# Datamining

Submitted By amlendu
Words 2101
Pages 9
MSc. Information System Management
Kyaw Khine Soe (3026039)
Boston Housing Dataset Analysis.

Table of Contents Introduction 3 Problem Statement 3 The associated data of Boston 5 Data pre-processing / Data preparation 8 Clustering Analysis 11 Cluster segment profile 17 Regression Analysis 18 Predictive analysis using neural network node 19 Decision tree node 21 Regression node analysis 23 Model Comparison 24 The recommendation and conclusion 26 Bibliography 27

Introduction

This report included part of assignment for the Data Mining and Business Analytics. This report based on the Boston Housing Dataset to describe prediction, cluster analysis, neural networks and decision tree nodes. Boston Housing is a real estate related dataset from Boston Massachusetts. This is small dataset with 506 rows can show prediction of housing price and regressing using decision trees and neural networks over this dataset. This report shows analysis of the property price over the size, age of property, environment factor such as crime rate, near the river dummy, distanced to employment centers and pollution.
Problem Statement
In relation to housing intelligence, real estate are usually concerned with following common business concerns: 1. Which area are high rates of crime? How crimes rates effected on housing price?
How can reduce the crime? 2. Which area is most/lease house price base on rooms in house/ area and pollution? What are the characteristics of them? 3. Does people willing to pay for more cleaning air? Does housing price near river chase is high or near industry zone? 4. How the ratio of pupil and teacher effect on the society? How is it effect on the crime rate of town? 5. How minorities group effect to the housing price? Are they related to crime rate? 6. What are the house...

### Similar Documents

#### Datamining

...Chapter 3 The Relational Model Review Questions 3.1 Discuss each of the following concepts in the context of the relational data model: (a) Relation (b) Attribute (c) Domain (d) Tuple (e) Intension and Extension (f) Degree and Cardinality. Each term defined in Section 3.2.1. 3.2 Describe the relationship between mathematical relations and relations in the relational data model? Let D1, D2, . . . , Dn be n sets. Their Cartesian product is defined as: D1  D2  . . .  Dn  {(d1, d2, . . . , dn) | d1 D1, d2 D2, . . . , dn Dn} Any set of n-tuples from this Cartesian product is a relation on the n sets. Now let A1, A2, . . ., An be attributes with domains D1, D2, . . . , Dn. Then the set {A1:D1, A2:D2, . . . , An:Dn} is a relation schema. A relation R defined by a relation schema S is a set of mappings from the attribute names to their corresponding domains. Thus, relation R is a set of n-tuples: (A1:d1, A2:d2, . . . , An:dn) such that d1 D1, d2 D2, . . . , dn Dn Each element in the n-tuple consists of an attribute and a value for that attribute. Discussed fully in Sections 3.2.2 and 3.2.3. 3.3 Describe the differences between a relation and a relation schema. What is a relational database schema? A relation schema is a named relation defined by a set of attribute and domain name pairs. A relational database schema is a set of relation schemas, each with a distinct name. Discussed in Section 3.2.3. 3.4 Discuss the......

Words: 3750 - Pages: 15

#### Datamining

...What is data mining: * Data mining (knowledge discovery from data) * Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data * data processing using sophisticated data search capabilities and statistical algorithms to discover patterns and correlations in large preexisting databases; a way to discover new meaning in data. 2. KDD process * General functionality * Descriptive data mining * Predictive data mining * Different views lead to different classifications * Data view: Kinds of data to be mined * Knowledge view: Kinds of knowledge to be discovered * Method view: Kinds of techniques utilized * Application view: Kinds of applications adapted Data mining issues * Mining methodology * Mining different kinds of knowledge from diverse data types, e.g., bio, stream, Web * Performance: efficiency, effectiveness, and scalability * Pattern evaluation: the interestingness problem * Incorporation of background knowledge * Handling noise and incomplete data * Parallel, distributed and incremental mining methods * Integration of the discovered knowledge with existing one: knowledge fusion * User interaction * Data mining query languages and ad-hoc mining * Expression and visualization of data mining results * Interactive mining of knowledge at......

Words: 874 - Pages: 4

#### Essentials

Words: 1553 - Pages: 7

#### Managing Data Resources

...Chapter 7 Managing Data Resources True-False Questions | |The benefits of a DBMS are immediately tangible. | | | | | |Answer: False Difficulty: Easy Reference: p. 234 | | |Excellent hardware and software will result in inefficient information systems if file management is poor. | | | | | |Answer: True Difficulty: Easy Reference: p. 234 | | |A record describes an entity. | | | | | |Answer: True Difficulty: Easy Reference: p. 235 | | |In traditional file processing, each functional area, by developing its own specialized applications, contributes to data ...

Words: 4937 - Pages: 20

#### Rohan

...Indian Institute of Management Bangalore Customer Relationship Management Faculty: Prof. G. Shainesh Term VI PGP (2008-09) 3 Credit Course Background – The primary purpose of any business is to win and keep customers. Its competitors also seek to do the same. Most successful firms have developed capabilities for attracting customers through their marketing programs. But they have shown mixed results when it comes to retaining these customers. Customer Relationship Management helps businesses in successfully implementing strategies aimed at winning and retaining customers profitably. It is also helping businesses shift from a short-term transaction based mode of operation in their interactions with customers to a long-term relationship mode. Objectives – The objective of this course is to help students understand the concept and practice of CRM derived from research and applications across businesses. These concepts and applications from real life case studies will help identify opportunities, which can be successfully implemented for long term profitability. Pedagogy – The teaching methodology will include a mix of lectures, discussions of pre-readings, presentations by practitioners, exercises and case analysis. The cases are integrative in nature but will also help develop an appreciation of specific elements of CRM. Group Project – Option 1 - Identify any organisation which is practicing some form of CRM. Start working with them to......

Words: 1490 - Pages: 6

#### Study Guide

...Enterprise information systems focus on data warehouse Important terms and concepts i. Definitions and purposes of data warehouse ii. Definitions of data mart iii. Data ware data type iv. Metadata 4. Week 9 Data Communication a. Types of networks i. Pan/Lan/Can/Wan/Man ii. Bluetooth, WiFi, WiMax iii. Terms/concepts 1. Packet switching 2. 3. 4. 5. 6. 7. 8. 9. a. b. Internet protocol TCP/IP VOIP (definition and advantages/disadvantages) VPN (definition) Hotspots Access points Tunnel 5. Week 13 Business Intelligence Definitions and architecture of BI Analytical tools i. definitions ii. OLAP 1. Drill-down 2. Pivot tables & Pivoting 3. Slicing 4. Dicing iii. Data Mining 1. Definitions 2. Supervised data mining 3. Unsupervised datamining iv. Decision support systems 1. Decision types (unstructured/ structured/ semi-structured) 2. Comparison of DSS/MIS/TPS/EIS(executive information systems) 3. Model-oriented DSS 4. Data-oriented DSS 5. Sensitivity analysis Monitoring tools i. KPIs ii. Balanced scorecard iii. Digital Dashboards iv. scorecards c....

Words: 273 - Pages: 2

Free Essay

#### Singularity Notes

...nations grid. Currently, the price sits as low as \$0.70 cents per watt. Once we have molecular nanotechnology based manufacturing, can produce solar panels extremely ineffictively, basically at the costs of raw materials. Could eventually be as inexpensive as a penny per square meter. Could put solar panels everywhere, on buildings, majority of human surfaces. Could put solar satellite into space and beam to earth via microwave. Each unit could provide billions of watts of electricity. Medicine 213 New ECG analysis for long-term unobtrusive monitoring, detect early warning signs of heart disease. AI programs to do pattern recognition and intelligent data mining in development of new drug therapies. Intelligent datamining tools to find new ways to distupt metabolisms of pathogens. CPOE checks for every order for possible allergies in a patient, drug interactions, duplications, drug restrictions, guidelines, ect. Patter recognition applied to protein pattern patterns can better detect ovarian cancer. Augmentations 197 Billions or trillions of nanobots can be put into the bloodstreem. Use to scan the human brain to reverse engineer it. Blood...

Words: 399 - Pages: 2

#### A Study on the Systems Applications and Products (Sap) and Statistical Analysis System (Sas) Software

...Black, K. (2011). Applied business statistics making better business decisions. (6th ed.). Asia: John Wiley & Sons. Business information systems. (n.d.). Retrieved October 25, 2011 from http://moodle.apc.edu.ph/course/category.php?id=6. Galarpe K. (2010). SAS partners with Asia Pacific College for biz curriculum. Retrieved October 25, 2011 from http://www.abs-cbnnews.com/business/12/18/10/sas-partners-asia-pacific-college-biz-curriculum. Jasch, C. M. (2010). Environmental and material flow cost accounting: principles and procedures. New York: Springer. Kieso, D. E., Weygandt, J. J., & Warfield, T. D. (2010). Intermediate Accounting. (13th ed.). Asia: John Wiley & Sons. Livingstone, J. L. (1970). Management planning and control: mathematical models. McGraw-Hill. Manuel, Z. C. (2011). Accounting process. (17th ed.) Philippines: Raintree Trading and Publishing Inc. Miller, R. E., & Blair, P. D. (2009). Input-output analysis: foundations and extensions. (2nd ed.) Cambridge University Press. Minz B. (2010). Difference between SAS and SAP. Retrieved October 25, 2011 from http://itknowledgeexchange.techtarget.com/itanswers/sap-vs-sas/. Peterson. W. (1991). Advances in input-output analysis: technology, planning, and development. Oxford University Press. Raa, T. T. (2006). The Economics of input-output analysis. Cambridge University Press. SAS Institute Inc. (1976). SAS: the power to know. Retrieved October 25, 2011 from......

Words: 268 - Pages: 2

#### Marketing Chapter 5

...Marketing Management 12e, Kotler and Kellner Summary: Chapter 5 (pages 139 - 171) Building Customer Value, Satisfaction and Loyalty Successful marketing companies invert the chart and see customers at the top and managers at every level must be personally involved in knowing, meeting and serving customers. Customer Perceived Value Customers are more educated and informed than ever and they have the tools to verify companies’ claims and seek out superior alternatives. They estimate which offer will deliver the most perceived value and act on it. Customer perceived value (CPV) is the difference between the prospective customer’s evaluation of all the benefits and all the costs of an offering and the perceived alternatives. Total customer value is the perceived monetary value of the bundle of economic, functional and psychological benefits customers expect from a given market offering. Total customer cost is the bundle of costs customers expect to incur in evaluating, obtaining, using and disposing of a given market offering, including monetary, time, energy and psychic costs. Applying value concepts = what the customer perceives as value and applying these. Choices and implications = (given in the context of the tractor example) ▪ The buyer may be under orders to buy at the lowest price ▪ The buyer will retire before the company realises that the product is more expensive to run ▪ The buyer enjoys a long-term friendship with one of the sales......

Words: 2074 - Pages: 9

#### Crm Curriculum

...Indian Institute of Management Bangalore Customer Relationship Management Faculty: Prof. G. Shainesh Room C-103, Tel : 3334 Term IV PGP (2014-15) 3 Credit Course Background – Businesses aim to win and keep customers. Its competitors also seek to do the same. Most successful firms have developed capabilities for attracting customers through their marketing programs. But they have shown mixed results when it comes to retaining these customers. Customer Relationship Management helps businesses in successfully implementing strategies aimed at winning and retaining customers profitably. It is also helping businesses shift from a short-term transaction based mode of operation in their interactions with customers to a long-term relationship mode. Objectives – The objective of this course is to help students understand the concept and practice of CRM derived from research and applications across businesses. These concepts and applications from real life case studies will help identify opportunities, which can be successfully implemented for long term profitability. Pedagogy – The teaching methodology will include a mix of lectures, discussions, presentations by practitioners, videos, exercises and case analysis. The cases are integrative in nature but will also help develop an appreciation of specific elements of CRM. Each session will require preparation of assigned reading / case and active participation by students. A significant portion of the......

Words: 2004 - Pages: 9

#### Student

...Mounir El HANBALI *  :mounir.elhanbali@gmail.com : (+971) 52 8874084 INDUSTRIAL ENGINEER CAREER OBJECTIVE A trilingual (Arabic, French, English) graduate with a bachelor degree in Industrial Engineering, seeking an opportunity where I can apply my engineering, managerial skills, and add value to the existing systems of an organization. EDUCATION (in progress): Master of Science and Logistics at UOWD (University Of Wollongong in Dubai). 2008-2013: Bachelor Degree in Industrial Engineeringat IIHEM (International Institute for Higher Education in Morocco), Rabat, Morocco. 2007-2008: Selectividad Degree (preparatory classes of Spanish for higher education) and Certificate of proficiency at Cervantes Spanish Institute. 2006-2007: 1st year in Economics and Business Administration in the University of Economics and Social Sciences of Fes, Morocco. 2005-2006: High School Graduation, Experimental Sciences, Taourirt, Morocco. EXPERIENCE • Google Student Ambassador in the MENA Region: Responsibilities: Organizing seminars, meetings and activities to present the Google technology to the educational and professional sectors. Holcim (Maroc) Fes: Senior Project in the Quality Department (June 2013 to September 2013) Subject: the Assessment and Control of the Impact of Fluorine on the quality and cost of cement. Project Manager of Talent and Creativity Club (September 2010 to July2011) Responsibilities: Organizing artistic and entertainment events, Talent Show 5. Talent and......

Words: 667 - Pages: 3

#### Database 6th Ch1 and Ch2

...Chapter 1 Databases and Database Users Review Questions 1.1. Define the following terms: data, database, DBMS, database system, database catalog, program-data independence, user view, DBA, end user, canned transaction, deductive database system, persistent object, meta-data, and transaction-processing application. Answer: Data: Facts that can be recorded and that have implicit meaning Database: Collection of related data DBMS: Collection of programs that enables users to create and maintain a database (Software) Database system: database and DBMS software together Database catalog: structure of data files is stored Program-data independence: property that properties that DBMS access programs do not require such changes in most cases. The structure of data files is stored in the DBMS catalog separately from the access programs. User view: DBA: assisted by a staff that carries out these functions End user: the people whose jobs require access to the database for querying, updating, and generating reports; the database primarily exists for their use Canned transaction: that have been carefully programmed and tested Deductive database system: database systems provide capabilities for defining deduction rules for inferencing new information from the stored database facts Persistent object: Meta-data: The information stored in the catalog Transaction-processing application: 1.2. What four main types of actions involve databases? Briefly discuss......

Words: 1273 - Pages: 6

#### It Consultation for Mr Green

...data wharehouse Term Paper CIS 111 The concept of data warehousing is easy to understand—to create a central location and permanent storage space for the various data sources needed to support a company’s analysis, reporting and other functions (2011). This actually traces back to about 1990 and the works of Bill Inmom. Inmom defined a data warehouse as ‘a subject oriented, intergraded, time-variant and non-volatile collection of data in support of management’s decision making process’. Since that time, the data warehouses have become very large and single subject data marts have proliferated. There are some advantages to data warehousing are they are. A Data Warehouse Delivers Enhanced Business Intelligence, by providing data from various sources, managers and executives will no longer need to make business decisions based on limited data or their gut. In addition, “data warehouses and related BI can be applied directly to business processes including marketing segmentation, inventory management, financial management, and sales.” A Data Warehouse Saves Time, since business users can quickly access critical data from a number of sources, all in one place—they can rapidly make informed decisions on key initiatives. They will not waste precious time retrieving data from multiple sources. Not only that but the business execs can query the data themselves with little or no support from IT—saving more......

Words: 1106 - Pages: 5

Free Essay

#### Persuasion Notes

...Likable people are more persuasive: 1. Physical attractiveness- attractive people are more persuasive both in terms of getting what they request and in changing another's attitude. 2. Similarity- we like people who are like us. 3. Increased Familiarity- through repeated contact is another factor 4. Association- By connecting themselves or their products with positive things, advertisers, politicians, and merchandisers frequently seek to share in the positivity through the process of association. Commitment and Consistency: People want to be consistent for 3 reasons- 1.To be valued by society- personal consistency is highly valued by society 2. It is beneficial to daily life- also has a positive effect on public image 3. It provides a shortcut through life's complexity Commitments are most effective if they are- 1. Active 2. Public 3. Effortful 4. Internally motivated "Throwing a low ball"- telling someone you will do something, then once they agree and join you , you take it back and they will still be "onboard" because they've already found other reasons to be onboard. Reciprocity: Rule for reciprocation- one person has to pay back what the other has provided, etc. You scratch my back and I will scratch yours. This rule allows one to give to someone with confidence of what he is giving will not be lost. This sense of future obligation develops the continuation of relationships, transactions, and exchanges. The decision to comply with another's request......

Words: 1313 - Pages: 6