131x Filetype PDF File size 0.51 MB Source: www.stet.edu.in
Sengamala Thayaar Educational Trust Women’s College (Affiliated to Bharathidasan University) (Accredited with ‘A’ Grade {3.45/4.00} By NAAC) (An ISO 9001: 2015 Certified Institution) Sundarakkottai, Mannargudi-614 016. Thiruvarur (Dt.), Tamil Nadu, India. DATA MINING AND WARE HOUSING V.GEETHA ASSISTANT PROFESSOR PG & RESEARCH DEPARTMENT OF COMPUTER SCIENCE 1 II M.Sc., COMPUTER SCIENCE Semester : III CORE COURSE VII-DATA MINING AND WARE HOUSING P16CS31 Inst. Hours/Week : 5 Credit : 5 Objective : On successful completion of the course the students should have: Understood data mining techniques- Concepts and design of data warehousing. UNIT I Introduction – What is Data mining – Data Warehouses – Data Mining Functionalities – Basic Data mining tasks – Data Mining Issues – Social Implications of Data Mining– Applications and Trends in Data Mining. UNIT II Data Preprocessing : Why preprocess the Data ? –Data Cleaning - Data Integration and Transformation – Data Reduction – Data cube Aggregation – Attribute Subset Selection Classification: Introduction – statistical based algorithms – Bayesian Classification. Distance based algorithms – decision tree based algorithms – ID3. UNIT III Clustering: Introduction - Hierarchical algorithms – Partitional algorithms – Minimum spanning tree – K-Means Clustering - Nearest Neighbour algorithm. Association Rules: What is an association rule? – Methods to discover an association rule–APRIORI algorithm – Partitioning algorithm . UNIT IV Data Warehousing: An introduction – characteristics of a data warehouse – Data marts – other aspects of data mart .Online analytical processing: OLTP & OLAP systems. UNIT V Developing a data warehouse : Why and how to build a data warehouse – Data warehouse architectural strategies and organizational issues – Design consideration – Data content – meta data – distribution of data – tools for data warehousing – Performance considerations TEXT BOOKS 1. Jiawei Han and Miceline Kamber , “Data Mining Concepts and Techniques “ , Morgan Kaulmann Publishers, 2006. (Unit I – Chapter 1 -1.2, 1.4 , Chapter 11- 11.1) (Unit II – Chapter 2 - 2.1,2.3, 2.4, 2.5.1,2.5.2) 2. Margaret H Dunham , “Data mining Introductory & Advanced Topics”, Pearson Education , 2003.(Unit I – Chapter 1 -1.1 , 1.3, 1.5) , (UNIT II – Chapter 4 – 4.1, 4.2, 4.3, 4.4) (UNIT III – Chapter 5 – 5.1,5.4, 5.5.1, 5.5.3,5.5.4, Chapter 6 – 6.1,6.3. 3. C.S.R.Prabhu, “Data Warehousing concepts, techniques, products & applications”, PHI, Second Edition. ) (UNIT IV & V ) REFERENCES: 1. Pieter Adriaans, Dolf Zantinge, “Data Mining” Pearson Education, 1998. 2. Arun K Pujari, “Data Mining Techniques”,Universities Press(India) Pvt, 2003. 3. S.Rajashekharan, G A Vijaylakshmi Bhai,”Neural Networks,Fuzzy Logic,and Genetic Algorithms synthesis and Application”, PHI 4. Margaret H.Dunham,” Data Mining Introductory and Advanced topics”,Pearson Eductaionn 2003. ***** 2 UNIT I 1.1 INTRODUCTION : WHAT IS DATA MINING? Definition Data Mining refers to extracting or mining knowledge from large amount of data . In simple words ,data mining is defined a process used to extract usable data from a larger set of any raw data. Data mining is the practice of examining large pre-existing databases in order to generate new information On defining data mining we can know the related terms of data mining , they are Database -Database is an organized collection of data, generally stored and accessed electronically from a computer system . DBMS -Database Management system is a software that interacts with the end users, applications, and the database itself to capture and analyze the data. Data warehouse - a large store of data accumulated from a wide range of sources within a company and used to guide management decisions. OLTP -Online Transaction processing is a class of software programs capable of supporting transactions oriented applications on the internet. (eg) log file, online banking . KDD Many people treat data mining as a synonym for another popular used term Knowledge Discovery from Data or KDD. But Data Mining is an essential step in the process of knowledge discovery Data mining as a step in the process of Knowledge discovery 1.Data Cleaning 2. Data Integration 3.Data Selection 4.Data Transformation 5.Data mining 6.Pattern Evaluation 7.Knowledge Presentation 3 Data Cleaning -To remove noise and inconsistent data. Data Integration -where multiple sources may be combined Data Integration -where data relevant to the analysis task are retrieved from the database Data Transformation -where data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations. Data Mining -an essential process where intelligent methods are applied in order to exact data pattern Pattern Evaluation -to identify the truly interesting patterns representing knowledge based on some interestingness measures 4
no reviews yet
Please Login to review.