jagomart
digital resources
picture1_Data Mining Applications Pdf 181261 | 1 Item Download 2023-01-30 19-51-02


 131x       Filetype PDF       File size 0.51 MB       Source: www.stet.edu.in


File: Data Mining Applications Pdf 181261 | 1 Item Download 2023-01-30 19-51-02
sengamala thayaar educational trust women s college affiliated to bharathidasan university accredited with a grade 3 45 4 00 by naac an iso 9001 2015 certified institution sundarakkottai mannargudi 614 ...

icon picture PDF Filetype PDF | Posted on 30 Jan 2023 | 2 years ago
Partial capture of text on file.
                           Sengamala Thayaar Educational Trust Women’s College 
                                           (Affiliated to Bharathidasan University) 
                                       (Accredited with ‘A’ Grade {3.45/4.00} By NAAC) 
                                           (An ISO 9001: 2015 Certified Institution) 
                                     Sundarakkottai, Mannargudi-614 016. 
                                      Thiruvarur (Dt.), Tamil Nadu, India. 
         
         
         
         
         
         
         
         
         
         
         
         
         
                DATA MINING AND WARE HOUSING 
         
         
         
                                          V.GEETHA 
                                    ASSISTANT PROFESSOR 
              PG & RESEARCH DEPARTMENT OF COMPUTER SCIENCE 
                                                1 
                               II M.Sc., COMPUTER SCIENCE 
                                        Semester : III 
        
                  CORE COURSE VII-DATA MINING AND WARE HOUSING 
                                          P16CS31 
                    Inst. Hours/Week : 5                       Credit : 5 
        
         Objective : On successful completion of the course the students should have: Understood data mining techniques- 
         Concepts and design of data warehousing. 
        
         UNIT I 
         Introduction – What is Data mining – Data Warehouses – Data Mining Functionalities – Basic Data mining tasks – 
         Data Mining Issues – Social Implications of Data Mining– Applications and Trends in Data Mining. 
        
         UNIT II 
         Data Preprocessing : Why preprocess the Data ? –Data Cleaning - Data Integration and Transformation – Data 
         Reduction – Data cube Aggregation – Attribute Subset Selection Classification: Introduction – statistical based 
         algorithms – Bayesian Classification. Distance based algorithms – decision tree based algorithms – ID3. 
        
         UNIT III 
         Clustering: Introduction - Hierarchical algorithms – Partitional algorithms – Minimum spanning tree – K-Means 
         Clustering - Nearest Neighbour algorithm. Association Rules: What is an association rule? – Methods to discover 
         an association rule–APRIORI algorithm – Partitioning algorithm . 
        
         UNIT IV 
         Data Warehousing: An introduction – characteristics of a data warehouse – Data marts – other aspects of data mart 
         .Online analytical processing: OLTP & OLAP systems. 
        
         UNIT V 
         Developing a data warehouse : Why and how to build a data warehouse – Data warehouse architectural strategies 
         and organizational issues – Design consideration – Data content – meta data – distribution of data – tools for data 
         warehousing – Performance considerations 
        
         TEXT BOOKS 
         1. Jiawei Han and Miceline Kamber , “Data Mining Concepts and Techniques “ , Morgan Kaulmann Publishers, 
         2006. (Unit I – Chapter 1 -1.2, 1.4 , Chapter 11- 11.1) (Unit II – Chapter 2 - 2.1,2.3, 2.4, 2.5.1,2.5.2) 2. Margaret H 
         Dunham , “Data mining Introductory & Advanced Topics”, Pearson Education , 2003.(Unit I – Chapter 1 -1.1 , 1.3, 
         1.5) , (UNIT II – Chapter 4 – 4.1, 4.2, 4.3, 4.4) (UNIT III – Chapter 5 – 5.1,5.4, 5.5.1, 5.5.3,5.5.4, Chapter 6 – 
         6.1,6.3. 3. C.S.R.Prabhu, “Data Warehousing concepts, techniques, products & applications”, PHI, Second Edition. 
         ) (UNIT IV & V ) REFERENCES: 1. Pieter Adriaans, Dolf Zantinge, “Data Mining” Pearson Education, 1998. 
        
         2. Arun K Pujari, “Data Mining Techniques”,Universities Press(India) Pvt, 2003. 
        
         3. S.Rajashekharan, G A Vijaylakshmi Bhai,”Neural Networks,Fuzzy Logic,and Genetic Algorithms synthesis and 
         Application”, PHI 4. Margaret H.Dunham,” Data Mining Introductory and Advanced topics”,Pearson Eductaionn 
         2003. 
        
                                            ***** 
                                             2 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     UNIT I 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                    1.1 INTRODUCTION : WHAT IS DATA MINING? 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Definition 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Data Mining refers to extracting or mining knowledge from large amount of data . 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              In simple words ,data mining is defined a process used to extract usable data from 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 a larger set of any raw data. 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Data mining is the practice of examining large pre-existing databases in order to 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 generate new information 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          On defining data mining we can know the related terms of data mining , they are 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 Database 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   -Database is an organized collection of data, generally stored and accessed 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             electronically from a computer system . 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 DBMS 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   -Database Management system is a software that interacts with the end users, 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             applications, and the database itself to capture and analyze the data. 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 Data warehouse 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   - a large store of data accumulated from a wide range of sources within a company 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          and used to guide management decisions. 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 OLTP 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   -Online Transaction processing is a class of software programs capable of 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            supporting transactions oriented applications on the internet. (eg) log file, online 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            banking . 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                    KDD 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Many people treat data mining as a synonym for another popular used term 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 Knowledge Discovery from Data or KDD. 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             But Data Mining is an essential step in the process of knowledge discovery 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Data mining as a step in the process of Knowledge discovery 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               1.Data Cleaning 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               2. Data Integration 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               3.Data Selection 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               4.Data Transformation 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               5.Data mining 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               6.Pattern Evaluation 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               7.Knowledge Presentation 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               3 
                                                                                                                                                                                                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
                                                                                                                                                                                                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Data Cleaning 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               -To remove noise and inconsistent data. 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Data Integration 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               -where multiple sources may be combined 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Data Integration 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               -where data relevant to the analysis task are retrieved from the database 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Data Transformation 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              -where data are transformed or consolidated into forms appropriate for mining 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           by  performing summary or aggregation operations. 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Data Mining 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              -an essential process where intelligent methods are applied in order to exact 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            data pattern 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Pattern  Evaluation 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              -to identify the truly interesting patterns representing knowledge based on 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        some interestingness measures 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               4 
The words contained in this file might help you see if this file matches what you are looking for:

...Sengamala thayaar educational trust women s college affiliated to bharathidasan university accredited with a grade by naac an iso certified institution sundarakkottai mannargudi thiruvarur dt tamil nadu india data mining and ware housing v geetha assistant professor pg research department of computer science ii m sc semester iii core course vii pcs inst hours week credit objective on successful completion the students should have understood techniques concepts design warehousing unit i introduction what is warehouses functionalities basic tasks issues social implications applications trends in preprocessing why preprocess cleaning integration transformation reduction cube aggregation attribute subset selection classification statistical based algorithms bayesian distance decision tree id clustering hierarchical partitional minimum spanning k means nearest neighbour algorithm association rules rule methods discover apriori partitioning iv characteristics warehouse marts other aspects ma...

no reviews yet
Please Login to review.