COMP 306: Data Mining (formerly numbered 300)¶
This course covers the theory and practice of extremely large information storage (warehousing) and analysis (mining) mechanisms. With data growing at exponential rates knowledge gathering and exploration techniques are essential for gaining useful intelligence.
COMP 251: Introduction to Database Systems or COMP 271: Data Structures I and STAT 103: Fundamentals of Statistics or STAT 203: Statistics or ISSCM 241: Business Statistics or PSYC 304: Statistics or instructor permission.
Data warehousing and data mining are two major areas of exploration for knowledge discovery in databases. These topics have gained great relevance especially in the 1990s and early 2000s with web data growing at an exponential rate. As more data is collected by businesses and scientific institutions alike, knowledge exploration techniques are needed to gain useful business intelligence. This course will cover a wide spectrum of industry-standard techniques using widely available databases and tools packages for knowledge discovery.
Data mining is for relatively unstructured data for which more sophisticated techniques are needed. The course aims to cover powerful data mining techniques including clustering, association rules, and classification. It then teaches high volume data processing mechanisms by building warehouse schemas such as snowflakes, and star. OLAP query retrieval techniques are also introduced.
Students will be able to define and critically analyze data warehouse and mining approaches for fields such as security, forensics, privacy, and marketing.