## CISC351/3.0 Advanced Data Analytics

### Calendar description:

Design and implementation of complex analytics techniques; predictive algorithms at scale; deep learning; clustering at scale; advanced matrix decompositions, analytics in the Web, collaborative filtering; social network analysis; applications in specialized domains.

**Prerequisites:** (APSC 42 or APSC 143 or CISC 101 or CISC 110 or CISC 151 or CISC 121 or previous programming experience) and CISC251 and (STAT263 or STAT options)

**Learning Hours:** 120 (36L;24Lab;60P)

### Course Outline:

**Advanved Analytic Algorithms** (6 weeks)

- Deep learning (restricted Boltzmann machines, stacked denoising autoencoders, prediction and clustering using deep networks, the adversarial example problem)
- Gradient boosting trees
- Generalized linear models
- Scaling clustering algorithms (DBNorm)
- Advanced clustering using matrix decompositions (Singular Value Decomposition, Independent Component Analysis, Semi-Discrete Decomposition, Non-Negative Matrix Factorization)

**Application to non-tabular data ** (4 weeks)
- Web analytics (PageRank, HITS, reducibility problem)
- Collaborative filtering and recommender systems
- Social network analysis (social network properties, viral spread, spectral embedding)
- Text analytics

**Scaling ** (1 week)
- Scaling algorithms to larger data, introduction to clusters for analytics

**Examples** (1 week)
- Examples of real-world applications in natural language analysis, security, marketing

### Learning outcomes:

Upon successful completion of this course, a student will be able to:

- Design inductive model building algorithms appropriate for datasets of substantial size and complexity with ill-defined requirements
- Plan ways to collect data, build models, and interpret results in network datasets
- Evaluate the modelling performance of such algorithms, and the implications for the real-world system that the data describes

### Possible Textbooks:

- Zaki and Meira, Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press.
- Skillicorn, Understanding Complex Datasets, Taylor and Francis