Queen's School of Computing

CISC 372/3.0 Advanced Data Analytics

Original Author: David Skillicorn
Last Revised: 2019-03-20

Calendar Description

Inductive modelling of data, especially counting models; ensemble approaches to modelling; maximum likelihood and density-based approaches to clustering, visualization. Applications to non-numeric datasets such as natural language, social networks, Internet search, recommender systems. Introduction to deep learning. Ethics of data analytics.

Prerequisites: CISC 371/3.0 and 3.0 units of (STAT or STAT_Options)

Exclusions: CISC 351*/3.0

Learning hours: 120 (36L; 84P)

Degree Planning

  • This course is required for the Data Analytics focus of the COMP degree plan.
  • This course is a direct prerequisite to:
    • CISC 451/3.0 (Topics in Data Analytics)

Possible Texts

  • Zaki and Meira. Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press.

Topics

  • Data analytics as an epistemology, review of optimisation-based prediction and clustering, ethical issues in data analytics (1 week)
  • Assessing model quality (accuracy, F1 score, precision, recall, ROC, AUC) (1 week)
  • Predictors based on counting: decision trees, rule systems (2 weeks)
  • Bias, variance, ensemble techniques (random forests, xgboost, bagging, boosting) (2 weeks)
  • Visualization (1 week)
  • Data analytics for graph data (Internet search, recommender systems) (2 weeks)
  • Social network analysis (1 week)
  • Natural language analytics (1 week)
  • Introduction to deep learning (1 week)