Queen's School of Computing

CISC451/3.0 Topics in Data Analytics

Calendar description:

Content will vary from year to year; typical areas covered may include: tools for large scale data analytics (Hadoop, Spark), data analytics in the cloud, properties of large scale social networks, applications of data analytics in security.
Prerequisites: CISC351
Learning Hours: 120 (36L;36Lab;48P)

Course Outline:

Potential topics include:
Tools and techniques (8 weeks)

  • Conceptual and pragmatic difficulties in large-scale analytics
  • Properties of clusters (for computation) and clouds (for storage)
  • Tools for analytics at scale (Hadoop, Spark, Tensorflow)
Applications of large-scale analytics (4 weeks) such as social network analysis in networks of planetary scale (e.g. Facebook); textual analytics of large corpora (e.g. forums, tweets); collaborative filtering in large-scale review data (e.g. Amazon review datasets)

Learning outcomes:

Upon successful completion of this course, a student will be able to:

  1. Design inductive model building algorithms appropriate for of any scale and complexity, using cutting edge technologies
  2. Plan ways to collect data, build models, and interpret results in large distributed (cloud) datasets
  3. Evaluate the modelling performance of such algorithms, and the implications for real-world system that the data describes

Possible Textbooks:

As a topics course, the readings will vary from year to year and will consist of custom courseware and papers located through the Queen’s library databases.