CISC432/3.0 Advanced Data Management Systems

Calendar description:

Storage and representation of “big data”, which are large, complex, structured or unstructured data sets. Provenance, curation, integration, indexing and querying of data.
Prerequisites: C- in CISC 235 and CISC 332
Learning Hours: 120 (36L;84P)

Course Outline:

Introduction (2 week)

Big Data Storage Systems (3 weeks) Processing Frameworks (4 weeks) Curation, Workflows, and Provenance (3 weeks)

Learning outcomes:

Upon successful completion of this course, a student will be able to:

  1. Create distributed storage structures for complex datasets
  2. Organize, integrate and process data from distributed storage systems
  3. Create metadata for complex datasets
  4. Articulate issues in data provenance and curation
  5. Build workflows and query the results

Possible Textbooks:

As this course is meant to remain current with latest developments, it will be based on course notes and online resources.