Coronavirus (Covid-19): Latest updates and information
Skip to main content Skip to navigation

Big Data Technology & Visualisation

Introduction

The management of an organisation's data lifecycle is an essential activity in modern business. In recent years, the advent of cloud computing and the emergence of big data, has fundamentally challenged and changed these processes. This module will explore these changes, the challenges and opportunities they bring, and give students practical exposure to the use of these tools.

The full data management lifecycle will be covered in this module, from data acquisition, data storage, data cleaning and engineering, data analysis tools through to data visualisation. These techniques will be implemented using the latest, cutting-edge tools made available in modern, cloud environments. This includes a combination of relational and NoSQL data stores, populated by data extracted from source APIs and open data sources. These data stores will be connected to dashboards and visualisations that can communicate the value and insights of the data via web-based applications. Participants will engage in a final, capstone project which applies these methods to a real-world setting.

Objectives

Upon successful completion participants will be able to:

  • Demonstrate an comprehensive understanding of the key differences between Big Data technologies and analysis methods and traditional approaches.
  • Evaluate real-world scenarios and determine appropriate database solutions (traditional and NoSQL)
  • Demonstrate a comprehensive understanding of cloud data architectures, the operational risks associated with them, and develop appropriate mitigation strategies
  • Demonstrate an comprehensive understanding of the core concepts of visual communication and data visualisation.
  • Practically implement data pipelines and processing in a cloud setting

Syllabus

1) Cloud computing
- Introduction to AWS
- AWS Glue
- Step functions and AWS Lambda
2) Data collection/extraction
- Working with APIs
- Web crawlers
- Open data
3) Data storage
- RDBMs and NoSQL databases
- Building a data store
- Querying and processing data from a database
4) Data processing
- Hadoop and MapReduce
- Apache Spark
- Lambda architecture
5) Data analysis
- Analysis software
- Operationalisation
6) Data visualisation
- Visualisation software
- Interactive data visualisation
- Dashboards
7) A practical simulation of the above topics

Assessment

  • Big Data Architecture Presentation (30%)
  • 4,000 words Post Module Assignment (70%)

Duration

2 weeks including 21 hours of lectures, 9 hours of seminars and 15 hours of supervised practical classes