The aim of the module is to introduce students to current techniques, methods and results from the active field of database systems and data management. Typical topics include query planning and optimization; transaction processing and concurrency control; big data management; data warehousing and OLAP; theory of databases.
By the end of the module the student should be able to:
- Demonstrate understanding of issues surrounding concurrency control and parallelism in data management.
- Express queries in different forms (relational algebra, SQL, etc).
- Devise appropriate ways to store and index data.
- Show understanding of modern data processing paradigms such as NoSQL and MapReduce/PigLatin
- Explain methods suitable for particular types of data such as temporal, multimedia or spatial data.
The topics will be drawn from core conceptual topics in advanced databases, and current ideas in database systems. This will be drawn from:
- Refresher on databases and modelling
- Relational algebra, tuple relational calculus, SQL, and equivalences between them
- Query planning, evaluation and optimization
- Transaction processing, concurrency, ACID rules, OLTP
- Online analytical processing (OLAP), data warehouses
- Data storage and indexing, B-trees and hashing
- NoSQL to relax ACID rules; consistency, availability, partition tolerance
- Database security and privacy, including anonymisation and release
- Parallel databases, hardware and software
- Big data, MapReduce, Pig Latin
- Special purpose databases, e.g. temporal, spatial, or multimedia databases