Data Science Across Disciplines: Principles, Practice and Critique
Interdisciplinary Postgraduate Modules
IM939 - Data Science Across Disciplines: Principles, Practice and Critique
15/20/30 CATs - (7.5/10/15 ECTS)
What this module is about?
This module introduces students to the fundamental techniques, concepts and contemporary discussions across the broad field of data science. With data and data related artefacts becoming ubiquitous in all aspects of social life, data science gains access to new sources of data, is taken up across an expanding range of research fields and disciplines, and increasingly engages with societal challenges. The module provides an advanced introduction to the theoretical and scientific frameworks of data science, and to the fundamental techniques for working with data using appropriate procedures, algorithms and visualisation. Students learn how to critically approach data and data-driven artefacts, and engage with and critically reflect on contemporary discussions around the practice of data science, its compatibility with different analytics frameworks and disciplinary, and its relation to on-going digital transformations of society. As well as lectures discussing the theoretical, scientific and ethical frameworks of data science, the module features coding labs and workshops that expose students to the practice of working effectively with data, algorithms, and analytical techniques, as well as providing a platform for reflective and critical discussions on data science practices, resulting data artefacts and how they can be interpreted, actioned and influence society.
Module convener: Cagatay Turkay
Teaching staff: Zofia Bednarowska-Michaiel, James Tripp
Note that the following sessions are indicative.
Session-01: Introduction, Historical Perspectives & Basic Concepts
This week discusses data science as a field that cuts across disciplines and provides a historical perspective on the subject.
Session-02: Thinking Data: Theoretical and Practical Concerns
This week explores the cultural, ethical, and critical challenges posed by data artefacts and data-intensive scientific processes. Engaging with Critical Data Studies, we discuss issues around data capture, curation, data quality, inclusion/exclusion and representativeness.
Session-03: Abstractions & Models
This week discusses ways of abstracting data. We start by visiting statistics as a means of representing data and its inherent characteristics.
Session-04: Structures and Spaces
This week explores the notion of structures and how data science can enable the extraction of “hidden” underlying groups – clusters -- and hierarchical structures from data.
Session-05: Multi-Model Thinking and Rigour in Data Science
This week we focus on multi-model approaches as a way of thinking and how critical, pluralistic thinking can improve our understanding of the underlying phenomena implicit in data.
Session-06: Recognising and Avoiding Traps
This week we discuss how we can be aware of various methodological and ethical traps and pitfalls that one can encounter during the data science process.
Session-07: Data Science & Society
We will engage with academic and practices discourse on the social, cultural and ethical aspects of data science, and discuss around how one can responsibly carry out data science research on social phenomena, whether data science can be a transformative power in society, and what ethical and social frameworks can help us to critically approach data science practices and its effects on society, and what are ethical practices for data scientists.
Session-08: Data Science Workshop - 1 ( Design Thinking in Data Science )
This week explores the question “Can we approach data science as a design problem?” and discusses how one can embrace a user-centred approach to design appropriate data science processes. We will do this through hands-on practical where we go through the data science process over applied cases.
Session-09: Data Science Workshop - 2
This week will involve hands-on practical workshops where we go through the data science process using applied examples. During the workshop, we will also explore concepts such as narrative and visual storytelling, as well as reflect on the design process for our analysis and artefacts.
The assessments will be individual based and will involve two components: a critical review and a data-driven essay. The critical review will involve students approaching a selected Data Science project through a critical lens covered during the lectures. The short report will expect students to engage with the related literature and reflect on the decisions made by the researchers of the project. Within the second component, the data-driven essay, students will report on a data science project that they carried on a chosen question and appropriate data set. The essay will be reporting on the data science process from initiation to evaluation to reflection while engaging with the relevant literature in the domain. These essays vary in length, depending on the number of CATS a student wishes to complete.
|15-CATS||20 CATS||30 CATS|
|Critical Review||(1000 words) -- 40%||(1250 words) -- 40%||(1500 words) -- 40%|
|Final Essay||(1500 words) -- 60%||(2000 words) -- 60%||(3000 words) -- 60%|
- Data science as a scientific practice : Dhar, Vasant. "Data science and prediction." Communications of the ACM, 56.12 (2013): 64-73.
- Iliadis, A. and Russo, F., 2016. Critical data studies: An introduction. Big Data & Society, 3(2), p.2053951716674238.
- Ginsberg, Jeremy, et al. "Detecting influenza epidemics using search engine query data." Nature (2008)
- Kandel, Sean, et al. "Enterprise data analysis and visualization: An interview study." Visualization and Computer Graphics, IEEE Transactions on 18.12 (2012): 2917-2926.
- Osborne, Jason. "Notes on the use of data transformations." Practical Assessment, Research & Evaluation 8.6 (2002): 1-8.
- Osborne, Jason W., and Amy Overbay. "The power of outliers (and why researchers should always check for them)." Practical assessment, research & evaluation 9.6 (2004): 1- 12.
- Guyon, Isabelle, and André Elisseeff. "An introduction to variable and feature selection." The Journal of Machine Learning Research 3 (2003): 1157-1182.
- Ringnér, Markus (2008). "What is principal component analysis?". Nature biotechnology (1087-0156), 26 (3), p. 303.
- Jaworska, Natalia, and Angelina Chupetlovska-Anastasova. "A review of multidimensional scaling (MDS) and its utility in various psychological domains." Tutorials in Quantitative Methods for Psychology 5.1 (2009): 1-10.
- Kohavi, A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection, Intl. Jnt. Conf. AI
- White, Douglas R., and Stephen P. Borgatti. "Betweenness centrality measures for directed graphs." Social Networks 16.4 (1994): 335-346.
- Heer, Jeffrey, and Ben Shneiderman. "Interactive dynamics for visual analysis." Queue 10.2 (2012): 30.
- Ruckenstein, M. and Schüll, N.D., 2017. The datafication of health. Annual Review of Anthropology, 46, pp.261-278.
- Pink, S., Ruckenstein, M., Willim, R. and Duque, M., 2018. Broken data: Conceptualising data in an emerging world. Big Data & Society, 5(1), p.2053951717753228.
- Demonstrate an in-depth understanding of the theoretical underpinnings, scientific and ethical frameworks of data science as applied across disciplines
- Demonstrate a critical understanding of the role that data and data intensive practices play in research, industry and the wider society
- Demonstrate an understanding of the workings and the practicalities of the data science process
- Apply and evaluate data science techniques and tools for particular scenarios and argue their suitability
- Demonstrate an ability to critique any resulting data artefacts, such as data-informed decisions to data-driven models, including from a user-centred perspective
- Develop and demonstrate an understanding of the societal, ethical, and cultural implications of advances in and applications of data science
Important Registration Information:
- Please first discuss your optional module choices with you personal tutor during the personal tutor meetings and get their approval
- Then complete and submit the optional module choice webform available in the CIM welcome page
- The webform opens on 30th September at 14:00 BST and closes on 1st October at 15:00 BST
- If there are any queries, please get in touch with Gheerdhardhini (PG Coordinator) via email@example.com
- All external students - Please contact the CIM PG Coordinator (Gheerdhardhini) via email (firstname.lastname@example.org), to request your optional module choice by Week 1 : Wednesday, 7th October, 17.00 BST.
- Please be advised that you may be expected to have access to a laptop for some of these courses due to software requirements; the Centre is unable to provide a laptop for external students.
- Please be advised that some modules may have restricted numbers and places are allocated according to availability.
- Please note that a request does NOT guarantee a place on the module and is subject to availability.
- Gaining permission of a member of CIM teaching staff or a member of staff from your home department or filling in the eVision Module Registration (eMR) system with the desired module does NOT guarantee a place on that module.
- Requests after the specified deadline will not be considered.
- CIM PG Coordinator will get back confirming your place in the module by 2nd October, Friday (For CIM students).
- For external students - Only after confirmation of a place from CIM PG Coordinator can students’ or their home departments confirm their registration on eVision/MRM. Registrations by students who have not received confirmation of a place from CIM will be rejected via the system.
NOTE – The above-mentioned registration deadline also applies to the CIM optional modules running in Term 2. We will consider registrations again in the first week of Term 2, but only in relation to modules where there is availability.
We are normally unable to allow students (registered or auditing) to join/leave the module after the second week of it commencing.