Skip to main content Skip to navigation

IM939 Data Science Across Disciplines: Principles, Practice and Critique

IM939

Data Science Across Disciplines: Principles, Practice and Critique







Illustrative figure showing part of an agent network
15/20/30 CATs - (7.5/10/15 ECTS)
Term 1

What this module is about?

This module introduces students to the fundamental techniques, concepts and contemporary discussions across the broad field of data science. With data and data related artefacts becoming ubiquitous in all aspects of social life, data science gains access to new sources of data, is taken up across an expanding range of research fields and disciplines, and increasingly engages with societal challenges. The module provides an advanced introduction to the theoretical and scientific frameworks of data science, and to the fundamental techniques for working with data using appropriate procedures, algorithms and visualisation. Students learn how to critically approach data and data-driven artefacts, and engage with and critically reflect on contemporary discussions around the practice of data science, its compatibility with different analytics frameworks and disciplinary, and its relation to on-going digital transformations of society. As well as lectures discussing the theoretical, scientific and ethical frameworks of data science, the module features coding labs and workshops that expose students to the practice of working effectively with data, algorithms, and analytical techniques, as well as providing a platform for reflective and critical discussions on data science practices, resulting data artefacts and how they can be interpreted, actioned and influence society.

The module has a significant component of practical learning enabled by hands-on coding activities using Python in lab sessions and workshops incorporating design thinking. Each week will have coding labs in Python that are aligned with the concepts covered in the lectures and involve data sets and problems drawn from real-world context. You will have plenty of help with this even if you are completely new to coding, while those more familiar with coding will also have opportunities that challenge understanding and facilitate new learning. We also have a comprehensive one-stop shop online manual, which is a live document collating the practical material and a handy guide for your learning journey.

Module convener: Kavin Narasimhan

Teaching staff: Cagatay Turkay, Ching Jin, Esha Nasir, Busola Oronti

Indicative Syllabus

Note that the following sessions are indicative.

Session-01: Introduction, Historical Perspectives & Basic Concepts

This week discusses data science as a field that cuts across disciplines and provides a historical perspective on the subject.

Session-02: Thinking Data: Theoretical and Practical Concerns

This week explores the cultural, ethical, and critical challenges posed by data artefacts and data-intensive scientific processes. Engaging with Critical Data Studies, we discuss issues around data capture, curation, data quality, inclusion/exclusion and representativeness.

Session-03: Abstractions & Models

This week discusses ways of abstracting data. We start by visiting statistics as a means of representing data and its inherent characteristics.

Session-04: Structures and Spaces

This week explores the notion of structures and how data science can enable the extraction of “hidden” underlying groups – clusters -- and hierarchical structures from data.

Session-05: Multi-Model Thinking and Rigour in Data Science

This week we focus on multi-model approaches as a way of thinking and how critical, pluralistic thinking can improve our understanding of the underlying phenomena implicit in data.

Session-06: Recognising and Avoiding Traps

This week we discuss how we can be aware of various methodological and ethical traps and pitfalls that one can encounter during the data science process.

Session-07: Data Science & Society

We will engage with academic and practices discourse on the social, cultural and ethical aspects of data science, and discuss around how one can responsibly carry out data science research on social phenomena, whether data science can be a transformative power in society, and what ethical and social frameworks can help us to critically approach data science practices and its effects on society, and what are ethical practices for data scientists.

Session-08: Data Science Workshop - 1 ( Design Thinking in Data Science )

This week explores the question “Can we approach data science as a design problem?” and discusses how one can embrace a user-centred approach to design appropriate data science processes. We will do this through hands-on practical where we go through the data science process over applied cases.

Session-09: Data Science Workshop - 2

This week will involve hands-on practical workshops where we go through the data science process using applied examples. During the workshop, we will also explore concepts such as narrative and visual storytelling, as well as reflect on the design process for our analysis and artefacts.

Assessment

The assessments will be individual based and will involve two components: a critical review and a data-driven essay. The critical review will involve students approaching a selected Data Science project through a critical lens covered during the lectures. The short report will expect students to engage with the related literature and reflect on the decisions made by the researchers of the project. Within the second component, the data-driven essay, students will report on a data science project that they carried on a chosen question and appropriate data set. The essay will be reporting on the data science process from initiation to evaluation to reflection while engaging with the relevant literature in the domain. These essays vary in length, depending on the number of CATS a student wishes to complete.

  15-CATS 20 CATS 30 CATS
Critical Review (1000 words) -- 40% (1250 words) -- 40% (1500 words) -- 40%
Final Essay (1500 words) -- 60% (2000 words) -- 60% (3000 words) -- 60%

Illustrative Bibliography

  • Data science as a scientific practice : Dhar, Vasant. "Data science and prediction." Communications of the ACM, 56.12 (2013): 64-73.
  • Iliadis, A. and Russo, F., 2016. Critical data studies: An introduction. Big Data & Society, 3(2), p.2053951716674238.
  • Ginsberg, Jeremy, et al. "Detecting influenza epidemics using search engine query data." Nature (2008)
  • Kandel, Sean, et al. "Enterprise data analysis and visualization: An interview study." Visualization and Computer Graphics, IEEE Transactions on 18.12 (2012): 2917-2926.
  • Osborne, Jason. "Notes on the use of data transformations." Practical Assessment, Research & Evaluation 8.6 (2002): 1-8.
  • Osborne, Jason W., and Amy Overbay. "The power of outliers (and why researchers should always check for them)." Practical assessment, research & evaluation 9.6 (2004): 1- 12.
  • Guyon, Isabelle, and André Elisseeff. "An introduction to variable and feature selection." The Journal of Machine Learning Research 3 (2003): 1157-1182.
  • Ringnér, Markus (2008). "What is principal component analysis?". Nature biotechnology (1087-0156), 26 (3), p. 303.
  • Jaworska, Natalia, and Angelina Chupetlovska-Anastasova. "A review of multidimensional scaling (MDS) and its utility in various psychological domains." Tutorials in Quantitative Methods for Psychology 5.1 (2009): 1-10.
  • Kohavi, A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection, Intl. Jnt. Conf. AI
  • White, Douglas R., and Stephen P. Borgatti. "Betweenness centrality measures for directed graphs." Social Networks 16.4 (1994): 335-346.
  • Heer, Jeffrey, and Ben Shneiderman. "Interactive dynamics for visual analysis." Queue 10.2 (2012): 30.
  • Ruckenstein, M. and Schüll, N.D., 2017. The datafication of health. Annual Review of Anthropology, 46, pp.261-278.
  • Pink, S., Ruckenstein, M., Willim, R. and Duque, M., 2018. Broken data: Conceptualising data in an emerging world. Big Data & Society, 5(1), p.2053951717753228.

Learning outcomes

  • Demonstrate an in-depth understanding of the theoretical underpinnings, scientific and ethical frameworks of data science as applied across disciplines
  • Demonstrate a critical understanding of the role that data and data intensive practices play in research, industry and the wider society
  • Demonstrate an understanding of the workings and the practicalities of the data science process
  • Apply and evaluate data science techniques and tools for particular scenarios and argue their suitability
  • Demonstrate an ability to critique any resulting data artefacts, such as data-informed decisions to data-driven models, including from a user-centred perspective
  • Develop and demonstrate an understanding of the societal, ethical, and cultural implications of advances in and applications of data science