Skip to main content Skip to navigation

IM950 - Scaling Data & Societies

IM950

Scaling Data & Societies



20 CATS

Term 2


Module Convenor

Dr Ching Jin

Introductory description

Big data technologies involve scaling-up — scaling up quantities of data, scaling up data infrastructures, scaling up data management, and scaling up the number of participants in a given technological system. This module provides an understanding of the technical. methodological and conceptual changes in the new forms of thinking, research and engineering required for understanding and working with scalable socio-technical systems. Beginning with the question of what 'scale' is in general and how data-based transformations redefine the limits of scale, the module presents students with a series of different ‘lenses’ through which the impact of scale manifests itself differently across contemporary data spaces, including hands-on laboratory exercises. By the end of the module, students will have gained knowledge and a greater appreciation of the impact of big data on research in socio-technical systems at various scales and, conversely, the multiple ways in which the concept of scale is driving developments in big data.

Principal module aims

Module aims

In this module, students gain both conceptual and methodological knowledge and practical experience of the theoretical, scientific, and social aspects of scalable data systems. The module will enable students to develop general knowledge of the impacts that ‘scalability’ makes across different data spaces in socio-technical systems. this will provide the basis for developing understanding of technical and methodological aspects of big data analysis at different scales. The module also enables students to gain hands-on experience and develop practical skills in distributed data processing, decentralized blockchain technologies, and large-scale network analysis. The overall module aim is then to enable students to develop a a rigorous theoretical, methodological and technical appreciation of the issue of scale as it relates to both data management, analysis and digitally-mediated social life.

Assessment

  • One essay of 4000 words on a topic from a given list or of the student's choosing (subject to the convenor's approval). The principal aim of the assignment will be to critically analyse an issue relating to scaling data.

Outline syllabus

This is an indicative module outline only to give an indication of the sort of topics that may be covered. Actual sessions held may differ.

Session 1: Conceptualising Scale This session introduces the key concepts and methodologies of the module relating
to the issue of scale.

Session 2: Distributed Systems and Distributed Data This session introduces the distinction between centralized, decentralized, and distributed communications and data systems.

Session 3: Computing in the Cloud This session introduces students to the shift from Big Data to cloud computing, as instantiated in technologies such as Apache Spark and Stream processing.

Session 4: Blockchain and Society This session discusses the origins of money; digital cash; distributed ledgers; forensic analysis of blockchains; and the sociology of the blockchain industry.

Session 5: Networks at Scale This session interrogates the notion that networks and network thinking are everywhere in an age of big data. We explore how networks expand and what properties they can exhibit.

Session 6: Individuals and/as Crowds This session examines the ways in which big data analytics operates on, affects and shapes populations and, secondarily, individuals.

Session 7: Prediction at scale This session explores big data and algorithmic analytics, which are increasingly used for predicting the social world, e.g. group behaviour, financial markets or the spread of disease.

Session 8: Scaling Time This session explores the ways in which time and temporality play a role in contemporary big data processes and analytics.

Session 9: Assessment workshop This session supports students in the development of their essay plans

Learning outcomes

By the end of the module, students should be able to:

  • Demonstrate in-depth knowledge of the theoretical underpinnings, scientific techniques, and social impacts of scalable data systems
  • Demonstrate a critical understanding of the role that scale plays in research, industry and the wider society in an age of big data
  • Demonstrate a practical ability to describe and engage with the technical workings of large-scale technological ecosystems ranging from distributed data processing to decentralized blockchains
  • Demonstrate an appreciation of the societal, ethical, and cultural implications of advances in and applications of scalable technologies in an age of big data
  • Appreciate the value of understanding different disciplinary approaches and perspectives;
  • Leverage a confidence and competence in interdisciplinarity specifically in relation to understanding the variety of ways that scale has on data and societies;
  • Communicate ideas effectively in different ways and to people with different disciplinary backgrounds.

Indicative reading list

  • Carr, E. S., & Lempert, M. (Eds.). (2016). Scale: Discourse and dimensions of social life. University of California Press.
  • Baran, P. (1964). On Distributed Communications: I. Introduction to Distributed Communication Networks. RAND.
  • Damji, J., Wenig, B., Das, T., & Lee, D. (2020). Learning Spark: Lightning-Fast Data Analytics (2nd ed.) O’Reilly Media, Inc.
  • Fox, A., Griffith, R., Joseph, A., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., & Stoica, I. (2009). Above the clouds: A Berkeley view of cloud computing.
  • Nakamoto, S. (2009). Bitcoin: A peer-to-peer electronic cash system.
  • Zachariadis, M., Hileman, G., & Scott, S. V. (2019). Governance and control in distributed ledgers: Understanding the challenges facing blockchain technology in financial services. Information and Organization, 29(2), 105–117.
  • Barabasi, Albert-Laszlo. 2009. ‘Scale-Free Networks: A Decade and Beyond’. Science 325 (5939): 412–13. Holme, Petter. 2019. ‘Rare and Everywhere: Perspectives on Scale-Free Networks’. Nature Communications 10 (1): 1016.
  • Smith, Marc A., et al. 2015. ‘The Structures of Twitter Crowds and Conversations’. In Transparency in Social Media: Tools, Methods and Algorithms for Mediating Online Interactions, Sorin Adam Matei et al. (Eds) 67–108. Computational Social Sciences. Cham: Springer. https://doi.org/10.1007/978-3- 319-18552-1_5.
  • Amoore, Louise. 2020. Cloud Ethics: Algorithms and the Attributes of Ourselves and Others. Durham: Duke University Press.
  • Scannell, R. Joshua. 2019. ‘This Is Not Minority Report: Predictive Policing and Population Racism’. In Captivating Technology: Race, Carceral Technoscience, and Liberatory Imagination in Everyday Life, edited by Ruha Benjamin, 107–29. Durham: Duke University Press.
  • Allen, R. L., & Mills, D. W. (2004). Signal analysis: Time, frequency, scale, and structure. IEEE Press.