Skip to main content Skip to navigation

EC9D8: Foundations of Data Science in Economics

  • Mateusz Stalinski

    Module Leader
  • Mirko Draca

    Module Lecturer
30 CATS - Department of Economics

Introduction

Analyses in all fields of Economics nowadays make frequent use of large and detailed datasets ("big data"). The explosion in data access and availability opens many opportunities for applied research, as well as new challenges on how to handle, process, and extract meaningful conclusions from the data. The aim of the module is to introduce students to the R and Python programming languages and basic concepts of data science; and to provide and "hands-on" experience with economic data. The module lays the foundation to more advanced materials.

Principal Aims

The primary aim of the module is to introduce students to the analytical tools of data science provided by the R and Python computing languages, with additional experience in Excel. It builds computer coding skills from the basic principles, and covers methods to acquire, process and manipulate large volumes of data, often obtained from the web. Data visualisation methods are also presented.

By the end of the module, students should feel comfortable with collecting, visualizing and presenting

both structured and unstructured datasets. This should include using multiple programs/languages,

depending on their application of interest, including Excel, R and Python. They should be aware

of the advantages/disadvantages of each package and applications of each one.

Principal Learning Outcomes

Subject Knowledge and Understanding: Be able to process and work efficiently with large datasets. The teaching and learning methods that enable students to achieve this learning outcome are: Lectures, seminars, independent study. The summative assessment methods that measure the achievement of this learning outcome are: Exam.

Subject Knowledge and Understanding: Develop and enhance computer skills in the R language, including the writing of clear and reproducible R codes. The teaching and learning methods that enable students to achieve this learning outcome are: Lectures, seminars, independent study. The summative assessment methods that measure the achievement of this learning outcome are: Exam.

Subject Knowledge and Understanding: Be able to use R to process data and apply data-science methods. The teaching and learning methods that enable students to achieve this learning outcome are: Lectures, seminars, independent study. The summative assessment methods that measure the achievement of this learning outcome are: Exam.

Subject Knowledge and Understanding: Develop and enhance computer skills in the Python language, including the writing of clear and reproducible R codes. The teaching and learning methods that enable students to achieve this learning outcome are: Lectures, seminars, independent study. The summative assessment methods that measure the achievement of this learning outcome are: Exam.

Subject Knowledge and Understanding: Be able to use Python to process data and apply data-science methods. The teaching and learning methods that enable students to achieve this learning outcome are: Lectures, seminars, independent study. The summative assessment methods that measure the achievement of this learning outcome are: Exam.

Syllabus

• Overview of R and data types. Data as text and date formats;

• Operators, loops, apply family, defining your own functions, scoping rules;

• Reading and writing data;

• Data extraction and acquisition from web or databases (web scraping);

• Organizing, merging and managing data;

• Data visualization;

• Econometrics in R;

• Big data methods: K-means and Principal Component Analysis;

• Optimization;

• Simulation and code profiling.

Context

Assessment

Assessment Method
Exam (100%)
Coursework Details
Exam (100%)
Exam Timing
January

Let us know you agree to cookies