Plan your data
As you move into a new project it is important to consider the data that you will create, gather and use in the course of the project and make decisions about how you will manage your data.
What is Research Data Management
Data can have a longer lifespan than that of the research project that creates or collects it. You may continue to work on your data after funding has ceased, follow-up projects may analyse or add to the data, and data may be re-used by other researchers. So making sure you are properly managing your data through the whole lifecycle of the data is increasingly relevant.
Many funders are now asking you to do this as part of their application process. Considering options for data management at an early stage can help you make the right decisions at the right time about creating, storing and sharing your data.
Make sure you know about your funders' expectations.
Data planning
Data planning involves making decisions at the outset of your research to decide:
- Which software and file formats to use
- How to organise, store and manage your data
- What to include in the consent agreements you negotiate
These will all affect what it’s possible to do with your data in the future. Data planning is best done by writing a Data Management Plan.
Research data
Research data is the information that you are using to answer your research questions. Research data is often arranged, formatted or designed in such a way as to facilitate communication, interpretation and further processing.
The Digital Curation Centre defines research data as "a reinterpretable representation of information in a formalized manner suitable for communication, interpretation, or processing."
Types of research data
Much research data is created ‘new’ for a specific project as it is answering a novel question but it may also be research data from a previous project that has been transformed, adjusted or reinterpreted to fit the needs of the new project. Five data types commonly used are:
- Observational: data captured in real time that is usually unique and irreplaceable. For example, remote sensing data, survey data, field recordings, sample data
- Experimental: data captured from lab equipment that is often reproducible. For example, gene sequences, chromatograms, magnetic field data
- Models or simulation: data generated from test models where model and metadata may be more important than output data from the model. For example, climate models, economic models
- Derived or compiled: resulting from processing or combining ‘raw’ data. For example, text and data mining, compiled databases, 3D models
- Reference or canonical: a static or organic conglomeration or collection of datasets, probably published and curated. For example, gene sequence databanks, collection of letters or archive of historical images
Examples of research data
Research data can be electronic or in hardcopy (e.g. paper) and it may include the following:
- Documents (text, Word, PDF), spreadsheets
- Laboratory notebooks, field notebooks, diaries
- Questionnaire responses, transcripts, codebooks
- Audiotapes, videotapes, photographs, films
- Slides, artefacts, specimens, samples
- Collection of digital objects acquired and generated during the process of research (including digitised archive material)
- Database contents (video, audio, text, images)
- Models, algorithms, scripts
- Contents of an application (input, output, logfiles for analysis software, simulation software, schemas)
- Methodologies, workflows and protocols
The UK Data Council offers detailed information about research data management practices across academic disciplines.