Organise research data
By establishing some simple, effective data practices early in your project you can make your data files easy to store, find and use.
Data storage
A plan for the storage of research data is essential, both for the short and long term. The short term plan considers how to store data during research, whereas the long term plan identifies how and where to store data for archiving and future reuse after research activities end. For short term, IT Services offer a flexible data storage service with backup. Check preserve research data for advice on longer term preservation and access.
Data formats
Your choice of file format will affect the usability and long term accessibility of your files and data. As technology changes, you should also plan for both hardware and software obsolescence.
File formats more likely to be accessible in the future have the following characteristics:
- Non-proprietary
- Open, documented standard
- Common usage by research community
- Standard representation (ASCII, Unicode)
- Unencrypted
- Uncompressed
Examples of preferred file format choices include:
- ODF, RTF or TXT, not Word (.doc or.docx)
- ASCII, not Excel (.xls or .xlsx)
- MPEG-4, not Quicktime
- TIFF, PNG or JPEG2000, not GIF or JPG
- XML or RDF, not RDBMS
If you are using proprietary software consider migrating your data into a format with the above characteristics, in addition to keeping a copy in the original software format.
- Further guidance can be found from the University's Modern Records Centre
- The University of Cornell has published a guide on common file formats and when to use which
File naming and folders
Starting a project with a strategy for the consistent naming of both files and folders and creating appropriate file and folder structures will save time, avoid loss of data, allow re-use of the data, and assist in accurate location of data in the future.
- Jisc Digital Media has a guide on choosing file names
- UK Data Service advice for naming and structuring research data files
- UK Data Service recommended file formats
- Library of Congress recommended formats statement (digital and non-digital formats)
Documentation and metadata
Good documentation for your data is like creating a ‘user’s guide’ to the data and helps make data understandable, verifiable and reusable. Just making the data available does not make it useful, if you or others come back to your data at a later time later it is helpful to have time they will need information on when, why, how and by whom the data was created.
Research funders often require the open access publication of research data metadata to facilitate the location discovery and reuse of datasets. Documentation and metadata about a dataset is often mentioned together but can be very different things.
Metadata
Metadata are intended for reading by machines, and help to explain the purpose, origin, time references, geographic location, creator, access conditions and terms of use of a data collection. This allows data collections to be discovered for re-use and/or citation.
Documentation
Documentation for a data collection or dataset includes high-level information on the research context and design, the data collection methods used, plus summaries of findings based on the data.
Documentation deposited alongside data files should enable users, with no prior knowledge of the research project and data collected, to understand what the data mean and be able to use the data correctly in their own research projects.
Further information and examples of best practice in documenting data can be found at UK Data Service.
Backup and security
It is essential during your project that you have plans in place to ensure the safe storage of your data as well as a strategy for regular backups.
If you are storing your data in the University’s storage options then they are automatically included in the main IT Services backup processes, so can be an easy way to cover all your backup requirements.
For colleagues thinking of using a cloud service, the University has guidance on the selection and use of cloud services for storing data.
Further help
Contact researchdata at warwick dot ac dot uk if you need further help and advice.