Database IO
Databases are a way of storing not just data, but also detailed metadata (data about data), and the relationships between data items. They make it easy to search and manage data, and can have a lot of advantages over files.
For "Big" computing - things done on clusters or specialised machines, you might find yourself producing a huge amount of rather small files. This is "impolite" to others using the resource, might be inefficient in compute terms, and might well be inefficient in terms of further data processing.
This self-taught course contains a short introduction to SQL databases, how to use them, and some examples of using them for research data management.
Download the slides to look over, a Hands-on examples sheet, and some command snippets hereLink opens in a new window.
Following Along
Start by reading over the slides until you come to "Hands on Time". This is the point you should stop, fire-up sqlite3, and explore some basics, following the HandsOn.pdf guide. Then the Slides cover some more advanced stuff.
You may also want to example all the snippets in the Commands folder, and make sure you understand them.