Research data management using DataLad

Research data management using DataLad

"Learning the basics of the DataLad version control system for research data. DataLad is a community project built on top of git and git-annex and a critical tool for reproducible cognitive neuroscience."

Information

The estimated time to complete this training module is 3h.

The prerequisites to take this module are:

Contact Pierre Bellec if you have questions on this module, or if you want to check that you completed successfully all the exercises.

Resources

This module was presented by Adina Wagner during the HBM brainhack in 2020.

The material of the tutorial is available here.

The video of her presentation is available below:

Exercise

  • Follow along the tutorial with Adina. You can copy paste the commands from the datalad handbook section linked above, while following the video.
    • Warning: the url for one of the books in the tutorial (byte-of-python.pdf) is broken, so the pdf is unreadable. This does not impact the tutorial, but just don't be surprised if that document does not open. Also it shows how important it is to create persistent URLs when you release material, such as those offered on platforms like zenodo, osf or figshare.
    • warning 2 to follow the tutorial you may need to install new command line tools, such as tree.
  • Check with Pierre Bellec to validate that the history of your datalad repository includes all the steps of the tutorial.
  • 🎉 🎉 🎉 you completed this training module! 🎉 🎉 🎉

More resources

If you want to learn more, check:

  • the datalad handbook, which features lot of additional resources as well!
  • the datalad datasets github organization, which provides an easy access to a number of data resources. This type of DataLad repositories are the easiest way to get access to datasets.
  • note that for the last part of the tutorial you will need to install singularity and the datalad-container extension (installable through pip).
  • all of the Open Neuro datasets available on the Open Neuro github organization.
  • you can also read about the YODA principles for reproducible papers.