Towards computational reproducibility when working with very large datasets

Adina Wagner
mas.to/@adswa

Psychoinformatics lab,
Institute of Neuroscience and Medicine, Brain & Behavior (INM-7)
Research Center Jülich



Slides: DOI 10.5281/zenodo.7835784 (Scan the QR code)

Acknowledgements

Co-authors
  • Laura K. Waite*
  • Małgorzata Wierzba*
  • Felix Hoffstaedter
  • Alexander Q. Waite
  • Benjamin Poldrack
  • Simon B. Eickhoff
  • Michael Hanke
DataLad software
& ecosystem
  • Psychoinformatics Lab,
    Research center Jülich
  • Center for Open
    Neuroscience,
    Dartmouth College
  • Joey Hess (git-annex)
  • >100 additional contributors
Funders
Collaborators
www.fz-juelich.de/en/inm/inm-7
https://www.nature.com/articles/s41597-022-01163-2

Cthulu Merge

List:       linux-kernel
Subject:    Re: [GIT PULL] regulator updates for v3.13-rc1
From:       Linus Torvalds <torvalds () linux-foundation ! org>
Date:       2014-01-21 19:16:57

Christ. When you start doing octopus merges, you don't do it by half
measures, do you?

I just pulled the sound updates from Takashi, and as a result got your
merge commit 2cde51fbd0f3. That one has 66 parents.

That kind of merge either needs to be split up, or gitk needs to be
made better about visualizing it, because it ends up being *so* wide
that the history is hard to read.

I think you'll find that having that many parents also breaks old
versions of git.

Anyway, I'd suggest you try to limit octopus merges to ~15 parents or
less to make the visualization tools not go crazy.  Maybe aim for just
10 or so in most cases.

It's pulled, and it's fine, but there's clearly a balance between
"octopus merges are fine" and "Christ, that's not an octopus, that's a
Cthulhu merge".

    Linus
42k-way octopus merge -- broke GitLab (JuGit)

DataLad contact and more information

Website + Demos http://datalad.org
Documentation http://handbook.datalad.org
Talks and tutorials https://youtube.com/datalad
Development http://github.com/datalad
Support https://matrix.to/#/#datalad:matrix.org
Open data http://datasets.datalad.org
Mastodon @datalad@fosstodon.org
Twitter @datalad

Thank you for your attention!



Slides: DOI 10.5281/zenodo.7835784 (Scan the QR code)



Women neuroscientists are underrepresented in neuroscience. You can use the
Repository for Women in Neuroscience to find and recommend neuroscientists for
conferences, symposia or collaborations, and help making neuroscience more open & divers.