| The what and the why and the how |
|
Git workflows in datasets |
|
"I can't continue my work on the project because my colleague is working on it at the moment" |
"I have such a good rephrasing of the discussion, but my PI wanted to work on this part of the manuscript for the past two weeks" |
|
|
"Alright team, I propose everyone reviews the proposal and adds changes and comments, and [poor scientific coordinator] will go through all documents and merge everything!" |


git config --global init.defaultbranch maindatalad create mydataset --initial-branch main$ git status
On branch main
nothing to commit, working tree clean$ git branch
git-annex
* main
|
|
|
|
|
My own very first version controlled project:
|
history with its base branch but adds independent new changes
|
|
|
|
- Transparency - Cleanliness - If sandboxing fails don't merge and your default branch stays orderly - Keep different preprocessing in parallel |
|
from main? |
|
computed when the fix has made it into preproc's timeline
|
|
require sandboxing and won't lead to disorder can continue on main
|
|
the central sibling you can push your changes
|
|

|
|
ssh-keygen -t ed25519 -C "your_email@example.com"
(see here for instructions for each OS)

URL right from GitHub: ![]() name the sibling dataset upstream as well By default, the dataset one clones from is known as "origin" to the local clone
|
|
|
|
they need to create a fork of the repository under their account and clone & push there. |
|
|
|
|
|
|
|
|
|
|
|
"Before I merge, help me choose which modification to keep"
$ git pull upstream master 1 !
From github.com:adswa/mydataset
* branch master -> FETCH_HEAD
Auto-merging code/preproc.sh
CONFLICT (content): Merge conflict in code/preproc.sh
Automatic merge failed; fix conflicts and then commit the result.
$ git status 1 !
On branch preproc
You have unmerged paths.
(fix conflicts and run "git commit")
(use "git merge --abort" to abort the merge)
Unmerged paths:
(use "git add file..." to mark resolution)
both modified: code/preproc.sh
no changes added to commit (use "git add" and/or "git commit -a")
"I'm in a merge conflict!"
How to emergency-abort
What to do next
Which files contain conflicts
<<<<<<<" followed by a refspec (e.g., HEAD, a branch name, a commit SHA)
until "======" indicates one set of changes.=======" until ">>>>>>>"
followed by a refspec indicates the other set of changes$ git diff
diff --cc code/preproc.sh
index fc3f8e8,14a0a13..0000000
--- a/code/preproc.sh
+++ b/code/preproc.sh
@@@ -1,3 -1,1 +1,7 @@@
++<<<<<<< HEAD
+this is a script for processing
+some parameter changes
+some more parameter tweaks
++=======
+ fixed paths!
++>>>>>>> 9217a4b101159e6b5aab0a548aeb75fb82cca798
This is your most recent change
This is a conflicting change
<<<<<<", ">>>>>>", and
"======" conflict mark-up afterwards$ git diff
diff --cc code/preproc.sh
index fc3f8e8,14a0a13..0000000
--- a/code/preproc.sh
+++ b/code/preproc.sh
@@@ -1,3 -1,1 +1,4 @@@
+this is a script for processing
+some parameter changes
+some more parameter tweaks
+fixed paths!
$ git add code/preproc.sh
(datalad) adina@juseless in ~/scratch/mydataset on preproc+ (merge)
$ git commit
[preproc 1cab31a] Merge branch 'master' of github.com:adswa/mydataset into preproc