- cross-posted to:
- git@programming.dev
roadrunnertwice.dreamwidth.org
- cross-posted to:
- git@programming.dev
I disagree, hard.
I disagree with the general conclusion - I think it’s very easy to understand*: each repo has a graph of commits. Each commit includes the diff and metadata (like parent commits). There is a difference between you repo seeing the state of another repo (fetch) and copying commits from another repo into your repo (merge; pull is just a combination of fetch and pull). Tags are pointers to specific commits, branches are pointers to specific commits that get updated when you add a child commit to this commit. That’s a rather small set of very clear concepts for such a complex problem.
I also disagree with a lot of the reasoning. Like “If a commit has the same content but a different parent, it’s NOT the same commit” is not an “alien concept”. When I apply the same change to different parents, I end up with different versions. Which would be kinda bad for a Version Control System.
“This in turn means that you need to be comfortable and fluent in a branching many-worlds cosmology” - yes, if you need to handle different versions, you need to switch between them. That’s the complexity of what you’re doing, not the tool. And I like that Git is not trying to hide things that I need to know to understand what’s happening.
“distinguish between changes and snapshots that have the same intent and content but which are completely non-interchangeable and imply entirely different flows of historical events” How do you even end up in a situation like that? Anyway, sounds like you should be able to merge them without conflicts, if they are in fact completely interchangeable?
“The natural mental model is that names denote global identity.” Why should another repo care, which names I use? How would you even synchronize naming across different repos without adding complexity, e.g. if two devs created a branch “experimental” or “playground”. Why on earth should they be treated as the same branch?
“Git uses the cached remote content, but that’s likely out of date” I actually agree that this can lead to some errors and confusion. But automation exists - you can just fetch every x minutes.
“Branches aren’t quite branches, they’re more like little bookmark go-karts.” A dev describing what basically is just a pointer in this way leads to the suspicion that it might not be Git’s mental model that is alien.
“My favorite version of this is when the novice has followed someone’s dodgy advice to set pull.rebase = true” Maybe don’t do stupid stuff you don’t understand? We know what fetch is, we know what merge is. Pull is basically fetch & merge.
““Pull” presents the illusion that you can just ask Git to make everything okay for you” Just… what? The rest of the sentence doesn’t really fix this error in expectations.
- except the CLI of course, but I can use GUI-tools for most tasks
I also disagree with a lot of the reasoning. Like “If a commit has the same content but a different parent, it’s NOT the same commit” is not an “alien concept”. When I apply the same change to different parents, I end up with different versions. Which would be kinda bad for a Version Control System.
It’s also intuitive, it’s how frames in video compression work, too. And in fact if you have two kids that look virtually identical but are from different families, they are very clearly not the same person. Context matters, most people more-than-intuitively understand that.
“Git uses the cached remote content, but that’s likely out of date” I actually agree that this can lead to some errors and confusion. But automation exists - you can just fetch every x minutes.
Yeah and nevermind that virtually any tool does that for you. So this is a long-solved problem.
I’d expect a developer to understand that. A stack trace works the exact same way.
Hot take: Git is hard for people who do not know how to read a documentation.
The Git book is very easy to read and only takes a couple of hours to read the most significant chapters. That’s how I learnt it myself.
Git is meant for developers, i.e. people who are supposed to be good at looking up online how stuff works.
developers, i.e. people who are supposed to be good at looking up online how stuff works.
How I wish this were true.
Each commit includes the diff and metadata (like parent commits).
Commits don’t store diffs, so you’re wrong from the start here.
Hence why people say “git is hard”
Yeah, you’re right, technically it’s not a “diff”, it’s the changed files.
I don’t think this technical detail has any consequences for the general mental model of Git though - as evidenced by the fact that I have been using Git for years without knowing this detail, and without any problems.
It’s all the files. Content-addreasable storage means that they might not take up any more space. Smart checkout means they might not require disk operations. But it’s the whole tree.
One problem, I think, is that git names are kinda bad. A git branch is just a pointer to a commit, it really doesn’t correspond to what we’d naturally think of as a branch in the context of a physical tree or even in a graph.
That’s a bit problematic for explaining git to programming newbies, because grokking pointers is famously one of the stumbling blocks people have, along with recursion. Front-end web developers who never learned C might not really grok pointers due to never really having to deal with them much.
Some other version control systems like mercurial have both a branch in a more intuitive sense (commits have a branch as a bit of metadata), as well as pointers to commits (mercurial, for example, calls them bookmarks).
As an aside, there’s a few version control systems like darcs where instead of the first-class concept being snapshots, it’s diffs. There’s no separate cherrypick command in darcs, it’s just one way you can use the regular commands.
A git branch is just a pointer to a commit, it really doesn’t correspond to what we’d naturally think of as a branch in the context of a physical tree or even in a graph.
But as the article points out, a commit includes all of its ancestors. Therefore pointing to a commit effectively is equivalent to a branch in the context of a tree.
Some other version control systems like mercurial have both a branch in a more intuitive sense (commits have a branch as a bit of metadata), as well as pointers to commits (mercurial, for example, calls them bookmarks).
I mean, git has bookmarks too, they’re called tags.
What happens after you merge a feature branch into main and delete it? What happens to the branch?
Afterwords, what git commands can you run to see what commits were made as part of the feature branch and which were previously on main?
Mercurial bookmarks correspond to git branches, while mercurial tags correspond to git tags.
Each commit includes the diff
It doesn’t. ☺
I totally disagree. Git is not hard. The way people learn git is hard. Most developers learn a couple of commands and believe they know git, but they don’t. Most teachers teach to use those commands and some more advanced commands, but this does not help to understand git. Learning commands sucks. It is like a cargo cult: you just do something similar to what others do and expect the same result, but you don’t understand how it works and why sometimes it does not do what you expect.
To understand git, you don’t need to learn commands. Commands are simple and you can always consult a man page to know how to do something if you understand how it should work. You only need to learn core concepts first, but nobody does. The reference git book is “Pro Git” and it perfectly explains how git works, but you need to start reading from the last chapter, 10 Git Internals. The concepts described there are very simple, but nobody starts learning git with them, almost nobody teaches them in the beginning of classes. That’s why git seems so hard.
Ahhhhh, that’s why! I should’ve know to read from the end not beginning lmao. Jokes aside, thanks for the advice I’ll try it out :)
Authors should write it in the opposite order.
I agree, the teaching is wrong. I always teach it visually. That seems to do the trick
Came here to say the same thing. The git book is an afternoon’s reading. It’s well worth the time - even if you think you know git.
People complain about the UX of the cli tool (perhaps rightly) but it’s honestly little different from the rest of the unix cli experience: ad hoc, arbitrary, inconsistent.
What’s important is a solid mental model and the vocabulary of primitive and compound operations built with it. How you spell it in the cli is just a thing you learn as you go.
h
In this thread - tons of smart people thinking that the tools we use to replace “make a backup of a file on a server somewhere” should require entire reference books, as if that’s normal.
Saying “it’s a graph of commits” makes no sense to a layperson. Hell the word “diff” makes no sense. Requiring training to get something right is acceptable, but “using CVS” is a tiny tiny part of the job, not the whole job. I mean, even most of the commenters on this thread are getting small things wrong (and some are handwaving it away saying “oh that small detail doesn’t matter”).
Look, git is hard. It’s learnable, but it’s hard. The concepts are medium hard to understand, and the way it does things is unique and designed for distributed, asynchronous work - which are usually hard problems to solve.
While I agree 100% with your main point,
"it’s a graph of commits” makes no sense to a layperson
You’re probably putting your standards too low. Every coder should know what a graph is, the basic concept at least. If you can understand fizzbuzz you can understand graphs too.
the word “diff” makes no sense
diff is short for difference. And that basically explains it
Saying “it’s a graph of commits” makes no sense to a layperson.
Sure, but git is aimed at programmers. Who should have learned graph theory in university. It was past of the very first course I had as an undergraduate many years ago.
Git is definitely hard though for almost all the reasons in the article, perhaps other reasons too. But not understanding what a DAG is shouldn’t be one of them, for the intended target audience.
My favorite version of this is when the novice has followed someone’s dodgy advice to set pull.rebase = true, then they pull a shared branch that they’re collaborating on, into which their coworker has just merged origin/main. Instant Sorcerer’s Apprentice-scale chaos!
Why are you doing that? Don’t do that.
And anyway… it’s trivial to fix. If you still have the commit ID of the tip of the branch before the pull, go back to that. If not, look it up in the reflog. If that’s too much of a hassle, list the commits you only have locally, stash any changes, reset to the origin/the_branch and cherry-pick your commits again and/or apply the stash.
I really embraced git once I understood that whatever I did locally, it’s most of the time relatively easy to recover from cock-ups. And it’s really difficult to lose work from the moment you’ve added it to a (local) commit or stashed it.
I do understand that git is daunting however, and there is plenty where I think the defaults are bad. Too often I’ve seen merge commits where someone merged a the remote of a branch into the local copy of the same branch, or even this on main. And once this stuff gets pushed it’s neigh impossible to go back.
In my, rather short (5ish years profesionally), career I needed to rebase once. And it was due to some releasing fuck up, a branch had to be released earlier and hence needed to be rebased on another feature branch scheduled for release.
Otherwise, fetch » pull » merge, all day, every day.
I rebase almost daily. I (almost) never merge the main branch into a feature branch, always rebase. I don’t see the point of polluting the history with this commit (assuming I’m the only dev on this branch). I also almost always do an interactive rebase before actually pushing a branch for the first time, in order to clean up commits. I mostly recreate my commits from scratch before pushing, but even then I sometimes forget to include a change in a commit I’ve just made so I then do an interactive rebase to fold fixup commits into the commits they should’ve been in.
I like merging for actually adding commits from a feature branch to main (or release or …)
A much simpler solution: don’t use the git CLI. And in my professional life I don’t know a single person who does. The shortcomings of git have long been abstracted away and as problematic as the CLI is, it’s now just an internal library of the tools we actually use.
Also the
git pull
criticism is weird. Yeah it exists on paper, and year every so often once in a blue moon there’s a conflict after a pull with rebase, but… this doesn’t even begin to dent the oodles of time saved from just doing Ctrl+T in IntelliJ and be up-to-date with no further input. Why waste 20 minutes 40x-100x a day instead of 45 minutes once every 3-6 months? Especially this case:My favorite version of this is when the novice has followed someone’s dodgy advice to set pull.rebase = true, then they pull a shared branch that they’re collaborating on, into which their coworker has just merged origin/main. Instant Sorcerer’s Apprentice-scale chaos!
I’m sorry, but are you collaborating or competing on a shared branch? If it is a collaborative effort, maybe just talk about it? And in fact, unless the other person is an utter asshole, they’ll have done so before merging in the new changes from
main
. That’s not even to mention that in 99,95% of cases or so, that exact scenario is perfectly fine and gets resolved without any issues whats-o-ever and no user input necessary. Bringing us once again to the situation where you save a moderate amount of time multiple times a day by always just pulling.(edit)
Don’t get me wrong, all of this criticism is of course valid. But it feels like a very arcane case, as no project should be able to produce the issues frequently unless there’s some underlying problem in either the mode of collaboration or the structure of the project in the first place, and the usage of git is long abstracted away and the tools handle virtually any and all edge case, including making merging far smarter than if you were to use the CLI.I’ve used the git cli exclusively for more than a decade, professionally. I guess it varies wildly by team, but CLIs are the only unambiguous way to communicate instructions, both for humans and computers. That being said, I still don’t mess around with rebase for anything, and I do use a gui diff tool for merge conflict resolution. Practically everything you need to do with git can be done with like 10 commands (I’m actually being generous here, including reset, stash, and tag).
That being said, I still don’t mess around with rebase for anything
Rebasing has a worse reputation than it deserves. It’s something you just get used to - just like how git use is, when you started using it. There are a couple of strategies to make it easier and less anxiety inducing:
- Before starting a rebase of a long branch, create a new branch. That way in case you seriously mess up, you can just delete the rebasing branch and rename the old branch to restore everything (you can usually get away with rebase abort. This is just added safety). Even in case of a successful rebasing, you can just keep the backup branch around, as a faithful record of actual development history.
- Do only one (or max 2) operations in a single rebases. Do this over multiple rebases to get what you want.
After a while, rebasing becomes as simple as commit or merging.
Rebasing and merge conflicts are the top ways that git can turn into a mess. I know that rebasing could (in some circumstances) make merge conflicts less of an issue, but I just mostly think the value of “commit grooming” is overrated. I don’t want to argue about this, if you like doing it, go ahead.
I had to check and make sure I didn’t type the comment above because it sounds exactly like me.
All UIs do things slightly differently, the CLI is always exactly the same… Everywhere. UI for non trivial conflict resolution? Definitely. For everything else, CLI.
And, I’m also reticent to use rebase unless I have to. Gimme that good ole FF :)
UI for non trivial conflict resolution? Definitely.
I dont know about that… Never found they help that much in conflict resolution. They give you some nice buttons for accept their or accept our changes but really I find more often than not those are what breaks code as you often want a mash-up of both sides - which needs to be manually done even in UIs.
Otherwise it is just find the marked sections in the file, and make it look like what you want it to after the merge/rebase. And that is the hardest part - figuring out what it should look like. Which is made easier if you only ever have small commits and merge back to master frequently minimizing the amount your branches drift from each other.
don’t use the git cli. In my professional life I don’t know a single person who does
I do, I find it much simpler than using the GUIs
I mostly agree. The caveat to this is I’ve had to learn CLI for programmatic use cases like automation.
I honestly don’t get why folks dislike rebase. I use it constantly, especially to squash commits so that my pull requests are a single commit that can be reverted easily.
It’s also kinda annoying to have a history full of “merge” commits polluting the commit messages and an entwined mix of parallel branches crossing each other at every merge all over the timeline. Rebasing makes things so much cleaner, keeping the branches separate until a proper merge is needed once the branch is ready.
I use rebase when I’m working in a dev branch. If someone else has pushed changes to the main branch, rebasing the dev branch on top of main is a way to do the hard work of resolving merge conflicts up front. Then I can rerun tests and make sure everything still works with changes from the main branch. And finally, when it is time to merge my dev branch to main, it’s a simple fast-forward.
Because rebase is fraught with peril, if you also push rebased branches upstream and someone else works off that branch.
If you stick to the rule of only using rebase on local branches that have never been pushed upstream, it’s an awesome tool. If you don’t, you’re eventually going to cause someone to have a bad day.
Yeah, basically anything that rewrites already pushed history and is then (force-) push is bound to create problems (unless it’s a solo dev only ever coding on a single device, who uses the remote repo as a mere backup solution).
Yep. I work exclusively in forks, and all my work is done on my machine, rebased, squashed and then pushed to my fork for a PR. No commits from main are ever touched in my rebase. It’s such a clean workflow for me.
deleted by creator
deleted by creator
git gets easier once you get the basic idea that branches are homeomorphic endofunctors mapping submanifolds of a Hilbert space.
(source)
Edit: but to actually have content in this comment, I’m not sure the mental model is the problem. It’s not that alien that a good explanation wouldn’t help, but it took a long time for git to start paying any sort of attention to “human readability.” It was and still is in a way “aggressively technical” and often felt like it purposefully wanted to keep anybody but the most UNIX-bearded kernel hackers from using it. The man pages were rarely helpful unless you already understood git, the options were very unintuitively named, etc etc. And considering Linus’ personality, I’m not exactly surprised.
With a little bit of more thought on how to make it more usable right from the start, I’m not sure it’d have such a reputation as it has now. The reason why I think this endofunctor joke is so funny is that that sort of explanation to “simplify” git wouldn’t have been at all out of place – followed by the UNIX beards scoffing at the poor lusers who didn’t understand their obviously clear description of what git branches are.
Reminds me of the old joke that monads are easy to understand, you just have to realize monads are just monoids in the class of endofunctors.
I might be suffering from stockholms syndrome here, but my prefered ways of working with git are the cli and the fugitive vim plugin which is a fairly thin wrapper around the cli. It does take a middle ground approach on hiding the magic and forcing you to learn the magic which I suppose can be confusing for beginners when you work collaboratory and something happens that forces you to go beyond pull/add/commit/push
In my (admittedly limited) experience, mercurial is much more intuitive than git. I really dislike that git branches are only tags on the heads and completely ephemeral. It favours creating a single clean history instead of preserving what actually happened.
I only stick with these:
- pull
- add
- commit
- push
Easy.
Merge is love merge is life, get the hell out of here with that rebase witchcraft.
LazyGit is a thing ❤️🙌
Excellent article
Personally it was when I was trying to commit and I got stuck in an authentication loop of git asking for my username or email (even used --global) and it would not work or remember no matter what I tried (was recommended to reinstall mint, yeah no lmao not again).
Ended up installing the unofficial GUI and I’m MUCH happier but I tell ya if you bork something in Mint it’s really hard to fix it if your not a CLI wizard.
Git GUI wise I can do all the basic stuff I need and if something breaks than I use the CLI because there’s more documentation on it
deleted by creator