What’s changed (Understanding diff)

(Q) How can I find the files that I’ve changed in this branch?

(A) I’ve setup an alias called git changed

git config --global alias.changed '!git diff --name-only $(git merge-base HEAD $(git symbolic-ref refs/remotes/origin/HEAD | sed "s@^refs/remotes/origin/@@"))'

This command does:

  1. Get’s the name of the branch your branch is off of (git symbolic-refs)
  2. Get’s the commit used when you created your branch (git merge-base)
  3. Runs the git command to list just the files that have changed (git diff –name-only $COMMIT_WHERE_BRANCH_WAS_CREATED)

(Q) How can I see all of the source code changes made on this branch?

(A) I use git changed2 which shows me the diff on each file.

(Q) How can I get a list of all of the commits that were made on this branch?

(A) Use the following

git log --oneline $(git-parent main)..HEAD

(Q) How can I get the commit where this branch started at?

(A) I wrote the script git-parent to give me that information. It requires you to specify the branch that originally branched from.

For example, if I branched from main and I was on branch feature/one then the command would be

git co feature/one
git-parent

To see the number of changes that have been made on a given on this branch since it branched off of the main branch I would use:

git co feature/one
git rev-list --count $(git-parent main)..HEAD

(Q) How can I view all of the actual source code changes made to a file?

(A) Using git log -p shows every commit that modified somefile and the actual source code changes made in that commit.

git log -p -- somefile

If you want to see the diff for ALL of the files in each commit that changed somefile then you can add the --full-diff option, like this:

git log -p --full-diff -- somefile

which will show diffs made in every commit that changed somefile. This syntax -- somefile filters/finds only commits where the file (path) specified was changed.

Git resolves somefile relative to the repository root (not necessarily your current directory). So a more realistic example might be

git log -p --full-diff -- src/main/java/com/example/play/hello.java

The src is relative to the root of the repostitory you are working on. If you specify a directory then git will filter for any commit that changed ANY file within that directory (or sub-directories).

git log -p -- src/main/resources

So this will show a the difference to any file that was changed under src/main/resources.

The -p (patch) option causes the results diff results to be shown in patch format (aka diff format).

The diffs will include ALL files changed in that commit. The diffs shown will be for the commit being shown and the previous version of that file (aka the parent commit).

Diff Filters - find only commit where file was added or deleted

Filters commits to only show those where files were added (A) or deleted (D). You can include multiple filters in one --diff-filter option as shown.

git log --diff-filter=AD --name-only -- somefile

Common filters:

  • A = Added
  • C = Copied
  • D = Deleted
  • M = Modified
  • R = Renamed

Changing the output of the diff

The git log command supports a --format option which allows you to specify the format for log records. For example,

git log --format="Commit: %h by %an on %ad"
# Sample output
Commit: abc123 by Alice on Mon Feb 19 12:00:00 2024
Commit: def456 by Bob on Sun Feb 18 15:30:00 2024

See https://git-scm.com/docs/git-log#_pretty_formats. NOTE: --format and --format are functionally the same! And the git log documentation uses the term PRETTY FORMATS so that is the right link

(Q) How can I get a single file from another branch?

(A) As follows

FILE=the/path/to/file.txt
git show TheBranch:$FILE > $FILE.tmp
diff $FILE.tmp $FILE 

A practical example for using this was; Another developer committed to several repos that I’m working on updated the releases of those packages. I need to update my application to use the release versions that are now committed and in the master branch.

To do this, I wrote the following script:

# FILE: mygit-get-file
#!/bin/sh
LIMIT_LINES=9999999
STARTING_DIR=$(pwd)
for arg in $*
do
    cd $STARTING_DIR
    echo "FILE: $arg"
    # Give relative filename from root of REPO/Some/Path/to/a/file.txt
    # reponame=REPO
    # dirname=Some/Path/to/a/file.txt
    reponame=$(echo $arg | sed 's,/.*,,')
    dirname=${arg#*/*}
    cd $reponame
    # Use main as the master branch if it exists, otherwise default to master for backwards historical reasons
    if [ "`git branch -r | grep main`" != "" ]
    then
       MAIN_BRANCH_NAME=main
    else
       MAIN_BRANCH_NAME=master
    fi
    echo "cd $reponame; git fetch; git show origin/$MAIN_BRANCH_NAME:$dirname | head -$LIMIT_LINES)"
    git fetch; git show origin/$MAIN_BRANCH_NAME:$dirname | head -$LIMIT_LINES
    echo "------------------------------------------------------------"
done

How can I see the file somefile

git show next~10:Documentation/README

Shows the contents of the file Documentation/README as they where in the 10th to the last commit on the branch next. See https://git-scm.com/docs/git-show.

Deleting branch, tag, commit

(Q) How can I delete a tag both locally and remotely?

(A) As follows

git tag -d 1.6.3
git push --delete origin 1.6.3

(Q) How can I delete a branch both locally and remotely?

(A) A similar syntax is used to delete a local/remote branch, but branches might have issues deleting.

# git tag tbd/branch1                 # Optionally tag a branch if you might need to get back to it
git branch -d feature/branch1
git push --delete origin feature/branch1

Git will only delete the branch if it has already been fully merged into the current branch (or another specified branch).

NOTE: If the -d option doesn’t work because the branch is currently being merged or for some other reason you can use the -D option to force the deletion of the branch.

NOTE2: To get back to a branch that you deleted, you can use

git checkout -b feature/branch1 tbd/branch1

Branch Hygiene

Periodically, you’ll want or need to cleanup/delete your local branches (Q) How can I get the pom version out of a pom.xml file to use it when creating a git tag for a release? (A) MVN_VER=$(mvn help:evaluate -Dexpression=project.version -q -DforceStdout)

Git tags

Git has two kinds of tags. Annotated an unannotated. To see which is which run the command

git for-each-ref refs/tags --format="%(refname:short) -> %(objecttype) (%(*objectype)) %(objectname:short) %(authordate:iso8601-strict) %(*authorname)"
25.1 -> tag (commit)
tags/jira-123-wip -> commit

# Things you can do with tags
git show jira-123-wip
git co jira-123-wip


# Then you can
git cat-file -t 25.1

So what’s the difference?

  1. An annotated tag is a GIT object, so questions about a tag, you can ask about the GIT OBJECT or the object that the tag points to.
    • In the --format argument when you see a star (authorname), the star means what the object points to so this only happens for *annotated objects.
  2. A unannotated tag is simply an alias for a hash aka a commit. Also called a reference.

What is quite confusing about git tags is how some git command work off of git objects and others work off of the pointers. An anaology (that is not perfect) is a symbolic link in UNIX. When you issue the ls (list files) commands for a directory that contains the symbolic link do you want to see the metadata about the symbolic link or the metadata (e.g., file creation date) of the file the symbolic link points to. Well, in most cases you want to operate on the file the symbolic link points to and that is how many git commands work. So in most cases you can treat a tag as a hash because that’s how it’s most often used.

To create an annotated hash you can do

git tag -a v1.0.0 -m "Release version 1.0.0"

RefLogs

You can think of reflogs as a log of every transaction that GIT does. Conceptually, it can be thought of as Git’s “undo history”.

Reflogs are local only and expire (usually after 90 days). You can configure this:

git config --global gc.reflogExpire "30 days"

Rebasing

In git, the term rebase is used to mean make all my changes appear like they were made on that branch.

main---A---B---------H--I    # main branch
            \
             ---C--E--F---J  # mybranch

So in this example, we want mybranch’s changes to appear as though they were made after commit I on the main branch. To do this, use the following command

git co mybranch
git rebase main

Upon successful completion, the following graph appears

main---A---B---------H--I    # main branch
                         \
                          ---C--E--F---J  # mybranch

Git’s graphing utilities will often simplify this as a single line (even though there are technically two branches (logical graphs) involved)

main---A---B---------H--I---C--E--F------J  
                        ^                ^
                        Head for main    Head for mybranch

Squash multiple commits into a single commit before rebase

When rebasing, you can sometimes have merge conflicts. When this happens, it’s often easier to merge multiple commits into a single commit.

It’s simpler because the way rebase works is that it re-plays each of your commits as though they were made on the branch being referenced.

For example, let’s say that in the previous diagram the two commits C and H changed the same file (say file1.java) and when they are merged a merge conflict occurs.

So when git replays the change of C after commit I it requires you to merge C and H so that your final branch includes all of the changes that were made after your branch started.

If you don’t do this, then you’ll be asked MULTIPLE times to merge the same file on every commit after the commit where a change was made. That’s because when git replays the change it simply takes the

  • Apply C on top of I → C’ # C and I changed file1.java and a merge conflict occurs
  • Apply E on top of C’ → E’
  • Apply F on top of E’ → F’ # F and E’ changed file1.java and a merge conflict occurs
  • Apply J on top of F’ → J’

So in the case when a file (file1.java in this case) keeps getting tweaked merge conflicts can occur multiple times.

Squash multiple commits into a single commit

Let’s say you see that the last 4 (four) commits on your mybranch example need to be merged into a single commit. Git says you want to squash them together.

For the following graph

main---A---B---------H--I    # main branch
            \
             ---C--E--F---J  # mybranch

Here are the last 5 commits

git log -n 5 --pretty=format:"%h"
6a3bf4b J
bf11147 F
dcec239 E
7fd28c8 C
23d4344 B

It’s worth noting that the order that git displays them are in NEWEST to OLDEST order. See https://git-scm.com/docs/git-rebase#_interactive_mode.

We are going to use the git rebase --interactive command to squash them together as follows

git rebase -i 23d4344

In the editor you’ll leave the first line alone

pick 7fd28c8 C
pick dcec239 E
pick bf11147 F
pick 6a3bf4b J

It’s important to point out that these are in OLDEST to NEWEST order. The very first time you do this, this might be confusing. You’ll change the list to look like this

pick 7fd28c8
squash dcec239
squash bf11147
squash 6a3bf4b 

When you’re done the commit history will be re-written to have a new hash for the 4 combined commits.

git log -n 2 --pretty=format:"%h"
8443f4b C, E, F and J commits all combined together into a single commit
23d4344 B  (this remains unchanged)

<
Previous Post
How do I get git push to do what I want?
>
Next Post
React Server Components Say No