Tutorial: Git on Ubuntu and OS X - 2020

bogotobogo.com site search:

Git

In this chapter, we'll setup git, and learn how to use it.

In Git, every checkout is really a full backup of all the data. The user can copy an existing repository. This copying process is typically called cloning in a distributed version control system and the resulting repository can be referred to as clone. Every clone contains the full history of the collection of files and a cloned repository has the same functionality as the original repository.

If we want to delete a Git repository, we can simply delete the folder which contains the repository.

Git Installation

We can install the Git command line tool using the command below:

$ sudo apt-get install git

Git Configuration

Git allows us to store global settings in the .gitconfig file located in the user home directory(~). Git stores the committer and author of a change in each commit. This and additional information can be stored in the global settings.

These values can be setup with the git config command.

We can also configure the settings for a specific repository. If we use the --global flag the configuration is global, otherwise it is specific for the current Git repository.

User and Email Configuration

We need to configure the user which will be used by git - user.name and user.email:

$ git config --global user.name "k"
$ git config --global user.email "k@bogotobogo.com" 

$ git config --list
user.name=k
user.email=k@bogotobogo.com

Highlight color

The commands below enables color highlighting for Git in the console:

$ git config --global color.ui true
$ git config --global color.status auto
$ git config --global color.branch auto

Editor & Merge tool

Now we want to configure default text editor that will be used when Git needs us to type in a message. By default, Git uses our system's default editor, which is generally vi. If we want to use vim as default editor for Git:

$ git config --global core.editor vim

Another useful option we may want to configure is the default diff tool to use to resolve merge conflicts. Since Git does not provide a default merge tool for integrating conflicting changes into our working tree, we can set our own tool as default merge tool. We may want to use kdiff:

$ git config --global merge.tool kdiff3

To query our Git settings of the local repository:

$ git config --list 
user.name=k
user.email=k@bogotobogo.com
color.ui=true
color.status=auto
color.branch=auto
core.editor=vim
khong@K-PC:~$ git config --global --list
user.name=k

To query the global settings we can use:

$ git config --global --list
user.name=k
user.email=k@bogotobogo.com
color.ui=true
color.status=auto
color.branch=auto
core.editor=vim

Repository

Now it's time to create a local Git repository and commit our files into that repository.

$ mkdir ~/Repository1
$ cd ~/Repository1
$ mkdir MyFiles

The following command creates a Git repository in the current directory:

$ git init 
Initialized empty Git repository in /home/khong/Repository1/.git/

Every Git repository is stored in the .git folder of the directory in which the Git repository has been created. This directory contains the complete history of the repository. The .git/config file contains the configuration for the repository.

All files inside the repository folder excluding the .git folder are the working tree for a Git repository.

$ ls -la
total 16
drwxr-xr-x  4 khong khong 4096 Nov 12 00:21 .
drwxr-xr-x 43 khong khong 4096 Nov 12 00:06 ..
drwxr-xr-x  7 khong khong 4096 Nov 12 00:21 .git
drwxr-xr-x  2 khong khong 4096 Nov 12 00:06 MyFiles

Let's create some files:

$ touch MyFiles/simple.txt
$ echo "file1" > file1
$ echo "file2" > file2
$ echo "file3" > file3

Let's check what we've done:

$ tree -a

Git Status

The git status command shows the working tree status, i.e. which files have changed, which are staged and which are not part of the staging area.

$ git status

Adding files to the staging area

We need to mark the changes that should be committed before committing change to a Git repository. We do this by adding the new and changed files to the staging area, and it creates a snapshot of the affected files.

Now, we want to add all files to the index of the Git repository:

$ git add .

$ tree -a

Committing to the repository

After adding the files to the Git staging area, we can commit them to the Git repository. This creates a new commit object with the staged changes in the Git repository and the HEAD reference points to the new commit. The -m parameter allows us to specify the commit message.

Let's commit our file to the local repository:

$ git commit -m "Initial commit" 
[master (root-commit) 4324d18] Initial commit
 4 files changed, 3 insertions(+)
 create mode 100644 MyFiles/simple.txt
 create mode 100644 file1
 create mode 100644 file2
 create mode 100644 file3

$ git status
# On branch master
nothing to commit, working directory clean

Git Log

The Git operations we performed have created a local Git repository in the .git folder and added all files to this repository via one commit. We can see the the changes using git log command:

$ git log
commit 4324d189e5996c8c442a2a284852a4750e1ca829
Author: k 
Date:   Tue Nov 12 00:55:50 2013 -0800

    Initial commit

Removing files from the Git

To remove a file from Git, we have to remove it from our tracked files (more accurately, remove it from our staging area) and then commit.

We can use the git rm command to delete the file from our working tree and record the deletion of the file in the staging area.

$ touch rm_file
$ git add .
$ git commit -m "to_be_removed"
[master 1bb66ef] to_be_removed
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 rm_file
$ git rm rm_file
rm 'rm_file'
$ git commit -m "removing rm_file"
[master 92c1778] removing rm_file
 1 file changed, 0 insertions(+), 0 deletions(-)
 delete mode 100644 rm_file

Branching

How Git does branching?

How Git stores its data?

Git doesn't store data as a series of changesets or deltas, but instead as a series of snapshots.

When we commit in Git, Git stores a commit object that contains a pointer to the snapshot of the content we staged, the author and message metadata, and zero or more pointers to the commit or commits that were the direct parents of this commit: zero parents for the first commit, one parent for a normal commit, and multiple parents for a commit that results from a merge of two or more branches.

To see how Git does branching, let's try a directory containing three files, and we stage them all and commit. Staging the files checksums each one, stores that version of the file in the Git repository (Git refers to them as blobs), and adds that checksum to the staging area:

$ echo "README" >> README
$ echo "test.rb" >> test.rb
$ echo "LICENSE" >> LICENSE

$ git add README test.rb LICENSE
$ git commit -m 'initial commit'

Running git commit checksums all project directories and stores them as tree objects in the Git repository. Git then creates a commit object that has the metadata and a pointer to the root project tree object so it can re-create that snapshot when needed.

Our Git repository now contains five objects: one blob for the contents of each of our three files, one tree that lists the contents of the directory and specifies which file names are stored as which blobs, and one commit with the pointer to that root tree and all the commit metadata. Conceptually, the data in our Git repository looks something like this:

If we make some changes and commit again, the next commit stores a pointer to the commit that came immediately before it.

After two more commits, our history might look something like the picture below:

A branch in Git is simply a lightweight movable pointer to one of these commits. The default branch name in Git is master. As we initially make commits, we're given a master branch that points to the last commit we made. Every time we commit, it moves forward automatically.

What happens if we create a new branch?

Doing so creates a new pointer for us to move around. Let's say we create a new branch called testing. We do this with the git branch command:

$ git branch testing

This creates a new pointer at the same commit we're currently on

How does Git know what branch we're currently on?

It keeps a special pointer called HEAD. Note that this is a lot different than the concept of HEAD in other VCSs we may be used to, such as Subversion or CVS.

In Git, this is a pointer to the local branch we're currently on. In this case, we're still on master. The git branch command only created a new branch - it didn't switch to that branch as shown in the picture below.

HEAD file is still pointing to the branch we're on.

To switch to an existing branch, we run the git checkout command. Let's switch to the new testing branch:

$ git checkout testing
Switched to branch 'testing'

This moves HEAD to point to the testing branch

What is the significance of that?

Well, let's do another commit:

$ vim test.rb
$ git commit -a -m 'made a change'

The picture we see below is the outcome:

This is interesting, because now our testing branch has moved forward, but our master branch still points to the commit we were on when we ran git checkout to switch branches. Let's switch back to the master branch:

$ git checkout master
Switched to branch 'master'

We can see the result from the picture below:

That command did two things. It moved the HEAD pointer back to point to the master branch, and it reverted the files in our working directory back to the snapshot that master points to. This also means the changes we make from this point forward will diverge from an older version of the project. It essentially rewinds the work we've done in our testing branch temporarily so we can go in a different direction.

Let's make a few changes and commit again:

$ vim test.rb
$ git commit -a -m 'made other changes'

Now our project history has diverged as shown in the picture below. We created and switched to a branch, did some work on it, and then switched back to our main branch and did other work. Both of those changes are isolated in separate branches: we can switch back and forth between the branches and merge them together when we're ready. And we did all that with simple branch and checkout commands.