LateX

Tuesday, June 12, 2012

Using git for simple projects.

This is not really related to main purpose of this blog, but I had to write this article anyway, so I though I might as well publish it.

Git is becoming more and more popular collaborative programming tool. It main purpose is version control, so you can always revert to previous versions of your code and so on.

The problem is many people get really confused by it and there is no really good tutorial that would describe how proper workflow should look like. I know this can be really confusing, so I created this article.

Initial setup:
You need to update git variables. Type:
git config --global user.email "YOUREMAIL@gmail.com"
git config --global user.name "YOURNICKINGIT"
git config --global core.editor vim
This is so that other people can associate your changes to code with you. You will understand later on when we get to git log.

If you don't have vim you have to install it. In ubuntu type (if no ubuntu, I assume you are geeky enough to be able to google that).:
sudo apt-get install vim 

Setting up a repository
  1. Use github - this is a really nice website and I really recommend using it as it has a really nice interface and allows incredibly easy sharing. The downside of that that they make you pay if your project is not public.
  2. DIY - you can set up a repository on ssh server that other people would be able to fork on their own computers and edit it. In this tutorial I follow a model of single master repo and multiple edit repos, which is very simple and powerful. So steps are:
    1. Log in to your ssh server and go to directory where you want to set up a master git repository.
    2. type
      git init --shared=group project
      This will create a directory named project. Inside this directory there is a hidden folder .git which keeps all of the git state. All files in project folder are subject to version control by git.
    3. There is some necessary initialization work that needs to be done. Don't worry about it for now (for curious: to be able to successfully clone git repository and be able to push, it has to be nonempty; it also must not be checked out to branch to which you are pushing; we will be pushing to branch master, so it is going to be checkout to devs branch).
      cd project/ && touch TODO && git add TODO && git commit -a -m"Initial commit" && git checkout -b dev 
    4. Now this is your master repo. Ideally you should never explicitly modify it from the inside. Instead you should take your own local copy and push to it. Also make sure that all your friends have rights to modify it (if they use different user to log in to the server - this is advanced if you do not know unix permissions yet; if you know it you will be able to do it).

To get the local copy of repo
git clone <address>
where address can be for example:
local directory - for example:
/home/memyselfandi/projects/massdestructionofbananalovers
remote directory on ssh server - it has to be structured like this:
ssh://username@server:pathtorepo
for example:
ssh://friendofmemyselfandi@secretgovermentorganisation.gov:/home/memyselfandi/projects/massdestructionofbananalovers
This will create a folder with the name of your project containing the repository.

Convention: In all the commands described it is assumed that you are inside your local repo.

To work with code:
git branch awesome_feature
Creates a new branch copying a contents of current one. Branch can be thought of as a new direction in which project is going (≈feature or set of features). If you type git branch you will see list of all the branches that you have on your local repo.
To understand branches better imagine, you are to add new feature to a project without git. What you would do is copy a directory with your project and work on a copy, because you will be worried that you will screw up the original. That is exactly what branches do. You can have many branches to work on many features in parallel.
There is one special branch - master - it is a useful convention to keep this one unchanged. That way you can always keep the newest set of changes committed to master repository there.
Observe the star near currently used branch. We want to change it to our newly created branch, so:

git checkout awesome_feature 
- changes branch to awesome_feature. Branches are meant to develop logically independent changes. If you are working on two features in parallel, create additional branch. Be careful not to create it from awesome_feature, because it would make a copy of it rather than master (unless your branch diverged into two different directions - but even in this case you would be probably better of first pushing your change to master, possibly disabled with a boolean flag and then starting two new copies of updated master).

Now is the time do actual coding. Write whatever you want and once you are sure a logical part of your code is done clean up the code (what you commit will be shared with your friends and you don't want to make mess on their computer, do you?). Now type git status. What you will see will look something like:
# On branch awesome_feature
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
# modified:   TODO
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
# bananalover_decapitator.cpp
# banana_poisoner.cpp
While TODO is file that already existed and you modified it, bananalover_decapitator.cpp is completely new file. By default git will just ignore new files, so you have to add them to git's consideration.

git add <files> 
- adds a file to gits consideration. Make sure you add all the files you want to share with your friends. Make sure it worked by typing git status again and checking files is listed under new files. Also you can add entire directories to git in one go. This is a sensible thing to do if you start version control of a previously uncontrolled project. Sometimes however that would also add some files you would prefer to ignore. This a actually really common situation, so git let's you specify, things to be ignored. They are located in a specific file in located in your repository:
$ cat .gitignore

# Can ignore specific files
photo_of_red_banana

# Use wildcards as well
*~
*.swp

# Can also ignore all directories and files in that directory.
build/**/* 
git commit -a
Commits your changes as a part of current branch (awesome_feature). It will open your file editor - in this case vim. You will be asked to type a short message describing your set of changes. Make sure is descriptive enough for your friends to understand what you did. Editor you are using is vim, so hit i (letter 'i' on keyboard) to start typing. Type your message normally. Hit esc and then ":wq" + enter (without quotes) to save your text file (if you need to know more on vim, just google it). Now if you type git log you should see your change on top (q to exit if list of changes is long). Now it is crucial to understand what just happened. Your modifications used to be only saved as changes to local directory. But committing them you you added them to your branch on your local computer. One commit is a smallest unit that git can manage. You can merge your commits with commits of your friends, revert to the state after a particular commit etc.
Now that you saved your work on your computer, it's time to share with your friends, but first...

make sure you are not conflicting with newest changes (made by your friends):
git checkout master # go to branch master
git pull # download the newest changes from master branch 
git checkout awesome_feature # back to your branch
git rebase master # magically merge newest updates by your friends with your code.
conflict resolution: Sometimes it might happen that you and your friend were editing the same file in a similar place. Should that happen you will have to resolve a conflict. Git makes resolving conflicts really easy! In fact most of them are just resolved automatically. If that didn't happen git will output a message. To fix it: type git status to see which files include conflicts. For each of them individually you have to open them and merge the changes. Git will list two different versions it found in this file:
<<<<<<< HEAD:banana_bomb.cpp
kill_banana_lovers();
=======
kill_banana_lovers_and_their_families();
>>>>>>> 7b08cdcedc68edc6c5e87d7dc5ed7:banana_bomb.cpp
Once you resolved conflicts, you have to git add again all the files that use to include conflict. Once this is done you may continue your rebase: git rebase --continue

Now, if there were any changes that you merged with, make sure that merging didn't break anything. If it did, see and try if you can fix it by yourself, or consult a friend with whose changes you were merging.

All the changes that you made have to be committed again. Sensible thing to do is to add them as a part of your last commit. You can quickly do that by using git commit -a --amend

Now share it! Push your code from your local branch awesome_feature to branch master on master repository.
git push origin awesome_feature:master
This will push where ever your origin is (remote or local).

Additional features


git blame <file> 
Displays names of people who last modified each line in a given files.

git diff 
Displays uncomitted changes. You can also compare files between different revisions (states after different commits).

git reset --hard <commit_hash> 
Resets current branch to given commit. In case smb screwed up. In case the change that is bad was not the last change, you are most probably better off writing new commit undoing the changes.

git stash
When you want to switch to different branch, but don't want to commit changes or want to transfer small set of changes between branches without committing and rebasing, just stash them - use responsibly.

git bisect 
No sure which commit caused the failure? Use git bisect to binary search for it.

git reflog 
Git is so meta - it even version controls what happens with itself. To display recent actions in git use git reflog. Each action also has its own hash, to which you can reset but using git reset --hard <action_hash>

Working with server-side code
Some people think that it is impossible to work on local repository if you are working on server side code, as you need to have access to server resources. This is only partially true. In most cases you can run local copy of server and access resources with hard state like SQL database remotely. Using PHP+MySQL as an example you can have separate file locally and on server specifying how to connect to database. You include in PHP files. Just gitignore that file, so that git does not change it around. Run apache at your local computer to execute PHP code.
Make sure you gitignore all your local configurations.

That is the end of this simple git usage tutorial. In general you have lot more interesting problems and features with git, but this actually covers more of the cases for small and medium sized projects.

Good luck and remember - with great power comes great responsibility.

Also don't be afraid to experiment with git - you always can use reflog to repair your mess.

Please comment if you find any bug in this tutorial. It is one of the longest things I ever wrote, so it probably has some.