Source Code Management — Git (Basic Concepts)

Author : Rahul Krishna Upadhyaya

Topic : Source Code Management

Date : 16/04/2013

Overview : What is source control management? Why do we need to manage the source code? Git as an example of Source Control Management. Some basic commands to get started with working with Git.

Source Control Management.

Source Control Management, as the name would suggest,  is the means by which the source-code for any project is managed. Some of the feature that every SCM-application should contain :

  • It should maintain a incremental history of each file . Every time you tell the SCM-tool that you want to save (commit) the code, it tracks the changes and saves those delta changes in its history.
  • Means to tracks the author of each particular change to every given file being monitored.
  • Provides means to several developers to collaborate and contribute,even while writing to a single file simultaneously, ensuring the integrity of the previously saved code and avoiding adding any new-code which conflicts with the previous version.

Where is the need Source Control Management ?

Q. Case where I am the single Author of the complete content : If I find that I added some code and everything stopped working, and I want to revert back to a state where I know everything was working fine.

A. Every single time you save (commit) your code a commit-id is generated for that. Also, you add a commit message stating some logical reasoning for the changes that you have added. This enables you to go to the history. Choose any commit-id and see all the additions and deletions done in that particular commit and the logical reasoning behind making those changes as the commit message. Here,using this, you can navigate to any particular stage of the  code back in time and check what changes were made or revert back to any stable code-base.

Q. Case where several Authors are writing into the same project and possibly in the same file too.

A. Any source-control management tool would make life really easy for collaboration between a group of developers. You write the changes and “push” this code to a remote central server. All the other developers can then “pull” the changes from this central remote server and update/synchronize their work-spaces with this changed code. If you are working on the same file and making changes to the same section, you will face “conflicts” . You either need to resolve them manually or set a policy to choose theirs or our version of code for updating your work-space with the latest code.

Git – The Most Powerful SCM Tool.

Git is a distributed revision control and source code management (SCM) system with an emphasis on speed.Initially designed and developed by Linus Torvalds for Linux kernel development, Git has since been adopted by many other projects.

Every Git working directory is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on network access or a central server.

A Short History of Git

As with many great things in life, Git began with a bit of creative destruction and fiery controversy. The Linux kernel is an open source software project of fairly large scope. For most of the lifetime of the Linux kernel maintenance (1991–2002), changes to the software were passed around as patches and archived files. In 2002, the Linux kernel project began using a proprietary DVCS system called BitKeeper.

In 2005, the relationship between the community that developed the Linux kernel and the commercial company that developed BitKeeper broke down, and the tool’s free-of-charge status was revoked. This prompted the Linux development community (and in particular Linus Torvalds, the creator of Linux) to develop their own tool based on some of the lessons they learned while using BitKeeper. Some of the goals of the new system were as follows:

  • Speed
  • Simple design
  • Strong support for non-linear development (thousands of parallel branches)
  • Fully distributed
  • Able to handle large projects like the Linux kernel efficiently (speed and data size)

Since its birth in 2005, Git has evolved and matured to be easy to use and yet retain these initial qualities. It’s incredibly fast, it’s very efficient with large projects, and it has an incredible branching system for non-linear development

The Three States- In Local Operations

Now, pay attention. This is the main thing to remember about Git if you want the rest of your learning process to go smoothly. Git has three main states that your files can reside in: committed, modified, and staged. Committed means that the data is safely stored in your local database. Modified means that you have changed the file but have not committed it to your database yet. Staged means that you have marked a modified file in its current version to go into your next commit snapshot.

This leads us to the three main sections of a Git project: the Git directory, the working directory, and the staging area.

Local Operations in Git

Local Operations in Git

The Git directory is where Git stores the metadata and object database for your project. This is the most important part of Git, and it is what is copied when you clone a repository from another computer.

The working directory is a single checkout of one version of the project. These files are pulled out of the compressed database in the Git directory and placed on disk for you to use or modify.

The staging area is a simple file, generally contained in your Git directory, that stores information about what will go into your next commit. It’s sometimes referred to as the index, but it’s becoming standard to refer to it as the staging area.

The basic Git workflow goes something like this:

  1. You modify files in your working directory.
  2. You stage the files, adding snapshots of them to your staging area.
  3. You do a commit, which takes the files as they are in the staging area and stores that snapshot permanently to your Git directory.

If a particular version of a file is in the git directory, it’s considered committed. If it’s modified but has been added to the staging area, it is staged. And if it was changed since it was checked out but has not been staged, it is modified.

Git Interaction with Remote Servers

Once as explained above, we are done with the changes and commits in our local system and we want to add these changes to the Remote Git Server – So that the central repository containing the code gets updated and all the peers can synchronize their work-spaces with our changes.

We use git push <some added params> command to push those changes to the remote repositories. At the same time we should know which is the remote repository that we are pointing to : git remote -v  would list down the remote repositories that my current git repository knows of. If there are none , perhaps you need to add one before running a git push command.

A good Step by Step Reference

A Website which you could follow for a step by step comprehensive Learning of git.

Git Immersion

Working on github is also majorly same as working with git on a remote/local machine, only that it has a few more additional things that you need to know.

A good place to learn that would be TryGit Tutorial.

References :

Advertisements

About Rahul K Upadhyaya
I am a software developer. My core areas of interest lies in Openstack as a technology,Python as the Programming language and Linux (Ubuntu/CentOS) as my favoraite OSs. When I am not at work, you would find me with my Camera , clicking random weird Stuff and People. You can have a look at the pictures on http://rakrup.wordpress.com

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: