By: Karthik Janar Printer Friendly Format
For anyone who is serious about doing any work with computers needs to understand some basic concept of files and folders and some understanding of command line interface and commands. It is essential to have some hands on to use CLI. Furthermore if you are interested in doing programming or analytics then version control becomes very important. Git and GitHub provide very good toolset to create, maintain and manage your files and folders and their versions as well as share them with others.
Some Basics and CLI
The command line interface is a way of working with files and folders that involves typing a command as opposed to pointing and clicking with a mouse. It is also a good tool and forms a part of 'Data Scientist Toolbox'. Whether you use Mac or Windows or Linux, all ship with CLIs. For this tutorial we use Git Bash for Windows and Terminal for Linux and Mac.
You can use it to create files, folders, and programs, and then you can use it to edit those files and folders and also to run the programs. A directory is a folder that contains a collection of folders and files. Each folder can have nested folders and files inside them. Forward slash "/" is the root folder or the top most folder. Other folders are created under them. So you can go up and down and traverse between folders using CLI commands. Some of the common functions that you will use are:
#to create a folder
mkdir <folder name>
#to change directory
#to pring working directory name
#to clear the CLI screen and show a blank screen
#to list all directories
#to copy a file
#to move or rename a file to another location
Files are created to save data within folders. When you create a file, you can type and save some data in that file. Later you may open the same file again and edit and save it again. This modification of the data is a continuous process and your files will always have a single final copy and therefore all the intermediate files are lost and untraceable.
Git is a free open source software to manage and maintain version control for your files so that you can have all those intermediate files as well. It's distributed so it can handle everything from small to very large projects with speed and efficiency. It's one of the most commonly used sort of version control systems right now. So when there are other people having a different version of the file, it is easier to reconcile the different versions and work with them. Everything is stored in local repositories, or on your computer, and they're called repos and then you do most of the operations from the command line.
The first thing to do is to download and install Git. Goto https://git-scm.com/downloads and download and install. Once the installation is complete, search for the program 'Git Bash' and open it. Now set the environment with your name and email.
git config --global user.name "Your name here"
git config --global user.email "Your email here"
You one do this once and then this value is set locally. The email you use in your above command will be the same email you will use to register for an account in GitHub in the next step. So now that you have setup Git locally in your computer we will see how to setup GitHub on the web.
GitHub is a web based hosting service for software development. It allows you to contribute projects online for others to see and contribute as well. So from Git, users can pull and push their local repository to GitHub. It also provides users with a home page that displays all of their repositories. And the repositories that you have on GitHub are backed up on the server in case something happens to your local copies. But the real key aspect of GitHub is the social aspect and so the social aspect allows users to follow one another and to share projects and to contribute to each other's projects and so that's really the power of GitHub.
Whats more its FREE! So goto github.com and create a new account. Remember to use the same email to register that you set in the above step. GitHub is very user-friendly. You can now try to create a new repository in your GitHub account. Just give it a name ('Hello-World') and a description, and select the 'Initialize this Repository with a README' check box and then click on the 'Create Repository' button. Congrats, you've created your first GitHub repository.
Linking local Git repository with your GitHub Repository
Now you can create a copy of the repo you created on the web (GitHub) on your computer, so that you can make changes to it. So you can open Git Bash, and create a directory on your computer where you will store your copy of the repo. So, for example you can do mkdir and then, here it's in your home directory creating test-repo, and then navigate to this new directory using cd. And then what you can do is, you can initialize a local git repository by using the command git init.
And then you can point your local repository to the remote repository. In other words, you can link up your local repository with the remote GitHub repository by typing git remote add origin, and then the URL of the remote repository that you created on GitHub.
git remote add origin https://github.com/yourUserNameHere/Hello-World.git
Now you've linked up your local copy with your remote version of GitHub. You can also FORK a repository that was created by someone else into your repository. That will create a copy of the repository for you to edit and manage.
Some git commands
Suppose you add new files to local repository for version control, you need to let git know about it so that they can be tracked.
git add . adds all new files
git add -u updates tracking for files that changed names or were deleted
git add -A does both of the previous
You should do this before committing. Now let us say you have made some changes to the files. you need to commit the intermediate version by typing the command:
git commit -m message
#where message is a useful description of what changes you did
This only commits to the local repository. This has not been updated to the web yet. You can do that by running the below command:
Some other git commands
#create a branch
git checkout -b branchname
#to see what branch you are working on
#to switch back to the master branch
git checkout master
You have learnt some basics of command line interface and how to use it. More importantly you have understood how to use git and github to create local and remote repositories and to maintain version control of your files and source code. You are now ready to start your coding.
Most Viewed Articles (in Data Science )
Latest Articles (in Data Science)
Comment on this tutorial
- Data Science
- Cloud Computing
- Java Beans
- Mac OS X
- Office 365
- Tech Reviews