Source control, also known as version control, stores software code files with a detailed history of every modification made to those files.
Version control systems allow developers to modify code without worrying about permanently screwing something up. Unwanted changes can be easily rolled back to previous working versions of the code.
Source control also makes team software development easier. One developer can combine her code modifications with other developers' code through diff views that show line-by-line changes then merge the appropriate code into the main code branch.
Version control is a necessity on all software projects regardless of development time, codebase size or the programming language used. Every project should immediately begin by using a version control system such as Git or Mercurial.
There is a spectrum of philosophies for how to store projects within source code repositories.
On one extreme end of the spectrum, every line of code for every project within an organization is stored in a single repository. That approach is called monorepo and it is used by companies like Google. On the other end of the spectrum, there are potentially tens of thousands or more repositories that store parts of projects. That approach is known as multirepo or manyrepo.
For example, in a microservices architecture, there could be thousands of microservices and each one is stored within its own repository. No one repository contains the code for the entire application created by the interaction of the microservices.
There are many hybrid strategies for how to store source code that fall between these opposite approaches. What to choose will depend on your organization's needs, resources and culture.
Pulling code during a deployment is a potential way source control systems fit into the deployment process.
Note that some developers recommend deployment pipelines package the source code to deploy it and never have a production environment touch a source control system directly. However, for small scale deployments it's often easiest to pull from source code when you're getting started instead of figuring out how to wrap the Python code in a system installation package.
Numerous source control systems have been created over the past several decades. In the past, proprietary source control software offered features tailored to large development teams and specific project workflows. However, open source systems are now used for version control on the largest and most complicated software projects in existence. There's no reason why your project should use anything other than an open source version control system in today's Python development world. The two primary choices are:
Git is a free and open source distributed version control system.
Mercurial is similar to Git, also a free and open source distributed version control system.
Subversion is a centralized system where developers must check files in and out of the hosted repository to minimize merge conflicts.
Git and Mercurial can be downloaded and run on your own server. However, it's easy and cheap to get started with a hosted version control service. You can transition away from the service at a later time by moving your repositories if your needs change. A couple of recommended hosted version control services are:
GitHub is a software-as-a-service platform that provides a user interface, tools and backup for developers to use with their Git repositories. Accounts are free for public open source development and private Git repositories can also be hosted for $7 per month.
BitBucket is Atlassian's software-as-a-service tool that with a user interface, comparison tools and backup for Git projects. There are many features in BitBucket focused on making it easier for groups of developers to work on projects together. BitBucket also has private repositories for up to five users. Users pay for hosting private repositories with more than five users.
Staging Servers, Source Control & Deploy Workflows, And Other Stuff Nobody Teaches You is a comprehensive overview by Patrick McKenzie of why you need source control.
This lighthearted guide to the ten astonishments in version control history is a fun way to learn how systems developed over the past several decades.
A visual guide to version control is a detailed article with real-life examples for why version control is necessary in software development.
An introduction to version control shows the basic concepts behind version control systems.
What Is Version Control? Why Is It Important For Due Diligence? explains the benefits and necessity of version control systems.
Version control before Git with CVS goes into the history of version control systems and defines three generations, of which CVS and SVN were part of the second generation while Git and Mercurial are third-generation version control systems.
About version control reviews the basics of distributed version control systems.
Monorepo versus multirepo version control strategies are a weirdly contentious topic in software development, likely because once a policy is set for an organization it is exceptionally difficult to change your approach. The following resources give more insight into the debate on how to structure your repositories.
Monorepo, Manyrepo, Metarepo is an awesome guide to varying ways of structuring your source repositories that contain more than one project. The guide covers advantages and disadvantages of common approaches used in both small and large organizations.
Repo Style Wars: Mono vs Multi goes into the implications of using one side or the other and why it is unlikely you can create a combination solution that will give you the advantages of both without the disadvantages.
Why Google Stores Billions of Lines of Code in a Single Repository covers the history and background of Google's source control monorepo, which is one of if not the largest monorepo for an organization in the world.
Advantages of monorepos goes into the advantages of using a monorepo and does not discuss the downsides but admits there are many so the decision is not clear-cut on using either strategy.
Monorepos and the Fallacy of Scale argues that having all of an organization's code in a single repository encourages code sharing. The author considers the concerns often raised about tight coupling between components in a monorepo code base but says that the advantages outweigh the disadvantages overall.
Git is the most widely-used source control system currently in use. Its distributed design eliminates the need to check files in and out of a centralized repository, which is a problem when using Subversion without a network connection. There is a full page on Git with further details and resources.
How to use Subversion (SVN) lays out the basic concepts and provides the first few steps for getting started tracking files.
10 Most Used SVN Commands with Examples is a good refresher list if you've used SVN in the past but it has been awhile since you worked with all the commands.
Pick a version control system. Git is recommended because on the web there are a significant number of tutorials to help both new and advanced users.
Learn basic use cases for version control such as committing changes, rolling back to earlier file versions and searching for when lines of code were modified during development history.
Ensure your source code is backed up in a central repository. A central repository is critical not only if your local development version is corrupted but also for the deployment process.
Integrate source control into your deployment process in three ways. First, pull the project source code from version control during deployments. Second, kick off deployments when code is modified by using webhooks or polling on the repository. Third, ensure you can roll back to a previous version if a code deployment goes wrong.