Thoughts of a Thinking Craftsman: version control

Showing posts with label version control. Show all posts

Wednesday, July 10, 2024

Elevating DevOps: A Journey from novice to Master craftsman

In the dynamic world of DevOps, the gap between theory and practice can sometimes seem as vast as space itself. Reflecting on my journey through countless interviews with DevOps engineers, a pattern emerges: a disconnect between claimed experience and practical knowledge. It's a concerning trend that sees the essence of DevOps, a craft of precision and innovation, being diluted by superficial engagement.

Rewind to the days before DevOps became the industry buzzword. I was at the helm of creating a comprehensive DevOps platform for Geometric Ltd., integrating both commercial and open-source tools. This experience was more than just 'clicking buttons'; it was about architecting a seamless workflow that propelled projects forward.

For those aspiring to master DevOps, here's a distilled essence of my experience:

Begin with the roots. Watch John Allspaw and Paul Hammond's seminal presentation at the Velocity 2009 conference. Let it be your DevOps genesis, and absorb its insights multiple times.
Dive deep into Version Control Systems (VCS). Whether it's Git, Mercurial, or Subversion, understanding the intricacies of VCS is non-negotiable. They are the backbone of any robust DevOps strategy.
Ensure everything is traceable. From source code to CI/CD configurations, every element should be version-controlled, allowing you to pinpoint changes with precision.
Immerse yourself in the wisdom of pioneers. My recommended reads include 'Release It' by Michael Nygard, 'The Phoenix Project' by Gene Kim, and 'Accelerate' by Gene Kim and Jez Humble.
Learn from the best. Study the engineering blogs of tech giants like Netflix, Google, and Facebook, Uber to stay abreast of cutting-edge practices.

DevOps is not just a role; it's a mindset of continuous improvement and relentless pursuit of excellence. Let's commit to upholding the true spirit of DevOps and nurturing the next generation of software craftsmen.

Posted on LinkedIn https://www.linkedin.com/posts/activity-7215942686271238144-7XjI

Tuesday, August 04, 2020

Experience of Contributing to Smart India Hackathon

Talking about my experience of 4 years of Contributing to Smart India Hackathon and trying to summerize it in 4 mins.

Smart India Hackathon is largest hackathon in the world. In this year's hackathon 60,000+ students participated

Saturday, December 15, 2018

Sane Branch Management of Version Control Systems for Teams

[NOTE : This is still a 'draft', please point out mistakes and suggest changes/improvements]
Some time back Nagaraj Mali asked a question about 'best practices for repository branch management' on one of our whatsapp groups. This is commom query. 'bad branching strategy' or 'no strategy about branching' are very common mistakes in project teams. These mistakes can seriously impact the teams productivity and quality. Unfortunately very few manager/scrum masters or tech leads really understand devastating impact of 'bad version control practices'. In this blog, I am attempting to explain my views and logic behind various practices that I recommend.

Mistake : Too many branches

A developer need to understand the difference between a 'branch' and a 'tag'. Many times teams use 'branch' where tag is sufficient. 'branch' is an 'active line' of development. Common practice is product teams is to support 3 previous versions (i.e major releases) for bug fixes and one new release'. For example your current release is Ver10 and you may supporting major Versions9, 8 and 7. Essentially you have 4 'lines of development'. Then you need 4 branches. Ideally you should checkout 'only' the branch that you are working on. There is no need to checkout all branches. Since 'branches' represent 'live lines of development', once you stop supporting a particular major release, you should 'delete' the branch for that release. Typically you will have 4 to 6 branches and many/many tags.

Mistake : Creating new branch for every minor release

Assume a major release is 'version 9'. Then 'version 9.1' is developed on 'version 9' branch ONLY. Remember 'Version 9' is the supported major release and 'live development line'. Version 9.1 is usually a 'NOT live development line'. Any code changes/bug fixes for Version 9.1 should be developed in 'Version 9' branch'

Common complaint : We spend long time in merging.

This is another symptom of bad branching and bad practices followed by team. Consider following branch scenarios.

Release branches for past releases

Lets assume that new releases are developed in 'master' and team is supporting 2 previous release 'Version1' and 'version2'. Now team have done a bug fix in 'Version1'. Obviously customers will expect that same bug fix is available on 'patch release on Version2' and also available when new Version3 becomes available. So every day branches must be merged 'upstream'. i.e. merge Version 1 commits to Verson 2. And merge version 2 commits to 'master'. If you get any merge failure fix them. Daily changes are typically small and can be easily merged. This simple practice ensure that time required to do merges are drastically reduced and bug introduced because of merge problems are almost entirely eliminated.
Feature branches

Many teams create 'feature' branches for every new feature or bug fix. However, they don't delete the feature branch one the feature development is over. At end of 'feature' development feature branch must be merged into 'master' and then delete the feature branch. It is best to use 'git flow' plugin/workflow for working on feature branches.

Typical feature branch flow will be
- create feature branch from 'master' (so 'parent branch' will be 'master')
- keep making changes and commiting in feature branch. Its ok to push feature branch.
- every day merge 'parent branch' to 'feature' branch. This way any changes to 'parent' branch done by other developers are available to you and any conflicting changes are detected early.
- once the feature is done, merge the 'feature branch' to 'parent branch' (usually 'master'). Since you are regularly merging the 'master' to 'feature', you will not see any conflicts when you merge 'feature' to master. Now delete the feature branch
In both scenarios 'key' is the regular (preferably every day) merge from 'release branch' to master' or 'master to feature' branch.

Remember It is OK to delete a branch

Remember in version control there is nothing like 'permanent delete'. When a branch is deleted, you will not see it 'branches list'. But that does not mean all the history of such branch is also deleted. Recommended practices is to create a 'tag' from branch and then delete the branch. This will keep the history intact. It will also allow to restore a branch from the tag, if you need to do some critical bug fix in an older (now unsupported) release

Bottomline

Remember 'version control' is 'productivity tool' for the development team and NOT just a backup tool. Learn how to take advantage of the version control system and you will see signficant increase in productivity of your team. Start by defining policy for 'branch creation/branch deletion and daily merge'.

Sunday, September 25, 2016

SVNPlot Version 0.9.0 Released

Today I am releasing SVNPlot Version 0.9.0.

SVNPlot now works on Python 2.7.x and on Python > 3.5.x. Current release also contains many small bug fixes esp. related to Unicode handling.

You can download the installers from Bitbucket Download Page.

Saturday, February 13, 2010

svnplot - one year later

About one year back (Dec 2008) I changed job. Between two jobs I had some free time. I wrote first version of svnplot during these 4-5 days. Then I released it as 'open source' project on Google Code. Soon many people started using it. I started getting the bug reports filed on the project page. To me, this was a indication that people are really using this project.

I got bug reports from developers/scientist working in places like CERN, AMD.
One of the bug reports mentioned that "I'm using SVNPlot on over 100 of my users' repositories"
Svnplot was mentioned in discussions on StatSVN forums
StatSVN developers added the feature of tag cloud of commonly used words in commit messages inspired from the similar feature in svnplot. So as mentioned by Benoit "it is now a bilateral inspiration". Since initially features in svnplot were inspired from excellent StatSVN project.

Many people contributed bug fixes and improvements to svnplot.

Chris Glasman added support for repository authentication.
Oscar Castaneda developed/contributed code to convert SVN logs to output files can be used in CMU's ORA and Apache Agora as as part of Google Summer of Code 2009 (GSoC09). You can read the details of his contribution here.
kitpz2 contributed code for better pie-chart display of directory sizes.

I think the key advantage of svnplot is it doesn't require a checked out copy of repository. Also it is easy to hack.

So what's next ?

I am now working on next version of svnplot (0.6). The key new feature will be graphs be generated on client side with javascript and HTML canvas. This will reduce the dependency on matplotlib and it will be easier for users to deploy it. After checking few Javascript charting libraries like Flot, jquery.Visualize plugin, I decided to use jqPlot. I am planning to release Svnplot 0.6 in few weeks time.

Monday, April 13, 2009

Book recommendations for Software Developers

As I mentioned in my previous post on 'C++ Book Recommendations', I have now published a list of books for software developers. These are the books helped me in developing my ideas about Software Development (irrespective of technlogy or programming languages)

Check the list at "Book Recommendations for Software Developers"

Tuesday, March 24, 2009

Comparison of VSS, CVS and SVN

I have prepared a comparison among the three commonly used Version Control Systems Visual SourceSafe (VSS), CVS and Subversion.

The weights and scores are based on my judgement. I think this type weight scores based comparison may help you in convincing people (e.g. your project team, colleages, senior mgmt in your company) to use Subversion.

Total Score
Subversion : 251
CVS : 171
Visual SourceSafe(VSS) : 138

Check it out at Comparison of VSS, CVS and SVN

Wednesday, January 28, 2009

Using Social Network Analysis with Version control data

As I mentioned in the last post, am experimenting about using social network analysis (sna) on verision control data. Now with SVNPlot project, I have a way of converting the Subversion logs into sqlite database. It allows me to query the data in many different ways.

I used the Rietveld repository data and did some premilinary analysis. I am not an expert on SNA but Initial results look very interesting and promising. You can see the results on my website

Update : Oscar Castaneda has added SNA data extraction to SVNPlot as part of GSoC 2010 project. He has used these modifications to analyze Apache repositories and reported his findings in ApacheCon. Check the details at

Life After Google Summer of Code by Oscar Castaneda
Oscar's GSoC 2010 proposal
Details on how to use his contributions in SVNPlot to extract the data.

Sunday, January 18, 2009

Social Network Analysis and Version Control

Recently I came across the concept of Social Network Analysis.

Given below is small introduction of Social Network Analysis is from Orgnet site

Social network analysis [SNA] is the mapping and measuring of relationships and flows between people, groups, organizations, computers, web sites, and other information/knowledge processing entities. The nodes in the network are the people and groups while the links show relationships or flows between the nodes. SNA provides both a visual and a mathematical analysis of human relationships.

The concept is originated in 'social sciences (socialogy, anthropology)' to study the relationships on communities. Today it is being used in fraud ring detection, identifying leaders in organizational network,analyzing the relience of computer networks and various other ways. The various casestudies from Orgnet site can give you good idea about the possibilities.

I started thinking about applying SNA for version control history with files and authors as nodes. There is some research going on in this area in universities. References below have few links. Google search with "data mining version control" will give you additional links

With SVNPlot, now I have a way of converting Subversion logs into an SQLite database. Also Python have some excellent libraries for Network analysis. I am using NetworkX for analysis and Matplotlib for visualization. I think such analysis will be useful in

In indentifying the key developers and their specific areas in the project.
Key files (files which are involved in the code changes more frequently than others)
Identify the clusters of related files (across directories and modules)

I think the results will be useful to software development companies as well especially for getting advance warning for problems and especially big projects in indentifying critical developers, planning the technology transfer during movement from people from one project to another etc. I see many exciting possibilities.

The initial results are interesting. I will put up the charts/analysis etc on my site in a few days time.

References and Interesting Articles/Links

Introduction to Social Network Analysis (from orgnet.com)
Casestudies of Social Network Analysis (from Orgnet.com)
Wikipedia page on Social Networks (Check the history of Social Network Analysis)
Social Life of Routers (Computer networks as social networks)
Finding Go-to People and Subject Matter Experts in Organization
Predicting Defects using Network Analysis on Dependency Graphs – ICSE 2008
Mining Software Archives (a special issue of IEEE magazine)

Wednesday, January 14, 2009

SVNPlot - my first opensource project

During the 1 week gap between the two jobs, I finally started an opensource project. The project is called in SVNPlot. It is inspired by the excellent StatSVN Subversion Statistics generation package.

SVNPlot generates graphs similar to StatSVN. The difference is in how the graphs are generated. SVNPlot generates these graphs in two steps. First it converts the Subversion logs into a 'sqlite3' database. Then it uses sql queries to extract the data from the database and then uses excellent Matplotlib plotting library to plot the graphs.

I believe using SQL queries to query the necessary data is resulting great flexibility in data extraction. Also since the sqlite3 is quite fast, it is possible to generate these graphs on demand.

As tribute to python and author of Python-Guido van Rossum, I have generated the graphs for Rietveld project. Check it out here

SVNPlot is hosted on Google code (http://code.google.com/p/svnplot/) and licensed under New BSD license. For information on installation and usage, check the introduction page here

I am using python to implement SVNPlot. I am a novice to python. Hence any suggestions to improvement are welcome.

Thoughts of a Thinking Craftsman

Announcement