Rewriting the Wikimedia Deutschland fundraising

Last year we rewrote the Wikimedia Deutschland fundraising software. In this blog post I’ll give you an idea of what this software does, why we rewrote it and the outcome of this rewrite.

The application

Our fundraising software is a homegrown PHP application. Its primary functions are donations and membership applications. It supports multiple payment methods, needs to interact with payment providers, supports submission and listing of comments and exchanges data with another homegrown PHP application that does analysis, reporting and moderation.

fun-app

The codebase was originally written in a procedural style, with most code residing directly in files (i.e., not even in a global function). There was very little design and completely separate concerns such as presentation and data access were mixed together. As you can probably imagine, this code was highly complex and very hard to understand or change. There was unused code, broken code, features that might not be needed anymore, and mysterious parts that even our guru that maintained the codebase during the last few years did not know what they did. This mess, combined with the complete lack of a specification and units tests, made development of new features extremely slow and error prone.

derp-code

Why we rewrote

During the last year of the old application’s lifetime, we did refactor some parts and tried adding tests. In doing so, we figured that rewriting from scratch would be easier than trying to make incremental changes. We could start with a fresh design, add only the features we really need, and perhaps borrow some reusable code from the less horrible parts of the old application.

They did it by making the single worst strategic mistake that any software company can make: […] rewrite the code from scratch. —Joel Spolsky

We were aware of the risks involved with doing a rewrite of this nature and that often such rewrites fail. One big reason we did not decide against rewriting is that we had a time period of 9 months during which no new features needed to be developed. This meant we could freeze the old application and avoid parallel development, resulting in some kind of feature race. Additionally, we set some constraints: we would only rewrite this application and leave the analysis and moderation application alone, and we would do a pure rewrite, avoiding the addition of new features into the new application until the rewrite was done.

How we got started

Since we had no specification, we tried visualizing the conceptual components of the old application, and then identified the “commands” they received from the outside world.

old-fun-code-diagram

Creating the new software

After some consideration, we decided to try out The Clean Architecture as a high level structure for the new application. For technical details on what we did and the lessons we learned, see Implementing the Clean Architecture.

The result

With a team of 3 people, we took about 8 months to finish the rewrite successfully. Our codebase is now clean and much, much easier to understand and work with. It took us over two man years to do this clean up, and presumably an even greater amount of time was wasted in dealing with the old application in the first place. This goes to show that the cost of not working towards technical excellence is very high.

We’re very happy with the result. For us, the team that wrote it, it’s easy to understand, and the same seems to be true for other people based on feedback we got from our colleagues in other teams. We have tests for pretty much all functionality, so can refactor and add new functionality with confidence. So far we’ve encountered very few bugs, with most issues arising from us forgetting to add minor but important features to the new application, or misunderstanding what the behavior should be and then correctly implementing the wrong thing. This of course has more to do with the old codebase than with the new one. We now have a solid platform upon which we can quickly build new functionality or improve what we already have.

The new application is the first Wikimedia (Deutschland) deployed on, and wrote in, PHP7. Even though not an explicit goal of the rewrite, the new application has ended up with better performance than the old one, in part due to the PHP7 usage.

Near the end of the rewrite we got an external review performed by thePHPcc, during which Sebastian Bergmann, who you might know from PHPUnit fame, looked for code quality issues in the new codebase. The general result of that was a thumbs up, which we took the creative license to translate into this totally non-Sebastian approved image:

You can see our new application in action in production. I recommend you try it out by donating 🙂

Technical statistics

These are some statistics for fun. They have been compiled after we did our rewrite, and where not used during development at all. As with most software metrics, they should be taken with a grain of salt.

In this visualization, each dot represents a single file. The size represents the Cyclomatic complexity while the color represents the Maintainability Index. The complexity is scored relative to the highest complexity in the project, which in the old application was 266 and in the new one is 30. This means that the red on the right (the new application) is a lot less problematic than the red on the left. (This visualization was created with PhpMetrics.)

fun-complexity

Global access in various Wikimedia codebases (lower is better). The rightmost is the old version of the fundraising application, and the one next to it is the new one. The new one has no global access whatsoever. LLOC stands for Logical Lines of Code. You can see the numbers in this public spreadsheet.

global-access-stats

Static method calls, often a big source of global state access, where omitted, since the tools used count many false positives (i.e. alternative constructors).

The differences between the projects can be made more apparent by visualizing them in another way. Here you have the number of lines per global access, represented on a logarithmic scale.

lloc-per-global

The following stats have been obtained using phploc, which counts namespace declarations and imports as LLOC. This means that for the new application some of the numbers are very slightly inflated.

  • Average class LLOC: 31 => 21
  • Average method LLOC: 4 => 3
  • Cyclomatic Complexity / LLOC : 0.39 => 0.10
  • Cyclomatic Complexity / Number of Methods: 2.67 => 1.32
  • Global functions: 58 => 0
  • Total LLOC: 5517 => 10187
  • Test LLOC: 979 => 5516
  • Production LLOC: 4538 => 4671
  • Classes 105 => 366
  • Namespaces: 14 => 105

This is another visualization created with PhpMetrics that shows the dependencies between classes. Dependencies are static calls (including to the constructor), implementation and extension and type hinting. The applications top-level factory can be seen at the top right of the visualization.

fun-dependencies

Maps 4.0.0-RC1 released!

I’m happy to announce the first release candidate for Maps 4.0. Maps is a MediaWiki extension to work with and visualize geographical information. Maps 4.0 is the first major release of the extension since January 2014, and it brings a ton of “new” functionality.

First off, this blog post is about a release candidate, meant to gather feedback and not suitable for usage in production. The 4.0 release itself will be made one week from now if no issues are found.

Almost all features from the Semantic Maps extension got merged into Maps, with the notable omission of the form input, which now resides in Yaron Korens Page Forms extension. I realized that spreading out the functionality over both Maps and Semantic Maps was hindering development and making things more difficult for the users than needed. Hence Semantic Maps is now discontinued, with Maps containing the coordinate datetype, the map result formats for each mapping service, the KML export format and distance query support. All these features will automatically enable themselves when you have Semantic MediaWiki installed, and can be explicitly turned off with a new egMapsDisableSmwIntegration setting.

The other big change is that, after 7 years of no change, the default mapping service was changed from Google Maps to Leaflet. The reason for this alteration is that Google recently required obtaining and specifying an API key for its maps to work on new websites. This would leave some users confused when they first installed the Maps extension and got a non functioning map, even though the API key is mentioned in the installation instructions. Google Maps is of course still supported, and you can make it the default again on your wiki via the egMapsDefaultService setting.

Another noteworthy change is the addition of the egMapsDisableExtension setting, which allows for disabling the extension via configuration, even when it is installed. This has often been requested by those running wiki farms.

For a full list of changes, see the release notes. Also check out the new features in Maps 3.8, Maps 3.7 and Maps 3.6 if you have not done so yet.

Upgrading

Since this is a major release, please beware of the breaking changes, and that you might need to change configuration or things inside of your wiki. Update your mediawiki/maps version in composer.json to ~4.0@rc (or ~4.0 once the real release has happened) and run composer update.

Beware that as of Maps 3.6, you need MediaWiki 1.23 or later, and PHP 5.5 or later. If you choose to remain with an older version of PHP or MediaWiki, use Maps 3.5. Maps works with the latest stable versions of both MediaWiki and PHP, which are the versions I recommend you use.

Object Orientated Lua code

During the last few weeks I’ve been refactoring some horrible Lua code. This has been a ton of fun so far, and I learned many new things about Lua that I’d like to share.

Such Horrible Code

final-rush-pro-5-largeThe code in question is that of a scripted Supreme Commander Forged Alliance Forever Map called Final Rush Pro v4. Essentially all the code resides in a single Lua file slightly over 2500 lines long. It is entirely procedural, uses global state all over, contains plenty of copy pasted code and, unsurprisingly, does not have a single test. What’s more is that at least some of the code must have been written by people not even at home with procedural programming, as there are several instances when massive if-else blocks are used rather than loops.

Much Refactoring

The high level approach I took was to identify cohesive sets of code in the huge file and move them out in dedicated files. These dedicated files would then have their dependencies explicitly defined and could be cleaned up one by one. This graph shows the lines of code of the Lua file that acts as entry point over time:

final-rush-loc

The first example can be seen in moving the “PrebuildTents” code into its own file. This code, coincidentally, nicely illustrates the copy pasting and insane use of if-else over loops. One huge issue that remains when simply moving the code like that is that it remains in global/static scope. In other words, it’s not possible to use the code in the file with two different sets of local values. I did some searching on how to idiosyncratically achieve polymorphism in Lua.

One of the first things I read through was the Object Orientated Programming pages of the Programming in Lua book. Following that approach, I created the very first version of a simple wrapper around a list of player armies. As you can see there, I wrote tests for that code (more on those tests below). I was not too happy with that approach as it does not provide nice encapsulation. After looking at the code of some of the more prominent Lua tools I came across, I decided to go with a closure based approach instead. Initially I would define a this local table, which would then get functions bound to it. I switched to returning a map at the end of the closure, which makes it more clear what the public functions are, and leaves one less local variable to worry about. (The closure is assigned to newInstance rather than just returned due to the way the import mechanism of the framework works, which is different than Lua’s native require.)

A downside of how the code in the files is organized is that you essentially need to read backwards when looking at how it is invoked. The public functions are listed at the very end of the file, with their dependencies defined before, and their dependencies defined before that. It would be nice to have the public functions more clearly visible at the top of the file, which is where you need to look for the constructor signature already.

Now the creating of cohesive sets of code is mostly done, the entry point file is down to 44 Lines of Code. It defaults some options coming from the framework/game, and then invokes a high level module that sets up the various aspects of the game, which totals 70 Lines of Code.

My next steps are further cleanup of individual sets of code, with a focus on minimizing dependencies and separating concerns. For this I’m using practices, principles and patterns which are by and large language agnostic, so I won’t get into them here. You can find the code of the new version of the map in the Final Rush Pro 5 repository on GitHub, including many small refactoring commits in the git history.

Very Environment

My first modifications to the code where with Notepad++ on Windows. While that editor provides syntax highlighting, there is no static code analysis or any of the essential things that require it, such as navigating to definitions. Hence I switched to my usual development environment, IntelliJ on Linux, using the IntelliJ Lua plugin.

While that switch to Linux made refactoring the code easier, it also prevent me from (manually) testing the code. This code, like many legacy balls of mud, binds very tightly to its framework, in this case the Supreme Commander game that only runs on Windows. While it’s often good to remove such binding, it’s not a trivial task, and not something I’d want to attempt without a fast feedback cycle.

The lack of fast feedback drove me to find a Lua testing tool to use. Several are listed on the lua-users wiki. After checking the project health of several tools, I decided to go with Busted, which I installed via LuaRocks. I then proceeded to create a wrapper for the list of players in the game (to replace code that was not only crappy but also incorrect) using Test Driven Development, resulting in a nice spec for the wrapper.

Unfortunately the same approach would not work for cleaning up most of the other code. The framework binding was just too high, and in a lot of cases, contrary to the typical scenario I’m used to (which are not games), perhaps simply the best that can be done. Hence I switched back to Windows.

On Windows I installed IntelliJ with the Lua plugin, TortoiseGit, and Busted. The latter was quite a hurdle, since my Windows administration skills are not exactly stellar. For Busted I needed to install Lua (ya really), LuaRocks and the MinGW compiler. Being able to run the tests in the IDE’s terminal was worth it though.

Wow Release

Version 5 of the map has now been released, see the release post for details on the new features.

Clean Architecture diagrams

I’m happy to release a few Clean Architecture related diagrams into the public domain (CC0 1.0).

These diagrams where created at Wikimedia Deutchland by Jan Dittrich, Charlie Kritschmar and myself for an upcoming presentation I’m doing on the Clean Architecture. There are plenty of diagrams available already if you include Onion Architecture and Hexagonal, which have essentially the same structure, though none I’ve found so far have a permissive license. Furthermore, I’m not so happy with the wording and structure of a lot of these. In particular, some incorporate more than they can chew with the “dependencies pointing inward rule”, glossing over important restrictions which end up not being visualized at all.

These images are SVGs. Click them to go to Wikimedia Commons where you can download them.

Clean Architecture Clean Architecture + Bounded Context Clean Architecture + Bounded Contexts Clean Architecture + Bounded Contexts

Maps 3.8 for MediaWiki released

I’m happy to announce the immediate availability of Maps 3.8. This feature release brings several enhancements and new features.

  • Added Leaflet marker clustering (by Peter Grassberger)
    • markercluster: Enables clustering, multiple markers are merged into one marker.
    • clustermaxzoom: The maximum zoom level where clusters may exist.
    • clusterzoomonclick: Whether clicking on a cluster zooms into it.
    • clustermaxradius: The maximum radius that a cluster will cover.
    • clusterspiderfy: At the lowest zoom level markers are separated so you can see all.
  • Added Leaflet fullscreen control (by Peter Grassberger)
  • Added OSM Nominatim Geocoder (by Peter Grassberger)
  • Upgraded Leaflet library to its latest version (1.0.0-r3) (by Peter Grassberger)
  • Made removal of marker clusters more robust. (by Peter Grassberger)
  • Unified system messages for several services (by Karsten Hoffmeyer)

Leaflet marker clusters

Goolge Maps API key

Due to changes to Google Maps, an API key now needs to be set. Upgrading to the latest version of Maps will not break the maps on your wiki in any case, as the change really is on Googles end. If they are still working, you can keep running an older version of Maps. Of course it’s safer to upgrade and set the API key anyway. In case you have a new wiki or the maps broke for some reason, you will need to get Maps 3.8 or later and set the API key. See the installation configuration instructions for more information.

  • Added Google Maps API key egMapsGMaps3ApiKey setting (by Peter Grassberger)
  • Added Google Maps API version number egMapsGMaps3ApiVersion setting (by Peter Grassberger)

Upgrading

Since this is a feature release, there are no breaking changes, and you can simply run composer update, or replace the old files with the new ones.

Beware that as of Maps 3.6, you need MediaWiki 1.23 or later, and PHP 5.5 or later. If you choose to remain with an older version of PHP or MediaWiki, use Maps 3.5. Maps works with the latest stable versions of both MediaWiki and PHP, which are the versions I recommend you use.

Notes: Implementing DDD, chapter 2

Notes from Implementing Domain Driven Design, chapter 2: Domains, Subdomains and Bounded Contexts (p58 and later only)

  • User interface and service orientated endpoints are within the context boundary
  • Domain concepts in the UI form the Smart UI Anti-Pattern
  • A database schema is part of the context if it was created for it and not influenced from the outside
  • Contexts should not be used to divide developer responsibilities; modules are a more suitable tactical approach
  • A bounded context has one team that is responsible for it (while teams can be responsible for multiple bounded contexts)
  • Access and identity is its own context and should not be visible at all in the domain of another context. The application services / use cases in the other context are responsible for interacting with the access and identity generic subdomain
  • Context Maps are supposedly real cool

Maps 3.7 for MediaWiki released

I’m happy to announce the immediate availability of Maps 3.7. This feature release brings some minor enhancements.

  • Added rotate control support for Google Maps (by Peter Grassberger)
  • Changed coordinate display on OpenLayers maps from long-lat to lat-long (by Peter Grassberger)
  • Upgraded Google marker cluster library to its latest version (2.1.2) (by Peter Grassberger)
  • Upgraded Leaflet library to its latest version (0.7.7) (by Peter Grassberger)
  • Added missing system messages (by Karsten Hoffmeyer)
  • Internal code enhancements (by Peter Grassberger)
  • Removed broken custom map layer functionality. You no longer need to run update.php for full installation.
  • Translation updates by TranslateWiki

Upgrading

Since this is a feature release, there are no breaking changes, and you can simply run composer update, or replace the old files with the new ones.

Beware that as of Maps 3.6, you need MediaWiki 1.23 or later, and PHP 5.5 or later. If you choose to remain with an older version of PHP or MediaWiki, use Maps 3.5. Maps works with the latest stable versions of both MediaWiki and PHP, which are the versions I recommend you use.

PHP Unconference Europe 2016

Last week I attended the 2016 edition of the PHP Unconference Europe, taking place in Palma De Mallorca. This post contains my notes from various conference sessions. Be warned, some of them are quite rough.

Overall impression

Before getting to the notes, I’d like to explain the setup of the unconference and my general impression.

The unconference is two days long, not counting associated social events before and afterwards. The first day started with people discussing in small groups which sessions they would like to have, either by leading them themselves, or just wanting to attend. These session ideas where written down and put on papers on the wall. We then went through them one by one, with someone explaining the idea behind each session, and one or more presenters / hosts being chosen. The final step of the process was to vote on the sessions. For this, each person got two “sticky dots” (what are those things called anyway?), which they could either both put onto a single session, or split and vote on two sessions.

One each day we had 4 such sessions, with long breaks in between, to promote interaction between the attendees.

Onto my notes for individual sessions:

How we analyze your code

Analysis and metrics can be used for tracking progress and for analyzing the current state. Talk focuses on current state.

  • Which code is important
  • Probably buggy code
  • Badly tested code
  • Untested code

Finding the core (kore?): code rank (like Google page rank): importance flows to classes that are dependent upon (fan-in). Qafoo Quality Analyzer. Reverse code rank: classes that depend on lots of other classes (fan-out)

Where do we expect bugs? Typically where code is hard to understand. We can look at method complexity: cyclomatic complexity, NPath complexity. Line Coverage exists, Path Coverage is being worked upon. Parameter Value Coverage. CRAP.

Excessive coupling is bad. Incoming and outgoing dependencies. Different from code rank in that only direct dependencies are counted. Things that are depended on a lot should be stable and well tested (essentially the Stable Dependencies Principle).

Qafoo Quality Analyzer can be used to find dependencies across layers when they are in different directories. Very limited at present.

When finding highly complex code, don’t immediately assume it is bad. There are valid reasons for high complexity. Metrics can also be tricked.

The evolution of web application architecture

How systems interact with each other. Starting with simple architecture, looking at problems that arise as more visitors arrive, and then seeing how we can deal with those problems.

Users -> Single web app server -> DB

Next step: Multiple app servers + load balancers (round robin + session caching server)

Launch of shopping system resulted in app going down, as master db got too many writes, due to logging “cache was hit” in it.

Different ways of caching: entities, collections, full pages. Cache invalidation is hard, lots of dependencies even in simple domains.

When too many writes: sharding (split data across multiple nodes), vertical (by columns) or horizontal (by rows). Loss of referential integrity checking.

Complexity with relational database systems -> NoSQL: sharding, multi master, cross-shard queries. Usually no SQL or referential integrity, though those features are already lost when using sharding.

Combination of multiple persistence systems: problems with synchronization. Transactions are slow. Embrace eventual consistency. Same updating strategies can be used for caches.

Business people often know SQL, yet not NoSQL query languages.

Queues can be used to pass data asynchronously to multiple consumers. Following data flow of an action can be tricky. Data consistency is still a thing.

Microservices: separation of concerns on service and team level. Can simplify via optimal tech stack per serve. Make things more complicated, need automated deployment, orchestration, eventual consistency, failure handling.

Boring technology often works best, especially at the beginning of a project. Start with the simplest solution that works. Hold team skills into account.

How to fuck up projects

Before the project

  • Buzzword first design
  • Mismatching expectations: huge customer expectations, no budget
  • Fuzzy ambitious vocabulary, directly into the contract (including made up words)
  • Meetings, bad mood, no eye contact
  • No decisions (no decision making process -> no managers -> saves money)
  • Customer Driven Development: customer makes decisions
  • Decide on environment: tools, mouse/touchpad, 1 big monitor or 2 small ones, JIRA, etc
  • Estimates: should be done by management

During the project

  • Avoid ALL communication, especially with the customer
  • If communication cannot be avoided: mix channels
  • Responsibility: use group chats and use “you” instead of specific names (cc everyone in mails)
  • Avoid issue trackers, this is what email and Facebook are for
  • If you cannot avoid issue trackers: use multiple or have one ticket with 2000 notes
  • Use ALL the programming languages, including PHP-COBOL
  • Do YOUR job, but nothing more
  • Only pressure makes diamonds: coding on the weekend
  • No breaks so people don’t lose focus
  • Collect metrics: Hours in office, LOC, emails answered, tickets closed

Completing the project

  • 3/4 projects fail: we can’t do anything about it
  • New features? Outsource
  • Ignore the client when they ask about the completed project
  • Change the team often, fire people on a daily basis
  • Rotate the customer’s contact person

Bonus

  • No VCS. FTP works. Live editing on production is even better
  • http://whatthecommit.com/
  • Encoding: emjois in function names, umlaut in file names. Mix encodings, also in MySQL
  • Agile is just guidelines, change goals during sprints often
  • Help others fuck up: release it as open source
  • git blame-someone-else

The future of PHP

This session started with some words from the moderator, who mainly talked about performance, portability and future adoption of, or moving away from, PHP.

  • PHP now fast enough to use many PHP libraries
  • PHP now better for long running tasks (though still no 64 bit for windows)
  • PHP now has an Abstract Syntax Tree

The discussion that followed after was primarily about the future of PHP in terms of adoption. The two languages most mentioned as competitors where Javascript and Java.

Java because it is very hard to get PHP into big enterprise, where people tend to cling to Java. A point made several times about this is that such choices have very little to do with technical sensibility, and are instead influenced by the eduction system, languages already used, newness/ hipness and the HiPPO. Most people also don’t have the relevant information to make an informed choice, and do not do the effort to look up this information as they already have a preference.

Javascript is a competitor because web based projects, be it with a backend in PHP or in another language, need more and more Javascript, with no real alternatives. It was mentioned several times that not having alternatives it bad. Having multiple JS interpreters is cool, JS being the only choice for browser programming is not.

Introduction to sensible load testing

In this talk the speaker explained why it is important to do realistic load testing, and how to avoid common pitfalls. He explained how jMeter can be used to simulate real user behavior during peak load times. Preliminary slides link.

Domain Objects: not just for Domain Driven Design

This session was hard to choose, as it coincided with “What to look for in a developer when hiring, and how to test it”, which I also wanted to attend.

The Domain Objects session introduced what Value Objects are, and why they are better than long parameter lists and passing around values that might be invalid. While sensible enough, all very basic, with unfortunately no information for me whatsoever. I’m thinking it’d have been better to do this as a discussion, partly because the speaker was clearly very inexperienced, and gave most of the talk with his arms crossed in front of him. (Speaker, if you are reading this, please don’t be discouraged, practice makes perfect.)

Performance monitoring

I was only in the second half of this session, during which two performance monitoring tools where presented. Tideways by Qafoo and Instana.

Some tweets

Maps 3.6 for MediaWiki released

I’m happy to announce the immediate availability of Maps 3.6. This feature release brings marker clustering enhancements and a number of fixes.

These parameters where added to the display_map parser function, to allow for greater control over marker clustering. They are only supported together with Google Maps.

  • clustergridsize: The grid size of a cluster in pixels
  • clustermaxzoom: The maximum zoom level that a marker can be part of a cluster
  • clusterzoomonclick: If the default behavior of clicking on a cluster is to zoom in on it
  • clusteraveragecenter: If the cluster location should be the average of all its markers
  • clusterminsize: The minimum number of markers required to form a cluster

Bugfixes

  • Fixed missing marker cluster images for Google Maps
  • Fixed duplicate markers in OpenLayers maps
  • Fixed URL support in the icon parameter

Credits

Many thanks to Peter Grassberger, who made the listed fixes and added the new clustering parameters. Thanks also go to Karsten Hoffmeyer for miscellaneous support and to TranslateWiki for providing translations.

Upgrading

Since this is a feature release, there are no breaking changes, and you can simply run composer update, or replace the old files with the new ones.

There are, however, compatibility changes to keep in mind. As of this version, Maps requires PHP 5.5 or later and MediaWiki 1.23 or later. composer update will not give you a version of Maps incompatible with your version of PHP, though it is presently not checking your MediaWiki version. Fun fact: this is the first bump in minimum requirements since the release of Maps 2.0, way back in 2012.

 

 

Is Pair Programming worth it?

Every now and then I get asked how to convince ones team members that Pair Programming is worthwhile. Often the person asking, or people I did pair programming with, while obviously enthusiastic about the practice, and willing to give it plenty of chance, are themselves not really convinced that it actually is worth the time. In this short post I share how I look at it, in the hope it is useful to you personally, and in convincing others.

Extreme Programming

The cost of Pair Programming

Suppose you are new to the practice and doing it very badly. You have one person hogging the keyboard and not sharing their thoughts, with the other paying more attention to twitter than to the development work. In this case you basically spend twice the time for the same output. In other words, the development cost is multiplied by two.

Personally I find it tempting to think about Pair Programming as doubling the cost, even though I know better. How much more total developer time you need is unclear, and really depends on the task. The more complex the task, the less overhead Pair Programming will cause. What is clear, is that when your execution of the practice is not pathologically bad, and when the task is more complicated than something you could trivially automate, the cost multiplication is well below two. An article on c2 wiki suggests 10-15% more total developer time, with the time elapsed being about 55% compared to solo development.

If these are all the cost implications you think about with regards to Pair Programming, it’s easy to see how you will have a hard time to justify it. Let’s look at what makes the practice actually worthwhile.

The cost of not Pair Programming

If you do Pair Programming, you do not need a dedicated code review step. This is because Pair Programming is a continuous application of review. Not only do you not have to put time into a dedicated review step, the quality of the review goes up, as communication is much easier. The involved feedback loops are shortened. With dedicated review, the reviewer will often have a hard time understanding all the relevant context and intent. Questions get asked and issues get pointed out. Some time later the author of the change, who in the meanwhile has been working on something else, needs to get back to the reviewer, presumably forcing two mental context switches. When you are used to such a process, it becomes easy to become blind to this kind of waste when not paying deliberate attention to it. Pair Programming eliminates this waste.

The shorter feedback loops and enhanced documentation also help you with design questions. You have a fellow developer sitting next to you who you can bounce ideas off and they are even up to speed with what you are doing. How great is that? Pair Programming can be a lot of fun.

The above two points make Pair Programming pay more than for itself in my opinion, though it offers a number of additional benefits. You gain true collective ownership, and build shared commitment. There is knowledge transfer and Pair Programming is an excellent way of onboarding new developers. You gain higher quality, both internal in the form of better design, and external, in the form of fewer defects. While those benefits are easy to state, they are by no means insignificant, and deserve thorough consideration.

Give Pair Programming a try

As with most practices there is a reasonable learning curve, which will slow you down at first. Such investments are needed to become a better programmer and contribute more to your team.

Many programmers are more introverted and find the notion of having to pair program rather daunting. My advice when starting is to begin with short sessions. Find a colleague you get along with reasonably well and sit down together for an hour. Don’t focus too much on how much you got done. Rather than setting some performance goal with an arbitrary deadline, focus on creating a habit such as doing one hour of Pair Programming every two days. You will automatically get better at it over time.

If you are looking for instructions on how to Pair Program, there is plenty of google-able material out there. You can start by reading the Wikipedia page. I recommend paying particular attention to the listed non-performance indicators. There are also many videos, be it conference tasks, or dedicated explanations of the basics.

Such disclaimer

I should note that while I have some experience with Pair Programming, I am very much a novice compared to those who have done it full time for multiple years, and can only guess at the sage incantations these mythical creatures would send your way.

Extreme Pair Programming

Extreme Pair Programming