Readable Functions: Guard Clause

Guard Clauses are one of my favorite little tricks that allow simplifying code. A guard clauses is an if statement with a return in it. Consider the following code:

Using a Guard Clause we can simplify it to:

The if statement in this simplified code is a Guard Clause. You can have multiple Guard Clauses in a function.

The simplification removes ALL the state in the function, including the nasty and completely not needed mutation in the first form of the code. You can read the new code sequentially, and the code after the Guard Clause is not polluted by extra complexity arising from the special case.

This post is part of the Readable Functions series which contains tricks like this one and two general principles: Minimize State and Do One Thing.

My year in books

In 2018 I read more books than ever before. Here are some of my recommendations and thoughts.

The 34 books I read in 2018

I read 21 fiction books and 13 non-fiction ones. The fiction books where all either hard science fiction or pleb science fiction while the non-fiction ones where mostly about psychology or finance.

Top 2018 fiction book

The price for 2018 best fiction book that I read goes to Children of Time by Adrian Tchaikovsky. This book had me so gripped that I spend 5h per day reading it till it was finished.

Children of Time is a wonderful hard Science Fiction novel. It deals with the rise and fall of civilizations and takes a delightful evolutionary angle in the story. The story has a lot in common with A Deepness in The Sky by Vernor Vinge, one of my all time favorite books.

It also reminded me of Pandora’s Star by Peter F Hamilton, in particular how the Primes civilization (in Pandora’s Star) rises from the unthinking depths to something waging interstellar war. Beyond that these two novels are very different in tone and focus.

Top 2018 non-fiction book

I was going to pick The Righteous Mind by Jonathan Haidt though then realized I beat myself to it by including it in my 2017 book summary.

With The Righteous Mind being disqualified, first place goes to 12 Rules for Life: An Antidote to Chaos by Jordan Peterson.

This book contains a lot of useful information on personal growth, meaning, ethics and applied psychology. Though who am I kidding, the reason it gets first place is because rule 12 is “Pet a cat when you encounter one on the street”.

Second place goes to Homo Deus: A Brief History of Tomorrow. This book is about the human condition, the challenges we have faced as a species and, most important of all, what the outline of those challenges that are up ahead could look like.

While I really liked both 12 Rules and Homo Deus, I was not a fan of the final chapters(s) of either book. Since I read Thinking Fast and Slow I recognized the tendency for Duration Neglect via Peak-end Averaging. Knowing how your own mind tends to go off the rails is so useful.

Culture 2.0

13 of the 21 novels I read are part of the same series, so it deserves a mention: The Polity by Neal Asher. It has more than a few similarities with The Culture universe by Iain M Banks.

Th novel Prador Moon tells the story of how the Culture comes into contact with the war-like Idirans. Caught off balance, the ship minds quickly ramp up military production while fighting a retreating and delaying action. Wait… I meant Polity, Prador and ship AIs. Not sure how I got those mixed up.

That said the Polity novels lack certain bits of awesome common to the Culture novels, such as the definitely superior Culture ship names and antics. On the flip side, the Prador are much more build out than any alien race in the Culture novels, and so is the Jain threat.

I particularly enjoyed the later (more recent) novels in the series. Recommended for people that like stories of increasing scale in a well build out universe.

Not books

Much of what I read is in the form of blog posts and articles. And then there are also many educational videos I watch. Here are some resources that I spend a lot of time consuming or that I came across in 2018 and piqued my interest.

Such Stats

In 2018 I for the first time read over 1000 pages per month. Such achievement!

Pages read per year
Pages read per month since late 2012
I’m mostly a sucker for recently published books

Readable Functions: Do One Thing

Several tricks and heuristics that I apply to write easy to understand functions keep coming up when I look at other people their code. This post outlines the second key principle. The first principle is Minimize State. Following posts will contain specific tips and tricks, often building on these two principles.

Do One Thing

Often functions become less readable because they are doing multiple things. If your function Foo needs to do tasks A, B and C, then create a function for each of the tasks and call those from Foo.

Having small functions is OK. Having just 3 calls to other functions in your function is OK. Don’t be afraid of “too simple” code or of this style making things harder to follow (it does not (unless you are splitting things up stupidly)). And don’t be afraid of performance. In most programs the amount of function calls have effectively 0 impact on performance. There are exceptions, though unless you know you are dealing with one of these, don’t mash things together for performance reasons.

One Level of Abstraction

Doing one thing includes dealing with a single level of abstraction. For instance, suppose you have a function in which in some cases a message needs to be logged to the file system.

Here the details of how the error message is logged are on a lower level of abstraction than the rest of the function. This makes it harder to tell what the function is actually doing, because details that don’t matter on the higher level of abstraction clutter the high level logic. These details should be in their own function.

See also

PHP Typed Properties

Lately there has been a lot of hype around the typed properties that PHP 7.4 will bring. In this post I outline why typed properties are not as big of a game changer as some people seem to think and how they can lead to shitty code. I start by a short introduction to what typed properties are.

What Are Typed Properties

As of version 7.3, PHP supports types for function parameters and for function return values. Over the latest years many additions to PHP types where made, such as primitive (scalar) types like string and int (PHP 7.0), return types (PHP 7.0), nullable types (PHP 7.1) and parameter type widening (PHP 7.2). The introduction of typed properties (PHP 7.4) is thus a natural progression.

Typed properties work as follows:

You can do in two simple lines what takes a lot more boilerplate in PHP 7.3 or earlier. In these versions, if you want to have type safety, you need a getter and setter for each property.

Not only is it a lot more work to write all of these getters and setters, it is also easy to make mistakes when not automatically generating the code with some tool.

These advantages are what the hype is all about. People are saying it will save us from writing so much code. I think not, and I am afraid of the type of code those people will write using typed properties.

Applicability of Typed Properties

Let’s look at some of different types of classes we have in a typical well designed OO codebase.

Services are classes that allow doing something. Loggers are services, Repositories are services and LolcatPrinters are services. Services often need collaborators, which get injected via their constructor and stored in private fields. These collaborators are not visible from the outside. While services might have additional state, they normally do not have getters or setters. Typed properties thus do not save us from writing code when creating services and the added type safety they provide is negligible.

Entities (DDD term) encapsulate both data and behavior. Normally their constructors take a bunch of values, typically in the form of Value Objects. The methods on entities provide ways to manipulate these values via actions that make sense in the domain language. There might be some getters, though setters are rare. Having getters and setters for most of the values in your entities is an anti-pattern. Again typed properties do not save us from writing code in most cases.

Value Objects (DDD term) are immutable. This means you can have getters but not setters. Once again typed properties are of no real help. What would be really helpful however is a first-class Value Object construct part of the PHP language.

Typed properties are only useful when you have public mutable state with no encapsulation. (And in some cases where you assign to private fields after doing complicated things.) If you design your code well, you will have very little code that matches all of these criteria.

Going to The Dark Side

By throwing immutability and encapsulation out of the window, you can often condense code using typed properties. This standard Value Object …

… becomes the much shorter

The same goes for Services and Entities: by giving up on encapsulation and immutability, you gain the ability to not write a few lines of simple code.

This trade-off might actually make sense if you are working on a small codebase on your own or with few people. It can also make sense if you create a throw away prototype that you then actually throw away. For codebases that are not small and are worked on by several people writing a few simple getters is a low price to pay for the advantages that encapsulation and immutability provide.

Conclusion

Typed properties marginally help with type safety and in some rare cases can help reduce boilerplate code. In most cases typed properties do not reduce the amount of code needed unless you throw the valuable properties of immutability and encapsulation out of the window. Due to the hype I expect many junior programmers to do exactly that.

See Also

Readable Functions: Minimize State

Several tricks and heuristics that I apply to write easy to understand functions keep coming up when I look at other peoples code. In this post I share the first of two key principles to writing readable functions. Follow up posts will contain the second key principle and specific tricks, often building on these general principles.

What makes functional programming so powerful? Why do developers that have mastered it say it makes them so much more productive? What amazing features or capabilities does the functional paradigm provide to enable this enhanced productivity? The answer is not what you might expect if you never looked into functional programming. The power of the functional paradigm does not come from new functionality, it comes from restricting something we are all familiar with: mutable state. By minimizing or altogether avoiding mutable state, functional programs skip a great source of complexity, thus becoming easier to understand and work with.

Minimize Mutability

If you are doing Object Orientated Programming you are hopefully aware of the drawbacks of having mutable objects. Similar drawbacks apply to mutable state within function scope, even if those functions are part of a procedural program. Consider the below PHP code snippet:

This function is needlessly complex because of mutable state. The variable $thing is in scope in the entire function and it gets modified. Thus to understand the function you need to keep track of the value that was assigned and how that value might get modified/overridden. This mental overhead can easily be avoided by using what is called a Guard Clause:

This code snippet is easier to understand because there is no state. The less state, the less things you need to remember while simulating the function in your head. Even though the logic in these code snippets is trivial, you can already notice how the Accidental Complexity created by the mutable state makes understanding the code take more time and effort. It pays to write your functions in a functional manner even if you are not doing functional programming.

Minimize Scope

While mutable state is particularly harmful, non-mutable state also comes with a cost. What is the return value of this function?

It is a lot easier to tell what the return value is when refactored as follows:

To understand the return value you need to know where the last assignment to $bar happened. In the first snippet you, for no reason at all, need to scan up all the way to the first lines of the function. You can avoid this by minimizing the scope of $bar. This is especially important if, like in PHP, you cannot declare function scope values as constants. In the first snippet you likely spotted that $bar = 2 before you went through the irrelevant details that follow. If instead the code had been const bar = 2 like you can do in JavaScript, you would not have needed to make that effort.

Conclusion

With this understanding we arrive at two guidelines for scope in functions that you can’t avoid altogether in the first place. Thou shalt:

  • Minimize mutability
  • Minimize scope

Indeed, these are two very general directives that you can apply in many other areas of software design. Keep in mind that these are just guidelines that serve as a starting point. Sometimes a little state or mutability can help readability.

To minimize scope, create it as close as possible to where it is needed. The worst thing you can do is declare all state at the start of a function, as this maximizes scope. Yes, I’m looking at you JavaScript developers and university professors. If you find yourself in a team or community that follows the practice of declaring all variables at the start of a function, I recommend not going along with this custom because its harmful nature outweighs “consistency” and “tradition” benefits.

To minimize mutability, stop every time you are about to override a variable and ask yourself if you cannot simplify the code. The answer is nearly always that you can via tricks such as Guard Clauses, many of which I will share in follow up posts. I myself rarely end up mutating variables, less than once per thousand 1000 lines of code. Because each removal of harmful mutability makes your code easier to work with you reap the benefits incrementally and can start applying this style right away. If you are lucky enough to work with a language that has constants in function scopes, use them as default instead of variables.

See also

Thanks to Gabriel Birke for proofreading and making some suggestions.

Clean Architecture + Bounded Contexts diagram

I’m happy to release a two Clean Architecture + Bounded Contexts diagrams into the public domain (CC0 1.0).

I created these diagrams for Wikimedia Deutchland with the help of Jan Dittrich, Charlie Kritschmar and Hanna Petruschat. They represent the architecture of our fundraising codebase. I explain the rules of this architecture in my post Clean Architecture + Bounded Contexts. The new diagrams are based on the ones I published two years ago in my Clean Architecture Diagram post.

Diagram 1: Clean Architecture + DDD, generic version. Click to enlarge. Link: SVG version

Diagram 1: Clean Architecture + DDD, fundraising version. Click to enlarge. Link: SVG version

 

Base Libraries Should be Stable

In this post I go over the principles that govern package (library) design and one specific issue I have come across several times.

Robert C Martin has proposed six Package Principles. Personally I find the (linked) description on Wikipedia rather confusing, especially if you do not already understand the principles. Here is my take on library design using short and simple points:

Libraries should

  • cover a single responsibility
  • provide a clean and well segregated interface to their users
  • hide/encapsulate implementation details
  • not have cyclic dependencies
  • not depend on less stable libraries

The rest of this post focuses on the importance of the last point.

Stable Base Libraries

I am assuming usage of a package manager such as Composer (PHP) or NPM. This means that each library defines its dependencies (if any) and the version ranges of those that it is compatible with. Libraries have releases and these follow semver. It also means that libraries are ultimately consumed by applications that define their dependencies in the same way.

Consider the scenario where you have various applications that all consume a base library.

If this library contains a lot of concrete code, it will not be very stable and it will now and then be needed or desired to make a major release. A major release is one that can contain breaking changes and therefore does not automatically get used by consumers of the library. These consumers instead need to manually update the version range of the library they are compatible with and possibly adjust their code to work with the breaking changes.

In this scenario that is not a problem. Suppose the library is at version 1.2 and that all applications use version 1.x. If breaking changes are made in the library, these will need to be released as version 2.0. Due to the versioning system the consuming applications can upgrade at their leisure. There is of course some cost to making a major release. The consumers still do need to spend some time on the upgrade, especially if they are affected by the breaking changes. Still, this is typically easy to deal with and is a normal part of the development process.

Things change drastically when we have an unstable library that is used by other libraries, even if those libraries themselves are unstable.

In such a scenario making breaking changes to the base library are very costly. Let’s assume we make a breaking change to the base library and that we want to use it in our application. What do we actually need to do to get there?

  • Make a release of the base library (Library A)
  • Update library B to specify it is compatible with the new A and make a new release of B
  • Update library C to specify it is compatible with the new A and make a new release of C
  • Update library D to specify it is compatible with the new A and make a new release of D
  • Update our application to specify it is compatible with the new version of A

In other words, we need to make a new release of EVERY library that uses our base library and that is also used by our application. This can be very painful, and is a big waste of time if these intermediate libraries where not actually affected by the breaking change to begin with. This means that it is costly, and generally a bad idea, to have libraries depend on another library that is not very stable.

Stability can be achieved in several ways. If the package is very abstract, and we assume good design, it will be stable. Think of a package providing a logging interface, such as psr/log. It can also be approached by a combination of following the package principles and taking care to avoid breaking changes.

Conclusion

Keep the package principles in mind, and avoid depending on unstable libraries in your own libraries where possible.

See also

Clean Architecture + Bounded Contexts

In this follow-up to Implementing the Clean Architecture I introduce you to a combination of The Clean Architecture and the strategic DDD pattern known as Bounded Contexts.

At Wikimedia Deutschland we use this combination of The Clean Architecture and Bounded Contexts for our fundraising applications. In this post I describe the structure we have and the architectural rules we follow in the abstract. For the story on how we got to this point and a more concrete description, see my post Bounded Contexts in the Wikimedia Fundraising Software. In that post and at the end of this one I link you to a real-world codebase that follows the abstract rules described in this post.

If you are not yet familiar with The Clean Architecture, please first read Implementing the Clean Architecture.

Clean Architecture + Bounded Contexts

Diagram by Jeroen De Dauw, Charlie Kritschmar, Jan Dittrich and Hanna Petruschat. Click image to enlarge

Diagram depicting Clean Architecture + Bounded Contexts

In the top layer of the diagram we have applications. These can be web applications, they can be console applications, they can be monoliths, they can be microservices, etc. Each application has presentation code which in bigger applications tends to reside in a decoupled presentation layer using patterns such as presenters. All applications also somehow construct the dependency graph they need, perhaps using a Dependency Injection Container or set of factories. Often this involves reading configuration from somewhere. The applications contain ALL framework binding, hence they are the place where you will find the Controllers if you are using a typical web framework.

Since the applications are in the top layer, and dependencies can only go down, no code outside of the applications is allowed to depend on code in the applications. That means there is 0 binding to mechanisms such as frameworks and presentation code outside of the applications.

In the second layer we have the Bounded Contexts. Ideally one Bounded Context per subdomain. At the core of each BC we have the Domain Model and Domain Services, containing the business logic part of the subdomain. Dependencies can only point inwards, so the Domain model which is at the center cannot depend on anything more to the outside. Around the Domain Model are the Domain Services. These include interfaces for persistence services such as Repositories. The UseCases form the final ring. They can use both the Domain Model and the Domain Services. They also form a boundary around the two, meaning that no code outside of the Bounded Context is allowed to talk to the Domain Model or Domain Services.

The Bounded Contexts include their own Persistence Layer. The Persistence Layer can use a relational database, files on the file system, a remote web API, a combination of these, etc. It has implementations of domain services such as Repositories which are used by the UseCases. These implementations are the only thing that is allowed to talk to and know about the low-level aspects of the Persistence Layer. The only things that can use these service implementations are other Domain Services and the UseCases.

The UseCases, including their Request Models and Response Models, form the public interface of the Bounded Context. This means that there is 0 binding to the persistence mechanisms outside of the Bounded Context. It also means that the code responsible for the domain logic cannot be directly accessed elsewhere, such as in the presentation layer of an application.

The applications and Bounded Contexts contain all the domain specific code. This code can make use of libraries and of course the runtime (ie PHP) itself.

As examples of Bounded Contexts following this approach, see the Donation Context and Membership Context. For an application following this architecture, see the FundraisingFrontend, which uses both the Donation Context and Membership Context. Both these contexts are also used by another application the code of which sadly enough is not currently public. You can also read the stories of how we rewrote the FundraisingFontend to use the Clean Architecture and how we refactored towards Bounded Contexts.

Further reading

If you are not yet familiar with Bounded Contexts or how to design them well, I recommend reading Domain-Driven Design Distilled.

Bounded Contexts in the Wikimedia Fundraising Software

In this follow-up to rewriting the Wikimedia Deutschland fundraising I tell the story of how we reorganized our codebases along the lines of the DDD strategic pattern Bounded Contexts.

In 2016 the FUN team at Wikimedia Deutschland rewrote the Wikimedia Deutschland fundraising application. This new codebase uses The Clean Architecture and near the end of the rewrite got reorganized partially towards Bounded Contexts. After adding many new features to the application in 2017, we reorganized further towards Bounded Contexts, this time also including our other fundraising applications. In this post I explain the questions we had and which decisions we ended up making. I also link to the relevant code so you have real world examples of using both The Clean Architecture and Bounded Contexts.

This post is a good primer to my Clean Architecture + Bounded Contexts post, which describes the structure and architecture rules we now have in detail.

Our initial rewrite

Back in 2014, we had two codebases, each in their own git repository and not using code from the other. The first one being a user facing PHP web-app that allows people to make donations and apply for memberships, called FundraisingFrontend. The “frontend” here stands for “user facing”. This is the application that we rewrote in 2016. The second codebase mainly contains the Fundraising Operations Center, a PHP web-app used by the fundraising staff to moderate and analyze donations and membership applications. This second codebase also contains some scripts to do exports of the data for communication with third-party systems.

Both the FundraisingFrontend and Fundraising Operations Center (FOC) used the same MySQL database. They each accessed this database in their own way, using PDO, Doctrine or SQL directly. In 2015 we created a Fundraising Store component based on Doctrine, to be used by both applications. Because we rewrote the FundraisingFrontend, all data access code there now uses this component. The FOC codebase we have been gradually migrating away from random SQL or PDO in its logic to also using this component, a process that is still ongoing.

Hence as we started with our rewrite of the FundraisingFrontend in 2016, we had 3 sets of code: FundraisingFrontend, FOC and Fundraising Store. After our rewrite the picture still looked largely the same. What we had done was turning FundraisingFrontend from a big ball of mud into a well designed application with proper architecture and separation of subdomains using Bounded Contexts. So while there where multiple components within the FundraisingFrontend, it was still all in one codebase / git repository. (Except for several small PHP libraries, but they are not relevant here.)

Need for reorganization

During 2017 we did a bunch of work on the FOC application and associated export script, all the while doing gradual refactoring towards sanity. While refactoring we found ourself implementing things such as a DonationRepository, things that already existed in the Bounded Contexts part of the FundraisingFrontend codebase. We realized that at least the subset of the FOC application that does moderation is part of the same subdomains that we created those Bounded Contexts for.

The inability to use these Bounded Contexts in the FOC app was forcing us to do double work by creating a second implementation of perfectly good code we already had. This also forced us to pay a lot of extra attention to avoid inconsistencies.  For instance, we ran into issues with the FOC app persisting Donations in a state deemed invalid by the donations Bounded Context.

To address these issues we decided to share the Bounded Contexts that we created during the 2016 rewrite with all relevant applications.

Sharing our Bounded Contexts

We considered several approaches to sharing our Bounded Contexts between our applications. The first approach we considered was having a dedicated git repository per BC. To do this we would need to answer what exactly would go into this repository and what would stay behind. We where concerned that for many changes we’d need to touch both the BC git repo and the application git repo, which requires more work and coordination than being able to make a change in a single repository. This lead us to consider options such as putting all BCs together into a single repo, to minimize this cost, or to simply put all code (BCs and applications) into a single huge repository.

We ended up going with one repo per BC, though started with a single BC to see how well this approach would work before committing to it. With this approach we still faced the question of what exactly should go into the BC repo. Should that just be the UseCases and their dependencies, or also the presentation layer code that uses the UseCases? We decided to leave the presentation layer code in the application repositories, to avoid extra (heavy) dependencies in the BC repo and because the UseCases provide a nice interface to the BC. Following this approach it is easy to tell if code belongs in the BC repo or not: if it binds to the domain model, it belongs in the BC repo.

These are the Bounded Context git repositories:

The BCs still use the FundraisingStore component in their data access services. Since this is not visible from outside of the BC, we can easily refactor towards removing the FundraisingStore and having the data access mechanism of a BC be truly private to it (as it is supposed to be for BCs).

The new BC repos allow us to continue gradual refactoring of the FOC and swap out legacy code with UseCases from the BCs. We can do so without duplicating what we already did and, thanks to the proper encapsulation, also without running into consistency issues.

Clean Architecture + Bounded Contexts

During our initial rewrite we created a diagram to represent the favor of The Clean Architecture that we where using. For an updated version that depicts the structure of The Clean Architecture + Bounded Contexts and describes the rules of this architecture in detail, see my blog post The Clean Architecture + Bounded Contexts.

Diagram depicting Clean Architecture + Bounded Contexts

Clean Architecture: UseCase tests

When creating an application that follows The Clean Architecture you end up with a number of UseCases that hold your application logic. In this blog post I outline a testing pattern for effectively testing these UseCases and avoiding common pitfalls.

Testing UseCases

A UseCase contains the application logic for a single “action” that your system supports. For instance “cancel a membership”. This application logic interacts with the domain and various services. These services and the domain should have their own unit and integration tests. Each UseCase gets used in one or more applications, where it gets invoked from inside the presentation layer. Typically you want to have a few integration or edge-to-edge tests that cover this invocation. In this post I look at how to test the application logic of the UseCase itself.

UseCases tend to have “many” collaborators. I can’t recall any that had less than 3. For the typical UseCase the number is likely closer to 6 or 7, with more collaborators being possible even when the design is good. That means constructing a UseCase takes some work: you need to provide working instances of all the collaborators.

Integration Testing

One way to deal with this is to write integration tests for your UseCases. Simply get an instance of the UseCase from your Top Level Factory or Dependency Injection Container.

This approach often requires you to mutate the factory or DIC. Want to test that an exception from the persistence service gets handled properly? You’ll need to use some test double instead of the real service, or perhaps mutate the real service in some way. Want to verify a mail got send? Definitely want to use a Spy here instead of the real service. Mutability comes with a cost so is better avoided.

A second issue with using real collaborators is that your tests get slow due to real persistence usage. Even using an in-memory SQLite database (that needs initialization) instead of a simple in-memory fake repository makes for a speed difference of easily two orders of magnitude.

Unit Testing

While there might be some cases where integration tests make sense, normally it is better to write unit tests for UseCases. This means having test doubles for all collaborators. Which leads us to the question of how to best inject these test doubles into our UseCases.

As example I will use the CancelMembershipApplicationUseCase of the Wikimedia Deutschland fundrasing application.

This UseCase uses 3 collaborators. An authorization service, a repository (persistence service) and a mailing service. First it checks if the operation is allowed with the authorizer, then it interacts with the persistence and finally, if all went well, it uses the mailing service to send a confirmation email. Our unit test should test all this behavior and needs to inject test doubles for the 3 collaborators.

The most obvious approach is to construct the UseCase in each test method.

Note how both these test methods use the same test doubles. This is not always the case, for instance when testing authorization failure, the test double for the authorizer service will differ, and when testing persistence failure, the test double for the persistence service will differ.

Normally a test function will only change a single test double.

UseCases tend to have, on average, two or more behaviors (and thus tests) per collaborator. That means for most UseCases you will be repeating the construction of the UseCase in a dozen or more test functions. That is a problem. Ask yourself why.

If the answer you came up with was DRY then think again and read my blog post on DRY 😉 The primary issue is that you couple each of those test methods to the list of collaborators. So when the constructor signature of your UseCase changes, you will need to do Shotgun Surgery and update all test functions. Even if those tests have nothing to do with the changed collaborator. A second issue is that you pollute the test methods with irrelevant details, making them harder to read.

Default Test Doubles Pattern

The pattern is demonstrated using PHP + PHPUnit and will need some adaptation when using a testing framework that does not work with a class based model like that of PHPUnit.

The coupling to the constructor signature and resulting Shotgun Surgery can be avoided by having a default instance of the UseCase filled with the right test doubles. This can be done by having a newUseCase method that constructs the UseCase and returns it. A way to change specific collaborators is needed (ie a FailingAuthorizer to test handling of failing authorization).

Making the UseCase itself mutable is a big no-no. Adding optional parameters to the newUseCase method works in languages that have named parameters. Since PHP does not have named parameters, another solution is needed.

An alternative approach to getting modified collaborators into the newUseCase method is using fields. This is less nice than named parameters, as it introduces mutable state on the level of the test class. Since in PHP this approach gives us named fields and is understandable by tools, it is better than either using a positional list of optional arguments or emulating named arguments with an associative array (key-value map).

The fields can be set in the setUp method, which gets called by PHPUnit before the test methods. For each test method PHPUnit instantiates the test class, then calls setUp, and then calls the test method.

With this field-based approach individual test methods can modify a specific collaborator by writing to the field before calling newUseCase.

The choice of default collaborators is important. To minimize binding in the test functions, the default collaborators should not cause any failures. This is the case both when using the field-based approach and when using optional named parameters.

If the authorization service failed by default, most test methods would need to modify it, even if they have nothing to do with authorization. And it is not always self-evident they need to modify the unrelated collaborator. Imagine the default authorization service indeed fails and that the testWhenSaveFails_cancellationFails test method forgets to modify it. This test method would end up passing even if the behavior it tests is broken, since the UseCase will return the expected failure result even before getting to the point where it saves something.

This is why inside of the setUp function the example creates a “cancellable application” and puts it inside an in-memory test double of the repository.

I chose the CancelMembershipApplication UseCase as an example because it is short and easy to understand. For most UseCases it is even more important to avoid the constructor signature coupling as this issue becomes more severe with size. And no matter how big or small the UseCase is, you benefit from not polluting your tests with unrelated setup details.

You can view the whole CancelMembershipApplicationUseCase and CancelMembershipApplicationUseCaseTest.

See also: