Implementing the Clean Architecture

Both Domain Driven Design and architectures such as the Clean Architecture and Hexagonal are often talked about. It’s hard to go to a conference on software development and not run into one of these topics. However it can be challenging to find good real-world examples. In this blog post I’ll introduce you to an application following the Clean Architecture and incorporating a lot of DDD patterns. The focus is on the key concepts of the Clean Architecture, and the most important lessons we learned implementing it.

The application

The real-world application we’ll be looking at is the Wikimedia Deutschland fundraising software. It is a PHP application written in 2016, replacing an older legacy system. While the application is written in PHP, the patterns followed are by and large language agnostic, and are thus relevant for anyone writing object orientated software.

I’ve outlined what the application is and why we replaced the legacy system in a blog post titled Rewriting the Wikimedia Deutschland fundraising. I recommend you have a look at least at its “The application” section, as it will give you a rough idea of the domain we’re dealing with.

A family of architectures

Architectures such as Hexagonal and the Clean Architecture are very similar. At their core, they are about separation of concerns. They decouple from mechanisms such as persistence and used frameworks and instead focus on the domain and high level policies. A nice short read on this topic is Unclebob’s blog post on the Clean Architecture. Another recommended post is Hexagonal != Layers, which explains that how just creating a bunch of layers is missing the point.

The Clean Architecture

cleanarchitecture

The arrows crossing the circle boundaries represent the allowed direction of dependencies. At the core is the domain. “Entities” here means Entities such as in Domain Driven Design, not to be confused by ORM entities. The domain is surrounded by a layer containing use cases (sometimes called interactors) that form an API that the outside world, such as a controller, can use to interact with the domain. The use cases themselves only bind to the domain and certain cross cutting concerns such as logging, and are devoid of binding to the web, the database and the framework.

In this example you can see how the UC for canceling a donation gets a request object, does some stuff, and then returns a response object. Both the request and response objects are specific to this UC and lack both domain and presentation mechanism binding. The stuff that is actually done is mainly interaction with the domain through Entities, Aggregates and Repositories.

This is a typical way of invoking a UC. The framework we’re using is Silex, which calls the function we provided when the route matches. Inside this function we construct our framework agnostic request model and invoke the UC with it. Then we hand over the response model to a presenter to create the appropriate HTML or other such format. This is all the framework bound code we have for canceling donations. Even the presenter does not bind to the framework, though it does depend on Twig.

If you are familiar with Silex, you might already have noticed that we’re constructing our UC different than you might expect. We decided to go with our own top level factory, rather than using the dependency injection mechanism provided by Silex: Pimple. Our factory internally actually uses Pimple, though this is not visible from the outside. With this approach we gain a nicer access to service construction, since we can have a getLogger() method with LoggerInterface return type hint, rather than accessing $app['logger'] or some such, which forces us to bind to a string and leaves us without type hint.

use-case-list

This use case based approach makes it very easy to see what our system is capable off at a glance.

use-case-directory

And it makes it very easy to find where certain behavior is located, or to figure out where new behavior should be put.

All code in our src/ directory is framework independent, and all code binding to specific persistence mechanisms resides in src/DataAccess. The only framework bound code we have are our very slim “route handlers” (kinda like controllers), the web entry point and the Silex bootstrap.

For more information on The Clean Architecture I can recommend Robert C Martins NDC 2013 talk. If you watch it, you will hopefully notice how we slightly deviated from the UseCase structure like he presented it. This is due to PHP being an interpreted language, and thus does not need certain interfaces that are beneficial in compiled languages.

Lesson learned: bounded contexts

By and large we started with the donation related use cases and then moved on to the membership application related ones. At some point, we had a Donation entity/aggregate in our domain, and a bunch of value objects that it contained.

As you can see, one of those value objects is PersonalInfo. Then we needed to add an entity for membership applications. Like donations, membership applications require a name, a physical address and an email address. Hence it was tempting to reuse our existing PersonalInfo class.

Luckily a complication made us realize that going down this path was not a good idea. This complication was that membership applications also have a phone number and an optional date of birth. We could have forced code sharing by doing something hacky like adding new optional fields to PersonalInfo, or by creating a MorePersonalInfo derivative.

Approaches such as these, while resulting in some code sharing, also result in creating binding between Donation and MembershipApplication. That’s not good, as those two entities don’t have anything to do with each other. Sharing what happens to be the same at present is simply not a good idea. Just imagine that we did not have the phone number and date of birth in our first version, and then needed to add them. We’d either end up with one of those hacky solutions, or need to refactor code that has nothing to do (apart from the bad coupling) with what we want to modify.

What we did is renaming PersonalInfo to Donor and introduce a new Applicant class.

These names are better since they are about the domain (see ubiquitous language) rather than some technical terms we needed to come up with.

Amongst other things, this rename made us realize that we where missing some explicit boundaries in our application. The donation related code and the membership application related code where mostly independent from each other, and we agreed this was a good thing. To make it more clear that this is the case and highlight violations of that rule, we decided to reorganize our code to follow the strategic DDD pattern of Bounded Contexts.

contexts-directory

This mainly consisted out of reorganizing our directory and namespace structure, and a few instances of splitting some code that should not have been bound together.

Based on this we created a new diagram to reflect the high level structure of our application. This diagram, and a version with just one context, are available for use under CC-0.

Clean Architecture + Bounded Contexts

Lesson learned: validation

A big question we had near the start of our project was where to put validation code. Do we put it in the UCs, or in the controller-like code that calls the UCs?

One of the first UCs we added was the one for adding donations. This one has a request model that contains a lot of information, including the donor’s name, their email, their address, the payment method, payment amount, payment interval, etc. In our domain we had several value objects for representing parts of donations, such as the donor or the payment information.

Since we did not want to have one object with two dozen fields, and did not want to duplicate code, we used the value objects from our domain in the request model.

If you’ve been paying attention, you’ll have realized that this approach violates one of the earlier outlined rules: nothing outside the UC layer is supposed to access anything from the domain. If value objects from the domain are exposed to whatever constructs the request model, i.e. a controller, this rule is violated. Loose from the this abstract objection, we got into real trouble by doing this.

Since we started doing validation in our UCs, this usage of objects from the domain in the request necessarily forced those objects to allow invalid values. For instance, if we’re validating the validity of an email address in the UC (or a service used by the UC), then the request model cannot use an EmailAddress which does sanity checks in its constructor.

We thus refactored our code to avoid using any of our domain objects in the request models (and response models), so that those objects could contain basic safeguards.

We made a similar change by altering which objects get validated. At the start of our project we created a number of validators that worked on objects from the domain. For instance a DonationValidator working with the Donation Entity. This DonationValidator would then be used by the AddDonationUseCase. This is not a good idea, since the validation that needs to happen depends on the context. In the AddDonationUseCase certain restrictions apply that don’t always hold for donations. Hence having a general looking DonationValidator is misleading. What we ended up doing instead is having validation code specific to the UCs, be it as part of the UC, or when too complex, a separate validation service in the same namespace. In both cases the validation code would work on the request model, i.e. AddDonationRequest, and not bind to the domain.

After learning these two lessons, we had a nice approach for policy-based validation. That’s not all validation that needs to be done though. For instance, if you get a number via a web request, the framework will typically give it to you as a string, which might thus not be an actual number. As the request model is supposed to be presentation mechanism agnostic, certain validation, conversion and error handling needs to happen before constructing the request model and invoking the UC. This means that often you will have validation in two places: policy based validation in the UC, and presentation specific validation in your controllers or equivalent code. If you have a string to integer conversion, number parsing or something internationalization specific, in your UC, you almost certainly messed up.

Closing notes

You can find the Wikimedia Deutschland fundraising application on GitHub and see it running in production. Unfortunately the code of the old application is not available for comparison, as it is not public. If you have questions, you can leave a comment, or contact me. If you find an issue or want to contribute, you can create a pull request. If you are looking for my presentation on this topic, view the slides.

As a team we learned a lot during this project, and we set a number of firsts at Wikimedia Deutschland, or the wider Wikimedia movement for that matter. The new codebase is the cleanest non-trivial application we have, or that I know of in PHP world. It is fully tested, contains less than 5% framework bound code, has strong strategic separation between both contexts and layers, has roughly 5% data access specific code and has tests that can be run without any real setup. (I might write another blog post on how we designed our tests and testing environment.)

Many thanks for my colleagues Kai Nissen and Gabriel Birke for being pretty awesome during our rewrite project.

Further reading

Sign up below to receive news on my upcoming Clean Architecture book, including a discount.

Other things to look at:

41 thoughts on “Implementing the Clean Architecture”

    1. Hey Dmitriy, that’s a good question.

      We did not move the presenters to src, but moved the contexts out of it. Still, the question of why we did not put the presenters together with the associated context remains valid.

      We decided to limit our contexts to things bound to the domain model. That means the outermost “layer” of the contexts are the usecases, and that they remain fully framework independent.

      From the literature I figured that what you suggest is the more common approach. Indeed we discussed reorganizing our codebase as such, but no compelling arguments where found that justify the effort.

      I raised this question on the DDD CQRS list when we where talking about this in out team. Unfortunately we did not get any reply. https://groups.google.com/forum/#!topic/dddcqrs/tWc6iZGhvUU

  1. This is a fantastic writeup. It’s very nice to see that the Silex actions are almost point-for-point examples of Action-Domain-Responder . (Some of the actions might do with trivial refactorings to bring them even more in-line with the pattern, but overall it’s very well done.)

  2. A longer reply here: http://paul-m-jones.com/archives/6535

    An excerpt:

    “The only place where Jeroen’s implementation deviates from ADR is that the Action code builds the presentation itself, instead of handing off to a Responder. (This may be a result of adhering to the idioms and expectations specific to Silex.) Because the rest of the implementation is so well done, refactoring to a separated presentation in the form of a Responder is a straightforward exercise. Let’s see what that might look like.”

  3. First, great article. I am a big fan of DDD and UncleBob both. I took the time to clone the git repo so I could browse it in PhpStorm and was really enjoying going though to see how you did the separation of concerns and boundaries etc. Then I came across https://github.com/wmde/FundraisingFrontend/blob/master/src/Factories/FunFunFactory.php … please explain, what the heck happened here???? there are 160 use statements at the top of that 1000+ line file … It looks like a dumping ground. Though I realize I am, it is not really my intention to be critical. I found this article searching for good examples of clean architecture in php and was happy I found your article.

    1. Hi Todd. You are right that this file has a lot of lines and many many dependencies. It is the top-level factory of our web application. It is the wiring where we inject the dependencies of our classes. This is single responsibility. And even though there is a lot of code, it is rather flat and uniform, making it easy to modify. We *could* split the top-level factory, though I suspect we’d be making a bad trade-off. Having a big top-level factory (or equivalent DI mechanism) is, as far as I am aware, not uncommon and typically not considered a problem.

    1. Hi Norman! There is more than one answer to that question and it depends on which code exactly you are taking about.

      In this codebase we have some instances where this architecture rule has been violated where it should not have been. If you search for TODOs you are likely to find some of those. These violations where created due to our lack of experience with The Clean Architecture. The ones that remain are not high priority issues, as long as they are not made worse, and hence likely will stay as they are.

      There might also be cases where we deliberately made an exception to this rule. There definitely are route handlers that use services directly rather than using a UseCase. If the UseCase just takes argument set x and passes it along to some service taking the same argument set, and thus not containing any application logic, having it is not necessary (and possibly harmful).

      1. Thank you for replying! (in such detail, too)

        That explains everything and answered my question! (also confirming my assumption, why certain decisions were made). It makes sense, to not use a “useCase” if it is simply delegating simple data-sets! 🙂

  4. That was a very good read about clean architecture. That’s the first time I see multiple DDD bounded contexts in action and it is nice. Regarding the previous comments, where controllers reuse directly domain service instead of using UseCase, well it’s a tradeoff, I faced similar situation and I don’t think there is a right answer, sometime it doesn’t worth it to add delegate boilerplate. Accessing directly the domain from low level is not a big issue as accessing directly low level from domain. Very good job, your codebase is a reference.

  5. Why are you so firm that outer layer could only access one layer under them?
    Why View layer (controllers) can not have access to Domain Entities?
    Uncle Bob’s Dependency Rule says only that nothing from outer layers should be used in the inner layers.
    Your restriction is not mentioned.

    1. Hey Micheal. I also mentioned this restriction in this post https://www.entropywins.wtf/blog/2018/08/14/clean-architecture-bounded-contexts/ though likewise did not explain it much.

      I got that idea from Uncle Bob. Since this is quite long ago I do not remember where, though as far as I recall he mentioned it in several places.

      The basic idea here is that there should be no logic in the presentation layer. And there should be no presentation concerns in the domain model. Without the boundary that is formed by the Use Case layer, it is easy for such violations to happen. By having the Use Case layer as a boundary you also get a nicely defined public API to your domain model. Refactoring the model is a lot easier if only the Use Cases use it rather than also any number of presentation concerns, possibly in different applications.

  6. Hey guys,
    You’ve done a great job and I would say that your project is one of the best PHP + Clean projects which are available online. Additionally you put a lot of effort into the BC separation – something that is completely missing in the hundreds of “clean” projects around. The presenter part is also visible and well thought. So … gut gemacht 😉

    Do you have any plans to get rid of this spaghetti class any time soon?
    https://github.com/wmde/FundraisingFrontend/blob/master/app/Routes.php

    I am pretty sure you can organize this much better.
    The other huge file – FunFun – unfortunately it is too late to fix it but just as a suggestion – there is the “Pure DI” which could have probably fit well into the project.

    Anyways, excellent job one more time!

    1. Hi Marian! Thanks for the kind words.

      I would not classify either Routes.php or FunFunFactory as spaghetti. Yes, these two contain a lot of code. Yes, they contain a lot of things that could be separated. Most of the time that means there is a problem. However for these cases the code is (1) highly uniform and (2) highly independent.

      If you put the code from Routes.php in dedicated files, the structure would not change. The benefit would just be that you have one route per file. And not a clear cut benefit at that. Having things in one file has its upsides as well. We did put a number of routes in dedicated files because we “needed” dedicated classes for them. https://github.com/wmde/FundraisingFrontend/tree/master/app/Controllers

      You could double the size of FunFunFactory and it would make essentially no difference. Same goes with reducing its size 10x. I used this top level factory approach in many projects and find working with one that has 5 methods just as easy as one with 500. Using a DIC (like you have in Symfony for instance) has its ups and its downs. I’m not convinced it is the better approach, so I tend to default to the simpler one used in this project.

      1. Hi Jeroen,
        Thanks for the quick reply 🙂

        I noticed that you have controllers for some of the actions and I assume that it depends on the controller complexity. Still I would prefer to keep the “entry point” of a PHP application as short as possible since every request should go through it. I know that we have OpCache which works quite well but I think it is nice if we could pass the control to the specific code as fast as possible.

        For the dependency injection I meant just using DI but not necessarily with a DI-Container. I know that it is quite convenient to pass the big “bucket” to every controller so that they can ask for whatever they want (and therefore transitively depending on maaaaany classes) but the other approach is that every controller specifies the dependencies it needs and they are passed as constructor arguments by the responsible dependency provider.

        P.S. I hope you don’t get my comments as a criticism. I myself try to build a similar clean solution and I just see how common our problems/challenges/decision points are 🙂

  7. Can you please explain what does “would not bind to domain” exactly mean when you wrote – “In both cases the validation code would work on the request model, i.e. AddDonationRequest, and not bind to the domain”

    1. It means the validation code doe not bind to the domain model. It does not operate on domain entities, ie Donation.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.