Profiling and logging in MediaWiki

In this post I first describe the current approaches taken to logging and profiling in MediaWiki, then outline why these are problematic, and finish with a few thoughts on how things can be improved. The problems encountered and solutions proposed are not MediaWiki specific and can be apply to similar situations elsewhere.

The current approach

In MediaWiki, profiling, amongst many other things, is done by calling static functions. Sometimes these have the form of global functions such as wfProfileIn, and sometimes they are located in classes, such as SMW\Profiler::In.

Each usage of wfProfileIn needs to be matched with a closing wfProfileOut. Doing so provides you with information in the execution time between those two calls, if you turned profiling on.

Typically a wfProfileIn is made near the start of some method, and a wfProfileOut is made before each exit point of the method. Bigger methods can have multiple profiling sections. (Though those that do are certainly doing to much.)

Problems

Since every usage of methods such as wfProfileIn create a static binding, the user of your class does not have the option to not making this profiling call, or to have it be made to a different profiler. You could modify or swap out the global profiler, though that is going to create more problems then it solves in any non-trivial system. This also points out that global state is used, which we all know is bad.

Also see the excellent “static considered harmful” blog post by Kore Nordmann, and read about the aptly named STUPID acronym.

Concrete places where one runs into the above problems are when writing tests, and when one wants assert the profiling calls are made correctly. Or when you want to enable profiling only in part of your object graph. Or if you have any other usage that is not the default. Global state and static kill the flexibility of your code.

A different type of problem the current approach has is in the form of clutter. Profiling calls and logging calls are made through your business logic, while those two things are clearly quite distinct. This has a number of negative consequences.

First of all, it increases complexity, by adding more things into methods. Readability and understandability suffer, which in turn increase likelihood of new bugs being introduced and the code rotting further.

Having to put in an opening call to profileIn and matching closing calls to profileOut is a very error prone process. It’s quite easy to miss an exit point of a method. (Or rather, it is in the MediaWiki codebase, and in most codebases out there. Again, keeping methods appropriately focused eliminates this problem.) It is even easier to introduce new exit points later on, and either not realize one needs to also add a profileOut call, or simply forget about it. And since this code is not being tested due to it being static and global, you are not going to notice the error until someone that tried to profile starts shouting at you.

The clutter problem gets worse as more profiling, logging, debugging calls, and similar are added.

And all of this code gets executed the whole time, and has to be understood by every developer that needs to do some task affecting or affected by this code. In most cases, this is forcing things not relevant to people onto them. Does it make sense to have all these calls to wfProfile, wfLog, wfDebug and so on happen on your production system? As a developer changing some business rule in a method, or even merely understand it, do you want to be forced to hold into account these concerns all of the time while most of the time there would be no such need if concerns where properly separated?

Thoughts on how to make things better

MediaWiki should be rewritten from ground up, in Haskell.

More seriously, there are two main approaches that can be used to tackle some of the outlines problems.

  1. One way to achieve great separation between your domain logic and cross cutting concerns such as profiling and logging is to make use of the decorator pattern.

1000px-Proxy_pattern_diagram.svg

This is a very simple pattern with lots of similarity to other “wrapping patterns” such as Adapter and Proxy. Unlike the Adapter pattern, the wrapping class implements the same interface, rather then adapting to another one. The above UML diagram is actually for the Proxy pattern, and at least to me is easier to quickly understand then the Decorator one on Wikipedia. This does not matter much though, since the patterns are structurally identical as far as we are concerned here.

Imagine you have some service object from the domain layer titled “AwesomenessMultiplier”, containing a bunch of business logic. Depending on what kind of public methods this class has, you either create a new abstract class or interface with the name “AwesomenessMultiplier”, after you renamed the original class to something else. Lets say we renamed it to “ConcreteAwesomenessMultiplier”. Now you can create a ProfilingAwesomenessMultiplier that extends/implements AwesomenessMultiplier.

ProfilingAwesomenessMultiplier will be the place where you put all profiling calls. This class has all the public methods specified by AwesomenessMultiplier, after all, it implements/extends it. In these public methods you can add profiling, logging and whatnot calls as you like, and appropriately forward the call to an object taking care of the real behaviour. This object is stored in a field of type AwesomenessMultiplier.

So typically such a decorating class for logging or profiling will have two fields: a AwesomenessMultiplier and a Logger/Profiler. Both would be injected via the constructor. If you lack such a Profiler interface, you can put in static calls in the decorator as an intermediate step. It’ll get rid of the clutter problem, and decrease the static binding problem. The later will however still be in your decorator, and presumably in lots of other parts of your application. It is thus well worthwhile to create the trivial abstractions needed to get rid of this.

You can have a LoggingAwesomenessMultiplier wrap a AwesomenessMultiplier, and have that whole thing wrapped by a ProfilingAwesomenessMultiplier. And then pass it around through your system without any of it knowing anything about logging, profiling, or the use of decorators to hide those concerns.

  1. Proper dependency injection

In case one needs to make calls that are not at the start/end of public methods, using decorators does not work nicely. You could somewhat modify the approach by using inheritance so you can place calls around protected methods as well. That however breaks encapsulation and might cause more problems then it solves.

You are thus forced to have some logging, profile, etc, code in the same class with your business logic. It thus becomes very important the concerns outlined in this post are addressed as best as possible, as unlike with the decorator approach, they’ll be in your face the whole time.

Step one is getting rid of those nasty static calls relying on global state. Enter dependency injection.

Inject a thing implementing a Logger or a Profiler interface in the constructor of your object and stuff it into a field. Your code then calls $this->logger->log() rather then wfLog().

When creating such interfaces, keep the Interface Segregation Principle in mind. They should not contain everything that could possibly be related to logging. In case of PHP development, also consider the PSR-3 standard which has been adopted by a bunch of frameworks already.

In case of doing calls such as profileIn that need matching profileOut closing calls, do separate this on the method level. Rather then doing this

Consider doing the below:

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.