Wikibase DataModel: Entity v2

In a recent blog post I introduced the new Term classes introduced in Wikibase DataModel 0.7.3. It also outlined plans for making some big changes to the Entity class and it’s derivatives. We have now taken the most difficult step in the process, which is already resulting in much nicer code.

As I’ve written about in the past, we had a good portion of technical debt related to our serialization code. Our Value Objects and Entities in DataModel had a public toArray method and a public static newFromArray. This brought with it global state in places, it caused a lot of static code and forces an additional responsibility onto these objects, increasing their complexity. The new serialization components where created in part to address these issues.

I started with removing the old toArray and newFromArray code from DataModel, and in the process of doing so found that this went hand in hand with taking the next big step in breaking up the Entity hierarchy. The constructor of Entity (and it’s concrete derivatives) took an array with the internal serialization format as only argument. They would hold this array internally and unstub specific parts when that data was requested in object form (ie by calling getSiteLinks). If all the serialization code got moved out of DataModel, then either this would need to go as well, or we’d need to make DataModel dependent on the new serialization components. Luckily the choice between introducing a cyclic dependency and removing some technical debt you need to get rid of in the near future anyway is an easy one.

Rather than taking this array in some storage specific array format, the constructors of the Entity derivatives now take a list of the objects they need. For instance:

As you can imagine, this makes a lot of things in Entity simpler. My last post on DataModel included this plot:

dm-complexity

This is what the same plot looks like on the development branch of the DataModel 1.0 release:

entity53

We nearly halved the complexity of our most complex class \o/. Some more stats: So far we changed 47 files with 768 additions and 2382 deletions. With these changes, our ScurtinizerCI quality rating went from 8.23 to 8.76. The release is definitely not done yet though – the big changes already described make a lot of smaller cleanup possible. And we’re incentives to kill deprecated things in this release, since we’ll be following semver properly afterwards, and will have to bump to 2.x when we make a breaking change afterwards.

Entity has an equality method. This used to work by putting the array data the Entity held through a generic comparer object. Since we needed a replacement for this, I made most value objects in DataModel implement the Comparable interface. This was already made available in a 0.7.4 release. Now Entity simply delegates to the equals methods of the objects it holds, letting them decide how to compute equality for their type. This fixed quite some inconsistencies that could occur in the old code (depending on how exactly you set data in the first place) such as SiteLink badges incorrectly being compared in order dependant fashion.

Further splitting of Entity is on the roadmap, though perhaps not for the 1.0 release. For a list of 1.0 changes made so far, check the release notes.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.