Evolving Architecture by Cheating

Both Data and Behaviour Are Important

It’s beyond the scope of this post to review the last 3-4 decades of how data and behaviour battled for importance. The answer to satisfy both is Events. Events capture both data and behaviour. An event is a statement of fact. “Customer became gold star discount status”, “Distributor discontinued the item”. These are expressions of things we have no control over and our system must understand – otherwise, these are not facts. If these are well defined and agreed on, our solution will be able to adhere easily to a technical vision that everyone in the organization understands.

What we discover with events is the concept of “ubiquitous language” (Evans – Domain Driven Design, 2004). When we use the word “customer”, “product” or “gold star discount”, everyone in the company knows what is being discussed. This includes the details of events, which answers our focus on data. An event like “customer signed up” will have some data associated with it such as time of the sign up, their name, their address, etc. The terms used to describe the data of the event should not be alien to anyone in the organization up to the CEO.

The events also act as building blocks for implementation of the system. They show what the aggregates of a system are. Aggregates are a concept from DDD that are a specific way of thinking about objects – specifically, closely related objects. An example would be an invoice object. It could have a few invoice line objects associated with it; and perhaps a customer name and address object. A border around this object doesn’t allow relationships with other entities. We don’t hold a reference to a customer that we can act upon from the invoice business logic. That is up to orchestration. Further, we do not manipulate the invoice line objects or the customer name and address objects directly. We can only manipulate them through the invoice object in adherence to Tell Don’t Ask principle. These 2 restrictions give a central place for behaviour and draw an important transactional boundary which gives freedom of mobility for this entity as it is not directly tied to other information in the system.

Get Both Business and Technical Vision Understood Right Away

So how are the correct events for the system discovered? The system must be understood by everyone. The best way to do it correctly is through an exercise called Event Storming. Event Storming is a term that was coined by Alberto Brandolini as he refined Greg Young’s approach to whiteboard sessions about workflow design with events, commands and document messages. In the exercise, representatives from all of the organization’s roles gather to discuss/argue/brainstorm about what events, or facts, the system they are building has. These will dictate what the lego blocks to build our system will be. The value of the exercise lies in the discussion of the events and specifically what data they contain. A CEO should know that when a “customer signed up” event is put up on the board by the service rep, they included a visa card number as a property of that event.

The exercise also defines commands and documents. Also, sub systems, or domains/bounded contexts can be defined by gathering related events and other messages together. These are logical layouts and the implementation for an initial system may be held in a single web application. The important parts of the design have been put in place. Evolving the system now has some guidance through the organization of both the data and the behaviour via these constructs. It is important that the concepts permeate the technical and non-technical discussions and in so doing get the best model to reflect the business.

>Make Specifications Easy to Write

The mechanism of event sourcing forces us to replay past events for an object to get its current state. While this looks like quite the hoop to jump through, it’s actually an excellent vector to a simple and correct system. Anyone in the organization can easily specify behaviour (previously referred to as state transitions) with commands which show intents. Our set up for the specification is simply the previous events that pertain to the object. Similarly, success can only be measured by seeing if the correct event has been generated by the objected after processing the command. Alternatively, we can look for failure via an exception or failure event or a combination of both. An example of the form can be paraphrased as “GIVEN: the customer signed up with these parameters, WHEN: they try to purchase a product for $100, THEN: an exception of large purchase for new customer is not allowed is thrown”. A simple JavaScript example can be found in the OpenParking project.

These specifications should eventually be easily understood as they could be authored in plain text or just displayed in a readable format with proper tooling. This will be the subject of the next post as it is very important and deserves its own article. For now, this will get the job done to get all behaviour specified. The key is that there was no distraction of how data is being presented to the user. It focused attention to getting the source of truth correctly.

Reading State is Off-Limits

To display data, it is tempting to add properties on the object in order to display information about the state to the user. One of the keys to keeping complexity down is to adhere to Command Query Responsibility Segregation – applying the concept of CQS at the architectural level. The object was built for processing commands; thus, it cannot be queried. Much like a nuclear physics experiment, we can only “bombard” our object with commands and look for evidence of particles that flew off from the reaction. The objects would quickly become a tangle of code which tries to process commands and then provide the state in lots of representations. On this basis alone it’s enough to not introduce this code. Other reasons have to deal with future concerns of scalability.

So the only solution is to build listeners (event handlers) as the start of the idea of the read model – the etchings of particles that flew off. Most proponents of this architecture will bring the concept of read models in right away. It’s some store of information – it could be a relational database or a set of files – that gets populated when events are emitted after processing commands successfully. This store, in turn, can be queried to populate some UI or service some API that requires queries to be performed. Meanwhile the object itself remains free of any querying complexities.

The Shortcut

But is a read model actually needed yet? It sounds like a lot of work to display information we already have in our object. The answer is no. There is a way to leave the domain object alone and get queries satisfied without a Read Model. The system can use the same store that houses the events that is used to get the domain objects to the correct state. This is where the “last responsible moment” practice in agile is applied in this architectural approach. The interesting part is that this mechanism can be left in indefinitely or performance issue may require it to be replaced immediately – it entirely depends on the Service Level Agreements your solution demands. Systems exist today that have only used the approach of reading events directly from the event store for queries and never added a relational database or key-value store to act as a read model.

What’s the cheapest event store? The most obvious very low tech solution is to textually search for information if the events are stored on disk using file system libraries that come with the technology chosen. As the performance becomes an issue, we can add some structure and conventions to this storage. An append-only file that grows forever can house all serialized events. Indexes can be built, trimmed in a similar fashion. The next step would be to simply add a secondary disk to store an exact copy of the events and read those if IO bottlenecks are hit due to concurrent reads and writes on the original disk where the events are stored. In a few years, the source of information for screens will live in tables in a relational database, but that overhead didn’t need to be added at the beginning. And not doing so didn’t hinder delivering the best solution, it helped it.

A system that only uses a languages native file access libraries has a very small technology footprint. So maintaining security updates, new features, deprecated features, incompatibility with other libraries and frameworks is not needed; the development effort can concentrate on the business needs rather than whether the latest version of one framework will work with another. Indexing is pretty trivial to add as an event stream is an add-only mechanism and events are immutable.

Do This Everywhere

While the obvious beneficiary of this practice is a start up that has to get things right and operate on a very small budget, it can be done in other places. In an organization that has a draconian approach to data and has signed a deal with the devil where everything has to be stored in an Oracle database, a simple 2 column key-value table will do. It is easy to implement a new system and circumvent red tape to get a solution implemented with no constraints as would a new start up. This effectively gives a method for larger companies to compete with the startups.

No Fair! You’re Cheating!

The key is to cheat – but cheat in a smart way. Agile cheated – it didn’t bother with huge design up front. To get the best architecture done from the start and quickly – the Read Model was intentionally skipped. Lots of other software solution dogma can be omitted as well. But what must remain unchanged and uncompromised is the source of truth for everything. The best way to do this is realize that events are facts and vice-versa; make everything else depend on them. Publish them when they get created. All other parts of the system become slaves (subscribers) to this information. The dreaded rewrite of version 1 doesn’t have to happen in a start-up. Most organizations don’t have the facts straight or represented properly in the implementation because they end up being implied by mountains of code. Isolate the source of facts and state transitions as a top level concern apart from the rest of the system

…and go ahead and query the events directly while you can get away with it.

Comments are closed, but trackbacks and pingbacks are open.

Top