The dinner/workshop last night on Ensemble Modeling at Top of Minds was great. It’s always a treat when you put a bunch of people in a room that all have solutions to the same old common problems in Data Warehousing, but where the solutions themselves have taken on very different forms. While different, the meeting was about Ensemble Modeling, and all of the represented techniques; Anchor modeling, Data Vault, Focal Point, Hyperagility, and Head & Version, have some important common denominators that separate them from other techniques, such as 3NF and Dimensional Modeling.
Much of the discussions boiled down to what these denominators are. Whenever there are differences in our approaches, those cannot, for example, be part of the definition of Ensemble Modeling. In our latest paper we view an ensemble as the concept that can remember statements about similar things, usually by storing them in a database. The act of modeling is to define the boundaries between similar and dissimilar things. In other words, an ensemble should capture the essence of things similar enough to not belong to another ensemble, or loosely speaking “be the type of the thing”. In order to single out things belonging to the same ensemble, so called instances, all forms of Ensemble Modeling assume that some part of the thing is immutable. This is the ‘glue’ that holds the thing together through time, or rather, keeps what we know about the thing together. Even if a thing itself has vanished the memory of it can live forever.
This immutable part, joyfully referred to as “the king of the thing” at the dinner, is represented by an anchor table in Anchor, a hub in Data Vault, a focal in Focal Point, a head in Head & Version, and also hub in Hyperagility. It is to this the rest of the ensemble is attached. That rest, the mutable part, is decomposed into one or many acutal parts depending on the technique. Head & Version groups all mutable values, as well as relationships, in a single table. The others have some degree of decomposition, while what drives the separation is different in different techniques. Focal Point has predefined parts corresponding to abstract concept types, Data Vault has parts depending on rate of change and sometimes point of origin, Anchor has one part per role a value plays with respect to thing, and Hyperagility has an extensible name-value-pair nature. In these respects Ensemble Modeling stands out from other techniques, where mutable and immutable parts are not distinguishable.
To conclude the evening we realized that, somehow, all of us have come up with the same fundamental ideas as solutions to problems we have seen and experienced in our careers. Even if these diverge in many respects, it is what they share at their core that has proven so successful where traditional techniques have failed. All in all, I am glad to be able to influence and be influenced by likeminded people, and hope that we have many more sessions yet to come. I suppose there is some truth to the old proverb “great minds think alike” after all.