Schema by Design

Lately, there’s been a lot of talk about when a schema should be applied to your data. This has led to a division of databases into two camps, those that do schema on write and those that do schema on read. The former is the more traditional, with relational databases as the main proponent, in which data has to be integrated into a determined schema before it can be written. The latter is the new challenger, driven by NoSQL solutions, in which data is stored more or less exactly as it arrives. In all honesty, both are pretty poor choices.

Schema on write imposes too much structure too early, which results in information loss during the process of molding it into a shape that fits the model. Schema on read, on the other hand, is so relaxed in letting the inquiring part make sense of the information that understandability is lost. Wouldn’t it be great if there was a way to keep all information and at the same time impose a schema that makes it understandable? In fact, now there is way, thanks to the latest research in information modeling and the transitional modeling technique.

Transitional modeling takes a middle road between schema on read and schema on write that I would like to call schema by design. It imposes the theoretical minimum of structure at write time, from which large parts of a schema can be derived. It is then up to modelers, which may even disagree on classifications, to provide enough auxilliary information that it can be understood what the model represents. This “metainformation” becomes a part of the same body of information it describes, and abides by the same rules with the same minimum of structure.

But why stop there? As it turns out, types and identifiers can be described in the same way. They may be disagreed upon, be uncertain, or vary over time, just like information in general, so of course all that can be recorded. In transitional modeling you can go back to any point in time and answer an inquiry as it would have been answered then and from the point of view of anyone who had an opinion at the time. Actually, it does not even stop there, since constraints over the information, like cardinalities, also are respresented in the same way. It all follows the same minimum of structure.

What then is this miraculous structure? Well, it relies on two constructs only, called posits and assertions, both which are given proper treatment in our latest scientific paper, entitled “Modeling Conflicting, Unreliable, and Varying Information”. It can be read and downloaded from ResearchGate or from the Anchor Modeling homepage. If you have an interest in information modeling, and what the future holds, give it an hour. Trust me, it will be well spent…

Published by

Lars Rönnbäck

Leave a Reply Cancel reply