Chris Conway
Chief Architect, Quantiv
In last month’s blog I talked about the usefulness of data models to understand an application’s operation. And this reminded me that while it may be true as a general principle, it’s also fair to say not all data is created equal for this purpose. Or, more precisely, that not all data models (metadata) are the same.
Our company name, Quantiv, hints at where our preferences lie (more on this later). But that doesn’t mean some models are more valuable than others in an absolute sense. Instead, it means a model’s value varies depending on how it’s used. And in what feels like a world with an ever-increasing level of uncertainty – AI, climate change, political and economic upheaval – it’s crucial to know which data to use at a particular time.
The importance of context to metadata
As an example, having a detailed model of a customer’s attributes (address, age, income, etc.) could help to understand the differences between the types of customers supported by an application.
However, that ‘static’ model would be much less useful in understanding the different types of services the application provides, even to the same customers. For that, a detailed ‘dynamic’ model covering the application’s behaviour (for example orders, invoices, payments) would be needed.
In fact, how the model is used – the context – is as important to metadata as it is to data.
Perhaps that difference between ‘static’ structural models and the ‘dynamic’ interaction ones is the most crucial.
In this instance, static doesn’t suggest the data is unchanging. Instead, it refers to the data’s role of providing dimensional structures or perspectives for an application’s operations. In effect, it defines what the application used.
This contrasts with dynamic interaction data, which records the details about those application operations, i.e. what an organisation did with what it used.
This might sound like a pedantic, technical distinction. But it can be key to understanding the nature of the data, the volumes and varieties likely to be collected, and how it should be stored.
The difference between structural and interaction data
Typically, structural data will have a finite number of records. And while that volume might be large, and even increase over time, there’s a theoretical maximum number of rows that can be collected, such as the number of customers in the market.
However, the number of fields and variety of the possible data values means the number of combinations can be extremely high, if not almost infinite, especially if human responses are included.
And this variation can happen even for a particular type of data. For example, some customers may have provided a favourite colour, while others provided favourite musicians.
Storing data of this sort needs flexibility and the ability to search across many fields. So, it’s often better to store it in ‘schema-less’ document databases rather than the more popular relational (or tabular) databases.
This contrasts with interaction data, which has an unconstrained volume (hence dynamic) but is more limited in variety. Interaction data grows every time an operational activity is performed and so its volume could increase indefinitely. For example, while the number of customers may be restricted, the number of orders they could place is almost unlimited.
The variety is more constrained because an organisation’s processes impose limits on the different ways in which operations can be performed – and therefore on the structure and possible values of the fields in the data collected. So, while there might be a lot of rows, it’s likely many of them will look the same, particularly once quantity and time values are excluded.
Data of this sort doesn’t need the flexibility of a document database, and searching is limited to specific fields, so this can fit better into a relational database. But even there, a more specialised interaction database can perform better than a raw relational one.
‘Mega-metadata’
The structural and interaction models could also be combined to produce a larger view of an application’s operation, which you could call ‘mega-metadata’. And while you might expect such a model to be more informative, the extra complexity of the data can make those individual views less clear. So, a definite case of ‘less is more’.
Equally, that a combination is possible shouldn’t detract from the fact the natures of the data are different. This is something apparent in Quantiv’s products. We concentrate on the ‘Quantities’ and ‘time values’ associated with organisational events rather than the characteristics of participants in those events (hence the company name).
We know operational metrics based on this interaction data are critical in helping organisations to understand their current operations, and to predict and plan for what might happen in the future.
Our NumberWorks method is designed to identify and define those operational metrics and the context on which they exist. And our NumberCloud product collects that dynamic data and stores and exposes it across different applications.
To find out more, talk to the Quantiv team on 0161 927 4000 or email: info@quantiv.com