(US and Canada) Phil Miller, Co-CEO and Co-Founder, Solidatus, speaks with Baz Khuti, President US Modak, about the role of data lineage, the co-existence of automation and manual decluttering of data, and data privacy
Miller addresses the importance of data lineage first, noting, "It's the operational way that the organization is built." He says that not understanding lineage poses a concern while building data infrastructure in an organization. The first part of lineage is to gather knowledge about the data by making connections with the people handling it.
At Solidatus, all the acquired knowledge is then put into a visual map. Miller compares it with the tube map of London. If one of the tube lines is having delays, then one must think of other routes to reach from one destination to another.
"We're trying to crowdsource information, collaboratively putting it into a place that everybody who would have permission can use it efficiently," he says.
Regarding the challenges to building the visual map of data lineage, Miller reckons two problems. The first is the discovery factor — to figure out how to get the organization's map into Solidatus as a blueprint. Secondly, once the organization is mapped, the problem shifts to making a change in the blueprint.
He explains that the first problem is 'highly automated' and the other one requires people driven enough to bring in the change by understanding the requirements and reason for the change. Thus, Solidatus can be used to "build the requirements and to simulate the change, and to see if it works, to see the cost of ownership of things."
In an organization, Miller points out, 75% to 80% of the blueprint can be achieved by automation. However, there will always be aspects that require human engagement, particularly concerning historic source data that is long lost.
As for the layout aspect, Miller maintains that data and metadata applications fail because they have a seemingly complex layout. "One of the joys of using Solidatus is that we made it a bit of a game, making it a pleasant experience for the user,” he says.” We try all points to give you the power as a user to define the height above other people's datasets — even your own — so that you work hard and can focus on the important things. We try to declutter things as quickly as possible."
Miller emphasizes the need to focus on improving one's application and record its effect on users rather than knowing other available applications. "Data isn't expensive, complexity is. And complexity hides problems. Solidatus declutters data to understand what is important and what is not,” he says.
He describes privacy as an “interesting factor” for a data professional. Referring to the '90s, he says things used to move extremely fast regarding the application and delivery of data. But it is not so today. And it seems that the work done by IT professionals in the past years has slowed down the delivery of IT today. “Privacy is probably one of the elements at the root of this,” Miller says. You've got to use data for the things that people expect you to use it for.”
There isn't just one rule about the use of data, and there isn't just one country in today’s connected world — and this further complicates the maintenance of privacy, he says. When data management is handled and used across borders and organizations, certain concerns about trust in data sources and data protection laws arise, anchoring the operational force. He recalls that before the introduction of GDPR, although there was data protection, "We were kind of just stuck in the gaze of the headlights of who can use what data, and it became very manual immediately."
Miller asserts that with a logically processed blueprint, data is codified, It answers the frequently asked questions of storage, location, and purpose of data. "We can draw a graph of these inside Solidatus. We can link all these concerns — unlike the traditional way of doing this across borders and organizations — while adhering to innumerable business rules,” he says.
He highlights how larger organizations have become responsible for data lineage, having strong governance and legal framework in place. Hence, the approach to data usage within the organizations has changed. Contrary to yesteryear, now the hierarchy does not play the blame game when there's a data breach. Even if the breach has taken place in a different country, as long as it is the same company, there will be one person taking responsibility for it.
Expending money and people on a problem is not the solution anymore, he concludes. Miller now prioritizes the need for automation with a proper framework and approach.