Daragh O Brien’s Post

View profile for Daragh O Brien, graphic

I change how people think about information and data | 3 years' running All-Star Thought Leader Accredited by AIBF | Doctoral Candidate in Data Governance @ UL

[edited as I hit ‘post’ before my thought was fully finished] I am seeing a lot of discussion beginning from lawyers and others about how standards will be key to ensuring data quality in AI systems. I've been doing #DataQuality work a long time. Standards won't fix things as, all too often they are used the same way a drunk uses a lampost - as much for support as illumination. And there are already standards for data quality (ISO8000 family) and for formatting of certain types of data (e.g. telephone numbers and date formats) that aren’t always applied or considered in the design of data capture processes And, lets not forget that the HISTORIC data that organisations have (that they will train small LLMs or other processes on) won't have been captured for that purpose and will have errors, defects, and biases baked in. Organisations should consider that when adopting a technology that learns from your data, your technical and data debt will fall due. O Brien’s 3rd Law of Automation has stood me well since 1997: Automating a badly understood process with badly managed data flowing through it just makes bad things happen faster than you can hope to keep up with.

Daragh O Brien

I change how people think about information and data | 3 years' running All-Star Thought Leader Accredited by AIBF | Doctoral Candidate in Data Governance @ UL

1y

Don't get me wrong... standards are GREAT. But if "certification" is approached as a tickbox activity (which it often is) then they are not the answer. As with every 'wicked problem' the obvious answer is often not correct (but is also equally not wrong).

Fully agree, Daragh. In my view, this is the biggest risk with the rush to implement AI everywhere, and train AI solutions using an organization's existing data stores. There is a huge body of work to do to validate and curate that data before it's used to train an AI model. Something that i think needs to be baked into these solutions is Accountability... Who is accountable for the output from an AI/LLM solution? In the simplest case, if I write an email and get something wrong, I'm accountable for that. If my AI "co-pilot" writes an automated response and gets something wrong, I have to be accountable for that too. However, it could be much harder and potentially more embarrassing to have to fix a problem caused by an automated response.

Marc Nolte, CDMP, CDP

Data Management Apologist, Modeling Data Architect, Solution Designer, Educator, Community Builder

1y

The amount of data debt in my small town will keep me employed for as long as I want to work. A cultural shift is required.

Sami Laine

Data Management Advisor&Consultant | CDOIQ Nordic Symposium | DAMA Finland ry

1y

There is so much different kind of standards. I have an impression, from what I have seen, that ISO standards are very abstract process standards - tick the box - but do not really do into depth or breadth of data itself. DQ standards with ambiguous DQ dimensions are actually quite naive and more about terminology. Data quality standardization should be founded on context dependent domain knowledgre and DATA itself - for example, healthcare data standards. These need to be built in to the actual organizational work practices and software systems. There is simple reference data standards, like ICD diagnosis codes or NCSP procedure codes. There is ontologies like Snomed. There is also technical data exchange standards like HL7/FHIR and data storage standards like OpenEHR. Good news is - data can be made good. Bad news is that almost everyone must change completele the way of managing data and particularly the foundational business application systems that collect, store, enrich and share data. These need to be upgraded to follow ALL these kind of sophisticated domain specific context-focused standards like OpenEHR. Using those other more limited standards as building blocks around the core data standards.

Robert Lazorko

Senior Advisor, Asset Information Management

1y

Suggest that governing documents for data are the starting point for data quality. They should include standards ("directives"), specifications ("requirements"), and constraints ("rules and reference"). The governing documents should enable the governance model:the people (RACI), processes (behaviors), and KPIs (monitoring) to realize the necessary quaility.

Like
Reply
Heidi Saas

Data Privacy and Technology Attorney | Licensed in CT, MD, & NY | ForHumanity Fellow of Ethics and Privacy | Ethical AI Consultant | Change Agent | ⚡️ Disruptor ⚖️

1y

Yes! 👉"Standards won't fix things as, all too often they are used the same way a drunk uses a lampost - as much for support as illumination."😂

Like
Reply
Daniel Sereduick

EU Data Protection Counsel at Johnson & Johnson

1y

IMO, for many, AI really just accelerates the FAFO lifecycle.

See more comments

To view or add a comment, sign in

Explore topics