The Data Governance Imperative

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

During this episode, Steve Sarsfield and I discuss how data governance is about changing the hearts and minds of your company to see the value of data quality, the characteristics of a data champion, and creating effective data quality scorecards.

Steve Sarsfield is a leading author and expert in data quality and data governance.  His book The Data Governance Imperative is a comprehensive exploration of data governance focusing on the business perspectives that are important to data champions, front-office employees, and executives.  He runs the Data Governance and Data Quality Insider, which is an award-winning and world-recognized blog.  Steve Sarsfield is the Product Marketing Manager for Data Governance and Data Quality at Talend.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Quality and Governance are Beyond the Data

Last week’s episode of DM Radio on Information Management, co-hosted as always by Eric Kavanagh and Jim Ericson, was a panel discussion about how and why data governance can improve the quality of an organization’s data, and the featured guests were Dan Soceanu of DataFlux, Jim Orr of Trillium Software, Steve Sarsfield of Talend, and Brian Parish of iData.

The relationship between data quality and data governance is a common question, and perhaps mostly because data governance is still an evolving discipline.  However, another contributing factor is the prevalence of the word “data” in the names given to most industry disciplines and enterprise information initiatives.

“Data governance goes well beyond just the data,” explained Orr.  “Administration, business process, and technology are also important aspects, and therefore the term data governance can be misleading.”

“So perhaps a best practice of data governance is not calling it data governance,” remarked Ericson.

From my perspective, data governance involves policies, people, business processes, data, and technology.  However, all of those last four concepts (people, business process, data, and technology) are critical to every enterprise initiative.

So I agree with Orr because I think that the key concept differentiating data governance is its definition and enforcement of the policies that govern the complex ways that people, business processes, data, and technology interact.

As it relates to data quality, I believe that data governance provides the framework for evolving data quality from a project to an enterprise-wide program by facilitating the collaboration of business and technical stakeholders.  Data governance aligns data usage with business processes through business relevant metrics, and enables people to be responsible for, among other things, data ownership and data quality.

“A basic form of data governance is tying the data quality metrics to their associated business processes and business impacts,” explained Sarsfield, the author of the great book The Data Governance Imperative, which explains that “the mantra of data governance is that technologists and business users must work together to define what good data is by constantly leveraging both business users, who know the value of the data, and technologists, who can apply what the business users know to the data.”

Data is used as the basis to make critical business decisions, and therefore “the key for data quality metrics is the confidence level that the organization has in the data,” explained Soceanu.  Data-driven decisions are better than intuition-driven decisions, but lacking confidence about the quality of their data can lead organizations to rely more on intuition for their business decisions.

The Data Asset: How Smart Companies Govern Their Data for Business Success, written by Tony Fisher, the CEO of DataFlux, is another great book about data governance, which explains that “data quality is about more than just improving your data.  Ultimately, the goal is improving your organization.  Better data leads to better decisions, which leads to better business.  Therefore, the very success of your organization is highly dependent on the quality of your data.”

Data is a strategic corporate asset and, by extension, data quality and data governance are both strategic corporate disciplines, because high quality data serves as a solid foundation for an organization’s success, empowering people, enabled by technology, to make better business decisions and optimize business performance.

Therefore, data quality and data governance both go well beyond just improving the quality of an organization’s data, because Quality and Governance are Beyond the Data.

 

Related Posts

Video: Declaration of Data Governance

Don’t Do Less Bad; Do Better Good

The Real Data Value is Business Insight

Is your data complete and accurate, but useless to your business?

Finding Data Quality

The Diffusion of Data Governance

MacGyver: Data Governance and Duct Tape

The Prince of Data Governance

Jack Bauer and Enforcing Data Governance Policies

Data Governance and Data Quality

Worthy Data Quality Whitepapers (Part 3)

In my April 2009 blog post Data Quality Whitepapers are Worthless, I called for data quality whitepapers worth reading.

This post is now the third entry in an ongoing series about data quality whitepapers that I have read and can endorse as worthy.

 

Matching Technology Improves Data Quality

Steve Sarsfield recently published Matching Technology Improves Data Quality, a worthy data quality whitepaper, which is a primer on the elementary principles, basic theories, and strategies of record matching.

This free whitepaper is available for download from Talend (requires registration by providing your full contact information).

The whitepaper describes the nuances of deterministic and probabilistic matching and the algorithms used to identify the relationships among records.  It covers the processes to employ in conjunction with matching technology to transform raw data into powerful information that drives success in enterprise applications, including customer relationship management (CRM), data warehousing, and master data management (MDM).

Steve Sarsfield is the Talend Data Quality Product Marketing Manager, and author of the book The Data Governance Imperative and the popular blog Data Governance and Data Quality Insider.

 

Whitepaper Excerpts

Excerpts from Matching Technology Improves Data Quality:

  • “Matching plays an important role in achieving a single view of customers, parts, transactions and almost any type of data.”
  • “Since data doesn’t always tell us the relationship between two data elements, matching technology lets us define rules for items that might be related.”
  • “Nearly all experts agree that standardization is absolutely necessary before matching.  The standardization process improves matching results, even when implemented along with very simple matching algorithms.  However, in combination with advanced matching techniques, standardization can improve information quality even more.”
  • “There are two common types of matching technology on the market today, deterministic and probabilistic.”
  • “Deterministic or rules-based matching is where records are compared using fuzzy algorithms.”
  • “Probabilistic matching is where records are compared using statistical analysis and advanced algorithms.”
  • “Data quality solutions often offer both types of matching, since one is not necessarily superior to the other.”
  • “Organizations often evoke a multi-match strategy, where matching is analyzed from various angles.”
  • “Matching is vital to providing data that is fit-for-use in enterprise applications.”
 

Related Posts

Identifying Duplicate Customers

Customer Incognita

To Parse or Not To Parse

The Very True Fear of False Positives

Data Governance and Data Quality

Worthy Data Quality Whitepapers (Part 2)

Worthy Data Quality Whitepapers (Part 1)

Data Quality Whitepapers are Worthless

Data Governance and Data Quality

Regular readers know that I often blog about the common mistakes I have observed (and made) in my professional services and application development experience in data quality (for example, see my post: The Nine Circles of Data Quality Hell).

According to Wikipedia: “Data governance is an emerging discipline with an evolving definition.  The discipline embodies a convergence of data quality, data management, business process management, and risk management surrounding the handling of data in an organization.”

Since I have never formally used the term “data governance” with my clients, I have been researching what data governance is and how it specifically relates to data quality.

Thankfully, I found a great resource in Steve Sarsfield's excellent book The Data Governance Imperative, where he explains:

“Data governance is about changing the hearts and minds of your company to see the value of information quality...data governance is a set of processes that ensures that important data assets are formally managed throughout the enterprise...at the root of the problems with managing your data are data quality problems...data governance guarantees that data can be trusted...putting people in charge of fixing and preventing issues with data...to have fewer negative events as a result of poor data.”

Although the book covers data governance more comprehensively, I focused on three of my favorite data quality themes:

  • Business-IT Collaboration
  • Data Quality Assessments
  • People Power

 

Business-IT Collaboration

Data governance establishes policies and procedures to align people throughout the organization.  Successful data quality initiatives require the Business and IT to forge an ongoing and iterative collaboration.  Neither the Business nor IT alone has all of the necessary knowledge and resources required to achieve data quality success.  The Business usually owns the data and understands its meaning and use in the day-to-day operation of the enterprise and must partner with IT in defining the necessary data quality standards and processes. 

Steve Sarsfield explains:

“Business users need to understand that data quality is everyone's job and not just an issue with technology...the mantra of data governance is that technologists and business users must work together to define what good data is...constantly leverage both business users, who know the value of the data, and technologists, who can apply what the business users know to the data.” 

Data Quality Assessments

Data quality assessments provide a much needed reality check for the perceptions and assumptions that the enterprise has about the quality of its data.  Data quality assessments help with many tasks including verifying metadata, preparing meaningful questions for subject matter experts, understanding how data is being used, and most importantly – evaluating the ROI of data quality improvements.  Building data quality monitoring functionality into the applications that support business processes provides the ability to measure the effect that poor data quality can have on decision-critical information.

Steve Sarsfield explains:

“In order to know if you're winning in the fight against poor data quality, you have to keep score...use data quality scorecards to understand the detail about quality of data...and aggregate those scores into business value metrics...solid metrics...give you a baseline against which you can measure improvement over time.” 

People Power

Although incredible advancements continue, technology alone cannot provide the solution.  Data governance and data quality both require a holistic approach involving people, process and technology.  However, by far the most important of the three is people.  In my experience, it is always the people involved that make projects successful.

Steve Sarsfield explains:

“The most important aspect of implementing data governance is that people power must be used to improve the processes within an organization.  Technology will have its place, but it's most importantly the people who set up new processes who make the biggest impact.”

Conclusion

Data governance provides the framework for evolving data quality from a project to an enterprise-wide initiative.  By facilitating the collaboration of business and technical stakeholders, aligning data usage with business metrics, and enabling people to be responsible for data ownership and data quality, data governance provides for the ongoing management of the decision-critical information that drives the tactical and strategic initiatives essential to the enterprise's mission to survive and thrive in today's highly competitive and rapidly evolving marketplace.

 

Related Posts

TDWI World Conference Chicago 2009

Not So Strange Case of Dr. Technology and Mr. Business

Schrödinger's Data Quality

The Three Musketeers of Data Quality

 

Additional Resources

Over on Data Quality Pro, read the following posts:

From the IAIDQ publications portal, read the 2008 industry report: The State of Information and Data Governance

Read Steve Sarsfield's book: The Data Governance Imperative and read his blog: Data Governance and Data Quality Insider