The Higher Education of Data Quality

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

On this episode of OCDQ Radio, we leave the corporate world, where data quality and master data management is mostly focused on the challenges of managing data about customers, products, and revenue, and we get schooled in the higher education of data quality.  In other words, we discuss data quality and master data management in higher education, which is mostly focused on the challenges of managing data about students, courses, and tuition.

Our guest lecturer will be Mark Horseman, who has been working at the University of Saskatchewan for over 10 years and has been on the implementation team of many of the University’s enterprise software solutions.  Mark Horseman now works in Information Strategy and Analytics leveraging his knowledge to assist the University in managing its data quality challenges.

Follow Mark Horseman on Twitter and read his Eccentric Data Quality blog to hear more about the challenges faced by Mark on his quest (yes, it’s a quest) to improve Higher-Education Data Quality.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

A Farscape Analogy for Data Quality

Farscape was one of my all-time favorite science fiction television shows.  In the weird way my mind works, the recent blog post (which has received great comments) Four Steps to Fixing Your Bad Data by Tom Redman, triggered a Farscape analogy.

“The notion that data are assets sounds simple and is anything but,” Redman wrote.  “Everyone touches data in one way or another, so the tendrils of a data program will affect everyone — the things they do, the way they think, their relationships with one another, your relationships with customers.”

The key word for me was tendrils — like I said, my mind works in a weird way.

 

Moya and Pilot

On Farscape, the central characters of the show travel through space aboard Moya, a Leviathan, which is a species of living, sentient spaceships.  Pilot is a sentient creature (of a species also known as Pilots) with the vast capacity for multitasking that is necessary for the simultaneous handling of the many systems aboard a Leviathan.  The tendrils of a Pilot’s lower body are biologically bonded with the living systems of a Leviathan, creating a permanent symbiotic connection, meaning that, once bonded, a Pilot and a Leviathan can no longer exist independently for more than an hour or so, or both of them will die.

Leviathans were one of the many laudably original concepts of Farscape.  The role of the spaceship in most science fiction is analogous to the role of a boat.  In other words, traveling through space is most often imagined like traveling on water.  However, seafaring vessels and spaceships are usually seen as a technological object providing transportation and life support, but not actually alive in its own right (despite the fact that both types of ship are usually anthropomorphized, and usually as a female).

Because Moya was alive, when she was damaged, she felt pain and needed time to heal.  And because she was sentient, highly intelligent, and capable of communicating with the crew through Pilot (who was the only one who could understand the complexity of the Leviathan language, which was beyond the capability of a universal translator), Moya was much more than just a means of transportation.  In other words, there truly was a symbiotic relationship between, not only Moya and Pilot, but also between Moya and Pilot, and their crew and passengers.

 

Enterprise and Data

(Sorry, my fellow science fiction geeks, but it’s not that Enterprise and that Data.  Perfectly understandable mistake, though.)

Although technically not alive in the biological sense, in many respects, an organization is like a living, sentient organism, and like space and seafaring ships, often anthropomorphized.  An enterprise is much more than just a large organization providing a means of employment and offering products and/or services (and, in a sense, life support to its employees and customers).

As Redman explains in his book Data Driven: Profiting from Your Most Important Business Asset, data is not just the lifeblood of the Information Age, data is essential to everything the enterprise does, from helping it better understand its customers, to guiding its development of better products and/or services, to setting a strategic direction toward achieving its business goals.

So the symbiotic relationship between Enterprise and Data is analogous to the symbiotic relationship between Moya and Pilot.

Data is the Pilot of the Enterprise Leviathan.  The enterprise can not survive without its data.  A healthy enterprise requires healthy data — data of sufficient quality capable of supporting the operational, tactical, and strategic functions of the enterprise.

Returning to Redman’s words, “Everyone touches data in one way or another, so the tendrils of a data program will affect everyone — the things they do, the way they think, their relationships with one another, your relationships with customers.”

So the relationship between an enterprise and its data, and its people, business processes, and technology, is analogous to the relationship between Moya and Pilot, and their crew and passengers.  It is the enterprise’s people, its crew (i.e., employees), who, empowered by high quality data and enabled by technology, optimize business processes for superior corporate performance, thereby delivering superior products and/or services to the enterprise’s passengers (i.e., customers).

 

So why isn’t data viewed as an asset?

So if this deep symbiosis exists, if these intertwined and symbiotic relationships exist, if the tendrils of data are biologically bonded with the complex enterprise ecosystem — then why isn’t data viewed as an asset?

In Data Driven, Redman references the book The Social Life of Information by John Seely Brown and Paul Duguid, who explained that “a technology is never fully accepted until it becomes invisible to those who use it.”  The term informationalization describes the process of building data and information into a product or service.  “When products and services are fully informationalized,” Redman noted, then data, “blends into the background and people do not even think about it anymore.”

Perhaps that is why data isn’t viewed as an asset.  Perhaps data has so thoroughly pervaded the enterprise that it has become invisible to those who use it.  Perhaps it is not an asset because data is invisible to those who are so dependent upon its quality.

 

Perhaps we only see Moya, but not her Pilot.

 

Related Posts

Organizing For Data Quality

Data, data everywhere, but where is data quality?

Finding Data Quality

The Data Quality Wager

Beyond a “Single Version of the Truth”

Poor Data Quality is a Virus

DQ-Tip: “Don't pass bad data on to the next person...”

Retroactive Data Quality

Hyperactive Data Quality (Second Edition)

A Brave New Data World

Commendable Comments (Part 10)

Welcome to the 300th Obsessive-Compulsive Data Quality (OCDQ) blog post!

You might have been expecting a blog post inspired by the movie 300, but since I already did that with Spartan Data Quality, instead I decided to commemorate this milestone with the 10th entry in my ongoing series for expressing my gratitude to my readers for their truly commendable comments on my blog posts.

 

Commendable Comments

On DQ-BE: Single Version of the Time, Vish Agashe commented:

“This has been one of my pet peeves for a long time. Shared version of truth or the reference version of truth is so much better, friendly and non-dictative (if such a word exists) than single version of truth.

I truly believe that starting a discussion with Single Version of the Truth with business stakeholders is a nonstarter. There will always be a need for multifaceted view and possibly multiple aspects of the truth.

A very common term/example I have come across is the usage of the term revenue. Unfortunately, there is no single version of revenue across the organizations (and for valid reasons). From Sales Management prospective, they like to look at sales revenue (sales bookings) which is the business on which they are compensated on, financial folks want to look at financial revenue, which is the revenue they capture in the books and marketing possibly wants to look at marketing revenue (sales revenue before the discount) which is the revenue marketing uses to justify their budgets. So if you ever asked questions to a group of people about what revenue of the organization is, you will get three different perspectives. And these three answers will be accurate in the context of three different groups.”

On Data Confabulation in Business Intelligence, Henrik Liliendahl Sørensen commented:

“I think this is going to dominate the data management realm in the coming years. We are not only met with drastically increasing volumes of data, but also increasing velocity and variety of data.

The dilemma is between making good decisions and making fast decisions, whether the decisions based on business intelligence findings should wait for assuring the quality of the data upon which the decisions are made, thus risking the decision being too late. If data quality always could be optimal by being solved at the root we wouldn’t have that dilemma.

The challenge is if we are able to have optimal data all the time when dealing with extreme data, which is data of great variety moving in high velocity and coming in huge volumes.”

On The People Platform, Mark Allen commented:

“I definitely agree and think you are burrowing into the real core of what makes or breaks EDM and MDM type initiatives -- it's the people.

Business models, processes, data, and technology all provide fixed forms of enablement or constraint. And where in the past these dynamics have been very compartmentalized throughout a company's business model and systems architecture, with EDM and MDM involving more integrated functions and shared data, people become more of the x-factor in the equation. This demands the presence of data governance to be the facilitating process that drives the collaborative, cross-functional, and decision making dynamics needed for successful EDM and MDM. Of course, the dilemma is that in a governance model people can still make bad decisions that inhibit people from working effectively.

So in terms of the people platform and data governance, there needs to be the correct focus on what are the right roles and good decisions made that can enable people to interact effectively.”

On Beware the Data Governance Ides of March, Jill Wanless commented:

“Our organization has taken the Hybrid Approach (starting Bottom-Up) and it works well for two reasons: (1) the worker bee rock stars are all aligned and ready to hit the ground running, and (2) the ‘Top’ can sit back and let the ‘aligned’ worker bees get on with it.

Of course, this approach is sometimes (painfully) slow, but with the ground-level rock stars already aligned, there is less resistance implementing the policies, and the Top’s heavy hand is needed much less frequently, but I voted for Hybrid Approach (starting Top-Down) because I have less than stellar patience for the long and scenic route.”

On Data Governance and the Buttered Cat Paradox, Rob Drysdale commented:

“Too many companies get paralyzed thinking about how to do this and implement it. (Along with the overwhelmed feeling that it is too much time/effort/money to fix it.) But I think your poll needs another option to vote on, specifically: ‘Whatever works for the company/culture/organization’ since not all solutions will work for every organization.

In some where it is highly structured, rigid and controlled, there wouldn’t be the freedom at the grass-roots level to start something like this and it might be frowned upon by upper-level management. In other organizations that foster grass-roots things then it could work.

However, no matter which way you can get it started and working, you need to have buy-in and commitment at all levels to keep it going and make it effective.”

On The Data Quality Wager, Gordon Hamilton commented:

“Deming puts a lot of energy into his arguments in 'Out of the Crisis' that the short-term mindset of the executives, and by extension the directors, is a large part of the problem.

Jackanapes, a lovely under-used term, might be a bit strong when the executives are really just doing what they are paid for. In North America we get what the directors measure! In fact, one quandary is that a proactive executive, who invests in data quality is building the long-term value of their company but is also setting it up to be acquired by somebody who recognizes that the 'under the radar' improvements are making the prize valuable.

Deming says on p.100: 'Fear of unfriendly takeover may be the single most important obstacle to constancy of purpose. There is also, besides the unfriendly takeover, the equally devastating leveraged buyout. Either way, the conqueror demands dividends, with vicious consequences on the vanquished.'”

On Got Data Quality?, Graham Rhind commented:

“It always makes me smile when people attempt to put a percentage value on their data quality as though it were something as tangible and measurable as the fat content of your milk.

In order to make such a measurement one would need to know where 100% of the defects lie. If they knew that they would be able to resolve the defects and achieve 100% quality. In reality you cannot and do not know where each defect is and how many there are.

Even though tools such as profilers will tell you, for example, that 95% of your US address records have a valid state added, there is still no way to measure how many of these valid states are applicable to the real world entity on the ground. Mr Smith may be registered in the database to an existing and valid address in the database, but if he moved last week there's a data quality issue that won't be discovered until one attempts to contact him.

The same applies when people say they have removed 95% of duplicates from their data. If they can measure it then they know where the other 5% of duplicates are and they can remove them.

But back to the point: you may not achieve 100% quality. In fact, we know you never will. But aiming for that target means that you're aiming in the right direction. As long as your goal is to get close to perfection and not to achieve it, I don't see the problem.”

On Data Governance Star Wars: Balancing Bureaucracy and Agility, Rob “Darth” Karel commented:

“A curious question to my Rebellious friend OCDQ-Wan, while data governance agility is a wonderful goal, and maybe a great place to start your efforts, is it sustainable?

Your agile Rebellion is like any start-up: decisions must be made quickly, you must do a lot with limited resources, everyone plays multiple roles willingly, and your objective is very targeted and specific. For example, to fire a photon torpedo into a small thermal exhaust port - only 2 meters wide - connected directly to the main reactor of the Death Star. Let's say you 'win' that market objective. What next?

The Rebellion defeats the Galactic Empire, leaving a market leadership vacuum. The Rebellion begins to set up a new form of government to serve all (aka grow existing market and expand into new markets) and must grow larger, with more layers of management, in order to scale. (aka enterprise data governance supporting all LOBs, geographies, and business functions).

At some point this Rebellion becomes a new Bureaucracy - maybe with a different name and legacy, but with similar results. Don't forget, the Galactic Empire started as a mini-rebellion itself spearheaded by the agile Palpatine!” 

You Are Awesome

Thank you very much for sharing your perspectives with our collablogaunity.  This entry in the series highlighted the commendable comments received on OCDQ Blog posts published between January and June of 2011.

Since there have been so many commendable comments, please don’t be offended if one of your comments wasn’t featured.

Please keep on commenting and stay tuned for future entries in the series.

By the way, even if you have never posted a comment on my blog, you are still awesome — feel free to tell everyone I said so.

Thank you for reading the Obsessive-Compulsive Data Quality (OCDQ) blog.  Your readership is deeply appreciated.

 

Related Posts

730 Days and 264 Blog Posts Later – The Second Blogiversary of OCDQ Blog

OCDQ Blog Bicentennial – The 200th OCDQ Blog Post

Commendable Comments (Part 9)

Commendable Comments (Part 8)

Commendable Comments (Part 7)

Commendable Comments (Part 6)

Commendable Comments (Part 5) – The 100th OCDQ Blog Post

Commendable Comments (Part 4)

Commendable Comments (Part 3)

Commendable Comments (Part 2)

Commendable Comments (Part 1)

Data Profiling Early and Often

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

On this episode of OCDQ Radio, I discuss data profiling with James Standen, the founder and CEO of nModal Solutions Inc., the makers of Datamartist, which is a fast, easy to use, visual data profiling and transformation tool.

Before founding nModal, James had over 15 years experience in a broad range of roles involving data, ranging from building business intelligence solutions, creating data warehouses and a data warehouse competency center, through to working on data migration and ERP projects in large organizations.  You can learn more about and connect with James Standen on LinkedIn.

James thinks that while there is obviously good data and bad data, that often bad data is just misunderstood and can be coaxed away from the dark side if you know how to approach it.  He does recommend wearing the proper safety equipment however, and having the right tools.  For more of his wit and wisdom, follow Datamartist on Twitter, and read the Datamartist Blog.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Data Governance and Information Quality 2011

Last week, I attended the Data Governance and Information Quality 2011 Conference, which was held June 27-30 in San Diego, California at the Catamaran Resort Hotel and Spa.

In this blog post, I summarize a few of the key points from some of the sessions I attended.  I used Twitter to help me collect my notes, and you can access the complete archive of my conference tweets on Twapper Keeper.

 

Assessing Data Quality Maturity

In his pre-conference tutorial, David Loshin, author of the book The Practitioner’s Guide to Data Quality Improvement, described five stages comprising a continuous cycle of data quality improvement:

  1. Identify and measure how poor data quality impedes business objectives
  2. Define business-related data quality rules and performance targets
  3. Design data quality improvement processes that remediate business process flaws
  4. Implement data quality improvement methods
  5. Monitor data quality against targets

 

Getting Started with Data Governance

Oliver Claude from Informatica provided some tips for making data governance a reality:

  • Data Governance requires acknowledging People, Process, and Technology are interlinked
  • You need to embed your data governance policies into your operational business processes
  • Data Governance must be Business-Centric, Technology-Enabled, and Business/IT Aligned

 

Data Profiling: An Information Quality Fundamental

Danette McGilvray, author of the book Executing Data Quality Projects, shared some of her data quality insights:

  • Although the right technology is essential, data quality is more than just technology
  • Believing tools cause good data quality is like believing X-Ray machines cause good health
  • Data Profiling is like CSI — Investigating the Poor Data Quality Crime Scene

 

Building Data Governance and Instilling Data Quality

In the opening keynote address, Dan Hartley of ConAgra Foods shared his data governance and data quality experiences:

  • It is important to realize that data governance is a journey, not a destination
  • One of the commonly overlooked costs of data governance is the cost of inaction
  • Data governance must follow a business-aligned and business-value-driven approach
  • Data governance is as much about change management as it is anything else
  • Data governance controls must be carefully balanced so they don’t disrupt business processes
  • Common Data Governance Challenge: Balancing Data Quality and Speed (i.e., Business Agility)
  • Common Data Governance Challenge: Picking up Fumbles — Balls dropped between vertical organizational silos
  • Bad business processes cause poor data quality
  • Better Data Quality = A Better Bottom Line
  • One of the most important aspects of Data Governance and Data Quality — Wave the Flag of Success

 

Practical Data Governance

Winston Chen from Kalido discussed some aspects of delivering tangible value with data governance:

  • Data governance is the business process of defining, implementing, and enforcing data policies
  • Every business process can be improved by feeding it better data
  • Data Governance is the Horse, not the Cart, i.e., Data Governance drives MDM and Data Quality
  • Data Governance needs to balance Data Silos (Local Authority) and Data Cathedrals (Central Control)

 

The Future of Data Governance and Data Quality

The closing keynote panel, moderated by Danette McGilvray, included the following insights:

  • David Plotkin: “It is not about Data, Process, or Technology — It is about People”
  • John Talburt: “For every byte of Data, we need 1,000 bytes of Metadata to go along with it”
  • C. Lwanga Yonke: “One of the most essential skills is the ability to lead change”
  • John Talburt: “We need to be focused on business-value-based data governance and data quality”
  • C. Lwanga Yonke: “We must be multilingual: Speak Data/Information, Business, and Technology”

 

Organizing for Data Quality

In his post-conference tutorial, Tom Redman, author of the book Data Driven, described ten habits of those with the best data:

  1. Focus on the most important needs of the most important customers
  2. Apply relentless attention to process
  3. Manage all critical sources of data, including external suppliers
  4. Measure data quality at the source and in business terms
  5. Employ controls at all levels to halt simple errors and establish a basis for moving forward
  6. Develop a knack for continuous improvement
  7. Set and achieve aggressive targets for improvement
  8. Formalize management accountabilities for data
  9. Lead the effort using a broad, senior group
  10. Recognize that the hard data quality issues are soft and actively manage the needed cultural changes

 

Tweeps Out at the Ball Game

As I mentioned earlier, I used Twitter to help me collect my notes, and you can access the complete archive of my conference tweets on Twapper Keeper.

But I wasn’t the only data governance and data quality tweep at the conference.  Steve Sarsfield, April Reeve, and Joe Dos Santos were also attending and tweeting.

However, on Tuesday night, we decided to take a timeout from tweeting, and instead became Tweeps out at the Ball Game by attending the San Diego Padres and Kansas Royals baseball game at PETCO Park.

We sang Take Me Out to the Ball Game, bought some peanuts and Cracker Jack, and root, root, rooted for the home team, which apparently worked since Padres closer Heath Bell got one, two, three strikes, you’re out on Royals third baseman Wilson Betemit, and the San Diego Padres won the game by a final score of 4-2.

So just like at the Data Governance and Information Quality 2011 Conference, a good time was had by all.  See you next year!

 

Related Posts

Stuck in the Middle with Data Governance

DQ-BE: Invitation to Duplication

TDWI World Conference Orlando 2010

Light Bulb Moments at DataFlux IDEAS 2010

Enterprise Data World 2010

Enterprise Data World 2009

TDWI World Conference Chicago 2009

DataFlux IDEAS 2009

Data Governance Star Wars

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

WindowsLiveWriter-DataGovernanceStarWars_728F-

Shown above is the poll results from the recent Star Wars themed blog debate about one of data governance’s biggest challenges, how to balance bureaucracy and business agility.  Rob Karel took the position for Bureaucracy as Darth Karel of the Empire, and I took the position for Agility as OCDQ-Wan Harris of the Rebellion.

However, this was a true debate format where Rob and I intentionally argued polar opposite positions with full knowledge that the reality is data governance success requires effectively balancing bureaucracy and business agility.

Just in case you missed the blog debate, here are the post links:

On this special, extended, and Star Wars themed episode of OCDQ Radio, I am joined by Rob Karel and Gwen Thomas to discuss this common challenge of effectively balancing bureaucracy and business agility on data governance programs.

Rob Karel is a Principal Analyst at Forrester Research, where he serves Business Process and Applications Professionals.  Rob is a leading expert in how companies manage data and integrate information across the enterprise.  His current research focus includes process data management, master data management, data quality management, metadata management, data governance, and data integration technologies.  Rob has more than 19 years of data management experience, working in both business and IT roles to develop solutions that provide better quality, confidence in, and usability of critical enterprise data.

Gwen Thomas is the Founder and President of The Data Governance Institute, a vendor-neutral, mission-based organization with three arms: publishing free frameworks and guidance, supporting communities of practitioners, and offering training and consulting.  Gwen also writes the popular blog Data Governance Matters, frequently contributes to IT and business publications, and is the author of the book Alpha Males and Data Disasters: The Case for Data Governance.

This extended episode of OCDQ Radio is 49 minutes long, and is divided into two parts, which are separated by a brief Star Wars themed intermission.  In Part 1, Rob and I discuss our blog debate.  In Part 2, Gwen joins us to provide her excellent insights.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

The Art of Data Matching

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

On this episode of OCDQ Radio, I am joined by Henrik Liliendahl Sørensen for a discussion about the Art of Data Matching.

Henrik is a data quality and master data management (MDM) professional also doing data architecture.  Henrik has worked 30 years in the IT business within a large range of business areas, such as government, insurance, manufacturing, membership, healthcare, public transportation, and more.

Henrik’s current engagements include working as practice manager at Omikron Data Quality, a data quality tool maker with headquarters in Germany, and as data quality specialist at Stibo Systems, a master data management vendor with headquarters in Denmark.  Henrik is also a charter member of the IAIDQ, and the creator of the LinkedIn Group for Data Matching for people interested in data quality and thrilled by automated data matching, deduplication, and identity resolution.

Henrik is one of the most prolific and popular data quality bloggers, regularly sharing his excellent insights about data quality, data matching, MDM, data architecture, data governance, diversity in data quality, and many other data management topics.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Data Governance Star Wars: Balancing Bureaucracy and Agility

I was recently discussing data governance best practices with Rob Karel, the well respected analyst at Forrester Research, and our conversation migrated to one of data governance’s biggest challenges — how to balance bureaucracy and business agility.

So Rob and I thought it would be fun to tackle this dilemma in a Star Wars themed debate across our individual blog platforms with Rob taking the position for Bureaucracy as the Empire and me taking the opposing position for Agility as the Rebellion.

(Yes, the cliché is true, conversations between self-proclaimed data geeks tend to result in Star Wars or Star Trek parallels.)

Disclaimer: Remember that this is a true debate format where Rob and I are intentionally arguing polar opposite positions with full knowledge that the reality is data governance success requires effectively balancing bureaucracy and agility.

Please take the time to read both of our blog posts, then we encourage your comments — and your votes (see the poll below).

Data Governance Star Wars

If you are having trouble viewing this video, you can watch it on Vimeo by clicking on this link: Data Governance Star Wars

The Force is Too Strong with This One

“Don’t give in to Bureaucracy—that is the path to the Dark Side of Data Governance.”

Data governance requires the coordination of a complex combination of a myriad of factors, including executive sponsorship, funding, decision rights, arbitration of conflicting priorities, policy definition, policy implementation, data quality remediation, data stewardship, business process optimization, technology enablement, and, perhaps most notably, policy enforcement.

When confronted by this phantom menace of complexity, many organizations believe that the only path to success must be command and control—institute a rigid bureaucracy to dictate policies, demand compliance, and dole out punishments.  This approach to data governance often makes policy compliance feel like imperial rule, and policy enforcement feel like martial law.

But beware.  Bureaucracy, command, control—the Dark Side of Data Governance are they.  Once you start down the dark path, forever will it dominate your destiny, consume your organization it will.

No Time to Discuss this as a Committee

“There is a great disturbance in the Data, as if millions of voices suddenly cried out for Governance but were suddenly silenced.  I fear something terrible has happened.  I fear another organization has started by creating a Data Governance Committee.”

Yes, it’s true—at some point, an official Data Governance Committee (or Council, or Board, or Galactic Senate) will be necessary.

However, one of the surest ways to guarantee the failure of a new data governance program is to start by creating a committee.  This is often done with the best of intentions, bringing together key stakeholders from all around the organization, representatives of each business unit and business function, as well as data and technology stakeholders.  But when you start by discussing data governance as a committee, you often never get data governance out of the committee (i.e., all talk, mostly arguing, no action).

Successful data governance programs often start with a small band of rebels (aka change agents) struggling to restore quality to some business-critical data, or struggling to resolve inefficiencies in a key business process.  Once news of their successful pilot project spreads, more change agents will rally to the cause—because that’s what data governance truly requires, not a committee, but a cause to believe in and fight for—especially after the Empire of Bureaucracy strikes back and tries to put down the rebellion.

Collaboration is the Data Governance Force

“Collaboration is what gives a data governance program its power.  Its energy binds us together.  Cooperative beings are we.  You must feel the Collaboration all around you, among the people, the data, the business process, the technology, everywhere.”

Many rightfully lament the misleading term “data governance” because it appears to put the emphasis on “governing data.”

Data governance actually governs the interactions among business processes, data, technology and, most important—people.  It is the organization’s people, empowered by high quality data and enabled by technology, who optimize business processes for superior corporate performance.  Data governance reveals how truly interconnected and interdependent the organization is, showing how everything that happens within the enterprise happens as a result of the interactions occurring among its people.

Data governance provides the framework for the communication and collaboration of business, data, and technical stakeholders, and establishes an enterprise-wide understanding of the roles and responsibilities involved, and the accountability required to support the organization’s business activities, and materialize the value of the enterprise’s data as positive business impacts.

Enforcing data governance policies with command and control is the quick and easy path—to failure.  Principles, not policies, are what truly give a data governance program its power.  Communication and collaboration are the two most powerful principles.

“May the Collaboration be with your Data Governance program.  Always.”

Always in Motion is the Future

“Be mindful of the future, but not at the expense of the moment.  Keep your concentration here and now, where it belongs.”

Perhaps the strongest case against bureaucracy in data governance is the business agility that is necessary for an organization to survive and thrive in today’s highly competitive and rapidly evolving marketplace.  The organization must follow what works for as long as it works, but without being afraid to adjust as necessary when circumstances inevitably change.

Change is the only galactic constant, which is why data governance policies can never be cast in stone (or frozen in carbonite).

Will a well-implemented data governance strategy continue to be successful?  Difficult to see.  Always in motion is the future.  And this is why, when it comes to deliberately designing a data governance program for agility: “Do or do not.  There is no try.”

Click here to read Rob “Darth” Karel’s blog post entry in this data governance debate

Please feel free to also post a comment below and explain your vote or simply share your opinions and experiences.

Listen to Data Governance Star Wars on OCDQ Radio — In Part 1, Rob Karel and I discuss our blog mock debate, which is followed by a brief Star Wars themed intermission, and then in Part 2, Gwen Thomas joins us to provide her excellent insights.

Data Quality Pro

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

On this episode, I am joined by special guest Dylan Jones, the community leader of Data Quality Pro, the largest membership resource dedicated entirely to the data quality profession.

Dylan is currently overseeing the re-build and re-launch of Data Quality Pro into a next generation membership platform, and during our podcast discussion, Dylan describes some of the great new features that will be coming soon to Data Quality Pro.

Links for Data Quality Pro and Dylan Jones:

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Data Confabulation in Business Intelligence

Jarrett Goldfedder recently asked the excellent question: When does Data become Too Much Information (TMI)?

We now live in a 24 hours a day, 7 days a week, 365 days a year world-wide whirlwind of constant information flow, where the very air we breath is literally teeming with digital data streams—continually inundating us with new information.

The challenge is our time is a zero-sum game, meaning for every new information source we choose, others are excluded.

There’s no way to acquire all available information.  And even if we somehow could, due to the limitations of human memory, we often don’t remember much of the new information we do acquire.  In my blog post Mind the Gap, I wrote about the need to coordinate our acquisition of new information with its timely and practical application.

So I definitely agree with Jarrett that the need to find the right amount of information appropriate for the moment is the needed (and far from easy) solution.  Since this is indeed the age of the data deluge and TMI, I fear that data-driven decision making may simply become intuition-driven decisions validated after the fact by selectively choosing the data that supports the decision already made.  The human mind is already exceptionally good at doing this—the term for it in psychology is confabulation.

Although, according to Wikipedia, the term can be used to describe neurological or psychological dysfunction, Jonathan Haidt explained in his book The Happiness Hypothesis, confabulation is frequently used by “normal” people as well.  For example, after buying my new smart phone, I chose to read only the positive online reviews about it, trying to make myself feel more confident I had made the right decision—and more capable of justifying my decision beyond saying I bought the phone that looked “cool.”

 

Data Confabulation in Business Intelligence

Data confabulation in business intelligence occurs when intuition-driven business decisions are claimed to be data-driven and justified after the fact using the results of selective post-decision data analysis.  This is even worse than when confirmation bias causes intuition-driven business decisions, which are justified using the results of selective pre-decision data analysis that only confirms preconceptions or favored hypotheses, resulting in potentially bad—albeit data-driven—business decisions.

My fear is that the data deluge will actually increase the use of both of these business decision-making “techniques” because they are much easier than, as Jarrett recommended, trying to make sense of the business world by gathering and sorting through as much data as possible, deriving patterns from the chaos and developing clear-cut, data-driven, data-justifiable business decisions.

But the data deluge generally broadcasts more noise than signal, and sometimes trying to get better data to make better decisions simply means getting more data, which often only delays or confuses the decision-making process, or causes analysis paralysis.

Can we somehow listen for decision-making insights among the cacophony of chaotic and constantly increasing data volumes?

I fear that the information overload of the data deluge is going to trigger an intuition override of data-driven decision making.

 

Related Posts

The Reptilian Anti-Data Brain

Data In, Decision Out

The Data-Decision Symphony

The Real Data Value is Business Insight

Is your data complete and accurate, but useless to your business?

DQ-View: From Data to Decision

TDWI World Conference Orlando 2010

Hell is other people’s data

Mind the Gap

The Fragility of Knowledge

Has Data Become a Four-Letter Word?

In her excellent blog post 'The Bad Data Ate My Homework' and Other IT Scapegoating, Loraine Lawson explained how “there are a lot of problems that can be blamed on bad data.  I suspect it would be fair to say that there’s a good percentage of problems we don’t even know about that can be blamed on bad data and a lack of data integration, quality and governance.”

Lawson examined whether bad data could have been the cause of the bank foreclosure fiasco, as opposed to, as she concludes, the more realistic causes being bad business and negligence, which, if not addressed, could lead to another global financial crisis.

“Bad data,” Lawson explained, “might be the most ubiquitous excuse since ‘the dog ate my homework.’  But while most of us would laugh at the idea of blaming the dog for missing homework, when someone blames the data, we all nod our heads in sympathy, because we all know how troublesome computers are.  And then the buck gets (unfairly) passed to IT.”

Unfairly blaming IT, or technology in general, when poor data quality negatively impacts business performance is ignoring the organization’s collective ownership of its problems, and its shared responsibility for the solutions to those problems, and causes, as Lawson explained in Data’s Conundrum: Everybody Wants Control, Nobody Wants Responsibility, an “unresolved conflict on both the business and the IT side over data ownership and its related issues, from stewardship to governance.”

In organizations suffering from this unresolved conflict between IT and the Business—a dysfunctional divide also known as the IT-Business Chasm—bad data becomes the default scapegoat used by both sides.

Perhaps, in a strange way, placing the blame on bad data is progress when compared with the historical notions of data denial, when an organization’s default was to claim that it had no data quality issues whatsoever.

However, admitting bad data not only exists, but that bad data is also having a tangible negative impact on business performance doesn’t seem to have motivated organizations to take action.  Instead, many appear to prefer practicing bad data blamestorming, where the Business blames bad data on IT and its technology, and IT blames bad data on the Business and its business processes.

Or perhaps, by default, everyone just claims that “the bad data ate my homework.”

Are your efforts to convince executive management that data needs to treated like a five-letter word (“asset”) being undermined by the fact that data has become a four-letter word in your organization?

 

Related Posts

The Business versus IT—Tear down this wall!

Quality and Governance are Beyond the Data

Data In, Decision Out

The Data-Decision Symphony

The Reptilian Anti-Data Brain

Hell is other people’s data

Promoting Poor Data Quality

Who Framed Data Entry?

Data, data everywhere, but where is data quality?

The Circle of Quality

Commendable Comments (Part 9)

Today is February 14 — Valentine’s Day — the annual celebration of enduring romance, where true love is publicly judged according to your willingness to purchase chocolate, roses, and extremely expensive jewelry, and privately judged in ways that nobody (and please, trust me when I say nobody) wants to see you post on Twitter, Facebook, Flickr, YouTube, or your blog.

This is the ninth entry in my ongoing series for expressing my true love to my readers for their truly commendable comments on my blog posts.  Receiving comments is the most rewarding aspect of my blogging experience.  Although I love all of my readers, I love my commenting readers most of all.

 

Commendable Comments

On Data Quality Industry: Problem Solvers or Enablers?, Henrik Liliendahl Sørensen commented:

“I sometimes compare our profession with that of dentists.  Dentists are also believed to advocate for good habits around your teeth, but are making money when these good habits aren’t followed.

So when 4 out 5 dentists recommend a certain toothpaste, it is probably no good :-)

Seriously though, I take the amount of money spent on data quality tools as a sign that organizations believe there are issues best solved with technology.  Of course these tools aren’t magic.

Data quality tools only solve a certain part of your data and information related challenges.  On the other hand, the few problems they do solve may be solved very well and cannot be solved by any other line of products or in any practical way by humans in any quantity or quality.”

On Data Quality Industry: Problem Solvers or Enablers?, Jarrett Goldfedder commented:

“I think that the expectations of clients from their data quality vendors have grown tremendously over the past few years.  This is, of course, in line with most everything in the Web 2.0 cloud world that has become point-and-click, on-demand response.

In the olden days of 2002, I remember clients asking for vendors to adjust data only to the point where dashboard statistics could be presented on a clean Java user interface.  I have noticed that some clients today want the software to not just run customizable reports, but to extract any form of data from any type of database, to perform advanced ETL and calculations with minimal user effort, and to be easy to use.  It’s almost like telling your dentist to fix your crooked teeth with no anesthesia, no braces, no pain, during a single office visit.

Of course, the reality today does not match the expectation, but data quality vendors and architects may need to step up their game to remain cutting edge.”

On Data Quality is not an Act, it is a Habit, Rob Paller commented:

“This immediately reminded me of the practice of Kaizen in the manufacturing industry.  The idea being that continued small improvements yield large improvements in productivity when compounded.

For years now, many of the thought leaders have preached that projects from business intelligence to data quality to MDM to data governance, and so on, start small and that by starting small and focused, they will yield larger benefits when all of the small projects are compounded.

But the one thing that I have not seen it tied back to is the successes that were found in the leaders of the various industries that have adopted the Kaizen philosophy.

Data quality practitioners need to recognize that their success lies in the fundamentals of Kaizen: quality, effort, participation, willingness to change, and communication. The fundamentals put people and process before technology.  In other words, technology may help eliminate the problem, but it is the people and process that allow that elimination to occur.”

On Data Quality is not an Act, it is a Habit, Dylan Jones commented:

“Subtle but immensely important because implementing a coordinated series of small, easily trained habits can add up to a comprehensive data quality program.

In my first data quality role we identified about ten core habits that everyone on the team should adopt and the results were astounding.  No need for big programs, expensive technology, change management and endless communication, just simple, achievable habits that importantly were focused on the workers.

To make habits work they need the WIIFM (What’s In It For Me) factor.”

On Darth Data, Rob Drysdale commented:

“Interesting concept about using data for the wrong purpose.  I think that data, if it is the ‘true’ data can be used for any business decision as long as it is interpreted the right way.

One problem is that data may have a margin of error associated with it and this must be understood in order to properly use it to make decisions.  Another issue is that the underlying definitions may be different.

For example, an organization may use the term ‘customer’ when it means different things.  The marketing department may have a list of ‘customers’ that includes leads and prospects, but the operational department may only call them ‘customers’ when they are generating revenue.

Each department’s data and interpretation of it is correct for their own purpose, but you cannot mix the data or use it in the ‘other’ department to make decisions.

If all the data is correct, the definitions and the rules around capturing it are fully understood, then you should be able to use it to make any business decision.

But when it gets misinterpreted and twisted to suit some business decision that it may not be suited for, then you are crossing over to the Dark Side.”

On Data Governance and the Social Enterprise, Jacqueline Roberts commented:

“My continuous struggle is the chaos of data electronically submitted by many, many sources, different levels of quality and many different formats while maintaining the history of classification, correction, language translation, where used, and a multitude of other ‘data transactions’ to translate this data into usable information for multi-business use and reporting.  This is my definition of Master Data Management.

I chuckled at the description of the ‘rigid business processes’ and I added ‘software products’ to the concept, since the software industry must understand the fluidity of the change of data to address the challenges of Master Data Management, Data Governance, and Data Cleansing.”

On Data Governance and the Social Enterprise, Frank Harland commented: 

“I read: ‘Collaboration is the key to business success. This essential collaboration has to be based on people, and not on rigid business processes . . .’

And I think: Collaboration is the key to any success.  This must have been true since the time man hunted the Mammoth.  When collaborating, it went a lot better to catch the bugger.

And I agree that the collaboration has to be based on people, and not on rigid business processes.  That is as opposed to based on rigid people, and not on flexible business processes. All the truths are in the adjectives.

I don’t mean to bash, Jim, I think there is a lot of truth here and you point to the exact relationship between collaboration as a requirement and Data Governance as a prerequisite.  It’s just me getting a little tired of Gartner saying things of the sort that ‘in order to achieve success, people should work together. . .’

I have a word in mind that starts with ‘du’ and ends with ‘h’ :-)”

On Quality and Governance are Beyond the Data, Milan Kučera commented:

“Quality is a result of people’s work, their responsibility, improvement initiatives, etc.  I think it is more about the company culture and its possible regulation by government.  It is the most complicated to set-up a ‘new’ (information quality) culture, because of its influence on every single employee.  It is about well balanced information value chain and quality processes at every ‘gemba’ where information is created.

Confidence in the information is necessary because we make many decisions based on it.  Sometimes we do better or worse then before.  We should store/use as much accurate information as possible.

All stewardship or governance frameworks should help companies with the change of its culture, define quality measures (the most important is accuracy), cost of poor quality system (allowing them to monitor impacts of poor quality information), and other necessary things.  Only at this moment would we be able to trust corporate information and make decisions.

A small remark on technology only.  Data quality technology is a good tool for helping you to analyze ‘technical’ quality of data – patterns, business rules, frequencies, NULL or Not NULL values, etc.  Many technology companies narrow information quality into an area of massive cleansing (scrap/rework) activities.  They can correct some errors but everything in general leads to a higher validity, but not information accuracy.  If cleansing is implemented as a regular part of the ETL processes then the company institutionalizes massive correction, which is only a cost adding activity and I am sure it is not the right place to change data contents – we increase data inconsistency within information systems.

Every quality management system (for example TQM, TIQM, Six Sigma, Kaizen) focuses on improvement at the place where errors occur – gemba.  All those systems require: leaders, measures, trained people, and simply – adequate culture.

Technology can be a good assistant (helper), but a bad master.”

On Can Data Quality avoid the Dustbin of History?, Vish Agashe commented:

“In a sense, I would say that the current definitions and approaches of/towards data quality might very well not be able to avoid the Dustbin of History.

In the world of phones and PDAs, quality of information about environments, current fashions/trends, locations and current moods of the customer might be more important than a single view of customer or de-duped customers.  The pace at which consumer’s habits are changing, it might be the quality of information about the environment in which the transaction is likely to happen that will be more important than the quality of the post transaction data itself . . . Just a thought.”

On Does your organization have a Calumet Culture?, Garnie Bolling commented:

“So true, so true, so true.

I see this a lot.  Great projects or initiatives start off, collaboration is expected across organizations, and there is initial interest, big meetings / events to jump start the Calumet.  Now what, when the events no longer happen, funding to fly everyone to the same city to bond, share, explore together dries up.

Here is what we have seen work. After the initial kick off, have small events, focus groups, and let the Calumet grow organically. Sometimes after a big powwow, folks assume others are taking care of the communication / collaboration, but with a small venue, it slowly grows.

Success breeds success and folks want to be part of that, so when the focus group achieves, the growth happens.  This cycle is then repeated, hopefully.

While it is important for folks to come together at the kick off to see the big picture, it is the small rolling waves of success that will pick up momentum, and people will want to join the effort to collaborate versus waiting for others to pick up the ball and run.

Thanks for posting, good topic.  Now where is my small focus group? :-)”

You Are Awesome

Thank you very much for sharing your perspectives with our collablogaunity.  This entry in the series highlighted the commendable comments received on OCDQ Blog posts published in October, November, and December of 2010.

Since there have been so many commendable comments, please don’t be offended if one of your comments wasn’t featured.

Please keep on commenting and stay tuned for future entries in the series.

By the way, even if you have never posted a comment on my blog, you are still awesome — feel free to tell everyone I said so.

 

Related Posts

Commendable Comments (Part 8)

Commendable Comments (Part 7)

Commendable Comments (Part 6)

Commendable Comments (Part 5)

Commendable Comments (Part 4)

Commendable Comments (Part 3)

Commendable Comments (Part 2)

Commendable Comments (Part 1)

Spartan Data Quality

My recent Twitter conservation with Dylan Jones, Henrik Liliendahl Sørensen, and Daragh O Brien was sparked by the blog post Case study with Data blogs, from 300 to 1000, which included a list of the top 500 data blogs ranked by influence.

Data Quality Pro was ranked #57, Liliendahl on Data Quality was ranked #87, The DOBlog was a glaring omission, and I was proud OCDQ Blog was ranked #33 – at least until, being the data quality geeks we are, we noticed that it was also ranked #165.

In other words, there was an ironic data quality issue—a data quality blog was listed twice (i.e., a duplicate record in the list)!

Hilarity ensued, including some epic photo shopping by Daragh, leading, quite inevitably, to the writing of this Data Quality Tale, which is obviously loosely based on the epic movie 300—and perhaps also the epically terrible comedy Meet the Spartans.  Enjoy!

 

Spartan Data Quality

In 1989, an alliance of Data Geeks, lead by the Spartans, an unrivaled group of data quality warriors, battled against an invading data deluge in the mountain data center of Thermopylae, caused by the complexities of the Greco-Persian Corporate Merger.

Although they were vastly outnumbered, the Data Geeks overcame epic data quality challenges in one of the most famous enterprise data management initiatives in history—The Data Integration of Thermopylae.

This is their story.

Leonidas, leader of the Spartans, espoused an enterprise data management approach known as Spartan Data Quality, defined by its ethos of collaboration amongst business, data, and technology experts, collectively and affectionately known as Data Geeks.

Therefore, Leonidas was chosen as the Thermopylae Project Lead.  However, Xerxes, the new Greco-Persian CIO, believed that the data integration project was pointless, Spartan Data Quality was a fool’s errand, and the technology-only Persian approach, known as Magic Beans, should be implemented instead.  Xerxes saw the Thermopylae project as an unnecessary sacrifice.

“There will be no glory in your sacrifice,” explained Xerxes.  “I will erase even the memory of Sparta from the database log files!  Every bit and byte of Data Geek tablespace shall be purged.  Every data quality historian and every data blogger shall have their Ethernet cables pulled out, and their network connections cut from the Greco-Persian mainframe.  Why, uttering the very name of Sparta, or Leonidas, will be punishable by employee termination!  The corporate world will never know you existed at all!”

“The corporate world will know,” replied Leonidas, “that Data Geeks stood against a data deluge, that few stood against many, and before this battle was over, a CIO blinded by technology saw what it truly takes to manage data as a corporate asset.”

Addressing his small army of 300 Data Geeks, Leonidas declared: “Gather round!  No retreat, no surrender.  That is Spartan law.  And by Spartan law we will stand and fight.  And together, united by our collaboration, our communication, our transparency, and our trust in each other, we shall overcome this challenge.”

“A new Information Age has begun.  An age of data-driven business decisions, an age of data-empowered consumers, an age of a world connected by a web of linked data.  And all will know, that 300 Data Geeks gave their last breath to defend it!”

“But there will be so many data defects, they will blot out the sun!” exclaimed Xerxes.

“Then we will fight poor data quality in the shade,” Leonidas replied, with a sly smile.

“This is madness!” Xerxes nervously responded as the new servers came on-line in the data center of Thermopylae.

“Madness?  No,” Leonidas calmly said as the first wave of the data deluge descended upon them.  “THIS . . . IS . . . DATA !!!”

 

Related Posts

Pirates of the Computer: The Curse of the Poor Data Quality

Video: Oh, the Data You’ll Show!

The Quest for the Golden Copy (Part 1)

The Quest for the Golden Copy (Part 2)

The Quest for the Golden Copy (Part 3)

The Quest for the Golden Copy (Part 4)

‘Twas Two Weeks Before Christmas

My Own Private Data

The Tell-Tale Data

Data Quality is People!

#FollowFriday Spotlight: @PhilSimon

FollowFriday Spotlight is an OCDQ regular segment highlighting someone you should follow—and not just Fridays on Twitter.


Phil Simon is an independent technology consultant, author, writer, and dynamic public speaker for hire, who focuses on the intersection of business and technology.  Phil is the author of three books (see below for more details) and also writes for a number of technology media outlets and sites, and hosts the podcast Technology Today.

As an independent consultant, Phil helps his clients optimize their use of technology.  Phil has cultivated over forty clients in a wide variety of industries, including health care, manufacturing, retail, education, telecommunications, and the public sector.

When not fiddling with computers, hosting podcasts, putting himself in comics, and writing, Phil enjoys English Bulldogs, tennis, golf, movies that hurt the brain, fantasy football, and progressive rock.  Phil is a particularly zealous fan of Rush, Porcupine Tree, and Dream Theater.  Anyone who reads his blog posts or books will catch many references to these bands.

 

Books by Phil Simon

My review of The New Small:

By leveraging what Phil Simon calls the Five Enablers (Cloud computing, Software-as-a-Service (SaaS), Free and open source software (FOSS), Mobility, Social technologies), small businesses no longer need to have technology as one of their core competencies, nor invest significant time and money in enabling technology, which allows them to focus on their true core competencies and truly compete against companies of all sizes.

The New Small serves as a practical guide to this brave new world of small business.

 

My review of The Next Wave of Technologies:

The constant challenge faced by organizations, large and small, which are using technology to support the ongoing management of their decision-critical information, is that the business world of information technology can never afford to remain static, but instead, must dynamically evolve and adapt, in order to protect and serve the enterprise’s continuing mission to survive and thrive in today’s highly competitive and rapidly changing marketplace.


The Next Wave of Technologies is required reading if your organization wishes to avoid common mistakes and realize the full potential of new technologies—especially before your competitors do.

 

My review of Why New Systems Fail:

Why New Systems Fail is far from a doom and gloom review of disastrous projects and failed system implementations.  Instead, this book contains numerous examples and compelling case studies, which serve as a very practical guide for how to recognize, and more importantly, overcome the common mistakes that can prevent new systems from being successful.

Phil Simon writes about these complex challenges in a clear and comprehensive style that is easily approachable and applicable to diverse audiences, both academic and professional, as well as readers with either a business or a technical orientation.

 

Blog Posts by Phil Simon

In addition to his great books, Phil is a great blogger.  For example, check out these brilliant blog posts written by Phil Simon:

 

Knights of the Data Roundtable

Phil Simon and I co-host and co-produce the wildly popular podcast Knights of the Data Roundtable, a bi-weekly data management podcast sponsored by the good folks at DataFlux, a SAS Company.

The podcast is a frank and open discussion about data quality, data integration, data governance and all things related to managing data.

 

Related Posts

#FollowFriday Spotlight: @hlsdk

#FollowFriday Spotlight: @DataQualityPro

#FollowFriday and Re-Tweet-Worthiness

#FollowFriday and The Three Tweets

Dilbert, Data Quality, Rabbits, and #FollowFriday

Twitter, Meaningful Conversations, and #FollowFriday

The Fellowship of #FollowFriday

Social Karma (Part 7) – Twitter

DQ-BE: Dear Valued Customer

Data Quality By Example (DQ-BE) is an OCDQ regular segment that provides examples of data quality key concepts.

The term “valued customer” is bandied about quite frequently and is often at the heart of enterprise data management initiatives such as Customer Data Integration (CDI), 360° Customer View, and Customer Master Data Management (MDM).

The role of data quality in these initiatives is an important, but sometimes mistakenly overlooked, consideration.

For example, the Service Contract Renewal Notice (shown above) I recently received exemplifies the impact of poor data quality on Customer Relationship Management (CRM) since one of my service providers wants me—as a valued customer—to purchase a new service contract for one of my laptop computers.

Let’s give them props for generating a 100% accurate residential postal address, since how could I even consider renewing my service contract if I don’t receive the renewal notice in the mail?  Let’s also acknowledge my Customer ID is also 100% accurate, since that is the “unique identifier” under which I have purchased all of my products and services from this company.

However, the biggest data quality mistake is that the name of their “Valued Customer” is not INDEPENDENT CONSULTANT.  (And they get bonus negative points for writing it in ALL CAPS).

The moral of the story is that if you truly value your customers, then you should truly value your customer data quality.

At the very least—get your customer’s name right.

 

Related Posts

Customer Incognita

Identifying Duplicate Customers

Adventures in Data Profiling (Part 7) – Customer Name

The Quest for the Golden Copy (Part 3) – Defining “Customer”

‘Tis the Season for Data Quality

The Seven Year Glitch

DQ-IRL (Data Quality in Real Life)

Data Quality, 50023

Once Upon a Time in the Data

The Semantic Future of MDM