Big Data and Big Analytics

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

Jill Dyché is the Vice President of Thought Leadership and Education at DataFlux.  Jill’s role at DataFlux is a combination of best-practice expert, key client advisor and all-around thought leader.  She is responsible for industry education, key client strategies and market analysis in the areas of data governance, business intelligence, master data management and customer relationship management.  Jill is a regularly featured speaker and the author of several books.

Jill’s latest book, Customer Data Integration: Reaching a Single Version of the Truth (Wiley & Sons, 2006), was co-authored with Evan Levy and shows the business breakthroughs achieved with integrated customer data.

Dan Soceanu is the Director of Product Marketing and Sales Enablement at DataFlux.  Dan manages global field sales enablement and product marketing, including product messaging and marketing analysis.  Prior to joining DataFlux in 2008, Dan has held marketing, partnership and market research positions with Teradata, General Electric and FormScape, as well as data management positions in the Financial Services sector.

Dan received his Bachelor of Science in Business Administration from Kutztown University of Pennsylvania, as well as earning his Master of Business Administration from Bloomsburg University of Pennsylvania.

On this episode of OCDQ Radio, Jill Dyché, Dan Soceanu, and I discuss the recent Pacific Northwest BI Summit, where the three core conference topics were Cloud, Collaboration, and Big Data, the last of which lead to a discussion about Big Analytics.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

The Stakeholder’s Dilemma

Game theory models a strategic situation as a game in which an individual player’s success depends on the choices made by the other players involved in the game.  One excellent example is the game known as The Prisoner’s Dilemma, which is deliberately designed to demonstrate why two people might not cooperate—even if it is in both of their best interests to do so.

Here is the classic scenario.  Two criminal suspects are arrested, but the police have insufficient evidence for a conviction.  So they separate the prisoners and offer each the same deal.  If one testifies for the prosecution against the other (i.e., defects) and the other remains silent (i.e., cooperates), the defector goes free and the silent accomplice receives the full one-year sentence.  If both remain silent, both prisoners are sentenced to only one month in jail for a minor charge.  If each betrays the other, each receives a three-month sentence.  Each prisoner must choose to betray the other or to remain silent.

If you have ever regularly watched a police procedural television series, such as Law & Order, then you have seen many dramatizations of the prisoner’s dilemma, including several sample outcomes of when the prisoners make different choices.

The Iterated Prisoner’s Dilemma

In iterated versions of the prisoner’s dilemma, players remember the previous actions of their opponent and change their strategy accordingly.  In many fields of study, these variations are considered fundamental to understanding cooperation and trust.

Here is an economics scenario with two players and a banker.  Each player holds a set of two cards, one printed with the word Cooperate (as in, with each other), the other printed with the word Defect.  Each player puts one card face-down in front of the banker.  By laying them face down, the possibility of a player knowing the other player’s selection in advance is eliminated.  At the end of each turn, the banker turns over both cards and gives out the payments, which can vary, but one example is as follows.

If both players cooperate, they are each awarded $5.  If both players defect, they are each penalized $1.  But if one player defects while the other player cooperates, the defector is awarded $10, while the cooperator neither wins nor loses any money.

Therefore, the safest play is to always cooperate, since you would never lose any money—and if your opponent always cooperates, then you can both win on every turn.  However, although defecting creates the possibility of losing a small amount of money, it also creates the possibility of winning twice as much money.

It is the iterated nature of this version of the prisoner’s dilemma that makes it so interesting for those studying human behavior.

For example, if you were playing against me, and I defected on the first two turns while you cooperated, I would have won $20 while you would have won nothing.  So what would you do on the third turn?  Let’s say that you choose to defect.

But if I defected yet again, although we would both lose $1, overall I would still be +$19 while you would be -$1.  And what if I continued defecting?  This would actually be an understandable strategy for me—if I was only playing for money, since you would have to defect 19 more times in a row before I broke even, but by which time you would have also lost $20.  And if instead, you start cooperating again in order to stop your losses, I could win a lot of money—at the expense of losing your trust.

Although the iterated prisoner’s dilemma is designed so that, over the long-term, cooperating players generally do better than non-cooperating players, in the short-term, the best result for an individual player is to defect while their opponent cooperates.

The Stakeholder’s Dilemma

Organizations embarking on an enterprise-wide initiative, such as data quality, master data management, and data governance, play a version of the iterated prisoner’s dilemma, which I refer to as The Stakeholder’s Dilemma.

These initiatives often bring together key stakeholders from all around the organization, representing each business unit or business function, and perhaps stakeholders representing data and technology as well.  These stakeholders usually form a committee or council, which is responsible for certain top-down aspects of the initiative, such as funding and strategic planning.

Of course, it is unrealistic to expect every stakeholder to cooperate equally at all times.  The realities of the fiscal calendar effect, conflicting interests, and changing business priorities, will mean that during any particular turn in the game (i.e., the current phase of the initiative), the amount of resources (money, time, people) allocated to the effort by a particular stakeholder will vary.

There will be times when sacrifices for the long-term greater good of the initiative will require that cooperating stakeholders either contribute more resources during the current phase, or receive fewer benefits from its deliverables, than defecting stakeholders.

As with the iterated prisoner’s dilemma, the challenge is what happens during the next turn (i.e., the next phase of the initiative).

If the same stakeholders repeatedly defect, then will the other stakeholders continue to cooperate?  Or will the spirit of trust, cooperation, and collaboration necessary for the continuing success of the ongoing initiative be irreparably damaged?

There are many, and often complex, reasons for why enterprise-wide initiatives fail, but failing to play the stakeholder’s dilemma well is one very common reason—and it is also a reason why many future enterprise-wide initiatives will fail to garner support.

How well does your organization play The Stakeholder’s Dilemma?

Related Posts

The Data Governance Oratorio

Boston Symphony Orchestra

An oratorio is a large musical composition collectively performed by an orchestra of musicians and choir of singers, all of whom accept a shared responsibility for the quality of their performance, but also requires individual performers accept accountability for playing their own musical instrument or singing their own lines, which includes an occasional instrumental or lyrical solo.

During a well-executed oratorio, individual mastery combines with group collaboration, creating a true symphony, a sounding together, which produces a more powerful performance than even the most consummate solo artist could deliver on their own.

 

The Data Governance Oratorio

Ownership, Responsibility, and Accountability comprise the core movements of the Data Governance ORA-torio.

Data is a corporate asset collectively owned by the entire enterprise.  Data governance is a cross-functional, enterprise-wide initiative requiring that everyone, regardless of their primary role or job function, accept a shared responsibility for preventing data quality issues, and for responding appropriately to mitigate the associated business risks when issues do occur.  However, individuals must still be held accountable for the specific data, business process, and technology aspects of data governance.

Data governance provides the framework for the communication and collaboration of business, data, and technical stakeholders, and establishes an enterprise-wide understanding of the roles and responsibilities involved, and the accountability required to support the organization’s business activities, and materialize the value of the enterprise’s data as positive business impacts.

Collective ownership, shared responsibility, and individual accountability combine to create a true enterprise-wide symphony, a sounding together by the organization’s people, who, when empowered by high quality data and enabled by technology, can optimize business processes for superior corporate performance.

Is your organization collectively performing the Data Governance Oratorio?

 

Related Posts

Data Governance and the Buttered Cat Paradox

Beware the Data Governance Ides of March

Zig-Zag-Diagonal Data Governance

A Tale of Two G’s

The People Platform

The Collaborative Culture of Data Governance

Connect Four and Data Governance

The Business versus IT—Tear down this wall!

The Road of Collaboration

Collaboration isn’t Brain Surgery

Shared Responsibility

The Role Of Data Quality Monitoring In Data Governance

Quality and Governance are Beyond the Data

Data Transcendentalism

Podcast: Data Governance is Mission Possible

Video: Declaration of Data Governance

Don’t Do Less Bad; Do Better Good

Jack Bauer and Enforcing Data Governance Policies

The Prince of Data Governance

MacGyver: Data Governance and Duct Tape

The Diffusion of Data Governance

Spartan Data Quality

My recent Twitter conservation with Dylan Jones, Henrik Liliendahl Sørensen, and Daragh O Brien was sparked by the blog post Case study with Data blogs, from 300 to 1000, which included a list of the top 500 data blogs ranked by influence.

Data Quality Pro was ranked #57, Liliendahl on Data Quality was ranked #87, The DOBlog was a glaring omission, and I was proud OCDQ Blog was ranked #33 – at least until, being the data quality geeks we are, we noticed that it was also ranked #165.

In other words, there was an ironic data quality issue—a data quality blog was listed twice (i.e., a duplicate record in the list)!

Hilarity ensued, including some epic photo shopping by Daragh, leading, quite inevitably, to the writing of this Data Quality Tale, which is obviously loosely based on the epic movie 300—and perhaps also the epically terrible comedy Meet the Spartans.  Enjoy!

 

Spartan Data Quality

In 1989, an alliance of Data Geeks, lead by the Spartans, an unrivaled group of data quality warriors, battled against an invading data deluge in the mountain data center of Thermopylae, caused by the complexities of the Greco-Persian Corporate Merger.

Although they were vastly outnumbered, the Data Geeks overcame epic data quality challenges in one of the most famous enterprise data management initiatives in history—The Data Integration of Thermopylae.

This is their story.

Leonidas, leader of the Spartans, espoused an enterprise data management approach known as Spartan Data Quality, defined by its ethos of collaboration amongst business, data, and technology experts, collectively and affectionately known as Data Geeks.

Therefore, Leonidas was chosen as the Thermopylae Project Lead.  However, Xerxes, the new Greco-Persian CIO, believed that the data integration project was pointless, Spartan Data Quality was a fool’s errand, and the technology-only Persian approach, known as Magic Beans, should be implemented instead.  Xerxes saw the Thermopylae project as an unnecessary sacrifice.

“There will be no glory in your sacrifice,” explained Xerxes.  “I will erase even the memory of Sparta from the database log files!  Every bit and byte of Data Geek tablespace shall be purged.  Every data quality historian and every data blogger shall have their Ethernet cables pulled out, and their network connections cut from the Greco-Persian mainframe.  Why, uttering the very name of Sparta, or Leonidas, will be punishable by employee termination!  The corporate world will never know you existed at all!”

“The corporate world will know,” replied Leonidas, “that Data Geeks stood against a data deluge, that few stood against many, and before this battle was over, a CIO blinded by technology saw what it truly takes to manage data as a corporate asset.”

Addressing his small army of 300 Data Geeks, Leonidas declared: “Gather round!  No retreat, no surrender.  That is Spartan law.  And by Spartan law we will stand and fight.  And together, united by our collaboration, our communication, our transparency, and our trust in each other, we shall overcome this challenge.”

“A new Information Age has begun.  An age of data-driven business decisions, an age of data-empowered consumers, an age of a world connected by a web of linked data.  And all will know, that 300 Data Geeks gave their last breath to defend it!”

“But there will be so many data defects, they will blot out the sun!” exclaimed Xerxes.

“Then we will fight poor data quality in the shade,” Leonidas replied, with a sly smile.

“This is madness!” Xerxes nervously responded as the new servers came on-line in the data center of Thermopylae.

“Madness?  No,” Leonidas calmly said as the first wave of the data deluge descended upon them.  “THIS . . . IS . . . DATA !!!”

 

Related Posts

Pirates of the Computer: The Curse of the Poor Data Quality

Video: Oh, the Data You’ll Show!

The Quest for the Golden Copy (Part 1)

The Quest for the Golden Copy (Part 2)

The Quest for the Golden Copy (Part 3)

The Quest for the Golden Copy (Part 4)

‘Twas Two Weeks Before Christmas

My Own Private Data

The Tell-Tale Data

Data Quality is People!

The People Platform

Platforms are popular in enterprise data management.  Most of the time, the term is used to describe a technology platform, an integrated suite of tools that enables the organization to manage its data in support of its business processes.

Other times the term is used to describe a methodology platform, an integrated set of best practices that enables the organization to manage its data as a corporate asset in order to achieve superior business performance.

Data governance is an example of a methodology platform, where one of its central concepts is the definition, implementation, and enforcement of policies, which govern the interactions between business processes, data, technology, and people.

But many rightfully lament the misleading term “data governance” because it appears to put the emphasis on data, arguing that since business needs come first in every organization, data governance should be formalized as a business process, and therefore mature organizations should view data governance as business process management.

However, successful enterprise data management is about much more than data, business processes, or enabling technology.

Business process management, data quality management, and technology management are all people-driven activities because people empowered by high quality data, enabled by technology, optimize business processes for superior business performance.

Data governance policies illustrate the intersection of business, data, and technical knowledge, which is spread throughout the enterprise, transcending any artificial boundaries imposed by an organizational chart, where different departments or different business functions appear as if they were independent of the rest of the organization.

Data governance policies reveal how truly interconnected and interdependent the organization is, and how everything that happens within the organization happens as a result of the interactions occurring among its people.

Michael Fauscette defines people-centricity as “our current social and business progression past the industrial society’s focus on business, technology, and process.  Not that business or technology or process go away, but instead they become supporting structures that facilitate new ways of collaborating and interacting with customers, suppliers, partners, and employees.”

In short, Fauscette believes people are becoming the new enterprise platform—and not just for data management.

I agree, but I would argue that people have always been—and always will be—the only successful enterprise platform.

 

Related Posts

The Collaborative Culture of Data Governance

Data Governance and the Social Enterprise

Connect Four and Data Governance

What Data Quality Technology Wants

Data and Process Transparency

The Business versus IT—Tear down this wall!

Collaboration isn’t Brain Surgery

Trust is not a checklist

Quality and Governance are Beyond the Data

Data Transcendentalism

Podcast: Data Governance is Mission Possible

Video: Declaration of Data Governance

Does your organization have a Calumet Culture?

In my previous post, I once again blogged about how the key to success for most, if not all, organizational initiatives is the willingness of people all across the enterprise to embrace collaboration.

However, what happens when an organization’s corporate culture doesn’t foster an environment of collaboration?

Sometimes as a result of rapid business growth, an organization trades effectiveness for efficiency, prioritizes short-term tactics over long-term strategy, and even encourages “friendly” competition amongst its relatively autonomous business units.

However, when the need for a true enterprise-wide initiative such as data governance becomes (perhaps painfully) obvious, the organization decides to bring representatives from all of its different “tribes” together to discuss the complexities of the business, data, technical, and (most important) people related issues that would shape the realities of a truly collaborative environment.

“Calumet Culture” is the term I like using (and not just because of my affinity for alliteration) to describe the disingenuous way that I have occasionally witnessed these organizational stakeholder gathering “ceremonies” carried out.

Calumet was the Norman word used by Norman-French Canadian settlers to describe the “peace pipes” they witnessed the people of the First Nations (referred to as Native Americans in the United States) using at ceremonies marking a treaty between previously combative factions.

Simply gathering everyone together around the camp fire (or the conference room table) is an empty gesture, similar in many ways to non-Native Americans mimicking a “peace pipe ceremony” and using one of their words (Calumet) to describe what was in fact a deeply spiritual object used to convey true significance to the event.

When collaboration is discussed at strategic planning meetings with great pomp and circumstance, but after the meetings end, the organization returns to its non-collaborative status quo, then little, if any, true collaboration should be expected to happen.

Does your organization have a Calumet Culture?

In other words, does your organization have a corporate culture that talks the talk of collaboration, but doesn’t walk the walk?

If so, how have you attempted to overcome this common barrier to success?

Data Governance and the Social Enterprise

In his blog post Socializing Software, Michael Fauscette explained that in order “to create a next generation enterprise, businesses need to take two concepts from the social web and apply them across all business functions: community and content.”

“Traditional enterprise software,” according to Fauscette, “was built on the concept of managing through rigid business processes and controlled workflow.  With process at the center of the design, people-based collaboration was not possible.”

Peter Sondergaard, the global head of research at Gartner, explained at a recent conference that “the rigid business processes which dominate enterprise organizational architectures today are well suited for routine, predictable business activities.  But they are poorly suited to support people who’s jobs require discovery, interpretation, negotiation and complex decision-making.”

“Social computing,” according to Sondergaard, “not Facebook, or Twitter, or LinkedIn, but the technologies and principals behind them will be implemented across and between all organizations, and it will unleash yet to be realized productivity growth.”

Since the importance of collaboration is one of my favorite topics, I like Fauscette’s emphasis on people-based collaboration and Sondergaard’s emphasis on the limitations of process-based collaboration.  The key to success for most, if not all, organizational initiatives is the willingness of people all across the enterprise to embrace collaboration.

Successful organizations view collaboration not just as a guiding principle, but as a call to action in their daily business practices.

As Sondergaard points out, the technologies and principals behind social computing are the key to enabling what many analysts have begun referring to as the social enterprise.  Collaboration is the key to business success.  This essential collaboration has to be based on people, and not on rigid business processes since business activities and business priorities are constantly changing.

 

Data Governance and the Social Enterprise

Often the root cause of poor data quality can be traced to a lack of a shared understanding of the roles and responsibilities involved in how the organization is using its data to support its business activities.  The primary focus of data governance is the strategic alignment of people throughout the organization through the definition, implementation, and enforcement of the policies that govern the interactions between people, business processes, data, and technology.

A data quality program within a data governance framework is a cross-functional, enterprise-wide initiative requiring people to be accountable for its data, business process, and technology aspects.  However, policy enforcement and accountability are often confused with traditional notions of command and control, which is the antithesis of the social enterprise that instead requires an emphasis on communication, cooperation, and people-based collaboration.

Data governance policies for data quality illustrate the intersection of business, data, and technical knowledge, which is spread throughout the enterprise, transcending any artificial boundaries imposed by an organizational chart or rigid business processes, where different departments or different business functions appear as if they were independent of the rest of the organization.

Data governance reveals how interconnected and interdependent the organization is, and why people-driven social enterprises are more likely to survive and thrive in today’s highly competitive and rapidly evolving marketplace.

Social enterprises rely on the strength of their people asset to successfully manage their data, which is a strategic corporate asset because high quality data serves as a solid foundation for an organization’s success, empowering people, enabled by technology, to optimize business processes for superior business performance.

 

Related Posts

Podcast: Data Governance is Mission Possible

Trust is not a checklist

The Business versus IT—Tear down this wall!

The Road of Collaboration

Shared Responsibility

Enterprise Ubuntu

Data Transcendentalism

Social Karma

Podcast: Data Governance is Mission Possible

The recent Information Management article Data – Who Cares! by Martin ABC Hansen of Platon has the provocative subtitle:

“If the need to care for data and manage it as an asset is so obvious, then why isn’t it happening?”

Hansen goes on to explain some of the possible reasons under an equally provocative section titled “Mission Impossible.”  It is a really good article that I recommend reading, and it also prompted me to record my thoughts on the subject in a new podcast:

You can also download this podcast (MP3 file) by clicking on this link: Data Governance is Mission Possible

Some of the key points covered in this approximately 15 minute OCDQ Podcast include:

  • Data is a strategic corporate asset because high quality data serves as a solid foundation for an organization’s success, empowering people, enabled by technology, to make better business decisions and optimize business performance
  • Data is an asset owned by the entire enterprise, and not owned by individual business units nor individual people
  • Data governance is the strategic alignment of people throughout the organization through the definition and enforcement of the declared policies that govern the complex ways in which people, business processes, data, and technology interact
  • Five steps for enforcing data governance policies:
    1. Documentation Use straightforward, natural language to document your policies in a way everyone can understand
    2. Communication Effective communication requires that you encourage open discussion and debate of all viewpoints
    3. Metrics Meaningful metrics can be effectively measured, and represent the business impact of data governance
    4. Remediation Correct any combination of business process, technology, data, and people—and sometimes all four
    5. Refinement Dynamically evolve and adapt your data governance policies—as well as their associated metrics
  • Data governance requires everyone within the organization to accept a shared responsibility for both failure and success
  • This blog post will self-destruct in 10 seconds . . . Just kidding, I didn’t have the budget for special effects

 

Related Posts

Shared Responsibility

Quality and Governance are Beyond the Data

Video: Declaration of Data Governance

Don’t Do Less Bad; Do Better Good

Delivering Data Happiness

The Circle of Quality

The Diffusion of Data Governance

Jack Bauer and Enforcing Data Governance Policies

The Prince of Data Governance

MacGyver: Data Governance and Duct Tape

 

Trust is not a checklist

This is my seventh blog post tagged Karma since I promised to discuss it directly and indirectly on my blog throughout the year after declaring KARMA my theme word for 2010 back on the first day of January, which is now almost ten months ago.

 

Trust and Collaboration

I was reminded of the topic of this post—trust—by this tweet by Jill Wanless sent from the recent Collaborative Culture Camp, which was a one day conference on enabling collaboration in a government context, held on October 15 in Ottawa, Ontario.

I followed the conference Twitter stream remotely and found many of the tweets interesting, especially ones about the role that trust plays in collaboration, which is one of my favorite topics in general, and one that plays well with my karma theme word.

 

Trust is not a checklist

The title of this blog post comes from the chapter on The Emergence of Trust in the book Start with Why by Simon Sinek, where he explained that trust is an organizational performance category that is nearly impossible to measure.

“Trust does not emerge simply because a seller makes a rational case why the customer should buy a product or service, or because an executive promises change.  Trust is not a checklist.  Fulfilling all your responsibilities does not create trust.  Trust is a feeling, not a rational experience.  We trust some people and companies even when things go wrong, and we don’t trust others even though everything might have gone exactly as it should have.  A completed checklist does not guarantee trust.  Trust begins to emerge when we have a sense that another person or organization is driven by things other than their own self-gain.”

 

Trust is not transparency

This past August, Scott Berkun blogged about how “trust is always more important than authenticity and transparency.”

“The more I trust you,” Berkun explained, “the less I need to know the details of your plans or operations.  Honesty, diligence, fairness, and clarity are the hallmarks of good relationships of all kinds and lead to the magic of trust.  And it’s trust that’s hardest to earn and easiest to destroy, making it the most precious attribute of all.  Becoming more transparent is something you can do by yourself, but trust is something only someone else can give to you.  If transparency leads to trust, that’s great, but if it doesn’t you have bigger problems to solve.”

 

Organizational Karma

Trust and collaboration create strong cultural ties, both personally and professionally.

“A company is a culture,” Sinek explained.  “A group of people brought together around a common set of values and beliefs.  It’s not the products or services that bind a company together.  It’s not size and might that make a company strong, it’s the culture, the strong sense of beliefs and values that everyone, from the CEO to the receptionist, all share.”

Organizations looking for ways to survive and thrive in today’s highly competitive and rapidly evolving marketplace, should embrace the fact that trust and collaboration are the organizational karma of corporate culture.

Trust me on this one—good karma is good business.

 

Related Posts

New Time Human Business

The Great Rift

Social Karma (Part 6)

The Challenging Gift of Social Media

The Importance of Envelopes

True Service

The Game of Darts – An Allegory

“I can make glass tubes”

My #ThemeWord for 2010: KARMA

The Business versus IT—Tear down this wall!

The Road of Collaboration

Video: Declaration of Data Governance