The Mullet Blogging Manifesto

Blogging is more art than science.  My personal blogging style can perhaps best be described as mullet blogging.  No, not the “business in the front, party in the back” haircut that I tried to rock back in the '80s (I couldn't pull it off, had to settle for a “tail” and had to cut that off because it made me look like an idiot – OK, more idiotic than usual).  By mullet blogging I mean:

“Take yourself and your blog seriously, but still have a sense of humor about both.”

As a mullet blogger, I hold the following truths to be self-evident, but I decided to write them down anyway.

 

Blogging is All about You

Not you meaning me, the blogger — you meaning you, the reader.

Blogging should always focus on the reader and provide them assistance with a specific problem, even if that problem is boredom or simply a need for entertainment.  Don't worry about your readers agreeing with you.  They will either thank you for your help or tell you that you're an idiot – either way, you have started a conversation, which should always be your blogging goal.

Brian Clark recently shared something to think about using the following quote from Robert McKee:

“When talented people write badly it’s generally for one of two reasons:

Either they’re blinded by an idea they feel compelled to prove,

Or they’re driven by an emotion they must express.

When talented people write well, it is generally for this reason:

They’re moved by a desire to touch the audience.”

B = U2C3

Blogging = Unique and Useful content that is Clear, Concise, and Consumable.

The conventional blogging wisdom is to be both Unique and Useful.  Although I normally like to defy conventions, I have to agree with the wise ones on these fundamentals.

One of the most important aspects of being unique is writing effective titles.  Most potential readers scan titles to determine whether or not they will click and read more.  There is obviously a delicate balance between effective titles and “baiting,” which will only alienate potential readers. 

If you write a compelling title that makes me click through to an interesting post, then “You Rock!”  However, if you write a “Shock and Awe” title followed by “Aw Shucks” content, then “You Suck!” 

Therefore, your content also has to be unique – your topic, position, voice, or a combination of all three.

One of the most important aspects of useful is “infotainment” – that combination of information and entertainment that, when done well, can turn potential readers into raving fans.  Just don't forget about the previous section – your content has to be informative and entertaining to your readers.

The key to good blogging is to follow the Three C’s – Clear, Concise, Consumable

The attention span of a blog reader is not the same as a reader of books, newspapers (they still exist, right?), magazine articles, or the audience for presentations.  Most people only scan blogs, rarely read a full post and even more rarely leave a comment – regardless of how well the blog post is written. 

Write blog posts that get to the point and stay on point (i.e., clear), are no longer than they need to be (i.e., concise), and are formatted to be easy to read on a computer screen (i.e., consumable).

 

Laugh, Think, Comment

The three things that you want your readers to do.

Although it is not as blatantly formulaic as the title of the previous section, here is another method to my blogging madness:

  1. Open with a joke
  2. Say something thought provoking
  3. End with a call to action

It's as easy as 1-2-3!  In my defense, I didn't say open with a good joke.  But seriously, humor can be a great way to start a conversation and hold your readers' attention for those few precious additional seconds while you are getting to your point.  Obviously, there will be times when the seriousness of your subject would make comedy inappropriate, and if you are not naturally inclined to use humor, then you shouldn't try to force it.

Thought provoking content doesn't have to mean deep thoughts.  There is no need to channel Jean-Paul Sartre, for example.  However, to paraphrase Sartre: “Hell is other people's boring blogs.”

Obviously, comments are not the only type of call to action.  However, blogging is a conversation facilitated by the dialogue and discussion provided via comments from your readers.  Without comments, the conversation is only one way. 

I love the sound of my own voice and I talk to myself all the time (even in public).  However, the two-way conversation provided via comments not only greatly improves the quality of my blog content — much more importantly, it helps me better appreciate the difference between what I know and what I only think I know.

As Darren Rowse and Chris Garrett explained in their highly recommended ProBlogger book: “even the most popular blogs tend to attract only about a 1 percent commenting rate.”  Therefore, don't be too disappointed if you are not getting many comments.  Take that statistic as a challenge to motivate you to write blog posts that your readers simply can not resist commenting on. 

Respond to the comments you do receive.  This continues the two-way conversation and encourages comments from other readers.  Make sure to never talk down to your readers (either in your blog post or your comment responses).  It is perfectly fine to disagree and debate, just don't denigrate. 

Obviously, you should block all spam (leading argument for using comment moderation) and never feed the troll.

 

Stories and Metaphors and Analogies!  Oh, my!

I've a feeling we're not in Kansas anymore.  Especially me, since I live in Iowa.

Darren Rowse recently shared some great tips about why stories are an effective communication tool for your blog, including a list of some of the different types of stories you can tell.

My blog uses a lot of metaphors and analogies (and sometimes just plain silliness) in an attempt to make my posts more interesting.  This is necessary because I write about a niche topic, which although important, is also rather dull.

James Chartrand uses the term Method Blogging as (yes, you guessed it) a metaphor for blogging by comparing it to method acting.  Try experimenting with different styles like an actor experimenting with different types of roles and movie genres. 

Oftentimes, using stories, metaphors, and analogies in my content works very well.  But I admit, sometimes it simply sucks. 

However, I have never been afraid to look like an idiot.  After all, we idiots are important members of society – we make everyone else look smart by comparison.

 

The King, Queen, and Crown Prince of Blogging

Meet the Blogging Royal Family: Content, Marketing, and Context.

Content is King.  The primary reason that people are (or aren't) reading your blog is because of your content.

Marketing is Queen.  “If you blog it, they will read.” Ah, no they won't — this ain't Field of DreamsSome of the best written blogs on the Series of Tubes get hardly any love because they get hardly any marketing.  In addition to providing RSS and e-mail feeds, I use social media (e.g., Twitter, Facebook, LinkedIn) to promote my blog content.

However, too many bloggers have a selfish social media strategy.  Don't use it exclusively for self-promotion.  View social media as Social Karma.  Focus on helping others and you will get much more back than just a blog reader, a LinkedIn connection, a Twitter follower, or a Facebook friend.  In addition to blog promotion (which is important), I use social media to listen, to learn, and to help others when I can.

Larry Brooks recently explained that although content may still be king, at the very least, you must pay homage to the new Crown Prince — Context.  To paraphrase Brooks, context comes from clarity about your blogging goals, juxtaposed against the expectations and tolerances of your readers.  Basically, this above all: to thine own readers be true.

 

Emerson on Blogging

“Nothing can bring you peace but yourself.”

One of my favorite writers is Ralph Waldo Emerson.  The quote that started this section was pure Emerson.  What follows is a slight paraphrasing of one of my all-time favorite passages, which comes from his essay on Self-Reliance:

“What I must do is all that concerns me, not what the people think.  This rule, equally arduous in real and in online life, may serve for the whole distinction between greatness and meanness.  It is the harder because you will always find those who think they know what is your duty better than you know it.  It is easy in the world to live after the world's opinion; it is easy in solitude to live after our own; but the great blogger is one who in the midst of the blogosphere, keeps with perfect sweetness the independence of solitude.”

Bottom line — BE YOURSELF — Let your own personality shine through.  Make people feel like they are having a conversation with a real person and not just someone who is blogging what they think people want to read.

I hope that you found at least some of this manifesto helpful.  I also hope to see more of you around the blogosphere.

I'll be the balding blogger who used to almost have a mullet...

 

Related Posts

Collablogaunity

Brevity is the Soul of Social Media

Podcast: Your Blog, Your Voice

Customer Incognita

Many enterprise information initiatives are launched in order to unravel that riddle, wrapped in a mystery, inside an enigma, that great unknown, also known as...Customer.

Centuries ago, cartographers used the Latin phrase terra incognita (meaning “unknown land”) to mark regions on a map not yet fully explored.  In this century, companies simply can not afford to use the phrase customer incognita to indicate what information about their existing (and prospective) customers they don't currently have or don't properly understand.

 

What is a Customer?

First things first, what exactly is a customer?  Those happy people who give you money?  Those angry people who yell at you on the phone or say really mean things about your company on Twitter and Facebook?  Why do they have to be so mean? 

Mean people suck.  However, companies who don't understand their customers also suck.  And surely you don't want to be one of those companies, do you?  I didn't think so.

Getting back to the question, here are some insights from the Data Quality Pro discussion forum topic What is a customer?:

  • Someone who purchases products or services from you.  The word “someone” is key because it’s not the role of a “customer” that forms the real problem, but the precision of the term “someone” that causes challenges when we try to link other and more specific roles to that “someone.”  These other roles could be contract partner, payer, receiver, user, owner, etc.
  • Customer is a role assigned to a legal entity in a complete and precise picture of the real world.  The role is established when the first purchase is accepted from this real-world entity.  Of course, the main challenge is whether or not the company can establish and maintain a complete and precise picture of the real world.

These working definitions were provided by fellow blogger and data quality expert Henrik Liliendahl Sørensen, who recently posted 360° Business Partner View, which further examines the many different ways a real-world entity can be represented, including when, instead of a customer, the real-world entity represents a citizen, patient, member, etc.

A critical first step for your company is to develop your definition of a customer.  Don't underestimate either the importance or the difficulty of this process.  And don't assume it is simply a matter of semantics.

Some of my consulting clients have indignantly told me: “We don't need to define it, everyone in our company knows exactly what a customer is.”  I usually respond: “I have no doubt that everyone in your company uses the word customer, however I will work for free if everyone defines the word customer in exactly the same way.”  So far, I haven't had to work for free.  

 

How Many Customers Do You Have?

You have done the due diligence and developed your definition of a customer.  Excellent!  Nice work.  Your next challenge is determining how many customers you have.  Hopefully, you are not going to try using any of these techniques:

  • SELECT COUNT(*) AS "We have this many customers" FROM Customers
  • SELECT COUNT(DISTINCT Name) AS "No wait, we really have this many customers" FROM Customers
  • Middle-Square or Blum Blum Shub methods (i.e. random number generation)
  • Magic 8-Ball says: “Ask again later”

One of the most common and challenging data quality problems is the identification of duplicate records, especially redundant representations of the same customer information within and across systems throughout the enterprise.  The need for a solution to this specific problem is one of the primary reasons that companies invest in data quality software and services.

Earlier this year on Data Quality Pro, I published a five part series of articles on identifying duplicate customers, which focused on the methodology for defining your business rules and illustrated some of the common data matching challenges.

Topics covered in the series:

  • Why a symbiosis of technology and methodology is necessary when approaching this challenge
  • How performing a preliminary analysis on a representative sample of real data prepares effective examples for discussion
  • Why using a detailed, interrogative analysis of those examples is imperative for defining your business rules
  • How both false negatives and false positives illustrate the highly subjective nature of this problem
  • How to document your business rules for identifying duplicate customers
  • How to set realistic expectations about application development
  • How to foster a collaboration of the business and technical teams throughout the entire project
  • How to consolidate identified duplicates by creating a “best of breed” representative record

To read the series, please follow these links:

To download the associated presentation (no registration required), please follow this link: OCDQ Downloads

 

Conclusion

“Knowing the characteristics of your customers,” stated Jill Dyché and Evan Levy in the opening chapter of their excellent book, Customer Data Integration: Reaching a Single Version of the Truth, “who they are, where they are, how they interact with your company, and how to support them, can shape every aspect of your company's strategy and operations.  In the information age, there are fewer excuses for ignorance.”

For companies of every size and within every industry, customer incognita is a crippling condition that must be replaced with customer cognizance in order for the company to continue to remain competitive in a rapidly changing marketplace.

Do you know your customers?  If not, then they likely aren't your customers anymore.

Poor Quality Data Sucks

Fenway Park 2008 Home Opener

Over the last few months on his Information Management blog, Steve Miller has been writing posts inspired by a great 2008 book that we both highly recommend: The Drunkard's Walk: How Randomness Rules Our Lives by Leonard Mlodinow.

In his most recent post The Demise of the 2009 Boston Red Sox: Super-Crunching Takes a Drunkard's Walk, Miller takes on my beloved Boston Red Sox and the less than glorious conclusion to their 2009 season. 

For those readers who are not baseball fans, the Los Angeles Angels of Anaheim swept the Red Sox out of the playoffs.  I will let Miller's words describe their demise: “Down two to none in the best of five series, the Red Sox took a 6-4 lead into the ninth inning, turning control over to impenetrable closer Jonathan Papelbon, who hadn't allowed a run in 26 postseason innings.  The Angels, within one strike of defeat on three occasions, somehow managed a miracle rally, scoring 3 runs to take the lead 7-6, then holding off the Red Sox in the bottom of the ninth for the victory to complete the shocking sweep.”

 

Baseball and Data Quality

What, you may be asking, does baseball have to do with data quality?  Beyond simply being two of my all-time favorite topics, quite a lot actually.  Baseball data is mostly transaction data describing the statistical events of games played.

Statistical analysis has been a beloved pastime even longer than baseball has been America's Pastime.  Number-crunching is far more than just a quantitative exercise in counting.  The qualitative component of statistics – discerning what the numbers mean, analyzing them to discover predictive patterns and trends – is the very basis of data-driven decision making.

“The Red Sox,” as Miller explained, “are certainly exemplars of the data and analytic team-building methodology” chronicled in Moneyball: The Art of Winning an Unfair Game, the 2003 book by Michael Lewis.  Red Sox General Manager Theo Epstein has always been an advocate of the so-called evidenced-based baseball, or baseball analytics, pioneered by Bill James, the baseball writer, historian, statistician, current Red Sox consultant, and founder of Sabermetrics.

In another book that Miller and I both highly recommend, Super Crunchers, author Ian Ayres explained that “Bill James challenged the notion that baseball experts could judge talent simply by watching a player.  James's simple but powerful thesis was that data-based analysis in baseball was superior to observational expertise.  James's number-crunching approach was particular anathema to scouts.” 

“James was baseball's herald,” continues Ayres, “of data-driven decision making.”

 

The Drunkard's Walk

As Mlodinow explains in the prologue: “The title The Drunkard's Walk comes from a mathematical term describing random motion, such as the paths molecules follow as they fly through space, incessantly bumping, and being bumped by, their sister molecules.  The surprise is that the tools used to understand the drunkard's walk can also be employed to help understand the events of everyday life.”

Later in the book, Mlodinow describes the hidden effects of randomness by discussing how to build a mathematical model for the probability that a baseball player will hit a home run: “The result of any particular at bat depends on the player's ability, of course.  But it also depends on the interplay of many other factors: his health, the wind, the sun or the stadium lights, the quality of the pitches he receives, the game situation, whether he correctly guesses how the pitcher will throw, whether his hand-eye coordination works just perfectly as he takes his swing, whether that brunette he met at the bar kept him up too late, or the chili-cheese dog with garlic fries he had for breakfast soured his stomach.”

“If not for all the unpredictable factors,” continues Mlodinow, “a player would either hit a home run on every at bat or fail to do so.  Instead, for each at bat all you can say is that he has a certain probability of hitting a home run and a certain probability of failing to hit one.  Over the hundreds of at bats he has each year, those random factors usually average out and result in some typical home run production that increases as the player becomes more skillful and then eventually decreases owing to the same process that etches wrinkles in his handsome face.  But sometimes the random factors don't average out.  How often does that happen, and how large is the aberration?”

 

Conclusion

I have heard some (not Mlodinow or anyone else mentioned in this post) argue that data quality is an irrelevant issue.  The basis of their argument is that poor quality data are simply random factors that, in any data set of statistically significant size, will usually average out and therefore have a negligible effect on any data-based decisions. 

However, the random factors don't always average out.  It is important to not only measure exactly how often poor quality data occur, but acknowledge the large aberration poor quality data are, especially in data-driven decision making.

As every citizen of Red Sox Nation is taught from birth, the only acceptable opinion of our American League East Division rivals, the New York Yankees, is encapsulated in the chant heard throughout the baseball season (and not just at Fenway Park):

“Yankees Suck!”

From their inception, the day-to-day business decisions of every organization are based on its data.  This decision-critical information drives the operational, tactical, and strategic initiatives essential to the enterprise's mission to survive and thrive in today's highly competitive and rapidly evolving marketplace. 

It doesn't quite roll off the tongue as easily, but a chant heard throughout these enterprise information initiatives is:

“Poor Quality Data Sucks!”

Books Recommended by Red Sox Nation

Mind Game: How the Boston Red Sox Got Smart, Won a World Series, and Created a New Blueprint for Winning

Feeding the Monster: How Money, Smarts, and Nerve Took a Team to the Top

Theology: How a Boy Wonder Led the Red Sox to the Promised Land

Now I Can Die in Peace: How The Sports Guy Found Salvation Thanks to the World Champion (Twice!) Red Sox

Commendable Comments (Part 3)

In a July 2008 blog post on Men with Pens (one of the Top 10 Blogs for Writers 2009), James Chartrand explained:

“Comment sections are communities strengthened by people.”

“Building a blog community creates a festival of people” where everyone can, as Chartrand explained, “speak up with great care and attention, sharing thoughts and views while openly accepting differing opinions.”

I agree with James (and not just because of his cool first name) – my goal for this blog is to foster an environment in which a diversity of viewpoints is freely shared without bias.  Everyone is invited to get involved in the discussion and have an opportunity to hear what others have to offer.  This blog's comment section has become a community strengthened by your contributions.

This is the third entry in my ongoing series celebrating my heroes – my readers.

 

Commendable Comments

On The Fragility of Knowledge, Andy Lunn commented:

“In my field of Software Development, you simply cannot rest and rely on what you know.  The technology you master today will almost certainly evolve over time and this can catch you out.  There's no point being an expert in something no one wants any more!  This is not always the case, but don't forget to come up for air and look around for what's changing.

I've lost count of the number of organizations I've seen who have stuck with a technology that was fresh 15 years ago and a huge stagnant pot of data, who are now scrambling to come up to speed with what their customers expect.  Throwing endless piles of cash at the problem, hoping to catch up.

What am I getting at?  The secret I've learned is to adapt.  This doesn't mean jump on every new fad immediately, but be aware of it.  Follow what's trending, where the collective thinking is heading and most importantly, what do your customers want?

I just wish more organizations would think like this and realize that the systems they create, the data they hold, and the customers they have are in a constant state of flux.  They are all projects that need care and attention.  All subject to change, there's no getting away from it, but small, well planned changes are a lot less painful, trust me.”

On DQ-Tip: “Data quality is primarily about context not accuracy...”, Stephen Simmonds commented:

“I have to agree with Rick about data quality being in the eye of the beholder – and with Henrik on the several dimensions of quality.

A theme I often return to is 'what does the business want/expect from data?' – and when you hear them talk about quality, it's not just an issue of accuracy.  The business stakeholder cares – more than many seem to notice – about a number of other issues that are squarely BI concerns:

– Timeliness ('WHEN I want it')
– Format ('how I want to SEE it') – visualization, delivery channels
– Usability ('how I want to then make USE of it') – being able to extract information from a report (say) for other purposes
– Relevance ('I want HIGHLIGHTED the information that is meaningful to me')

And so on.  Yes, accuracy is important, and it messes up your effectiveness when delivering inaccurate information.  But that's not the only thing a business stakeholder can raise when discussing issues of quality.  A report can be rejected as poor quality if it doesn't adequately meet business needs in a far more general sense.  That is the constant challenge for a BI professional.”

On Mistake Driven Learning, Ken O'Connor commented:

“There is a Chinese proverb that says:

'Tell me and I'll forget; Show me and I may remember; Involve me and I'll understand.'

I have found the above to be very true, especially when seeking to brief a large team on a new policy or process.  Interaction with the audience generates involvement and a better understanding.

The challenge facing books, whitepapers, blog posts etc. is that they usually 'Tell us,' they often 'Show us,' but they seldom 'Involve us.'

Hence, we struggle to remember, and struggle even more to understand.  We learn best by 'doing' and by making mistakes.”

You Are Awesome

Thank you very much for your comments.  For me, the best part of blogging is the dialogue and discussion provided by interactions with my readers.  Since there have been so many commendable comments, please don't be offended if your commendable comment hasn't been featured yet.  Please keep on commenting and stay tuned for future entries in the series.

By the way, even if you have never posted a comment on my blog, you are still awesome — feel free to tell everyone I said so.

 

Related Posts

Commendable Comments (Part 1)

Commendable Comments (Part 2)

Blog-Bout: “Risk” versus “Monopoly”

A “blog-bout” is a good-natured debate between two bloggers.  This blog-bout is between Jim Harris and Phil Simon, where they debate which board game is the better metaphor for an Information Technology (IT) project: “Risk” or “Monopoly.”

 

Why “Risk” is a better metaphor for an IT Project

By Jim Harris

IT projects and “Risk” have a great deal in common.  I thought long and hard about this while screaming obscenities and watching professional sports on television, the source of all of my great thinking.  I came up with five world dominating reasons.

1. Both things start with the players marking their territory.  In Risk, the game begins with the players placing their “armies” on the territories they will initially occupy.  On IT projects, the different groups within the organization will initially claim their turf. 

Please note that the term “Information Technology” is being used in a general sense to describe a project (e.g. Data Quality, Master Data Management, etc.) and should not be confused with the IT group within an organization.  At a very high level, the Business and IT are the internal groups representing the business and technical stakeholders on a project.

The Business usually owns the data and understands its meaning and use in the day-to-day operation of the enterprise.  IT usually owns the hardware and software infrastructure of the enterprise's technical architecture. 

Both groups can claim they are only responsible for what they own, resist collaborating with the “other side” and therefore create organizational barriers as fiercely defended as the continental borders of Europe and Asia in Risk.

2. In both, there are many competing strategies.  In Risk, the official rules of the game include some basic strategies and over the years many players have developed their own fool-proof plans to guarantee victory.  Some strategies advocate focusing on controlling entire continents, while others advise fortifying your borders by invading and occupying neighboring territories.  And my blog-bout competitor Phil Simon half-jokingly claims that the key to winning Risk is securing the island nation of Madagascar.

On IT projects, you often hear a lot of buzzwords and strategies bandied about, such as Lean, Agile, Six Sigma, and Kaizen, to name but a few.  Please understand – I am an advocate for methodology and best practices, and there are certainly many excellent frameworks out there, including the paradigms I just mentioned.

However, a general problem that I have with most frameworks is their tendency to adopt a one-size-fits-all strategy, which I believe is an approach that is doomed to fail.  Any implemented framework must be customized to adapt to an organization’s unique culture. 

In part, this is necessary because implementing changes of any kind will be met with initial resistance, but an attempt at forcing a one-size-fits-all approach almost sends a message to the organization that everything they are currently doing is wrong, which will of course only increase the resistance to change. 

Starting with a framework simply provides a reference of best practices and recommended options of what has worked on successful IT projects.  The framework should be reviewed in order to determine what can be learned from it and to select what will work in the current environment and what simply won't.     

3. Pyrrhic victories are common during both endeavors.  In Risk, sacrificing everything to win a single battle or to defend your favorite territory can ultimately lead you to lose the war.  Political fiefdoms can undermine what could otherwise have been a successful IT project.  Do not underestimate the unique challenges of your corporate culture.

Obviously, business, technical and data issues will all come up from time to time, and there will likely be disagreements regarding how these issues should be prioritized.  Some issues will likely affect certain stakeholders more than others. 

Keeping data and technology aligned with business processes requires getting people aligned and free to communicate their concerns.  Coordinating discussions with all of the stakeholders and maintaining open communication can prevent a Pyrrhic victory for one stakeholder causing the overall project to fail.

4. Alliances are the key to true victory.  In Risk, it is common for players to form alliances by combining their resources and coordinating their efforts in order to defend their shared borders or to eliminate a common enemy. 

On IT projects, knowledge about data, business processes and supporting technology are spread throughout the organization.  Neither the Business nor IT alone has all of the necessary information required to achieve success. 

Successful projects are driven by an executive management mandate for the Business and IT to forge an alliance of ongoing and iterative collaboration throughout the entire project.

5. The outcomes of both are too often left to chance.  IT projects are complex, time-consuming, and expensive enterprise initiatives.  Success requires people taking on the challenge united by collaboration, guided by an effective methodology, and implementing a solution using powerful technology.

But the complexity of an IT project can sometimes work against your best intentions.  It is easy to get pulled into the mechanics of documenting the business requirements and functional specifications, drafting the project plan and then charging ahead on the common mantra: “We planned the work, now we work the plan.”

Once an IT project achieves some momentum, it can take on a life of its own and the focus becomes more and more about making progress against the tasks in the project plan, and less and less on the project's actual business goals.  Typically, this leads to another all too common mantra: “Code it, test it, implement it into production, and then declare victory.”

In Risk, the outcomes are literally determined by a roll of the dice.  If you allow your IT project to lose sight of its business goals, then you treat it like a game of chance.  And to paraphrase Albert Einstein:

“Do not play dice with IT Projects.”

Why “Monopoly” is a better metaphor for an IT Project

By Phil Simon

IT projects and “Monopoly” have a great deal in common.  I thought long and hard about this at the gym, the source of all of my great thinking.  I came up with six really smashing reasons.

1. Both things take much longer than originally expected.  IT projects typically take much longer than expected for a wide variety of reasons.  Rare is the project that finishes on time (with expected functionality delivered).

The same holds true for Monopoly.  Remember when you were a kid and you wanted to play a quick game?  Now, I consider the term “a quick game of Monopoly” to be the very definition of an oxymoron.  You’d better block off about four to six hours for a proper game.  Unforeseen complexities will doubtlessly delay even the best intentions.

2. During both endeavors, screaming matches typically erupt.  Many projects become tense.  I remember one in which two participants nearly came to blows.  Most projects have key players engage in very heated debates over strategic vision and execution.

With Monopoly, especially after the properties are divvied up, players scream and yell over what constitutes a “fair” deal.  “What do you mean Boardwalk for Ventnor Avenue and Pennsylvania Railroad isn’t reasonable?  IT’S COMPLETELY FAIR!”  Debates like this are the rule, not the exception.

3. While the basic rules may be the same, different people play by different rules.  The vast majority of projects on which I have worked have had the usual suspects: steering committees, executive sponsors, PMOs, different stages of testing, and ultimately system activation.  However, different organizations often try to do things in vastly different ways.  For example, on two similar projects in different organizations, you are likely to find differences with respect to:

  • the number of internal and external folks assigned to a project
  • the project’s timeline and budget
  • project objectives

By the same token, people play Monopoly in somewhat different ways.  Many don’t know about the auction rule.  Others replenish Free Parking with a new $500 bill after someone lands on it.  Also, many people disregard altogether the property assessment card while sticklers like me assess penalties when that vaunted red card appears.

4. Personal relationships can largely determine the outcome in both.  Negotiation is key on IT projects.  Clients negotiate rates, prices, and responsibilities with consulting vendors and/or software vendors.

In Monopoly, personal rivalries play a big part in who makes a deal with whom.  Often players chime in (uninvited, of course) with their opinions on potential deals, without a doubt to affect the outcome.

5. Little things really matter, especially at the end.  Towards the end of an IT project, snakes in the woodwork often come out to bite people when they least expect it.  A tightly staffed or planned project may not be able to withstand a relatively minor problem, especially if the go-live date is non-negotiable.

In Monopoly, the same holds true.  Laugh all you want when your opponent builds hotels on Mediterranean Avenue and Baltic Avenue, but at the end of the game those $250 and $450 charges can really hurt, especially when you’re low on cash.

6. Many times, each does not end; it is merely abandoned.  A good percentage of projects have their plugs pulled prior to completion.  A CIO may become tired with an interminable project and decide to simply end it before costs skyrocket even further.

I’d say that about half of the Monopoly games that I’ve played in the last fifteen years have also been called by “executive decision.”  The writing is on the board, as 1 a.m. rolls around and only two players remain.  Often player X simply cedes the game to player Y.

 

You are the Referee

All bouts require a referee.  Blog-bouts are refereed by the readers.  Therefore, please cast your vote in the poll and also weigh in on this debate by sharing your thoughts by posting a comment below.  Since a blog-bout is co-posted, your comments will be copied (with full attribution) into the comments section of both of the blogs co-hosting this blog-bout.

 

About Jim Harris

Jim Harris is the Blogger-in-Chief at Obsessive-Compulsive Data Quality (OCDQ), which is an independent blog offering a vendor-neutral perspective on data quality.  Jim is also an independent consultant, speaker, writer and blogger with over 15 years of professional services and application development experience in data quality (DQ), data integration, data warehousing (DW), business intelligence (BI), customer data integration (CDI), and master data management (MDM).  Jim is also a contributing writer to Data Quality Pro, the leading online magazine and community resource dedicated to data quality professionals.

 

About Phil Simon

Phil Simon is the author of the acclaimed book Why New Systems Fail: Theory and Practice Collide and the highly anticipated upcoming book The Next Wave of Technologies: Opportunities from Chaos.  Phil is also an independent systems consultant and a dynamic public speaker for hire focusing on how organizations use technology.  Phil also writes for a number of technology media outlets.

Mistake Driven Learning

In his Copyblogger article How to Stop Making Yourself Crazy with Self-Editing, Sean D'Souza explains:

“Competency is a state of mind you reach when you’ve made enough mistakes.”

One of my continuing challenges is staying informed about the latest trends in data quality and its related disciplines, including Master Data Management (MDM), Dystopian Automated Transactional Analysis (DATA), and Data Governance (DG) – I am fairly certain that one of those three things isn't real, but I haven't figured out which one yet.

I read all of the latest books, as well as the books that I was supposed to have read years ago, when I was just pretending to have read all of the latest books.  I also read the latest articles, whitepapers, and blogs.  And I go to as many conferences as possible.

The basis of this endless quest for knowledge is fear.  Please understand – I have never been afraid to look like an idiot.  After all, we idiots are important members of society – we make everyone else look smart by comparison. 

However, I also market myself as a data quality expert.  Therefore, when I consult, speak, write, or blog, I am always at least a little afraid of not getting things quite right.  Being afraid of making mistakes can drive you crazy. 

But as a wise man named Seal Henry Olusegun Olumide Adeola Samuel (wisely better known by only his first name) lyrically taught us back in 1991:

“We're never gonna survive unless, we get a little crazy.”

“It’s not about getting things right in your brain,” explains D’Souza, “it’s about getting things wrong.  The brain has to make hundreds, even thousands of mistakes — and overcome those mistakes — to be able to reach a level of competency.”

 

So, get a little crazy, make a lot of mistakes, and never stop learning.

 

Related Posts

The Fragility of Knowledge

The Wisdom of Failure

A Portrait of the Data Quality Expert as a Young Idiot

The Nine Circles of Data Quality Hell

Tweet 2001: A Social Media Odyssey

HAL 9000 “I am putting myself to the fullest possible use, which is all I think that any conscious entity can ever hope to do.”

As I get closer and closer to my 2001st tweet on Twitter, I wanted to pause for some quiet reflection on my personal odyssey in social media – but then I decided to blog about it instead.

 

The Dawn of OCDQ

Except for LinkedIn, my epic drama of social media adventure and exploration started with my OCDQ blog.

In my Data Quality Pro article Blogging about Data Quality, I explained why I started this blog and discussed some of my thoughts on blogging.  Most importantly, I explained that I am neither a blogging expert nor a social media expert.

But now that I have been blogging and using social media for over six months, I feel more comfortable sharing my thoughts and personal experiences with social media without worrying about sounding like too much of an idiot (no promises, of course).

 

LinkedIn

My social media odyssey began in 2007 when I created my account on LinkedIn, which I admit, I initially viewed as just an online resume.  I put little effort into my profile, only made a few connections, and only joined a few groups.

Last year (motivated by the economic recession), I started using LinkedIn more extensively.  I updated my profile with a complete job history, asked my colleagues for recommendations, expanded my network with more connections, and joined more groups.  I also used LinkedIn applications (e.g. Reading List by Amazon and Blog Link) to further enhance my profile.

My favorite feature is the LinkedIn Groups, which not only provide an excellent opportunity to connect with other users, but also provide Discussions, News (including support for RSS feeds), and Job Postings.

By no means a comprehensive list, here are some LinkedIn Groups that you may be interested in:

For more information about LinkedIn features and benefits, check out the following posts on the LinkedIn Blog:

 

Twitter

Shortly after launching my blog in March 2009, I created my Twitter account to help promote my blog content.  In blogging, content is king, but marketing is queen.  LinkedIn (via group news feeds) is my leading source of blog visitors from social media, but Twitter isn't far behind. 

However, as Michele Goetz of Brain Vibe explained in her blog post Is Twitter an Effective Direct Marketing Tool?, Twitter has a click-through rate equivalent to direct mail.  Citing research from Pear Analytics, a “useful” tweet was found to have a shelf life of about one hour with about a 1% click-through rate on links.

In his blog post Is Twitter Killing Blogging?, Ajay Ohri of Decision Stats examined whether Twitter was a complement or a substitute for blogging.  I created a Data Quality on Twitter page on my blog in order to illustrate what I have found to be the complementary nature of tweeting and blogging. 

My ten blog posts receiving the most tweets (tracked using the Retweet Button from TweetMeme):

  1. The Nine Circles of Data Quality Hell 13 Tweets
  2. Adventures in Data Profiling (Part 1) 13 Tweets
  3. Fantasy League Data Quality 12 Tweets
  4. Not So Strange Case of Dr. Technology and Mr. Business 12 Tweets 
  5. The Fragility of Knowledge 11 Tweets
  6. The General Theory of Data Quality 9 Tweets
  7. The Very True Fear of False Positives8 Tweets
  8. Data Governance and Data Quality 8 Tweets
  9. Adventures in Data Profiling (Part 3)8 Tweets
  10. Data Quality: The Reality Show? 7 Tweets

Most of my social networking is done using Twitter (with LinkedIn being a close second).  I have also found Twitter to be great for doing research, which I complement with RSS subscriptions to blogs.

To search Twitter for data quality content:

If you are new to Twitter, then I would recommend reading the following blog posts:

 

Facebook

I also created my Facebook account shortly after launching my blog.  Although I almost exclusively use social media for professional purposes, I do use Facebook as a way to stay connected with family and friends. 

I created a page for my blog to separate my professional and personal aspects of Facebook without the need to manage multiple accounts.  Additionally, this allows you to become a “fan” of my blog without requiring you to also become my “friend.”

A quick note on Facebook games, polls, and triviaI do not play them.  With my obsessive-compulsive personality, I have to ignore them.  Therefore, please don't be offended if for example, I have ignored your invitation to play Mafia Wars.

By no means a comprehensive list, here are some Facebook Pages or Groups that you may be interested in:

 

Additional Social Media Websites

Although LinkedIn, Twitter, and Facebook are my primary social media websites, I also have accounts on three of the most popular social bookmarking websites: Digg, StumbleUpon, and Delicious.

Social bookmarking can be a great promotional tool that can help blog content go viral.  However, niche content is almost impossible to get to go viral.  Data quality is not just a niche – if technology blogging was a Matryoshka (a.k.a. Russian nested) doll, then data quality would be the last, innermost doll. 

This doesn't mean that data quality isn't an important subject – it just means that you will not see a blog post about data quality hitting the front pages of mainstream social bookmarking websites anytime soon.  Dylan Jones of Data Quality Pro created DQVote, which is a social bookmarking website dedicated to sharing data quality community content.

I also have an account on FriendFeed, which is an aggregator that can consolidate content from other social media websites, blogs or anything providing a RSS feed.  My blog posts and my updates from other social media websites (except for Facebook) are automatically aggregated.  On Facebook, my personal page displays my FriendFeed content.

 

Social Media Tools and Services

Social media tools and services that I personally use (listed in no particular order):

  • Flock The Social Web Browser Powered by Mozilla
  • TweetDeck Connecting you with your contacts across Twitter, Facebook, MySpace and more
  • Digsby – Digsby = Instant Messaging (IM) + E-mail + Social Networks
  • Ping.fm – Update all of your social networks at once
  • HootSuite – The professional Twitter client
  • Twitterfeed – Feed your blog to Twitter
  • Google FeedBurner – Provide an e-mail subscription to your blog
  • TweetMeme – Add a Retweet Button to your blog
  • Squarespace Blog Platform – The secret behind exceptional websites

 

Social Media Strategy

As Darren Rowse of ProBlogger explained in his blog post How I use Social Media in My Blogging, Chris Brogan developed a social media strategy using the metaphor of a Home Base with Outposts.

“A home base,” explains Rowse, “is a place online that you own.”  For example, your home base could be your blog or your company's website.  “Outposts,” continues Rowse, “are places that you have an online presence out in other parts of the web that you might not own.”  For example, your outposts could be your LinkedIn, Twitter, and Facebook accounts.

According to Rowse, your Outposts will make your Home Base stronger by providing:

“Relationships, ideas, traffic, resources, partnerships, community and much more.”

Social Karma

An effective social media strategy is essential for both companies and individual professionals.  Using social media can help promote you, your expertise, your company and your products and services.

However, too many companies and individuals have a selfish social media strategy.

You should not use social media exclusively for self-promotion.  You should view social media as Social Karma.

If you can focus on helping others when you use social media, then you will get much more back than just a blog reader, a LinkedIn connection, a Twitter follower, a Facebook friend, or even a potential customer.

Yes, I use social media to promote myself and my blog content.  However, more than anything else, I use social media to listen, to learn, and to help others when I can.

 

Please Share Your Social Media Odyssey

As always, I am interested in hearing from you.  What have been your personal experiences with social media?

DQ-Tip: “Data quality is primarily about context not accuracy...”

Data Quality (DQ) Tips is an OCDQ regular segment.  Each DQ-Tip is a clear and concise data quality pearl of wisdom.

“Data quality is primarily about context not accuracy. 

Accuracy is part of the equation, but only a very small portion.”

This DQ-Tip is from Rick Sherman's recent blog post summarizing the TDWI Boston Chapter Meeting at MIT.

 

I define data using the Dragnet definition – it is “just the facts” collected as an abstract description of the real-world entities that the enterprise does business with (e.g. customers, vendors, suppliers).  A common definition for data quality is fitness for the purpose of use, the common challenge is that data has multiple uses – each with its own fitness requirements.  Viewing each intended use as the information that is derived from data, I define information as data in use or data in action.

Alternatively, information can be defined as data in context

Quality, as Sherman explains, “is in the eyes of the beholder, i.e. the business context.”

 

Related Posts

DQ-Tip: “Don't pass bad data on to the next person...”

The General Theory of Data Quality

The Data-Information Continuum

Commendable Comments (Part 2)

In a recent guest post on ProBlogger, Josh Hanagarne “quoted” Jane Austen:

“It is a truth universally acknowledged, that a blogger in possession of a good domain must be in want of some worthwhile comments.”

“The most rewarding thing has been that comments,” explained Hanagarne, “led to me meeting some great people I possibly never would have known otherwise.”  I wholeheartedly echo that sentiment. 

This is the second entry in my ongoing series celebrating my heroes – my readers.

 

Commendable Comments

Proving that comments are the best part of blogging, on The Data-Information Continuum, Diane Neville commented:

“This article is intriguing. I would add more still.

A most significant quote:  'Data could be considered a constant while Information is a variable that redefines data for each specific use.'

This tells us that Information draws from a snapshot of a Data store.  I would state further that the very Information [specification] is – in itself – a snapshot.

The earlier quote continues:  'Data is not truly a constant since it is constantly changing.'

Similarly, it is a business reality that 'Information is not truly a constant since it is constantly changing.'

The article points out that 'The Data-Information Continuum' implies a many-to-many relationship between the two.  This is a sensible CONCEPTUAL model.

Enterprise Architecture is concerned as well with its responsibility for application quality in service to each Business Unit/Initiative.

For example, in the interest of quality design in Application Architecture, an additional LOGICAL model must be maintained between a then-current Information requirement and the particular Data (snapshots) from which it draws.  [Snapshot: generally understood as captured and frozen – and uneditable – at a particular point in time.]  Simply put, Information Snapshots have a PARENT RELATIONSHIP to the Data Snapshots from which they draw.

Analyzing this further, refer to this further piece of quoted wisdom (from section 'Subjective Information Quality'):  '...business units and initiatives must begin defining their Information...by using...Data...as a foundation...necessary for the day-to-day operation of each business unit and initiative.'

From logically-related snapshots of Information to the Data from which it draws, we can see from this quote that yet another PARENT/CHILD relationship exists...that from Business Unit/Initiative Snapshots to the Information Snapshots that implement whatever goals are the order of the day.  But days change.

If it is true that 'Data is not truly a constant since it is constantly changing,' and if we can agree that Information is not truly a constant either, then we can agree to take a rational and profitable leap to the truth that neither is a Business Unit/Initiative...since these undergo change as well, though they represent more slowly-changing dimensions.

Enterprises have an increasing responsibility for regulatory/compliance/archival systems that will qualitatively reproduce the ENTIRE snapshot of a particular operational transaction at any given point in time.

Thus, the Enterprise Architecture function has before it a daunting task:  to devise a holistic process that can SEAMLESSLY model the correct relationship of snapshots between Data (grandchild), Information (parent) and Business Unit/Initiative (grandparent).

There need be no conversion programs or redundant, throw-away data structures contrived to bridge the present gap.  The ability to capture the activities resulting from the undeniable point-in-time hierarchy among these entities is where tremendous opportunities lie.”

On Missed It By That Much, Vish Agashe commented:

“My favorite quote is 'Instead of focusing on the exceptions – focus on the improvements.'

I think that it is really important to define incremental goals for data quality projects and track the progress through percentage improvement over a period of time.

I think it is also important to manage the expectations that the goal is not necessarily to reach 100% (which will be extremely difficult if not impossible) clean data but the goal is to make progress to a point where the purpose for cleaning the data can be achieved in much better way than had the original data been used.

For example, if marketing wanted to use the contact data to create a campaign for those contacts which have a certain ERP system installed on-site.  But if the ERP information on the contact database is not clean (it is free text, in some cases it is absent etc...) then any campaign run on this data will reach only X% contacts at best (assuming only X% of contacts have ERP which is clean)...if the data quality project is undertaken to clean this data, one needs to look at progress in terms of % improvement.  How many contacts now have their ERP field cleaned and legible compared to when we started etc...and a reasonable goal needs to be set based on how much marketing and IT is willing to invest in these issues (which in turn could be based on ROI of the campaign based on increased outreach).”

Proving that my readers are way smarter than I am, on The General Theory of Data Quality, John O'Gorman commented:

“My theory of the data, information, knowledge continuum is more closely related to the element, compound, protein, structure arc.

In my world, there is no such thing as 'bad' data, just as there is no 'bad' elements.  Data is either useful or not: the larger the audience that agrees that a string is representative of something they can use, the more that string will be of value to me.

By dint of its existence in the world of human communication and in keeping with my theory, I can assign every piece of data to one of a fixed number of classes, each with characteristics of their own, just like elements in the periodic table.  And, just like the periodic table, those characteristics do not change.  The same 109 usable elements in the periodic table are found and are consistent throughout the universe, and our ability to understand that universe is based on that stability.

Information is simply data in a given context, like a molecule of carbon in flour.  The carbon retains all of its characteristics but the combination with other elements allows it to partake in a whole class of organic behavior. This is similar to the word 'practical' occurring in a sentence: Jim is a practical person or the letter 'p' in the last two words.

Where the analogue bends a bit is a cause of a lot of information management pain, but can be rectified with a slight change in perspective.  Computers (and almost all indexes) have a hard time with homographs: strings that are identical but that mean different things.  By creating fixed and persistent categories of data, my model suffers no such pain.

Take the word 'flies' in the following: 'Time flies like an arrow.' and 'Fruit flies like a pear.'  The data 'flies' can be permanently assigned to two different places, and their use determines which instance is relevant in the context of the sentence.  One instance is a verb, the other a plural noun.

Knowledge, in my opinion, is the ability to recognize, predict and synthesize patterns of information for past, present and future use, and more importantly to effectively communicate those patterns in one or more contexts to one or more audiences.

On one level, the model for information management that I use makes no apparent distinction between the data: we all use nouns, adjectives, verbs and sometimes scalar objects to communicate.  We may compress those into extremely compact concepts but they can all be unraveled to get at elemental components. At another level every distinction is made to insure precision.

The difference between information and knowledge is experiential and since experience is an accumulative construct, knowledge can be layered to appeal to common knowledge, special knowledge and unique knowledge.

Common being the most easily taught and widely applied; Special being related to one or more disciplines and/or special functions; and, Unique to individuals who have their own elevated understanding of the world and so have a need for compact and purpose-built semantic structures.

Going back to the analogue, knowledge is equivalent to the creation by certain proteins of cartilage, the use to which that cartilage is put throughout a body, and the specific shape of the cartilage that forms my nose as unique from the one on my wife's face.

To me, the most important part of the model is at the element level.  If I can convince a group of people to use a fixed set of elemental categories and to reference those categories when they create information, it's amazing how much tension disappears in the design, creation and deployment of knowledge.”

 

Tá mé buíoch díot

Daragh O Brien recently taught me the Irish Gaelic phrase Tá mé buíoch díot, which translates as I am grateful to you.

I am very grateful to all of my readers.  Since there have been so many commendable comments, please don't be offended if your commendable comment hasn't been featured yet.  Please keep on commenting and stay tuned for future entries in the series.

 

Related Posts

Commendable Comments (Part 1)

Commendable Comments (Part 3)

Commendable Comments (Part 1)

Six month ago today, I launched this blog by asking: Do you have obsessive-compulsive data quality (OCDQ)?

As of September 10, here are the monthly traffic statistics provided by my blog platform:

OCDQ Blog Traffic Overview

 

It Takes a Village (Idiot)

In my recent Data Quality Pro article Blogging about Data Quality, I explained why I started this blog.  Blogging provides me a way to demonstrate my expertise.  It is one thing for me to describe myself as an expert and another to back up that claim by allowing you to read my thoughts and decide for yourself.

In general, I have always enjoyed sharing my experiences and insights.  A great aspect to doing this via a blog (as opposed to only via whitepapers and presentations) is the dialogue and discussion provided via comments from my readers.

This two-way conversation not only greatly improves the quality of the blog content, but much more importantly, it helps me better appreciate the difference between what I know and what I only think I know. 

Even an expert's opinions are biased by the practical limits of their personal experience.  Having spent most of my career working with what is now mostly IBM technology, I sometimes have to pause and consider if some of that yummy Big Blue Kool-Aid is still swirling around in my head (since I “think with my gut,” I have to “drink with my head”).

Don't get me wrong – “You're my boy, Blue!” – but there are many other vendors and all of them also offer viable solutions driven by impressive technologies and proven methodologies.

Data quality isn't exactly the most exciting subject for a blog.  Data quality is not just a niche – if technology blogging was a Matryoshka (a.k.a. Russian nested) doll, then data quality would be the last, innermost doll. 

This doesn't mean that data quality isn't an important subject – it just means that you will not see a blog post about data quality hitting the front page of Digg anytime soon.

All blogging is more art than science.  My personal blogging style can perhaps best be described as mullet blogging – not “business in the front, party in the back” but “take your subject seriously, but still have a sense of humor about it.”

My blog uses a lot of metaphors and analogies (and sometimes just plain silliness) to try to make an important (but dull) subject more interesting.  Sometimes it works and sometimes it sucks.  However, I have never been afraid to look like an idiot.  After all, idiots are important members of society – they make everyone else look smart by comparison.

Therefore, I view my blog as a Data Quality Village.  And as the Blogger-in-Chief, I am the Village Idiot.

 

The Rich Stuff of Comments

Earlier this year in an excellent IT Business Edge article by Ann All, David Churbuck of Lenovo explained:

“You can host focus groups at great expense, you can run online surveys, you can do a lot of polling, but you won’t get the kind of rich stuff (you will get from blog comments).”

How very true.  But before we get to the rich stuff of our village, let's first take a look at a few more numbers:

  • Not counting this one, I have published 44 posts on this blog
  • Those blog posts have collectively received a total of 185 comments
  • Only 5 blog posts received no comments
  • 30 comments were actually me responding to my readers
  • 45 comments were from LinkedIn groups (23), SmartData Collective re-posts (17), or Twitter re-tweets (5)

The ten blog posts receiving the most comments:

  1. The Two Headed Monster of Data Matching 11 Comments
  2. Adventures in Data Profiling (Part 4)9 Comments
  3. Adventures in Data Profiling (Part 2) 9 Comments
  4. You're So Vain, You Probably Think Data Quality Is About You 8 Comments
  5. There are no Magic Beans for Data Quality 8 Comments
  6. The General Theory of Data Quality 8 Comments
  7. Adventures in Data Profiling (Part 1) 8 Comments
  8. To Parse or Not To Parse 7 Comments
  9. The Wisdom of Failure 7 Comments
  10. The Nine Circles of Data Quality Hell 7 Comments

 

Commendable Comments

This post will be the first in an ongoing series celebrating my heroes my readers.

As Darren Rowse and Chris Garrett explained in their highly recommended ProBlogger book: “even the most popular blogs tend to attract only about a 1 percent commenting rate.” 

Therefore, I am completely in awe of my blog's current 88 percent commenting rate.  Sure, I get my fair share of the simple and straightforward comments like “Great post!” or “You're an idiot!” but I decided to start this series because I am consistently amazed by the truly commendable comments that I regularly receive.

On The Data Quality Goldilocks Zone, Daragh O Brien commented:

“To take (or stretch) your analogy a little further, it is also important to remember that quality is ultimately defined by the consumers of the information.  For example, if you were working on a customer data set (or 'porridge' in Goldilocks terms) you might get it to a point where Marketing thinks it is 'just right' but your Compliance and Risk management people might think it is too hot and your Field Sales people might think it is too cold.  Declaring 'Mission Accomplished' when you have addressed the needs of just one stakeholder in the information can often be premature.

Also, one of the key learnings that we've captured in the IAIDQ over the past 5 years from meeting with practitioners and hosting our webinars is that, just like any Change Management effort, information quality change requires you to break the challenge into smaller deliverables so that you get regular delivery of 'just right' porridge to the various stakeholders rather than boiling the whole thing up together and leaving everyone with a bad taste in their mouths.  It also means you can more quickly see when you've reached the Goldilocks zone.”

On Data Quality Whitepapers are Worthless, Henrik Liliendahl Sørensen commented:

“Bashing in blogging must be carefully balanced.

As we all tend to find many things from gurus to tools in our own country, I have also found one of my favourite sayings from Søren Kirkegaard:

If One Is Truly to Succeed in Leading a Person to a Specific Place, One Must First and Foremost Take Care to Find Him Where He is and Begin There.

This is the secret in the entire art of helping.

Anyone who cannot do this is himself under a delusion if he thinks he is able to help someone else.  In order truly to help someone else, I must understand more than he–but certainly first and foremost understand what he understands.

If I do not do that, then my greater understanding does not help him at all.  If I nevertheless want to assert my greater understanding, then it is because I am vain or proud, then basically instead of benefiting him I really want to be admired by him.

But all true helping begins with a humbling.

The helper must first humble himself under the person he wants to help and thereby understand that to help is not to dominate but to serve, that to help is not to be the most dominating but the most patient, that to help is a willingness for the time being to put up with being in the wrong and not understanding what the other understands.”

On All I Really Need To Know About Data Quality I Learned In Kindergarten, Daniel Gent commented:

“In kindergarten we played 'Simon Says...'

I compare it as a way of following the requirements or business rules.

Simon says raise your hands.

Simon says touch your nose.

Touch your feet.

With that final statement you learned very quickly in kindergarten that you can be out of the game if you are not paying attention to what is being said.

Just like in data quality, to have good accurate data and to keep the business functioning properly you need to pay attention to what is being said, what the business rules are.

So when Simon says touch your nose, don't be touching your toes, and you'll stay in the game.”

Since there have been so many commendable comments, I could only list a few of them in the series debut.  Therefore, please don't be offended if your commendable comment didn't get featured in this post.  Please keep on commenting and stay tuned for future entries in the series.

 

Because of You

As Brian Clark of Copyblogger explains, The Two Most Important Words in Blogging are “You” and “Because.”

I wholeheartedly agree, but prefer to paraphrase it as: Blogging is “because of you.” 

Not you meaning me, the blogger you meaning you, the reader.

Thank You.

 

Related Posts

Commendable Comments (Part 2)

Commendable Comments (Part 3)


The Wisdom of Failure

Earlier this month, I had the honor of being interviewed by Ajay Ohri on his blog Decision Stats, which is an excellent source of insights on business intelligence and data mining as well as interviews with industry thought leaders and chief evangelists.

One of the questions Ajay asked me during my interview was what methods and habits would I recommend to young analysts just starting in the business intelligence field and part of my response was:

“Don't be afraid to ask questions or admit when you don't know the answers.  The only difference between a young analyst just starting out and an expert is that the expert has already made and learned from all the mistakes caused by being afraid to ask questions or admitting when you don't know the answers.”

It is perhaps one of life’s cruelest paradoxes that some lessons simply cannot be taught, but instead have to be learned through the pain of making mistakes.  To err is human, but not all humans learn from their errors.  In fact, some of us find it extremely difficult to even simply acknowledge when we have made a mistake.  This was certainly true for me earlier in my career.

 

The Wisdom of Crowds

One of my favorite books is The Wisdom of Crowds by James Surowiecki.  Before reading it, I admit that I believed crowds were incapable of wisdom and that the best decisions are based on the expert advice of carefully selected individuals.  However, Surowiecki wonderfully elucidates the folly of “chasing the expert” and explains the four conditions that characterize wise crowds: diversity of opinion, independent thinking, decentralization and aggregation.  The book is also balanced by examining the conditions (e.g. confirmation bias and groupthink) that can commonly undermine the wisdom of crowds.  All and all, it is a wonderful discourse on both collective intelligence and collective ignorance with practical advice on how to achieve the former and avoid the latter.

 

Chasing the Data Quality Expert

Without question, a data quality expert can be an invaluable member of your team.  Often an external consultant, a data quality expert can provide extensive experience and best practices from successful implementations.  However, regardless of their experience, even with other companies in your industry, every organization and its data is unique.  An expert's perspective definitely has merit, but their opinions and advice should not be allowed to dominate the decision making process. 

“The more power you give a single individual in the face of complexity,” explains Surowiecki, “the more likely it is that bad decisions will get made.”  No one person regardless of their experience and expertise can succeed on their own.  According to Surowiecki, the best experts “recognize the limits of their own knowledge and of individual decision making.”

 

“Success is on the far side of failure”

One of the most common obstacles organizations face with data quality initiatives is that many initial attempts end in failure.  Some fail because of lofty expectations, unmanaged scope creep, and the unrealistic perspective that data quality problems can be permanently “fixed” by a one-time project as opposed to needing a sustained program.  However, regardless of the reason for the failure, it can negatively affect morale and cause employees to resist participating in the next data quality effort.

Although a common best practice is to perform a post-mortem in order to document the lessons learned, sometimes the stigma of failure persuades an organization to either skip the post-mortem or ignore its findings. 

However, in the famous words of IBM founder Thomas J. Watson: “Success is on the far side of failure.” 

A failed data quality initiative may have been closer to success than you realize.  At the very least, there are important lessons to be learned from the mistakes that were made.  The sooner you can recognize your mistakes, the sooner you can mitigate their effects and hopefully prevent them from happening again.

 

The Wisdom of Failure

In one of my other favorite books, How We Decide, Jonah Lehrer explains:

“The brain always learns the same way, accumulating wisdom through error...there are no shortcuts to this painstaking process...becoming an expert just takes time and practice...once you have developed expertise in a particular area...you have made the requisite mistakes.”

Therefore, although it may be true that experience is the path that separates knowledge from wisdom, I have come to realize that the true wisdom of my experience is the wisdom of failure.

 

Related Posts

A Portrait of the Data Quality Expert as a Young Idiot

All I Really Need To Know About Data Quality I Learned In Kindergarten

The Nine Circles of Data Quality Hell

Data Quality Blogging All-Stars

The 2009 Major League Baseball (MLB) All-Star Game is being held tonight at Busch Stadium in St. Louis, Missouri. 

For those readers who are not baseball fans, the All-Star Game is an annual exhibition held in mid-July that showcases the players with the best statistical performances from the first half of the MLB season.

As I watch the 80th Midsummer Classic, I offer this exhibition that showcases the bloggers with the posts I have most enjoyed reading from the first half of the 2009 data quality blogging season.

 

Dylan Jones

From Data Quality Pro:

 

Daragh O Brien

From The DOBlog:

 

Steve Sarsfield

From Data Governance and Data Quality Insider:

 

Daniel Gent

From Data Quality Edge:

 

Henrik Liliendahl Sørensen

From Liliendahl on Data Quality:

 

Stefanos Damianakis

From Netrics HD:

 

Vish Agashe

From Business Intelligence: Process, People and Products:

 

Mark Goloboy

From Boston Data, Technology & Analytics:

 

Additional Resources

Over on Data Quality Pro, read the data quality blog roundups from the first half of 2009:

From the IAIDQ, read the 2009 issues of the IAIDQ Blog Carnival:

El Festival del IDQ Bloggers (April 2009)

image Welcome to the April 2009 issue of El Festival del IDQ Bloggers, which is a blog carnival for information/data quality bloggers being run as part of the celebration of the five year anniversary of the International Association for Information and Data Quality (IAIDQ).

 

A blog carnival is a collection of posts from different blogs on a specific theme that are published across a series of issues.  Anyone can submit a data quality blog post and experience the benefits of extra traffic, networking with other bloggers and discovering interesting posts.  It doesn't matter what type of blog you have as long as the submitted post has a data quality theme. 

El Festival del IDQ Bloggers will run monthly issues April through November 2009.

 

Can You Say Anything Interesting About Data Quality?

This simple question launched the first blog carnival of data quality that ran four issues from late 2007 through early 2008:

Blog Carnival of Data Quality (November 2007)

Blog Carnival of Data Quality (December 2007)

Blog Carnival of Data Quality (January 2008)

Blog Carnival of Data Quality (February 2008)

 

How to give your Data Warehouse a Data Quality Immunity System

Vincent McBurney is a manager for Deloitte consulting in Perth, Australia.  His excellent blog Tooling Around in the IBM InfoSphere looks at the world of data integration software and occasionally wonders what IBM is up to.  His data quality motto: "If it ain’t broke, don't fix it."

Vincent submitted How to give your Data Warehouse a Data Quality Immunity System that discusses how people who obsessively keep bad quality data out of a data warehouse may be making it unhealthy in the long run.

 

Stuck in First Gear

Michele Goetz is a free-lance consultant helping companies make sense of their business through better analysis, marketing best practices, and marketing solutions.  Her excellent blog Intelligent Metrix guides you on your journey from data to metrics to insight to intelligent decisions.  Her blog de-mystifies business intelligence and data management for the business, and helps you bridge the Business-IT gap for better processes and solutions that drive business success.

Michele submitted Stuck in First Gear that discusses the common problem when companies make big investments in enterprise class solutions but only use a portion of the capabilities, which is like driving a Porcshe in first gear.

 

When Bad Data Becomes Acceptable Data

Daniel Gent is a bilingual business analyst experienced with the System Development Life Cycle (SDLC), decision making, change management, database design, data modeling, data quality management, project coordination, and problem resolution.  His excellent blog Data Quality Edge is a grassroots look at data quality for the data quality analyst in the trenches.

Daniel submitted When Bad Data Becomes Acceptable Data that discusses how you need to prioritize bad data and determine when it is acceptable to keep it for now.

 

Customer Value and Sustainable Quality

Daniel Bahula is a strategy and operations improvement professional with an extensive project experience from multinational telco, software development and professional services companies.  His excellent blog DanBahula.net defies a simple definition and is a great example of how it doesn't matter what type of blog you have as long as the submitted post has a data quality theme.

Daniel submitted Customer Value and Sustainable Quality that discusses Six Sigma and its relevance to addressing data quality issues.

 

Data Quality, Entity Resolution, and OFAC Compliance

Bob Barker is the editor of Identity Resolution Daily, which is a corporate blog of Austin, TX-based Infoglide Software strongly dedicated to citizenship, integrity and communication.  The blog has recently been gaining guest bloggers with varying points of view, helping it to become an excellent site for information, dialogue and community.

Bob submitted Data Quality, Entity Resolution, and OFAC Compliance that discusses how entity resolution is different from name matching and traditional data quality.

 

Selecting Data Quality Software

Dylan Jones is the editor of Data Quality Pro, which is the leading data quality online magazine and free independent community resource dedicated to helping data quality professionals take their career or business to the next level.

Dylan submitted Selecting Data Quality Software that discusses how to find the right data quality technology for your needs and your budget.

 

AmazonFail - A Classic Information Quality Impact

Since 2006, IQTrainwrecks.com, which is a community blog provided and administered by the International Association for Information and Data Quality (IAIDQ), has been serving up regular doses of information quality disasters from around the world.

IAIDQ submitted AmazonFail - A Classic Information Quality Impact that looks behind the hype and confusion surrounding the #amazonfail debacle.

 

You’re a Leader - Lead

Daragh O Brien is an Irish information quality expert, conference speaker, published author in the field, and director of publicity for the IAIDQ.  His excellent blog The DOBlog, founded in 2006, was one of the first specialist information quality blogs.

Daragh submitted You’re a Leader - Lead that explains although there’s a whole lot of great management happening in the world, what we really need are information quality leaders.

 

All I Really Need To Know About Data Quality I Learned In Kindergarten

My name is Jim Harris.  I am an independent consultant, speaker, writer and blogger with over 15 years of professional services and application development experience in data quality.  My blog Obsessive-Compulsive Data Quality is an independent blog offering a vendor-neutral perspective on data quality.

I submitted All I Really Need To Know About Data Quality I Learned In Kindergarten that explains how show and tell, the five second rule and other great lessons from kindergarten are essential to success in data quality initiatives.

 

Submit to Daragh

The May issue will be edited by Daragh O Brien and hosted on The DOBlog

For more information, please follow this link:  El Festival del IDQ Bloggers


Data Quality Whitepapers are Worthless

During a 1609 interview, William Shakespeare was asked his opinion about an emerging genre of theatrical writing known as Data Quality Whitepapers.  The "Bard of Avon" was clearly not a fan.  His famously satirical response was:

Data quality's but a writing shadow, a poor paper

That struts and frets its words upon the page

And then is heard no more:  it is a tale

Told by a vendor, full of sound and fury

Signifying nothing.

 

Four centuries later, I find myself in complete agreement with Shakespeare (and not just because Harold Bloom told me so).

 

Today is April Fool's Day, but I am not joking around - call Dennis Miller and Lewis Black - because I am ready to RANT.

 

I am sick and tired of reading whitepapers.  Here is my "Bottom Ten List" explaining why: 

  1. Ones that make me fill out a "please mercilessly spam me later" contact information form before I am allowed to download them remind me of Mrs. Bun: "I DON'T LIKE SPAM!"
  2. Ones that after I read their supposed pearls of wisdom, make me shake my laptop violently like an Etch-A-Sketch.  I have lost count of how many laptops I have destroyed this way.  I have starting buying them in bulk at Wal-Mart.
  3. Ones comprised entirely of the exact same information found on the vendor's website make www = World Wide Worthless.
  4. Ones that start out good, but just when they get to the really useful stuff, refer to content only available to paying customers.  What a great way to guarantee that neither I nor anyone I know will ever become your paying customer!
  5. Ones that have a "Shock and Awe" title followed by "Aw Shucks" content because apparently the entire marketing budget was spent on the title.
  6. Ones that promise me the latest BUZZ but deliver only ZZZ are not worthless only when I have insomnia.
  7. Ones that claim to be about data quality, but have nothing at all to do with data quality:  "...don't make me angry.  You wouldn't like me when I'm angry."
  8. Ones that take the adage "a picture is worth a thousand words" too far by using a dizzying collage of logos, charts, graphs and other visual aids.  This is one reason we're happy that Pablo Picasso was a painter.  However, he did once write that "art is a lie that makes us realize the truth."  Maybe he was defending whitepapers.
  9. Ones that use acronyms without ever defining what they stand for remind me of that scene from Good Morning, Vietnam: "Excuse me, sir.  Seeing as how the VP is such a VIP, shouldn't we keep the PC on the QT?  Because if it leaks to the VC he could end up MIA, and then we'd all be put out in KP."
  10. Ones that really know they're worthless but aren't honest about it.  Don't promise me "The Top 10 Metrics for Data Quality Scorecards" and give me a list as pointless as this one.

 

I am officially calling out all writers of Data Quality Whitepapers. 

Shakespeare and I both believe that you can't write anything about data quality that is worth reading. 

Send your data quality whitepapers to Obsessive-Compulsive Data Quality and if it is not worthless, then I will let the world know that you proved Shakespeare and I wrong.

 

And while I am on a rant roll, I am officially calling out all Data Quality Bloggers.

The International Association for Information and Data Quality (IAIDQ) is celebrating its five year anniversary by hosting:

El Festival del IDQ Bloggers – A Blog Carnival for Information/Data Quality Bloggers

For more information about the blog carnival, please follow this link:  IAIDQ Blog Carnival

Do you have obsessive-compulsive data quality (OCDQ)?

Obsessive-compulsive data quality (OCDQ) affects millions of people worldwide.

The most common symptoms of OCDQ are:

  • Obsessively verifying data used in critical business decisions
  • Compulsively seeking an understanding of data in business terms
  • Repeatedly checking that data is complete and accurate before sharing it
  • Habitually attempting to calculate the cost of poor data quality
  • Constantly muttering a mantra that data quality must be taken seriously

While the good folks at Prescott Pharmaceuticals are busy working on a treatment, I am dedicating this independent blog as group therapy to all those who (like me) have dealt with OCDQ their entire professional lives.

Over the years, the work of many individuals and organizations has been immensely helpful to those of us with OCDQ.

Some of these heroes deserve special recognition:

Data Quality Pro – Founded and maintained by Dylan Jones, Data Quality Pro is a free independent community resource dedicated to helping data quality professionals take their career or business to the next level. With the mission to create the most beneficial data quality resource that is freely available to members around the world, Data Quality Pro provides free software, job listings, advice, tutorials, news, views and forums. Their goal is "winning-by-sharing” and they believe that by contributing a small amount of their experience, skill or time to support other members then truly great things can be achieved. With the new Member Service Register, consultants, service providers and technology vendors can promote their services and include links to their websites and blogs.

 

International Association for Information and Data Quality (IAIDQ) – Chartered in January 2004, IAIDQ is a not-for-profit, vendor-neutral professional association whose purpose is to create a world-wide community of people who desire to reduce the high costs of low quality information and data by applying sound quality management principles to the processes that create, maintain and deliver data and information. IAIDQ was co-founded by Larry English and Tom Redman, who are two of the most respected and well-known thought and practice leaders in the field of information and data quality.IAIDQ also provides two excellent blogs: IQ Trainwrecks and Certified Information Quality Professional (CIQP).

 

Beth Breidenbach – her blog Confessions of a database geek is fantastic in and of itself, but she has also compiled an excellent list of data quality blogs and provides them via aggregated feeds in both Feedburner and Google Reader formats.

 

Vincent McBurney – his blog Tooling Around in the IBM InfoSphere is an entertaining and informative look at data integration in the IBM InfoSphere covering many IBM Information Server products such as DataStage, QualityStage and Information Analyzer.

 

Daragh O Brien – is a leading writer, presenter and researcher in the field of information quality management, with a particular interest in legal aspects of information quality. His blog The DOBlog is a popular and entertaining source of great material.

 

Steve Sarsfield – his blog Data Governance and Data Quality Insider covers the world of data integration, data governance, and data quality from the perspective of an industry insider. Also, check out his new book: The Data Governance Imperative.