Popeye, Spinach, and Data Quality
/As a kid, one of my favorite cartoons was Popeye the Sailor, who was empowered by eating spinach to take on many daunting challenges, such as battling his brawny nemesis Bluto for the affections of his love interest Olive Oyl, often kidnapped by Bluto.
I am reading the book The Half-life of Facts: Why Everything We Know Has an Expiration Date by Samuel Arbesman, who explained, while examining how a novel fact, even a wrong one, spreads and persists, that one of the strangest examples of the spread of an error is related to Popeye the Sailor. “Popeye, with his odd accent and improbable forearms, used spinach to great effect, a sort of anti-Kryptonite. It gave him his strength, and perhaps his distinctive speaking style. But why did Popeye eat so much spinach? What was the reason for his obsession with such a strange food?”
The truth begins over fifty years before the comic strip made its debut. “Back in 1870,” Arbesman explained, “Erich von Wolf, a German chemist, examined the amount of iron within spinach, among many other green vegetables. In recording his findings, von Wolf accidentally misplaced a decimal point when transcribing data from his notebook, changing the iron content in spinach by an order of magnitude. While there are actually only 3.5 milligrams of iron in a 100-gram serving of spinach, the accepted fact became 35 milligrams. Once this incorrect number was printed, spinach’s nutritional value became legendary. So when Popeye was created, studio executives recommended he eat spinach for his strength, due to its vaunted health properties, and apparently Popeye helped increase American consumption of spinach by a third!”
“This error was eventually corrected in 1937,” Arbesman continued, “when someone rechecked the numbers. But the damage had been done. It spread and spread, and only recently has gone by the wayside, no doubt helped by Popeye’s relative obscurity today. But the error was so widespread, that the British Medical Journal published an article discussing this spinach incident in 1981, trying its best to finally debunk the issue.”
“Ultimately, the reason these errors spread,” Arbesman concluded, “is because it’s a lot easier to spread the first thing you find, or the fact that sounds correct, than to delve deeply into the literature in search of the correct fact.”
What “spinach” has your organization been falsely consuming because of a data quality issue that was not immediately obvious, and which may have led to a long, and perhaps ongoing, history of data-driven decisions based on poor quality data?
Popeye said “I yam what I yam!” Your organization yams what your data yams, so you had better make damn sure it’s correct.