Like the zombies of “The Walking Dead,” data growth is threatening to overrun us. Managers worry about the right data that can be captured and used effectively, but what about the right data that is missed? That data can be costly — and in the case of zombies, deadly.
As in the show, there is great debate about what’s right, only in this case, the debate is about what’s the right data for an analytics project and what isn’t. SearchCIO.com Senior News Writer Linda Tucci writes that today’s data quality theories pit clean data against big data. She tells the story of Greg Taffet, CIO at Miami-based U.S. Gas & Electric, whose team spends many hours making sure the data they’re using is the right data.
The other theory holds that there can never be enough data, and that the data many companies purge from their systems actually is hiding important knowledge all its own. This is the camp of big data gurus such as IBM’s Jeff Jonas.
Jonas has a proven track record of working with dirty data to answer difficult questions — such as ‘”Who is stealing from my casino?” — but big data solutions should not be for all companies. To really make big data work for you, you have to have lots of data, and plenty of system and software sophistication, to make big number-crunching cost-effective.