I have a data problem. The problem is, every day, we seem to gather and then retain more and more data. Due to our growth in staff, students and applications, our internal, transactional data volumes and needs are growing. We are now adding massive amounts of external data -- social data, data from courseware providers, data from partner colleges, economic data, et cetera -- to our existing data volumes. With this increasing data load, I need to make the case for a pretty constant stream of data storage and backup enhancements and new purchases. But, how best can I make this case?
I always want to make a strong business case (and not technology case) anytime I ask the university for money. This is pretty simple for things like new applications that will increase graduation rates or automate staff functions. But, it is more of a challenge when I am dealing with "IT plumbing" like storage and backup with more nebulous benefits.
So, as I prepare to convince the university to spend money, I reflect back on my business case successes and failures. From this, I recognize three successful practices.
First, focus the business case on what really adds value. But, when it comes to data, what is valuable? And, just because data is available, does that mean we need it or will use it? I start with two critical, value-based questions:
- What decisions would you like to make?
- What data do you need in order to make those decisions?
More on storage strategies
- Storage technologies are paving the way for Data Center as a Service
- A storage strategy built with virtualization in mind
- Large data sets offer insights, require a tiered storage strategy
If we are capturing data that does not map to making those decisions, we don't need it. Having some value-based criteria upfront dramatically simplifies the entire storage and backup business case and project.
Second, not all of the data we capture is created equally and our processes should reflect this reality. I like to stratify my data into several buckets -- regulatory, permanent record, transitional and summary. We keep the regulatory data only for the required period of time; we keep the permanent record data forever (and thus confirm your fears about a school really having a permanent record); and we keep the transitional data for as long as it is useful for analysis. For us, this includes data like specific answers to survey questions, information about course utilization, service requests, et cetera. Before we purge transitional information, we identify if we need any of it, in summary form, for future analyses. In making a business case, this stratification is essential. The university knows not all data is created equal, and if I treat it as if it is, it is obvious I am not aligned. In practice, this also means my backup will be less expensive (why have a long-term backup approach for transitional data?).
Finally, I should take a truly objective look at alternatives to on-premise, self-managed data storage and backup. It could be that doing it all myself is just not a battle that is worth fighting -- particularly given all of the other battles I need to fight. Or, it could be that someone else is simply better at this than I am. No matter what, I cannot assume on-premise, self-managed is the most responsible, credible thing to do. At a minimum, taking an honest look at the alternatives will give my business case analysis an element of validity.
The nature of data and data management is changing as volumes increase and technologies iterate. But, I can still make a strong, value-based business case by applying some common sense to how I manage the data.