WavebreakMediaMicro - Fotolia
KCF Technologies found that there weren't many non-DIY options available when it came to Cassandra backup.
The company is based in State College, Pa., and manufactures and deploys sensors to help industrial businesses diagnose problems with their machines. Most are vibration sensors, but some measure moisture, temperature and oil quality. These sensors are mounted on industrial equipment like conveyor belts and pumps and send their readings to an on-site hub, which then transmits the data to a Cassandra database on Amazon Web Services (AWS).
KCF's core business, called SmartDiagnostics Sentry Services, analyzes the data gathered from its sensors to help customers detect faults in their machines or predict failures. The company has teams of trained vibration experts who can figure out from vibration patterns if anything is chipped or damaged. These experts work with the end customers to optimize their machines' life spans or recommend operational changes, and rely on the data in the Cassandra databases to make those calls.
"It's largely a replacement for a guy who walks around with a stethoscope and manually listens to all the machinery," said Brandon Bennett, cloud infrastructure engineer at KCF Technologies.
Cassandra needed 'care and feeding'
Bennett said KCF currently has tens of thousands of sensors deployed across all its customers' sites and estimated there are tens to hundreds of terabytes of sensor data living on its Cassandra database, which consists of 42 nodes. Backing up and maintaining Cassandra proved to be a challenge, as Bennett found it required a lot of "care and feeding" compared with SQL databases.
"A lot of management tools that you find baked into SQL Server are not there natively [in Cassandra]," Bennett said.
Before he bought Rubrik, Bennett relied on Amazon EBS (Elastic Block Store) snapshots as his Cassandra backup method. However, he could not guarantee consistency through this method, as the snapshot was a point-in-time capture of a disk volume, with no awareness of the database server itself. With the database spread across 42 nodes, there wasn't a way to ensure all the snapshots would be in sync.
Another problem came from one of Cassandra's features. Bennett had set up the Cassandra database to replicate data to three different availability zones, avoiding a single point of failure. Cassandra uses a quorum model to perform reads and writes; at least two of the three zones must agree to form consistency. While this is good for preventing data loss, it also created three times the amount of data to back up.
Bennett first encountered Rubrik in November 2018 at the Amazon re:Invent conference, and bought and implemented it in February 2019. Rubrik provided deduplication, consolidating the multiple database replicas to a single backup, effectively cutting the amount of space KCF was devoting to backup to one-third of the Amazon EBS method. Additionally, KCF was able to store this backup in a colder AWS tier, further lowering costs. Bennett said he didn't see many data protection vendors who offered a backup product for Cassandra databases while he was shopping around.
Rubrik acquired Datos IO in February 2018 specifically to protect applications built on NoSQL databases such as MongoDB, Amazon DynamoDB and Cassandra. In May 2019, Rubrik rival Cohesity took the same tact and acquired Imanis Data.
Bennett said he learned that a majority of customers were doing Cassandra backup on their own, through open source tools and custom scripting. However, he decided against this for two reasons: He wanted the assurance of a vendor and a service contract behind his backups, and he wanted to ensure the backup wasn't a system only he or a handful of programmers knew how to run.
Brandon BennettCloud infrastructure engineer, KCF Technologies
"I wanted something a little more enterprise and robust, with a support contract and a vendor with a name on it," Bennett said. "I don't want to write something and go on vacation and nobody knows how it works."
SmartDiagnostics lives on the AWS cloud, but there was a period when KCF had an on-premises product. That version allowed customers to install an instance of SmartDiagnostics in their own server room. Bennett said the product proved unpopular because KCF's end customers -- plant managers -- didn't want to commit their own IT resources and provision the right size of servers to house the constantly growing sensor data.
Instead, Bennett describes KCF as a "cloud-first" company. The cloud is the only place that makes sense to him for a data set that only grows. Although it's unlikely a customer will ever need sensor data past a certain point in time, KCF doesn't delete any of its data. Bennett said historical data can be used to develop models and algorithms that will make SmartDiagnostics better.
"We have this philosophy to hold on to as much of this data as we can and do additional things with it," Bennett said.