Gorodenkoff - stock.adobe.com
In technology terms, Storj Labs is decentralized cloud storage. In layman's vernacular, it's Airbnb for disk drives.
Storj Labs CEO Ben Golub uses both descriptions for his company, but the Airbnb comparison best captures the unusual nature of a cloud storage company without storage. The startup rents capacity and bandwidth from companies -- called node operators -- around the world and turns that into an object storage-based cloud for its customers. It also offers customers the option to pay through cryptocurrency.
"Storj is different from most other cloud storage because we don't own and operate any disk drives," Golub said.
Storj also offers customers the option to pay through cryptocurrency, with its Storj token.
Perhaps its biggest challenge is convincing customers to store their data on drives rented by a cloud company that isn't AWS, Microsoft or Google. To do that, Storj has focused on adding security, reliability and performance to its service before going live. Storj went into beta with its commercially available Amazon S3 compatible Tardigrade platform S3 in August 2019 and moved into general availability in March 2020.
The 45-person Atlanta-based Storj Labs was funded by a $30 million crypto token sale in 2017, and also uses blockchain to facilitate payment -- but its service is not blockchain storage. Former Docker CEO Golub joined Storj Labs in 2018 as executive chairman. We spoke with Golub about this storage newcomer, its technology and business model, who's using its service and who it rents capacity from, and lessons learned from launching during a pandemic.
So how does a cloud storage company operate without any storage?
Ben Golub: We built a cloud storage service using spare capacities on thousands of drives in 85-plus countries. We're Airbnb for disk drives. Airbnb people rent out spare room in their houses. With our model, people rent out spare space on their disk drives. We do a lot to make sure we do it in a way that is enterprise-grade, rock solid in terms of security, durability and performance.
How do you guarantee enterprise-grade characteristics if you use other people's hard drives?
Golub: We constantly monitor all the drives to make sure they perform well, and we kick out all that aren't performing. But we designed the service so we're not dependent on any particular drives doing well. Whenever a customer uploads data, it first gets encrypted before coming to our service with keys only the user has. Then we use erasure coding. A file is broken up in 80 pieces of which any 30 can be used to be put it back together. You need at least 30 pieces to put it back together. And each of those 80 pieces goes to a different drive on the network. There's no single point of failure. The data is encrypted and split into pieces before it ever goes to the network.
If you wanted to compromise a single file, you'd have to compromise 30 different drives, each run by a different person in a different geographic area, with different networking protocols. And you'd still have an encrypted file, and you'd have to start the process again for the second file because it would be all on 30 different drives across the network. There's no central honeypot of user data that can be compromise by a bad administrator or misconfigured print server or something like that. Someone described our security approach as "encrypted sand on encrypted beaches."
How did Storj Labs start?
Golub: It started in a college dorm in 2014. Founder [and chief strategy officer] Shawn Wilkinson started it as a project at Morehouse. He did a proof of concept in 2014, then came out with version 2 in 2017, which grew to 150 petabytes of capacity. Then we did a significant retooling over the last two years to make it enterprise-grade.
How many customers do you have?
Golub: We have north of 4,000 users after about a month-and-a-half of full-scale production. Our service is a B2B offering; we're not for consumers who want to back up hard drives. Our customers will usually start with a database or VM backup or archive and snapshotting. They're usually people storing a lot of data, but they need to access it quickly and reliably when they need it. And we're also seeing a lot of people use us for media storage and sharing of photos, videos, software distribution and things like that. We're not seeing a lot of usage yet in regulated industries.
What's your business model? How much do you pay people to rent their drive capacity, and how much do your customers pay?
Golub: It's a pretty simple subscription model. Customers pay us [$.01] per gigabyte, per month and [$0.45 per GB of] egress bandwidth. And roughly 60 cents of every dollar that we get from customers goes to people who rent out the space on their drives. They get compensated based on the amount of gigabytes stored and the amount of uploaded bandwidth we're consuming. This is almost pure profit for people renting space on their drives because they're using spare capacity on machines they already have. It takes no additional people, no additional cooling, and no additional power to run a drive at 75% capacity versus 100% capacity.
How do you convince your customers to turn over data to you that you'll place on someone else's storage that your customer has no idea where it is? What kind of proof of concept do you go through?
Ben GolubCEO, Storj Labs
Golub: Well, typically, most people want to start with lower value data and move their way up the chain. As we walk people up through the math on durability and security, they end up getting convinced quickly. The durability story is compelling. Even if you assume that some drives will fail, the chance is astronomically small that 50 drives each run in a different location by a different person with a different power supply will fail within the same short window. We haven't lost a single file since we went into alpha last July.
Where do you find people to rent you capacity?
Golub: Almost everybody has excess capacity. The fact that they can monetize it with low effort, securely and for almost pure profit gives us a lot of inbound interest. We rent not only from individuals, but a lot of data centers, universities and businesses with spare capacities.
We will push out test data in advance of the need for customer data. So, we know there's capacity out there that we can test and we know it responds well, so when customers come on board we can delete the test data and put customer data out there.
How do you make sure your hosts don't access your customers' data?
Golub: What they receive is encrypted and only one-thirtieth of what you need to put back together a file. And no metadata. What they have is an encrypted blob of data, which even if it's unencrypted would be useless. We also continually monitor the drives for uptime and to make sure they're storing the data they say they are.
You went GA just as the pandemic hit. Has the pandemic hurt your sales?
Golub: In some ways it's helped us. People are more willing to look at disruptive solutions during difficult times. We're also seeing the centralized nature of what we do is a nice fit. It's impossible today to build out a new data center. You can't send people on site, you can't get construction permits, you can't get equipment. We can get you petabytes of capacity without asking anybody to leave the comfort of their home.
We didn't' plan to launch in the midst of a pandemic, but we've been pleasantly surprised about the level of engagement we've seen, including from people doing research or data science around the pandemic. We've made capacity available to them for free.
Where does blockchain come in?
Golub: Blockchain is related to payments. Everyone renting out their drives to us is compensated using our storage token. The amount of token they receive from us is calculated first in dollars, then we do a conversion based on the amount of storage tokens. They're not exposed to token fluctuation.
Customers can pay in dollars or in tokens. There's a discount for paying in tokens. Blockchain is a good accelerant. It makes it easier to pay people in 85 countries. But we aren't using blockchain storage. Blockchain tends to be too slow and too cumbersome for the use cases we're seeing.
Do you have plans to go beyond your Tardigrade object storage?
Golub: For the next several months, we want to do a first-class job of general object storage. We've had a lot of customer interest in a few areas. There's interest in things that look CDN-like. We've also seen a lot of interest in ways to give customers more control over where their data is stored. They may want drives in particular countries or in drives in data centers that meet security criteria. We're building that out.