Unstructured data is among the fastest-growing data types out there. With organizations creating and attempting to store growing quantities of data year over year, this naturally leads to the question: What's the best storage for unstructured data?
unstructured data is information that doesn't adhere to a traditional database format. Text in the form of email and documents, along with multimedia -- such as photo, video and audio files -- are common examples of unstructured data. When looking for the best way to store unstructured data, NAS vs. object storage are the two primary choices.
NAS vs. object storage
NAS has been around for decades and puts a hierarchical system of directories and folders between the users and their files. This approach enables organizations to neatly categorize individual files for later use.
Object storage, on the other hand, doesn't impose a file system paradigm on data. Instead, object systems use metadata tables that exist separate from underlying data elements. The metadata table stores attributes that describe the underlying data, such as file name, creation date, user ID and the location from which the data can be retrieved.
There are pros and cons to both approaches, especially in the context of unstructured data storage. And, in the NAS vs. object storage debate, the type of storage that's right for your organization also depends on the kind of workloads supported.
The pros and cons
The key advantages of using NAS for unstructured data storage are that it's organized, at least insofar as you create a decent folder structure, and it's user-friendly. NAS is also ubiquitous with many services supporting NFS or SMB storage. In addition, it's relatively fast and provides support for applications where the data changes quickly.
Scalability, on the other hand, isn't a strong suit of NAS. This is changing with the advent of more capable, scale-out offerings, but NAS still isn't close to the scalability potential inherent in object storage systems.
In fact, scale is the biggest benefit of object storage systems. Increasing capacity is a simple exercise, and it's pretty invisible once you get beyond adding hardware. The reason: Many object storage systems scale out rather than up. All you have to do is add another node and then tell your management tool to add the new node to the cluster. Some magic happens behind the scenes, and your cluster now has more storage capacity.
Performance is the challenge with traditional object stores. However, this is also changing with newer object storage products. Another downside to object storage is both the metadata and object data must be updated. So, if you have fast-changing data, the process can take longer than with NAS. Moreover, although there are gateways and somewhat standardized access protocols, such as S3, object storage standards aren't as consistent as their file-based cousins.
The bottom line for unstructured data storage
So, which approach comes out on top when it comes to NAS vs. object storage? In general, if you have applications that include fast-changing data and streamlined access, NAS is probably your best option. If you have workloads for which storage is more of an archive and don't need a super high level of native integration with applications, object is the way to go.
Scale also plays a part in your NAS vs. object storage decision. NAS systems differ wildly in how far they can scale, so it's possible that, at some point, you could grow beyond the limits of the NAS product you pick.