Unstructured data is among the fastest-growing data types out there. With organizations creating and attempting to store growing quantities of data year over year, this naturally leads to the question: What's the best storage for unstructured data?
Unstructured data is information that doesn't adhere to a traditional database format. Text in the form of email and documents, along with multimedia -- such as photo, video and audio files -- are common examples of unstructured data. When looking for the best way to store unstructured data, NAS vs. object storage are the two primary choices.
NAS vs. object storage
NAS has been around for decades and puts a hierarchical system of directories and folders between the users and their files. This approach enables organizations to neatly categorize individual files for later use.
Object storage, on the other hand, doesn't impose a file system paradigm on data. Instead, object systems use metadata tables that exist separate from underlying data elements. The metadata table stores attributes that describe the underlying data, such as file name, creation date, user ID and the location from which the data can be retrieved.
This article is part of
There are pros and cons to both approaches, especially in the context of unstructured data storage. And, in the NAS vs. object storage debate, the type of storage that's right for your organization also depends on the kind of workloads supported.
The pros and cons
The key advantages of using NAS for unstructured data storage are that it's organized, at least insofar as you create a decent folder structure, and it's user-friendly. NAS is also ubiquitous with many services supporting NFS or SMB storage. In addition, it's relatively fast and provides support for applications where the data changes quickly.
Scalability, on the other hand, isn't a strong suit of NAS. This is changing with the advent of more capable, scale-out offerings, but NAS still isn't close to the scalability potential inherent in object storage systems.
In fact, scale is the biggest benefit of object storage systems. Increasing capacity is a simple exercise, and it's pretty invisible once you get beyond adding hardware. The reason: Many object storage systems scale out rather than up. All you have to do is add another node and then tell your management tool to add the new node to the cluster. Some magic happens behind the scenes, and your cluster now has more storage capacity.
Performance is the challenge with traditional object stores. However, this is also changing with newer object storage products. Another downside to object storage is both the metadata and object data must be updated. So, if you have fast-changing data, the process can take longer than with NAS. Moreover, although there are gateways and somewhat standardized access protocols, such as Amazon S3, object storage standards aren't as consistent as their file-based cousins.
Features of NAS and object storage systems
Although file and object storage are significantly different approaches at the logical level, the actual storage subsystems can share a large suite of features and functionality. Enterprise users can consider an array of NAS or object features, such as:
- Data tiering and placement. NAS and object storage systems can use file tagging and object metadata policies to organize data into tiers -- placing more important or frequently accessed data into faster storage, while relegating less-critical data to less-expensive nearline disks.
- Global namespace. Creating a "namespace" abstracts storage from the corresponding application, enabling the application to find and access data wherever it is stored -- on any suitable NAS or object storage system -- as a key means of seamless storage scalability.
- Performance and multi-tenancy. The storage system must be capable of handling simultaneous users or applications without introducing latency that can cause application delays or errors. This requires internal processing power -- often with the ability to access disks in parallel -- and suitable network bandwidth.
- Data protection. Consider the data resiliency features of the NAS or object storage device, such as RAID, replication or distributed/cluster storage approaches. Data protection eliminates any single point of failure -- leading to data loss -- and can be a critical part of business continuance and compliance.
- Flexible access. NAS and object storage systems can provide various ways of accessing data, such as representational state transfer (REST) or solid object access protocol (SOAP) APIs, as well as suitable storage protocols, including CIFS and NFS for file storage, Lustre or PanFS for object storage, and even Hadoop Distributed File System if the storage system supports big data analytics.
- Management options. NAS and object storage system management can include a variety of features, including self-configuring, auto-healing and auto-rebalance -- i.e., file relocation to spread out disk access -- capabilities.
- Cloud interface. Some file and object storage systems can provide a cloud interface that can support a private cloud or interoperate with public cloud storage offerings to build a seamless cloud/local storage infrastructure.
NAS and object use cases in the enterprise
NAS and object storage share the same fundamental purpose: storing data for enterprise users and applications. But the strengths and weaknesses of both technologies make them each suited for different uses.
NAS offers a more traditional approach to data storage and is ideally suited to a wide range of tasks, anywhere file data must be stored or accessed, such as:
- streaming or retrieving any form of media -- such as image, video, audio and text -- represented as a file rather than unstructured object;
- storing raw data files for analytics;
- storing data backups or using the NAS as a file replication target;
- running an array of open source business applications, such as SugarCRM, Vtiger CRM, OrangeHRM, Synology Office, Mattermost (chat) or even a variety of email servers, web servers and content management systems like WordPress -- almost any business application where block-based SAN storage isn't required;
- storing, accessing and hosting VMs;
- using NAS to provide file storage in a private cloud, typically through a NAS manufacturer's browser-based UI; and
- using NAS storage for test and development tasks, such as web-based or server-based applications.
Object storage also stores data. But the flat (non-hierarchical), nonstructured, metadata-based nature of objects makes object storage attractive for various storage applications in the enterprise, including:
- streaming or retrieving any form of media -- such as image, video, audio and text -- represented as an unstructured object rather than a traditional file;
- storing data for analytics where objects can be extremely large databases, as object storage is often the foundation of vast and highly scalable storage facilities such as data warehouses or even data lake deployments; and
- storing data backups, since object storage is often replicated or distributed, making object storage highly resilient for tasks -- such as DR, backup and long-term archival storage -- that require only infrequent access.
NAS and object storage in the cloud
As more users and applications use the public cloud, providers are delivering an array of storage services designed to emulate file and object -- as well as block and application-specific -- storage resources that can offer global accessibility, high durability and high resilience.
- NAS. File-based storage services include the following:
- Amazon EFS
- Azure Files
- Google Filestore
- Object. Object-based storage services include the following:
- Amazon S3
- Azure Blob
- Google Cloud Storage
Organizations just starting work with public cloud services, developing a hybrid cloud infrastructure or requiring ongoing local storage requirements might consider selecting storage systems that are compatible with public clouds.
The key to hardware/cloud compatibility is typically in the storage system's OS platform. For example, Cohesity SmartFiles supports varied Amazon services, including S3, GovCloud, Snowball, EFS, FSx for Windows File Server and Amazon FSx for NetApp ONTAP. As another example, NetApp platforms such as NetApp ONTAP 9 support Google Cloud Storage.
Common NAS and object storage platforms
There are many different NAS and object storage system product offerings. NAS platforms include the following:
- Arcserve OneXafe
- Buurst SoftNAS
- Ciphertex CX
- Cloudian HyperFile NAS
- CTERA Edge X Series
- DataDirect Networks (DDN) EXAScaler
- Dell EMC PowerScale
- Hitachi NAS (HNAS) Platform
- HP StorageWorks 4400 Scalable NAS
- HP StorageWorks X9000
- HPE 3PAR StoreServ
- HPE StoreEasy 1000 Storage
- IBM Scale Out Network Attached Storage (SONAS)
- IBM System Storage N Series
- iXsystems TrueNAS
- LaCie NAS
- NetApp V-Series
- NetApp FAS
- Netgear ReadyNAS
- Oracle Pillar Axiom 600
- Oracle Sun Storage 7000 Unified Storage System
- Oracle ZFS Storage Appliance
- OVHcloud NAS-HA
- Panasas ActiveStor
- QCT QuantaVault
- QNAP NAS
- Quantum ATFS
- SnapServer NAS
- Synology DiskStation Manager (DSM)
- Western Digital Ultrastar SATA Series
Object storage platform offerings include the following:
- Cloudian HyperStore
- DataCore Swarm
- Dell EMC ECS
- FalconStor StorSafe
- Huawei OceanStor family
- Hitachi Content Platform (HCP)
- Inspur AS13000G5 series
- NetApp StorageGRID
- Pure Storage FlashBlade
- Scality Ring
Storage systems must be selected carefully, based on requirements for factors such as storage capacity, form factor (tower or rack mount), network and I/O performance, resilience features and scalability.
The bottom line for unstructured data storage
So, which approach comes out on top when it comes to NAS vs. object storage? In general, if you have applications that include fast-changing data and streamlined access, NAS is probably your best option. If you have workloads for which storage is more of an archive, and you don't need a very high level of native integration with applications, object storage is the way to go.
Scale also plays a part in your NAS vs. object storage decision. NAS systems differ wildly in how far they can scale, so it's possible that, at some point, you could grow beyond the limits of the NAS product you pick.
Dig Deeper on Primary storage devices
Related Q&A from Stephen J. Bigelow
Though machine learning and neural networks are both forms of AI, neural networks are a specific type of ML algorithm. Learn more about their ... Continue Reading
Some enterprises avoid the public cloud due to its multi-tenant nature and data security concerns. Learn what data separation is and how it can keep ... Continue Reading
Knowing hardware maximums and VM limits ensures you don't overload the system. Learn hypervisor scalability limits for Hyper-V, vSphere, ESXi and ... Continue Reading