Data is the fuel that businesses run on, so the success of an enterprise may depend on how well the organization accommodates and cares for its data. That may seem like a cliché these days, but regardless of whether your business sells physical products or provides services, information about your suppliers, subcontractors, employees, processes and customers is vital to the health of your company. Data storage is at the heart of any computer system, so decisions you make about the types of data storage systems to deploy will likely be a key factor in determining how well your company's servers, networks and related infrastructure serve your business's needs.
The basic requirement of a data storage system is to securely contain useful data that users and applications can access easily and quickly when needed. Of course, there are other factors to consider to ensure the best fit, as there are many data storage options and storage system configurations, each with its own strengths and weaknesses. So, finding the best fit involves matching a storage system's strengths to your company's critical applications.
Beginning a Data Storage Assessment
To configure storage that will adequately serve your company, start by collecting some basic facts about the organization:
- Company size: Number of employees, discrete business units, number of locations
- Current computing environment: Number of installed servers, networks, desktop and laptop PCs, and other computing devices
- Applications currently in use: End-user productivity, back-office applications and business-specific applications
- Current use of cloud services: Software as a service, application development platforms, file share-and-sync, storage, backup or disaster recovery
These findings will help steer you toward the most appropriate backup solution for your environment. If any of the above conditions suggest that company information is widely dispersed among isolated servers, desktop/laptop computers, and cloud services such as Google Drive, it's a strong indication that a central external storage system that can function as a shared data repository is needed.
In addition to documenting your current environment, it's a good idea to put together a wish list of new capabilities that you hope your new storage system will enable. This could include new applications, additional data sources, disaster recovery planning, access for mobile devices, and so forth.
Finally, another important consideration is whether your company has an internal IT department (or plans to add one). In-house technical skills or the need to engage external expertise could figure into the type of storage you acquire, where it's located and how it's managed. Your company's data center facilities will also help determine the appropriate storage system.
Four Basic Types of Data Storage
Storage systems -- or storage arrays -- provide several distinct types of storage that are best for handling certain types of data.
File storage. This is the most familiar type of storage where the storage system manages data based on its affiliation with specific file types, such as documents, spreadsheets and PDFs. While data may be stored in different drive locations, the system sees them as constituents of a single file and stores that information, along with some basic metadata, in the system's directory.
Central storage systems for files are called NAS -- network-attached storage -- and can typically handle file system protocols associated with Linux and Windows operating systems:
- Linux: NFS (network file system)
- Windows: SMB (server message block) and the older CIFS (Common Internet File System)
NAS devices use the common Ethernet protocol to connect with servers and users, so integrating a NAS box into an existing network is relatively easy.
NAS is very good for hosting user shares for business productivity software, any file-oriented applications, such as media production, and for database applications that are not highly transactional.
Block storage. In block storage arrays, the system accesses data in chunks without relying on their association with a particular files. The access method is similar to way the drive's native firmware accesses and manages the data it stores. So, block storage generally provides better performance than NAS arrays, particularly for applications that access large amounts of data frequently, such as databases supporting applications for online retail. Because of the way it breaks down data and accesses it, block storage is particularly effective for large files.
Block storage is typically deployed as a storage area network (SAN) that may be separate from a company's data network that links its servers and end users. SANs support one of these communications protocols:
- Fibre Channel. Fibre Channel, or FC, SANs are built on specialized networks that are specifically designed for high-performance data access. FC SANs are relatively expensive to implement as they require special FC interface cards, switches and routers.
- iSCSI. iSCSI is a lower cost alternative to FC because, like NAS, it can use an existing Ethernet network infrastructure to connect servers to the storage system.
Unified storage. Increasingly, vendors are offering storage arrays that can function as either a file or block device -- or both simultaneously. These arrays are referred to as unified storage, and may be ideal for companies that need both file and block storage but not enough of each to warrant purchasing separate arrays.
Object storage. Object storage is similar to file storage, but it is far more scalable and can often host petabytes of data in millions of files. It is used by many cloud storage services because of its scalability and the ability to attach detailed metadata to each file or object it stores. Performance, however, is an issue with object storage, so it is often implemented to store archival data. Also, many applications aren't able to access object storage directly.
Centralized vs. Converged Storage Systems
SAN, NAS, unified and object storage systems are all available as centralized, external storage arrays. Typically, these systems provide data access to the servers that attach to them via switches and routers.
A new storage system architecture has emerged over the past decade that decentralizes the storage resources while still providing the same type of shared accessed to data. These systems are called hyperconverged infrastructure (HCI) and are based on conventional server technology. Essentially, HCI systems are collections of interconnected servers that share computing power and storage among all servers in the cluster. Additional storage or compute can be added by introducing additional servers into the cluster. It's a flexible and often low-cost design as it uses existing technology -- servers -- as the main components. HCI systems are also available as software-only products that allow users to assemble the cluster with the servers and storage of their choice.
The Drives Inside a Storage System
Two types of drives are available in most storage arrays: hard disk drives (HDD) and solid-state drives (SSD). Hard disks have been around for more than fifty years, and still comprise a large share of the data storage that's purchased annually. It's inexpensive, offers high capacities and can provide reasonable performance read/write performance for many applications.
SSDs -- unlike HDDs -- are built around NAND flash technology and have no moving parts. SSDs provide performance that easily outstrips hard drives. Early issues with SSDs, such as longevity, capacity and price have been addressed, and the storage medium is now reliable, commodious, and not that much more expensive than hard disk drives. Thus, the scales have tipped toward solid state; most arrays either mix SSDs with HDDs for a speed and capacity combination or are all-flash arrays (AFAs), which are growing in popularity.
Onsite vs. Offsite Storage
In most cases, when you purchase a new storage array it will reside in your company's data center. Of course, the data center must be able to accommodate the storage device with adequate network and environmental (space, power, air conditioning, etc.) infrastructure as well as personnel who are can handle the system management. If staff size or expertise is an issue, as an alternative, you can host the storage on your premises, and have an outside company manage the system.
A more radical alternative -- and a way to cut storage costs -- is to not install the storage on your site at all. You could, for example, purchase the system and then have it installed at a co-location service that will manage it.
Or, you can forego purchasing storage equipment entirely and subscribe to a cloud storage service. Cloud storage is the most mature of all cloud services and is readily available for both large vendors (e.g., AWS, Google, Microsoft) and smaller regional cloud storage outfits. One of the great advantages of cloud storage is that you only pay for what you use -- so if your capacity requirements drop, your costs will too.
There are some requirements for a successful cloud storage implementation, such as having appropriate communications bandwidth. The main bottleneck with cloud storage is the link between the user and the service, so a fairly big pipe is needed for acceptable access to your data. This is especially true if you keep your applications on servers back in your office and the data they need in the cloud.
Some companies have found that hybrid approach works best, where some storage is maintained on premises while the bulk of the data is store at the cloud service. Applications can use the locally stored data and get good response times while the less frequently accessed data is tucked away offsite on cheaper storage.
Protect Your Data
Regardless of where your company's data is stored, it needs to be protected against loss or damage. For sensitive data, such as customers' credit card numbers, extra precautions must be taken, but any data that's important to your company must be safeguarded.
The most common way to protect data stored on in-house systems is by using backup and restore software that can copy your data files and send the copies to another location, such as:
- Another storage system, preferably at a remote company site
- A tape library, which would allow you to ship the tapes offsite to a vaulting service (or second company site)
- A cloud storage service
Alternatively, you could use a cloud-based backup-as-a-service provider, which either uses your locally installed backup software or provides its own. As with the hybrid storage scheme described earlier, cloud backup often leaves a copy of the most recent data on a local storage system so that data can be restored quickly if needed. If you opt for cloud backup, you might consider enlisting the services of a second cloud backup service to duplicate the data stored with the primary cloud backup provider so that two offsite copies are always be available.
While planning your backup strategy, it's a good idea to consider a disaster recovery (DR) plan as well. DR differs from backup in that in addition to data, all your company's applications, server and operating system configurations, network configurations, etc., are duplicated to another site that can be brought online when the primary systems become unavailable. If you use a cloud backup service, ask if they also support DR and can spin up your services in the same cloud environment where your data is stored.