WavebreakMediaMicro - Fotolia
Cloud architecture and infrastructure will be hot topics for IT organizations in 2021 as they ramp up their efforts to analyze and gain value from the mountains of data that continue to pile up.
Some may follow the lead of the hyperscalers on cloud architecture. Others could escalate their hybrid cloud efforts or opt for on-premises IT as a service. Multi-cloud may make sense for some IT shops. AI, containers and composable infrastructure could be important for many as they try to take advantage of their data assets.
Those are just a sampling of the technologies and approaches that a panel of prominent industry CTOs and analysts cited in making predictions on enterprise infrastructure and cloud architecture. Here, vendor CTOs share their sometimes divergent views on trends that could have a significant impact in the industry or on their technology roadmaps through 2021 and beyond, and analysts provide clues on what's next.
Cloud architecture predictions
John Morris, CTO, Seagate Technology: There will be more widespread adoption of hyperscale cloud architecture by the non-cloud enterprise to get improvements in scale and efficiency that didn't exist in traditional enterprise architecture. [The public] cloud doesn't work for all use cases. Privacy and regulation affect decisions on whether [the public] cloud can be used. Over time, we're seeing a repatriation of data and infrastructure from cloud back into on-prem or private infrastructure. As repatriation occurs, enterprises will want to achieve similar levels of scale and efficiency that the cloud has.
Three key elements of hyperscale architecture are the use of object storage, the adoption of disaggregated and composable infrastructure, and the utilization of tiering to get the right type and ratios of storage.
Tom Black, SVP/GM, storage business unit, Hewlett Packard Enterprise: More organizations will turn to a cloud everywhere strategy in 2021… Hyperscalers like AWS are moving on-prem and offering on-prem compute and storage to their customers. The HPE GreenLake business is growing, with over $4 billion in contracts, demonstrating customer demand for cloud services.
I also have a fundamental belief moving forward that we are in a multi-cloud, hybrid IT world. A multi-cloud world is increasingly becoming a reality as businesses look to avoid cloud lock-in. Cloud platforms aren't one-size-fits-all. Different clouds are innovating in different areas. For example, [Google Cloud Platform] is well known for AI/analytics. So, one might use on-prem for mission-critical applications, AWS for DevOps, and GCP for AI/analytics.
Alex McMullan, international CTO, Pure Storage: Multi-cloud deployments will never happen. Nobody's going to want to spend the time to write an application twice or three times when they could have two separate applications. It's too hard. Someone will do it just to prove it can be done, but it'll be a Pyrrhic victory. The cost will just not bring the benefits, I'm afraid. I do buy the hybrid cloud, of course, but multi-cloud just doesn't make sense from a business perspective.
Andy Walls, IBM fellow, CTO IBM Flash Storage: The pandemic is going to accelerate the need and the desire for organizations to build a hybrid cloud, with the flexibility to go easily to the cloud and back, and take your data there and back and do analytics.
We saw a real push of organizations to move things to the cloud in 2020 in some sectors, but in others, there's actually more standing up of external storage. A lot of us were concerned the pandemic would result in this massive movement to the cloud, and external storage would really drop. But I don't see that happening. I think you're going to see much more of a requirement that the internal infrastructure is set up in such a way that it can be automated, more flexible, even funded more flexibly, and that important applications can move to the cloud and back.
Randy Kerns, senior strategist and analyst, Evaluator Group: There will be a greater focus this year on the deployment of private or hybrid cloud infrastructures focused on containers. Most customers already have a significant investment in virtual machine environments, primarily with VMware. But now a lot of developers are delivering cloud-native applications that are container-based.
In the past, we've seen it was a toe in the water, or separate groups within companies that were not part of the central IT organization. The end users' operations staffs typically stand up separate environments for containers. What I think we're going to see now is more concerted efforts that need the strategic direction for the IT organization rather than it just being a fragmented group. I'm seeing significant interest in infrastructures that support containers -- things like OpenShift or Rancher. They really want to deploy something that is more or less prepackaged and has support. The idea is that it can get in operational use quicker.
Trends in enterprise infrastructure
Sudhir Srinivasan, SVP and CTO, storage division, Dell Technologies: This will be the proving year for traditional vendors to deliver IT as a service, and the next two to three years will be when the heat of the battle starts with the clouds. Customers have been consuming cloud IT as a service, but now they want it on prem. That led to offerings like Amazon Outposts, etc. We announced Project Apex late last year. It gets customers out of the business of managing IT. It's managed by the vendor. All you do is consume it. You decide when you need more, and if you don't need it, you can decommission it and stop paying for it. It goes from a heavy capital outlay and expensive operational cost model to a streamlined, pure Opex model.
Octavian Tanase, SVP, hybrid cloud engineering, NetApp: Customers will look to adopt infrastructures that use artificial intelligence intuition to store data on the right tier, ranging from traditional nearline SAS HDD to second-generation storage class memory. Storage operating systems will employ AI based on the application signature and make decisions for the most cost-effective place to store the data based on an SLA.
Marc Staimer, founder and president, Dragon Slayer Consulting: Most of the topline storage systems that are going after high performance will go from 100 gig to 200 gig per port or higher by the end of this year, and you'll see 400 Gbps as well.
Some IT organizations on the bleeding edge have already mandated that 2021 projects must be at 200 Gbps per port or faster, because the amount of data that needs to be analyzed is growing faster than the bandwidth can handle it. They've got to have the latest and greatest and fastest and biggest and widest. We're talking labs, financial technology companies, some AI/analytics. Basically, in AI/analytics, it's a throughput issue, not an IOPS issue. You need to get as much data as you can in the shortest period of time to be analyzed. By the end of this year, 200 gig will become the same standard that 100 gig is in that space today. It's not going to be at the mid-tier and below, and even a lot of enterprises won't look that route. But enterprises that are concerned with bandwidth and throughput will -- so Big Pharma, energy, media and entertainment.
Eric Burgener, research vice president, IDC: Customers buying storage platforms for AI/ML-driven next-generation workloads will accelerate the growth of newer scale-out unstructured storage systems at the expense of more traditional designs, driving a noticeable impact in 2021.
Customers are struggling a bit with what is the right type of platform to deploy AI/ML-driven big data analytics workloads that have to handle both low-latency, random, small-file I/O and high-throughput, sequential large-file I/O at the same time. Scale-up file system designs are really good at the first, and scale-out parallel file systems are really good at the second, but neither is very good at both. Handling 20 PB+ environments is also a challenge, and there are more customers that have already run into those limits with existing parallel file systems. This opens up the opportunity for newer designs. We'll see startups with true parallel file systems that implement distributed metadata, a unified global namespace that spans both file and object, and data protection and data reduction algorithms specifically rewritten to operate at petabyte-plus scale gain momentum.
New CXL interconnect
Raj Hazra, senior vice president of emerging products and corporate strategy, Micron: A major interconnect called CXL is going to open the door to an unprecedented level of system architecture innovations starting in the next 18 to 24 months. CXL changes how you connect memory and storage to compute.
Today, DRAM is connected to the CPU via a memory bus. The type and generation of DRAM, such as DDR4 or DDR5, is dictated by the vendor CPU. Similarly, you cannot mix and match different types of devices, memory, accelerators, FPGAs and GPUs easily because they all use different interfaces determined by the CPU's characteristics, with physical limitations on number of devices you can connect. CXL is an open interface that standardizes a single interconnect for all types of devices. It's not a proprietary interface that you have to license for a particular CPU. It provides the ability to connect CPU to anything. We see CXL as a necessary step forward to create more interesting memory and storage hierarchies to allow systems to address diverse workload needs.
Extracting value from data
Kristie Mann, senior director, Optane persistent memory products, Intel: COVID really shook things up and started to speed up the emergence of AI and the move to everything digital. I'm seeing it everywhere in the proof of concepts that we're running with customers. What they're all trying to do down to a business is take what used to be their standard database infrastructure and build the capability on top of it to do analytics. We see increased adoption of more sophisticated retail and video recommendation engines, more interactive multiplayer gaming, more content being generated at the edge. We've seen lots of interest from the financial service industry for using analytics to do more mature credit card fraud detection. Customers are coming to us with ideas, and that's when you start to see the tipping point of a transition.
Morris, CTO, Seagate Technology: We're seeing 30% growth in data, but the percentage of stored data is going down year over year, from maybe 5% to less than 3% over the next five years. Why? Part of it is the economics of data, and part is extracting the value from that data. So, we will see more widespread adoption of machine learning technology at the sources of data creation, at endpoints and the edge, to unlock the trapped value in data. And, assuming we can create the economics for it, I would expect to see more data stored. One of the better examples of a use case specific solution right now is autonomous vehicles, but there will be more emerging over time.
Kerns, Evaluator Group: We're going to see more data flow accelerator and processor offload, initially focused in the areas of machine learning and AI and then moving into more traditional IT environments. Once you take an environment where you've got exploited solid-state devices and NVMe protocols between the storage and the server, you've done a lot of acceleration there.
The next bottleneck is in the ability to process or handle the data. The central processor dedicates more and more work to try and process the data quicker. There are several initiatives to get data provided faster for computation and free up the processor. One is putting the intelligence in a plug-in card like a SmartNIC. VMware is going to be the first one to really drive this message home. VMware's Project Monterey, which uses the Nvidia BlueField SmartNIC, offloads the data handling. Another way is with computational storage, to move some processing to the individual devices so I don't have to transfer as much data. I think 2021 will be the year that everybody starts to see the value.
Walls, IBM: Computational storage is going to take off for two reasons -- the SNIA work group that formed and the accelerating need of the hyperscalers to do this. A lot of processor time is wasted just doing 'if' statements in order to do AI and process all the data you need. If you could have the SSDs look for all of that data instead and just send it, then you save processing.
Tackling the security challenge
Morris, Seagate: Security has been a challenging area, and you hear about some kind of issue pretty much every day. We will see more widespread adoption of provenance capabilities to ensure that either the device or the data is uncompromised. For example, blockchain is a key enabler, where you can have multiple parties engage in a transaction with the ability to establish a trusted exchange. There's a number of significant open projects. For example, Microsoft's Project Cerberus and Google's OpenTitan seek to establish standards around infrastructure support for root of trust. Today, there is no easy way to establish root of trust across a wide array of components.
The adoption of a standard will allow components from a variety of vendors to deploy root of trust capability across the whole infrastructure. It's going to take multiple years before we see widespread adoption, but it is already beginning to happen with prototypes and proofs of concept. I think the earliest adoption of root of trust capabilities is going to be at the system level, and then it'll spread to the system components, including hard disk and SSDs.