TechTarget.com/searchstorage

https://www.techtarget.com/searchstorage/feature/The-significance-of-parallel-I-O-in-data-storage

The significance of parallel I/O in data storage

By Jon Toigo

Based on recent Storage Performance Council SPC-1 benchmark results, we are poised for a watershed moment in data storage performance. You could call it -- as the leader in the new technology, DataCore Software, almost certainly will -- parallel I/O.

SPC-1 measures the I/O per second (IOPS) handled by a storage system under a predefined enterprise workload that typifies random I/O queries and updates commonly found in OLTP, database mail server applications. It's similar to a Transaction Processing Performance Council benchmark in the database world. In real life, SPC-1 rates the IOPS that can be handled by data storage infrastructure and, by virtue of the expense of the kit evaluated, the cost per IOPS.

Until parallel I/O technology (re-)emerged, the two SPC-1 metrics were generally proportional: IOPS could be accelerated, usually via expensive and proprietary hardware enhancements, which in turn drove up the cost per IOPS. With parallel I/O from DataCore Software, cost per IOPS is inversely proportional to the cost of the kit. The result is steady improvements in I/O performance at a steadily decreasing price.

Origins of parallel I/O

The term parallel I/O may sound like an exotic new tech to some, but it is a simple concept based on well-established technology -- well, technology that hasn't been much discussed outside of the rarified circles of high-performance computing for nearly three decades.

Parallel I/O is a subset of parallel computing, an area of computer science research and development that was all the rage in the late 1970s through the early 1990s. Back then, computer scientists and engineers worked on algorithms, interconnects and mainboard designs that would let them install and operate multiple, low-performance, central processor chips in parallel to support the requirements of new high-performance transaction processing applications. Those development efforts mostly fell on hard times when Intel and others pushed a microprocessor architecture to the forefront that utilized a single processor chip design (Unicore) and a serial bus used to process application instructions in and out of memory and deliver I/Os to storage infrastructure.

The Unicore processor-based system -- on which the PC revolution, and most client-server and the bulk of distributed server computing technology came to be based -- dominated the business computing scene for approximately 30 years. In accordance with Moore's Law, Unicore technology saw a doubling of transistors on a chip every two years; in accordance with House’s Hypothesis, there was a doubling of chip clock speeds at roughly the same clip.

The impact of that progression put the kibosh on multiprocessor parallel processing development. PCs and servers based on Unicore CPUs evolved too quickly for parallel computing developers to keep pace. By the time a more complex and difficult to build higher-speed multi-chip parallel processing machine was defined, its performance was already eclipsed by fast single-processor systems with a serial bus. Even as applications became more I/O intensive, Unicore computers met these requirements with brute-force improvements in chip capacities and speeds.

Until they didn't. At the beginning of the Millennium, House's Hypothesis fell apart. The trend lines for increased processor clock rates became decoupled from the trend lines describing transistors per integrated circuit. For a number of technical reasons, mostly related to power and heat generation, chip speeds plateaued. Instead of producing faster Unicore chips, developers began pushing out multicore chips, capitalizing on the ongoing doubling of integrated circuits on a chip die, as forecasted by Moore.

Today, multicore processors are de rigueur in servers, PCs, laptops, and even tablets and smartphones. Though some observers have failed to notice, multicore is quite similar to multichip from an architectural standpoint, which reopens the possibilities of parallel computing, including parallel I/O, for improved application performance.

Parallel I/O will improve performance

Most applications are not written to take advantage of parallel processing. Even the most sophisticated hypervisor-based software, while it may use separate computer cores to host specific virtual machines (VMs), still leverages each logical core as a VM and processes hosted application workloads sequentially. Below this layer of application processing, however, parallelism can be applied to improve overall performance. That's where parallel I/O comes in.

DataCore has resurrected some older parallel I/O algorithms company chief scientist and co-founder Ziya Aral was working on back in the heyday of multiprocessor systems engineering and implemented them using the logical cores of a multicore chip. Every physical core in a multicore processor supports software multithreading to enable more efficient use of chip resources by operating systems and hypervisors. The combination of physical cores plus multithreading has the system providing two or more logical cores for each physical core.

With so many logical processors, a lot of these cores are going unused. DataCore's technology uses some of the cores that are currently unused to create a parallel processing engine explicitly developed to do nothing but service I/O requests from all hosted applications. Such an engine enables I/Os to be processed in and out of many applications concurrently -- rather than sequentially -- which translates into much less time to service I/O. This is known as parallel I/O, and it is exactly what DataCore has now demonstrated with the Storage Performance Council SPCI-1 benchmark.

By itself, parallel I/O may not seem like much more than an interesting nuance in data storage stack design. But its practical implications are significant.

27 Jan 2016

All Rights Reserved, Copyright 2000 - 2025, TechTarget | Read our Privacy Statement