Parallel processing is a method in computing of running two or more processors (CPUs) to handle separate parts of an overall task. Breaking up different parts of a task among multiple processors will help reduce the amount of time to run a program. Any system that has more than one CPU can perform parallel processing, as well as multi-core processors which are commonly found on computers today.
Multi-core processors are IC chips that contain two or more processors for better performance, reduced power consumption and more efficient processing of multiple tasks. These multi-core set-ups are similar to having multiple, separate processors installed in the same computer. Most computers may have anywhere from two-four cores; increasing up to 12 cores.
Parallel processing is commonly used to perform complex tasks and computations. Data scientists will commonly make use of parallel processing for compute and data-intensive tasks.
How parallel processing works
Typically a computer scientist will divide a complex task into multiple parts with a software tool and assign each part to a processor, then each processor will solve its part, and the data is reassembled by a software tool to read the solution or execute the task.
Typically each processor will operate normally and will perform operations in parallel as instructed, pulling data from the computer’s memory. Processors will also rely on software to communicate with each other so they can stay in sync concerning changes in data values. Assuming all the processors remain in sync with one another, at the end of a task, software will fit all the data pieces together.
Computers without multiple processors can still be used in parallel processing if they are networked together to form a cluster.
Types of parallel processing
There are multiple types of parallel processing, two of the most commonly used types include SIMD and MIMD. SIMD, or single instruction multiple data, is a form of parallel processing in which a computer will have two or more processors follow the same instruction set while each processor handles different data. SIMD is typically used to analyze large data sets that are based on the same specified benchmarks.
MIMD, or multiple instruction multiple data, is another common form of parallel processing which each computer has two or more of its own processors and will get data from separate data streams.
Another, less used, type of parallel processing includes MISD, or multiple instruction single data, where each processor will use a different algorithm with the same input data.
Difference between Serial and parallel processing
Where parallel processing can complete multiple tasks using two or more processors, serial processing (also called sequential processing) will only complete one task at a time using one processor. If a computer needs to complete multiple assigned tasks, then it will complete one task at a time. Likewise, if a computer using serial processing needs to complete a complex task, then it will take longer compared to a parallel processor.
History of parallel processing
In the earliest computers, only one program ran at a time. A computation-intensive program which would take one hour to both run as well as and tape copying program that took one hour to run would take a total of two hours to run. An early form of parallel processing allowed the interleaved execution of both programs together. The computer would start an I/O operation, and while it was waiting for the operation to complete, it would execute the processor-intensive program. The total execution time for the two jobs would be a little over one hour.
The next improvement was multiprogramming. In a multiprogramming system, multiple programs submitted by users were each allowed to use the processor for a short time. To users, it appeared that all of the programs were executing at the same time. Problems of resource contention first arose in these systems. Explicit requests for resources led to the problem of the deadlock, where simultaneous requests for resources would effectively prevent program from accessing the resource. Competition for resources on machines with no tie-breaking instructions lead to the critical section routine.
Vector processing was another attempt to increase performance by doing more than one thing at a time. In this case, capabilities were added to machines to allow a single instruction to add (or subtract, or multiply, or otherwise manipulate) two arrays of numbers. This was valuable in certain engineering applications where data naturally occurred in the form of vectors or matrices. In applications with less well-formed data, vector processing was not so valuable.
The next step in parallel processing was the introduction of multiprocessing. In these systems, two or more processors shared the work to be done. The earliest versions had a master/slave configuration. One processor (the master) was programmed to be responsible for all of the work in the system; the other (the slave) performed only those tasks it was assigned by the master. This arrangement was necessary because it was not then understood how to program the machines so they could cooperate in managing the resources of the system.
SMP and MMP
Solving these problems led to the symmetric multiprocessing system (SMP). In an SMP system, each processor is equally capable and responsible for managing the flow of work through the system. Initially, the goal was to make SMP systems appear to programmers to be exactly the same as a single processor, multiprogramming systems. However, engineers found that system performance could be increased by someplace in the range of 10-20% by executing some instructions out of order and requiring programmers to deal with the increased complexity (the problem can become visible only when two or more programs simultaneously read and write the same operands; thus the burden of dealing with the increased complexity falls on only a very few programmers and then only in very specialized circumstances). The question of how SMP machines should behave on shared data is not yet resolved.
As the number of processors in SMP systems increases, the time it takes for data to propagate from one part of the system to all other parts also increases. When the number of processors is somewhere in the range of several dozen, the performance benefit of adding more processors to the system is too small to justify the additional expense. To get around the problem of long propagation times, a message passing system mentioned earlier was created. In these systems, programs that share data send messages to each other to announce that particular operands have been assigned a new value. Instead of a broadcast of an operand's new value to all parts of a system, the new value is communicated only to those programs that need to know the new value. Instead of shared memory, there is a network to support the transfer of messages between programs. This simplification allows hundreds, even thousands, of processors to work together efficiently in one system. Hence such systems have been given the name of massively parallel processing (MPP) systems.
The most successful MPP applications have been for problems that can be broken down into many separate, independent operations on vast quantities of data. In data mining, there is a need to perform multiple searches of a static database. In artificial intelligence, there is a need to analyze multiple alternatives, as in a chess game. Often MPP systems are structured as clusters of processors. Within each cluster the processors interact as in an SMP system. It is only between the clusters that messages are passed. Because operands may be addressed either via messages or via memory addresses, some MPP systems are called NUMA machines, for Non-Uniform Memory Addressing.
SMP machines are relatively simple to program; MPP machines are not. SMP machines do well on all types of problems, providing the amount of data involved is not too large. For certain problems, such as data mining of vast databases, only MPP systems will serve.
Continue Reading About parallel processing
- Carnegie-Mellon University hosts a listing of supercomputing and parallel processing research terms and links.
- At the University of Wisconsin, Doug Burger and Mark Hill have created The WWW Computer Architecture Home Page .