tags:
- dnc
5. Data parallelism #
Created Tue Jul 30, 2024 at 12:15 AM
Why parallel algorithms #
There are situations when runtime speed is needed, if not complete speed;
- Animations and games
- Search engine results
Here breaking down a problem so it can be solved by multiple computers/processors and combining the outputs is essential.
Parallelism and DnC #
Since each input is solved in isolation, it’s possible to employ multiple processors to make the work happen faster.
Principle of parallel algorithms #
We distribute the problem into parts, solve them and recombine. Each step has a cost. If the sum this is lower than sequential algorithm, significantly, then use parallel algorithms.
Data parallelism (heterogeneous speedup) #
Suppose a task involves working on independent aspects, then parallelism is a good fit here because:
- No need of managing communication between parts of the program
- Guaranteed speedup
Note: this kind of parallelism differs from the more involved algorithms where parts of the program communicate and synchronize to build the output.
Pitfalls of parallelism #
- Embarrassingly parallel problems - problems that can be trivially solved using parallelism. These are thought of as uninteresting, but they work fine.
- Hard to parallelize problems - suppose a problem involved looking at all pairs of a data. Here a parallel algorithm would not be good since not all pairs can be on the same processor. So parallelism shouldn’t be considered here.
- Naive parallelism - sometimes an algorithm/problem may be parallelizable. But it could be that the sequential algorithm can be on par or even better than the parallel algo. E.g. minmax game-tree search. This can be parallelized easily (one branch one processor), but its better to alpha-beta pruning, which is almost on par with sequential. Try to better the sequential algorithm before assuming parallelization is always better.
- Cost - even if adding cores increases speedup, cores aren’t inexpensive.
- Tooling is bad - Performance-tuning tools such as profilers are better developed for sequential machines/programs than for parallel models.
- Tough to debug - since parallel algorithms have a component of communication that needs to be managed by the programmer, multiple runs with the same input could lead to different executions.