“Big data”—the gathering, manipulation, analysis, and reporting of data based on one or more data sets that are too large to be managed by traditional means—has had a big problem: Because of the vast quantity of data to be processed, a single computer, or even a high-end virtual or physical server with multiple CPU cores, is not up to the task of processing that much data efficiently. It’s much better to divide the work among several computers or servers operating in parallel.