Dask is a flexible parallel computing library for analytics. See documentation for more information. Dask allows distributed computation in Python.Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you love.
This chart will deploy the following:
Dask is open source and freely available. It is developed in coordination with other community projects like Numpy, Pandas, and Scikit-Learn
Dask uses existing Python APIs and data structures to make it easy to switch between Numpy, Pandas, Scikit-learn to their Dask-powered equivalents.
You don’t have to completely rewrite your code or retrain to scale up.
Dask’s schedulers scale to thousand-node clusters and its algorithms have been tested on some of the largest supercomputers in the world.
But you don’t need a massive cluster to get started. Dask ships with schedulers designed for use on personal machines. Many people use Dask today to scale computations on their laptop, using multiple cores for computation and their disk for excess storage.
Not all computations fit into a big data frame.
Dask exposes lower-level APIs letting you build custom systems for in-house applications. This helps open source leaders parallelize their own packages and helps business leaders scale custom business logic.
Tell us about a new Kubernetes application
Never miss a thing! Sign up for our newsletter to stay updated.
Discover and learn about everything Kubernetes