We have released Python Polars 1.0. We are very excited to reach this milestone. Since its inception four years ago, Polars has gained 27.5K GitHub stars, has over 7 million monthly downloads, and is used by many companies in their production workloads.
Though these are just statistics, we are truly proud of what we have achieved in 4 years and how Polars has become a serious alternative to entrenched libraries.
Last year, the project ventured into a new phase with the start of the Polars company. This greatly improved open-source development hours and quality. We thank our core team and contributors who helped achieve the rapid development and improvement of Polars.
Production readiness
With this release, we signify that the Polars in-memory engine and API is production ready. We are convinced that Polars is in a state where it is one of the best open-source choices for fast data modeling that focuses on vertical scaling. We are confident that the core of our API is solid and offers a strong base for further improvements to Polars. Another driving factor in this conviction is that the project is now backed by the Polars company, which can guarantee continuous effort and support.
Future plans
Releasing 1.0 doesn’t mean Polars is finished. We still have big plans on improving functionality, scalability and performance. This major version release marks a point in time where the separation between the API and the actual implementation is solid enough that we can continue improving in a backward compatible manner.
We don’t share our roadmap very often, but this is a good occasion to share a bit of what’s coming.
New engine design
Most notably, we are completely redesigning our streaming engine. This is a novel design that combines morsel-driven parallelism with Rust’s async state machines. This combination leads to a hybrid push/pull based engine that benefits from the cache locality, parallelism and NUMA-awareness of morsel-driven parallelism with flexible operator designs that compile down to state machines where the complexity is handled by rustc.
Streaming is very fast and memory efficient for many workloads. It isn’t the best design for many time series operations with regard to performance. Operations like rolling windows, window functions, and so on need more synchronization during streaming than when run in-memory. For these functionalities, we will use the Polars in-memory engine as a fallback if needed. This ensures we achieve optimal performance on every workload.
GPU acceleration with NVIDIA RAPIDS
Another very exciting development is bringing GPU acceleration to Polars. A lot of progress has been made in this area and we can already run a substantial part of our test suite on the GPU. Combining GPU acceleration with the Polars optimizer leads to optimal performance and reduced memory pressure on the GPU. The GPU will take care of the parallelism, and our optimizer ensures that the minimum amount of work is executed.
Polars Cloud
With Polars Cloud, we aim to provide a managed service for organizations to reduce the complexity of hosting and scaling Polars. The development of Polars Cloud is progressing at a steady pace and we expect to be able to start initial beta testing this year. Many of Polars Cloud’s requirements directly lead to improvements to the open-source project. For instance, the new 1.0 release brings support for scanning Hive partitioned datasets in the cloud, caching cloud files and extending cloud support to many more formats. We are committed to using the Polars open-source engine as our runners in the managed service, ensuring that improvements benefit users of Polars Cloud as well as open-source users.
Other short term plans
Some other notable plans on the short term roadmap include right joins, non-equi joins, extended metadata support, join re-ordering optimization, and extended SQL support.
Versioning philosophy
It is sometimes said that “the standard library is where modules go to die”. Though a bit of an exaggeration, we do feel that holding on tight to a 1.0 version can hold a project back. If we have flawed designs, we will issue a new breaking release. However, we expect the frequency and severity of these changes to strongly diminish over time.
Influenced by Rust’s nightly system, we also have an unstable API. This is functionality added to Polars that can be very useful, but we are not yet certain about their mechanics or API design. Such methods are marked as unstable. They can be used and are tested for correctness, but they may change in the future without it being considered a breaking change, so use these at your own risk.
Our full versioning policy is available in our documentation.
Last words
For help with upgrading to 1.0, we have released an upgrade guide. And lastly, we want to thank all the contributors and the community for getting Polars to where it is today.
There are a lot of exciting and challenging milestones on our roadmap. We are actively hiring for several roles. If you are excited about what is ahead for Polars, check out our hiring page.