Since the last edition of Polars in Aggregate in December:
- 12 releases have been published.
- 778 PRs were merged to open source Polars.
- 95 contributors submitted code to open source Polars. Thank you!
This post summarizes the biggest highlights of that work.
Highlighted Features
The Streaming Engine Keeps Expanding
This quarter extended streaming into more query patterns.
1.38 added a streaming merge join, and 1.39 added a streaming AsOf join.
This brings us closer to setting the streaming engine as the default for everyone.
We already recommend enabling it by setting pl.Config.set_engine_affinity("streaming") at the beginning of your code.
Streaming Scan and Sink Across All Major Formats
PRs: #25948, #25900, #25746, #25910, #25964
All major formats now have a streaming scan implementation, making reads both faster and more memory-efficient, including CSV.
In 1.37 the new sink pipeline landed for NDJSON, CSV, and IPC, and partitioned sink_* usage now dispatches to the new streaming engine by default.
1.39 added support to stream files directly from cloud instead of pulling the whole file first, for scan_ndjson() and scan_lines().
Delta Lake Read/Write
Delta got sink_delta() in 1.37, which means lazy queries can now write directly back to Delta Lake without first materializing the whole result in memory.
In 1.38, scan_delta() was refactored onto the Python dataset interface and gained file-statistics-based predicate pushdown for batch reads, making selective scans cheaper.
Apache Iceberg Read/Write
1.39 completes the Iceberg read/write roundtrip with sink_iceberg().
Polars was already able to read Iceberg tables, but it can now scan, transform, and commit a new Iceberg snapshot from the streaming engine.
scan_iceberg() now accepts table identifier strings, making it easier to integrate with catalog-backed workloads.
Polars Cloud Gets Better Operational Tooling
This quarter brought the first layer of serious observability to Polars Cloud. The new Query Profiler exposes per-stage resource metrics, physical-plan details, shuffle percentages, and node-level CPU, RAM, and network usage. In the accompanying post we used it to tune a TPC-H query from a single oversized machine to a better-fitting cluster in five runs, ending up 54% faster and 64% cheaper. This allows users to move beyond “does it run?” to “why is this query expensive?”.
On top of that, we continue our work to make the distributed query planner faster and have implemented a cost-based planner that, on average, is 10% more performant.
New Partitioning API: pl.PartitionBy
PR: #26004
As Polars adds more sink paths and more cloud-oriented execution, users need one consistent way to describe how output should be partitioned.
PartitionBy gives that configuration a real object instead of scattering it across ad-hoc parameters.
It expresses hive-style partitioning, row- or byte-based file size limits, and user-definable file path logic, and plugs directly into the sink_* APIs.
Popular Blog Posts
-
Introducing the Query Profiler and Optimizing Cluster Configuration
If you are running distributed queries and still sizing clusters by gut feel, this post shows how to use the new profiler to find the real bottleneck and turn that into speed-ups and cost savings.
-
Understanding the New Categorical
If you want to understand why
Categoricalhad to change for the streaming and distributed future of Polars, this is the architectural exploration that explains the old tradeoffs and the new design. -
How Rabobank leverages window functions to drastically improve performance
If you need a real production case study, this one shows how a large bank rewrote critical transaction-enrichment logic in Polars and used window functions and Polars extensions to get both cleaner code and major performance gains.
And Much More
This is just a glimpse of what we’ve built.
If you want to stay up to date, you can follow us on the following platforms:
[LinkedIn] - [Twitter/X] - [Bluesky] - [GitHub]
And you can learn more about Polars Cloud at: https://cloud.pola.rs/