Polars 1.42 is out.
In this post we’ll go through three highlights: the query optimizer eliminates filters that can never match any rows, cloud I/O gets an adaptive concurrency controller, and DataFrame.is_sorted and Expr.is_sorted are now available.
Adaptive Cloud I/O Concurrency
PR: #27924
Polars 1.42 introduces an adaptive concurrency controller for reading Parquet and IPC files from cloud object stores (S3, GCS, Azure). It dynamically tunes the number of in-flight requests to the observed bandwidth and latency of the connection, improving throughput for workloads with many small or scattered reads on large instances. Internal benchmarking (TPC-H SF=1000) on a 64 vCPU instance has shown a 2x improvement overall, and up to 4x on individual I/O-bound queries.
There is no API change.
Existing read_parquet/scan_parquet and read_ipc/scan_ipc calls from cloud storage benefit automatically.
Contradictory Filter Elimination
PR: #27775
The query optimizer now detects when a filter predicate can never be true and folds the entire filter to an empty result. No data is scanned and no expression is evaluated. It recognises six categories of contradictions:
| Category | Description | Example |
|---|---|---|
| Logical negation | A predicate conjoined with its own negation | A AND NOT(A) |
| Inverse comparisons | Two comparisons that cannot hold together | x > 5 AND x <= 5, x == 5 AND x != 5 |
| Empty membership | Membership against an empty set | is_in([]) |
| Disjoint ranges | Lower bound above an upper bound | a > 5 AND a < 3 |
| Non-overlapping intervals | Two is_between ranges that never overlap | is_between(4, 6) AND is_between(0, 2) |
| Equality outside range | An equality that falls outside a bound | a == 5 AND a > 10 |
For example, the post-aggregation filter below can never be satisfied:
import polars as pl
result = (
pl.LazyFrame({"category": ["A", "B", "A"], "value": [10, 20, 30]})
.filter(pl.col("value") > 5)
.group_by("category")
.agg(pl.col("value").sum())
.filter(pl.col("value") > 100)
.filter(pl.col("value") < 50)
.collect()
)
# Returns an empty frame; the scan and aggregation never run.
Call show_graph() on the same query (without .collect()) to confirm the optimization fired.
In 1.41 the streaming plan has four nodes: the scan, a pre-aggregation filter, the group-by aggregation, and the contradictory post-aggregation filter.
In 1.42 the optimizer recognises at plan time that value > 100 AND value < 50 can never be satisfied and folds the entire query to an empty TABLE, so the scan, filter, and aggregation are never executed.

This mostly helps with programmatically built predicates, such as parameterized filters where a bound can produce low > high, or composed predicates that stack mutually exclusive conditions.
No code changes are required to benefit from it.
is_sorted() for DataFrame and Expr
Series.is_sorted() has been available for a while.
1.42 adds the same capability at the DataFrame and expression levels.
DataFrame.is_sorted checks whether the frame is sorted by one or more columns and returns a boolean:
import polars as pl
df = pl.DataFrame({"a": [1, 2, 3], "b": [5, 4, 3]})
df.is_sorted("a")
# Output
# True
df.is_sorted("b", descending=True)
# Output
# True
Expr.is_sorted works inside select or filter and returns a boolean expression:
import polars as pl
df = pl.DataFrame({"a": [1, 2, 3, 4], "b": [4, 3, 2, 1]})
df.select(pl.col("a").is_sorted(), pl.col("b").is_sorted(descending=True))
# Output
# shape: (1, 2)
# ┌──────┬──────┐
# │ a ┆ b │
# │ --- ┆ --- │
# │ bool ┆ bool │
# ╞══════╪══════╡
# │ true ┆ true │
# └──────┴──────┘
Both methods accept descending and nulls_last parameters, matching the existing Series.is_sorted interface.
They are currently marked unstable.
And More
The full list of changes is in the 1.42 release notes on GitHub.
Follow us for updates: