# polars

> Fast in-memory DataFrame library for datasets that fit in RAM — use when pandas is too slow but data still fits in memory, with lazy evaluation, parallel execution, and an Apache Arrow backend for 1–100GB datasets and ETL pipelines.

**Use case**: Fast pandas replacement with lazy evaluation for large DataFrames

**Canonical URL**: https://agentcookbooks.com/skills/polars/

**Topics**: claude-code, skills, science, data-science

**Trigger phrases**: "use polars instead of pandas", "polars DataFrame", "lazy evaluation pipeline", "fast CSV loading", "polars groupby"

**Source**: [K-Dense AI](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/polars)

**License**: MIT

---

## What it does

`polars` is a Claude Code skill from K-Dense AI's [scientific-agent-skills repo](https://github.com/K-Dense-AI/scientific-agent-skills). It turns Claude into a Polars expert covering the full DataFrame API — lazy vs. eager evaluation, expression syntax, groupby and aggregation, joins, string operations, time series, and integration with Apache Arrow and Parquet — for workflows where pandas becomes the bottleneck.

A session produces Polars code that is idiomatic to Polars' expression syntax rather than a naive pandas-to-polars translation, making full use of lazy evaluation and parallel execution.

## When to use it

Reach for it when:

- Your pandas operations are slow and your data fits in RAM (roughly 1–100 GB)
- You're building ETL pipelines where lazy evaluation lets you compose transformations before executing them
- You need to process large CSV or Parquet files faster than pandas can manage

When *not* to reach for it:

- Data that doesn't fit in RAM — use `dask` for distributed or out-of-core processing
- Workflows tightly coupled to pandas-only libraries — some ML libraries don't accept Polars DataFrames directly

## Install

Copy the `SKILL.md` from K-Dense AI's [polars folder](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/polars) into `.claude/skills/polars/` in your project.

Trigger phrases: "use polars instead of pandas", "polars DataFrame", "lazy evaluation pipeline", "fast CSV loading".

## What a session looks like

A typical session has three phases:

1. **Context and operation description.** Describe the data shape and the transformation you need — filter, groupby, join, or pipeline. Claude identifies whether lazy (LazyFrame) or eager (DataFrame) evaluation is more appropriate.
2. **Idiomatic Polars code.** Claude writes code using Polars' expression API (`pl.col()`, `pl.lit()`, method chaining) rather than row-iteration patterns, making full use of parallelism.
3. **Performance notes.** Claude points out where the code exploits Polars' query optimizer and flags any anti-patterns (e.g., using `.to_pandas()` unnecessarily) that would undo the performance gains.

## Receipts

**Where it works well:**
- Reading and filtering large CSV or Parquet files — Polars' lazy scan with predicate pushdown is dramatically faster than pandas read_csv for large files with filter conditions
- Groupby aggregations on high-cardinality columns — parallel execution makes Polars substantially faster than pandas for groupby on columns with millions of unique values

**Where it backfires:**
- Libraries that only accept pandas DataFrames as input require a `.to_pandas()` call that converts back, negating some of the performance gains for those specific operations
- Polars' expression syntax has a learning curve; Claude's first-pass code is idiomatic, but debugging novel expressions requires understanding the expression context model

**Pattern that works:** start with a LazyFrame scan (`.scan_csv()`, `.scan_parquet()`) and build the full transformation chain before calling `.collect()` — letting the query optimizer run over the complete plan produces better performance than collecting at intermediate steps.

## Source and attribution

Originally authored by [K-Dense Inc.](https://github.com/K-Dense-AI). The canonical SKILL.md lives in the [`polars` folder](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/polars) of their public scientific-agent-skills repository.

License: MIT. Install, adapt, and redistribute with attribution preserved.

This page documents the skill from a practitioner's perspective. For the formal spec and any updates, defer to the source repo.