Ceph Erasure Coding Profile Planner

Pick the right k+m for your topology — storage efficiency, overhead, minimum failure domains, and min_size, with the exact CLI to create the EC pool. EC profiles can't be changed after pool creation, so get it right the first time.

k+m Planner

Failure Domain Aware

RBD / CephFS / RGW

BlueStore

Free · No Login

Topology & Capacity

Capacity Input // pick one

OSD count × size

Raw capacity (TB)

OSD Count // total

OSD Size (TB) // each

EC Profile // k = data, m = parity

2+2 (Ceph default)

4+2 (recommended)

6+2

8+3

8+4

Custom

CRUSH Failure Domain

host

rack

osd

Available Domains // hosts/racks/OSDs you have

Use Case

RBD / CephFS

RGW (object)

EC Sizing Rules

Efficiency:k / (k+m)

Overhead:(k+m) / k × raw storage

Fault tolerance:m simultaneous domain losses

min_size:k + 1 (k alone = no redundancy)

Min domains:k + m (recommend k+m+1)

Default pick:k=4, m=2 — ~2× usable vs size=3

Requires:BlueStore + allow_ec_overwrites

Immutable:profile can't change after pool create

Documentation

Erasure Code Profiles ↗
Plugin configs, k/m semantics, overwrite support
Managing Pools ↗
ceph osd pool create, set, ls detail
Ceph PG Calculator →
Size pg_num/pgp_num for your new EC pool

EC Profile Results

configure your topology on the left
and click PLAN EC PROFILE
to generate efficiency, domain requirements, and CLI

How Ceph Erasure Coding Works

Erasure coding splits each object into k data chunks and computes m parity chunks, then stores all k+m chunks on distinct failure domains. The cluster can lose any m of those chunks and still reconstruct the full object — that's the fault tolerance. Unlike replication, EC doesn't store full copies, so it uses dramatically less raw storage for the same durability guarantee.

The tradeoff is CPU cost (encoding/decoding) and reconstruction reads when chunks are missing, which is why EC suits sequential, less latency-sensitive workloads — RGW object storage, backups, cold archives — better than low-latency block storage.

Efficiency by profile

Profile	Efficiency (k/(k+m))	Overhead	Tolerates	Min Domains
2+2	50.0%	2.00×	2 failures	4
4+2	66.7%	1.50×	2 failures	6
6+2	75.0%	1.33×	2 failures	8
8+3	72.7%	1.38×	3 failures	11
8+4	66.7%	1.50×	4 failures	12

Profile

Efficiency (k/(k+m))

Overhead

Tolerates

Min Domains

2+2

50.0%

2.00×

2 failures

4+2

66.7%

1.50×

2 failures

6+2

75.0%

1.33×

2 failures

8+3

72.7%

1.38×

3 failures

8+4

66.7%

1.50×

4 failures

Ceph's own default EC profile is k=2, m=2, but the documentation recommends k=4, m=2 as a practical starting point for most clusters — it delivers roughly twice the usable capacity of 3x replication while keeping the domain count and CPU overhead manageable.

Failure Domains, min_size, and the #1 EC Gotcha

Each of the k+m chunks must land on a distinct failure domain — otherwise losing one domain could take out more chunks than the profile tolerates. That means a 4+2 profile needs at least 6 hosts (with crush_failure_domain=host); fewer than that and CRUSH cannot place the pool's PGs at all. This tool recommends k+m+1 domains so the cluster has room to recover after losing one, rather than running at the bare minimum indefinitely.

min_size for an EC pool is k+1, not k. If min_size were allowed to equal k, the pool would keep serving I/O with zero spare parity chunks — any further loss during that window means unrecoverable data loss. Ceph enforces this by halting I/O below min_size rather than risking it.

The #1 EC gotcha: an erasure-code profile is locked in at pool creation. You cannot change k, m, or the plugin/technique on an existing pool — to change the profile you must create a new pool and migrate data. Plan capacity and fault tolerance carefully before running ceph osd pool create.

Frequently Asked Questions

Can I use erasure coding for RBD volumes?

Yes, since Luminous, EC pools support overwrites for RBD and CephFS when the pool sits on BlueStore and has allow_ec_overwrites true set. Performance is lower than replicated pools for small random writes because partial-chunk overwrites require a read-modify-write cycle, so EC is most often used for RGW object storage, backup targets, or CephFS data pools with mostly large sequential I/O.

Why is m=1 discouraged for production?

With m=1, the pool tolerates exactly one chunk loss. If a second failure occurs anywhere in the cluster before the first is recovered — a very real scenario during a host reboot or disk replacement — you lose data. m=2 (or higher) gives a buffer for overlapping failures, which is why most production guidance treats m=1 as appropriate only for non-critical or scratch data.

What does crush-failure-domain actually control here?

It tells CRUSH which level of your hierarchy must be distinct across the k+m chunks. host means no two chunks of the same object can land on the same host; rack raises that guarantee to the rack level (protecting against a whole rack going dark — top-of-rack switch, PDU, etc.) but requires more physical domains to satisfy the same profile. Use the CRUSH helper to check whether your physical topology actually supports the domain you pick.

What plugin and technique should I use?

jerasure with technique=reed_sol_van is the default, well-tested, portable choice for most clusters and is what this tool's generated CLI uses. isa can be faster on Intel hardware with ISA-L support; clay reduces recovery network traffic at the cost of more CPU, useful for clusters where recovery bandwidth is the bottleneck. Stick with jerasure/reed_sol_van unless you have a specific reason to deviate.

How do I size pg_num for an EC pool?

The same target of ~100 PGs per OSD applies, but PG-per-OSD math for EC pools should account for k+m chunks landing on k+m different OSDs per PG (vs. size copies for replication). Use the Ceph PG Calculator and select Erasure Code with this profile's k+m to get pg_num and pgp_num sized correctly for the pool you're about to create.