EC
Ceph EC Profile Planner Erasure Coding Calculator
ceph osd erasure-code-profile set ...
Ceph Docs →

Ceph Erasure Coding Profile Planner

Pick the right k+m for your topology — storage efficiency, overhead, minimum failure domains, and min_size, with the exact CLI to create the EC pool. EC profiles can't be changed after pool creation, so get it right the first time.

k+m Planner
Failure Domain Aware
RBD / CephFS / RGW
BlueStore
Free · No Login
Topology & Capacity
Capacity Input // pick one
OSD count × size
Raw capacity (TB)
OSD Count // total
OSD Size (TB) // each

EC Profile // k = data, m = parity
2+2 (Ceph default)
4+2 (recommended)
6+2
8+3
8+4
Custom
CRUSH Failure Domain
host
rack
osd
Available Domains // hosts/racks/OSDs you have
Use Case
RBD / CephFS
RGW (object)
EC Sizing Rules
Efficiency:k / (k+m)
Overhead:(k+m) / k × raw storage
Fault tolerance:m simultaneous domain losses
min_size:k + 1 (k alone = no redundancy)
Min domains:k + m (recommend k+m+1)
Default pick:k=4, m=2 — ~2× usable vs size=3
Requires:BlueStore + allow_ec_overwrites
Immutable:profile can't change after pool create
Documentation
EC Profile Results
configure your topology on the left
and click PLAN EC PROFILE
to generate efficiency, domain requirements, and CLI

How Ceph Erasure Coding Works

Erasure coding splits each object into k data chunks and computes m parity chunks, then stores all k+m chunks on distinct failure domains. The cluster can lose any m of those chunks and still reconstruct the full object — that's the fault tolerance. Unlike replication, EC doesn't store full copies, so it uses dramatically less raw storage for the same durability guarantee.

The tradeoff is CPU cost (encoding/decoding) and reconstruction reads when chunks are missing, which is why EC suits sequential, less latency-sensitive workloads — RGW object storage, backups, cold archives — better than low-latency block storage.

Efficiency by profile

ProfileEfficiency (k/(k+m))OverheadToleratesMin Domains
2+250.0%2.00×2 failures4
4+266.7%1.50×2 failures6
6+275.0%1.33×2 failures8
8+372.7%1.38×3 failures11
8+466.7%1.50×4 failures12

Ceph's own default EC profile is k=2, m=2, but the documentation recommends k=4, m=2 as a practical starting point for most clusters — it delivers roughly twice the usable capacity of 3x replication while keeping the domain count and CPU overhead manageable.


Failure Domains, min_size, and the #1 EC Gotcha

Each of the k+m chunks must land on a distinct failure domain — otherwise losing one domain could take out more chunks than the profile tolerates. That means a 4+2 profile needs at least 6 hosts (with crush_failure_domain=host); fewer than that and CRUSH cannot place the pool's PGs at all. This tool recommends k+m+1 domains so the cluster has room to recover after losing one, rather than running at the bare minimum indefinitely.

min_size for an EC pool is k+1, not k. If min_size were allowed to equal k, the pool would keep serving I/O with zero spare parity chunks — any further loss during that window means unrecoverable data loss. Ceph enforces this by halting I/O below min_size rather than risking it.

The #1 EC gotcha: an erasure-code profile is locked in at pool creation. You cannot change k, m, or the plugin/technique on an existing pool — to change the profile you must create a new pool and migrate data. Plan capacity and fault tolerance carefully before running ceph osd pool create.


Frequently Asked Questions

Can I use erasure coding for RBD volumes?

Yes, since Luminous, EC pools support overwrites for RBD and CephFS when the pool sits on BlueStore and has allow_ec_overwrites true set. Performance is lower than replicated pools for small random writes because partial-chunk overwrites require a read-modify-write cycle, so EC is most often used for RGW object storage, backup targets, or CephFS data pools with mostly large sequential I/O.

Why is m=1 discouraged for production?

With m=1, the pool tolerates exactly one chunk loss. If a second failure occurs anywhere in the cluster before the first is recovered — a very real scenario during a host reboot or disk replacement — you lose data. m=2 (or higher) gives a buffer for overlapping failures, which is why most production guidance treats m=1 as appropriate only for non-critical or scratch data.

What does crush-failure-domain actually control here?

It tells CRUSH which level of your hierarchy must be distinct across the k+m chunks. host means no two chunks of the same object can land on the same host; rack raises that guarantee to the rack level (protecting against a whole rack going dark — top-of-rack switch, PDU, etc.) but requires more physical domains to satisfy the same profile. Use the CRUSH helper to check whether your physical topology actually supports the domain you pick.

What plugin and technique should I use?

jerasure with technique=reed_sol_van is the default, well-tested, portable choice for most clusters and is what this tool's generated CLI uses. isa can be faster on Intel hardware with ISA-L support; clay reduces recovery network traffic at the cost of more CPU, useful for clusters where recovery bandwidth is the bottleneck. Stick with jerasure/reed_sol_van unless you have a specific reason to deviate.

How do I size pg_num for an EC pool?

The same target of ~100 PGs per OSD applies, but PG-per-OSD math for EC pools should account for k+m chunks landing on k+m different OSDs per PG (vs. size copies for replication). Use the Ceph PG Calculator and select Erasure Code with this profile's k+m to get pg_num and pgp_num sized correctly for the pool you're about to create.