A different way to read the leaderboard.

The rating model on this site was designed by James Macdiarmid and laid out in his video Our ranking system is broken. Let’s fix it!. This project is an implementation of the spec he described there.

The official WCA rankings capture something important: a single best performance. Records live there. But a single solve is a narrow lens — it says little about who is performing well right now. This site is an attempt at that second view: a performance rating per competitor per event, updated hourly from the public WCA export.

How a rating is computed

For each (competitor, event) pair, we compute one rating on singles and one on averages. The metric toggle at the top of the rankings page switches between them; for averaged events (3×3, 4×4, etc.) the leaderboard defaults to averages, and for single-only events (BLD, FMC, multi) it defaults to singles.

Take every round result from the competitor’s last 24 months in the event, counting back from their most recent competition rather than from today. At least 3 results are required to appear. If their latest round is itself older than 24 months they drop off the list entirely.
Normalise each round into a Kinch-style score: 100 × (WR / your_result). WR here is the all-time best in the same metric (best-ever average if we’re scoring averages, best-ever single if we’re scoring singles). The record holder scores 100; everyone else scales below.
Apply a context bonus. Each round’s score is multiplied by 1 + 0.01 × (placement + record), where placement rewards finals and medal positions (scaled sharply upward at national, continental and world championships) and record rewards national / continental / world records. A normal first-round solve at a local comp gets no bonus at all; a gold-medal world-record finals round at Worlds maxes out near +17%. In aggregate, the bonus lifts a top competitor’s final rating by a few tenths of a point.
Weight each round by 0.99 ^ days_since. Recent solves count for more — a result from 90 days ago carries about 40% of today’s weight.
Take the weighted mean. This is the raw rating.
If a competitor hasn’t entered the event for a while, the rating starts to decay. The grace period depends on how often the event is typically held: 90 days for frequent events (3×3, 2×2, OH, pyraminx, skewb, square-1), 180 days for the bigger cubes, clock, megaminx and 3-blind, and 365 days for the rare events (FMC, multi-blind, 4- and 5-blind) that are mostly only scheduled at larger competitions. After the grace the rating multiplies by 0.9995 ^ (days − grace).
Rank by rating within each event and metric using standard sports tied-rank semantics (two cubers on the same rating share a rank; the next slot skips ahead — 5, 5, 7).

The bonus formula above is taken directly from James’s reference spreadsheet — we reverse-engineered it from his Excel model rather than re-deriving it from first principles. On 11 reference competitors the output matches his leaderboard to mean-absolute-error ~0.025 (i.e., to two decimal places for nearly everyone).

Data freshness

An hourly job polls the WCA results export. When a new export is published, we re-ingest the underlying data and rebuild every rating. Even on hours when nothing has changed on the WCA side we still recompute, so the inactivity decay stays current to the day.

Scope

17 currently-active WCA events are rated. Discontinued events exist in the WCA record but are not scored. Ratings are per-event — there is no cross-event aggregate because skill in 3×3 looks nothing like skill in multi-blind, and averaging across events would hide more than it reveals. The region filter narrows the leaderboard to a continent or country; global ranks are preserved when filtered, so the top European 3×3 cuber can still be #3 in the world.

Known gaps

DNFs are not yet factored in. The pipeline currently drops DNF rounds before computing a rating, which means reliability isn’t reflected. For 3×3 and the other fast events this barely matters, but for BLD, FMC, multi and clock it overstates unreliable solvers. The fix, following the spec author’s suggestion, is a per-event DNF-rate coefficient that discounts ratings proportionally to how often someone fails attempts in the window. We’re not shipping a number until it’s calibrated properly; until then, treat blind-event ratings with an extra grain of salt.

Credits and data

The rating model is by James Macdiarmid. The competition data is maintained by the World Cube Association and used under the terms of its public results export. This site is not affiliated with or endorsed by the WCA. Implementation is open source.

← back to rankings