Every racing tip looks obvious in hindsight. The horse broke smoothly, the pace set it up perfectly, and the trainer had clearly had it ready to fire. But predicting those outcomes before the race — that's the problem every punter is trying to solve.
This article documents the framework FormRace uses to identify selections, why each signal matters, and what the academic and industry research says about the factors that most reliably separate winners from losers.
---
The Six Conviction Signals
FormRace scores every prediction across six equally-weighted components, each worth up to ~16.67 points, for a total conviction score out of 100. A selection only publishes as a tip when multiple signals align.
1. Monte Carlo Confirmation
What it is: A simulation model runs thousands of iterations of each race, sampling from each runner's performance distribution. The output is a probability — e.g. "this horse wins in 28% of simulated races."
Why it matters: Single-point predictions break. A horse ranked #1 by speed figures might be only marginally better than five other contenders. Monte Carlo confirmation shows whether the model's confidence is robust or paper-thin. A 28% win probability in a field of 14 is genuinely high (expected baseline ~7%). A 15% probability in a field of 8 is not.
Research backing: Studies consistently show that well-calibrated probability models outperform heuristic rules over large sample sizes. The edge isn't in any single race — it's in having the right probability when the market is wrong.
2. Blended Model Confidence
What it is: A weighted combination of two independent model outputs — an XGBoost gradient-boosted classifier trained on historical race data, and an LLM-based reasoning score that reads form notes and race conditions.
Why it matters: Ensemble models outperform single models because they reduce correlation in error. When the XGB model and the LLM reach the same conclusion independently, the bet is better founded than when either signal fires alone.
Key inputs to the blended score:
3. Red Flag Penalty
What it is: The LLM explicitly flags known risk factors. Each flag reduces the signal contribution by 50%. Two or more flags zeroes the component entirely.
Common red flags:
Why it matters: Red flags don't eliminate a horse from consideration — they reduce conviction. A selection that scores well on all five other signals but carries two red flags still has a significantly reduced conviction score, which is the correct response: lower the stake, not necessarily pass the race.
4. Track Bias Alignment
What it is: A daily track bias reading — derived from race-by-race pattern recognition — that scores whether a horse's running style matches the prevailing advantage. On a day where front-runners are significantly outperforming, a confirmed sit-and-sprint runner gets a low alignment score.
Why it matters: Track bias is one of the most undervalued factors in Australian racing. A horse that is genuinely the best performer in the race can still lose on a day where the track condition or rail position systematically disadvantages its running style. Conversely, a marginal horse whose style suits a strong inside bias can run way above its form figures.
How to spot bias:
5. LLM Agreement
What it is: A binary signal — does the LLM's natural-language race analysis conclude with a BET or BUY recommendation? The LLM reads the full form narrative, trainer and jockey comments (where available), and race conditions before reaching its conclusion.
Why it matters: The quantitative model can't read between the lines. The LLM can factor in information that's hard to encode numerically: "trainer has won this race three of the past five years," "horse has been freshened specifically for this assignment," or "jockey is claiming 2kg and has ridden this horse in trackwork."
LLM signals that increase confidence:
6. Kelly Strength
What it is: The Kelly Criterion is a formula that calculates the mathematically optimal stake given a probability estimate and the available odds. A Kelly stake percentage of 10%+ means the model believes there is a large positive edge at the available price.
Why it matters: This signal directly quantifies value. A horse can have strong conviction across all other components but poor Kelly strength if the market has already priced in the edge. High Kelly strength means the model's probability is significantly higher than what the market implies.
The value threshold: FormRace requires a minimum value ratio of 1.30 — the model must believe the horse is at least 30% more likely to win than the odds imply. A $3.00 horse needs an implied true probability of at least 43% (against the market-implied ~33%) to pass this filter.
---
What the Research Says
Beyond FormRace's specific signals, a body of racing research — from academics and full-time professional punters — identifies the factors that most consistently correlate with winning. Here is the evidence hierarchy:
Tier 1: High-Confidence Predictors
Recent speed figures and form trajectory
The two factors most strongly correlated with winning are average earnings per run (a class proxy) and average speed rating over the last 3–5 starts. Horses showing an upward trajectory in speed figures — especially when stepping to a class level where their figures are competitive — win more often than expected.
Class drop (stepping down)
Horses dropping in class carry a significant statistical advantage. Australian Benchmark ratings make this measurable: a horse rated 85 running in a BM 72 has form figures that are expected to dominate, all else equal. The key question is why the class drop — genuine form improvement targeting a weaker race, or a trainer struggling to find a race where the horse is competitive? Intent matters.
Going match
Track condition is a major predictor for horses with a clear preference. A "Dead 4" specialist on a "Heavy 9" track is structurally disadvantaged regardless of class. The going-match signal becomes especially powerful when the track condition diverges significantly from the horse's optimal range.
Tier 2: Meaningful But Context-Dependent
Jockey and trainer combinations
A top jockey on an in-form trainer's horse at a track where that trainer has a strong record is a meaningful signal — but it must not be evaluated in isolation. The combination matters most when the horse's underlying form is also competitive. Elite trainers in elite form fire horses into races deliberately; that intent is worth pricing in.
Barrier draw and pace
Post position matters, but the magnitude varies by track, distance, and surface. Barrier draw is most significant in sprint races (1000–1200m) on tight tracks with short runs to the first turn. In longer races on wide tracks, the effect diminishes. The interaction between barrier draw and pace scenario (front-runners vs. closers) is where the real modelling edge lives.
Distance match
Horses running at their optimum distance (within ~100m) outperform horses stretching or shortening. The strongest positive signal is a horse that has run well at a distance multiple times and is returning to it after a spell or class adjustment.
Tier 3: Overrated by the Public
Jockey name alone
Australia's premier jockeys — Nash Rawiller, James McDonald, Damian Lane — win at high rates, but they also ride at short market prices that eliminate most of the edge. When a big-name jockey is on a horse at $2.20, the market has already priced in the jockey premium. The value signal comes from finding elite jockeys on horses the market has under-priced.
Last-start winner status
Recency bias inflates the prices of last-start winners. A horse that won a BM 64 last week running in a BM 80 this week is not a last-start winner who will naturally repeat — it's a horse being tested at a materially harder class level. Strip out the context and the "last-start winner" halo is often illusory.
Barrier trials
A strong trial is a fitness indicator, not a form indicator. It proves the horse is ready to run — it doesn't prove the horse is competitive at the class level of today's race. Barrier trial results should upgrade the red flag penalty (no fitness concerns), not independently drive selection.
---
The Framework in Practice
When all six signals align, a high-conviction selection looks like this:
That alignment is rare. On most days across most races, one or more signals are absent. The discipline is in passing those races, not forcing selections.
---
Updating This Framework
This framework will be updated as FormRace's settled result history grows. The key metrics to watch:
As the dataset accumulates, the signal weights will be adjusted to reflect what actually predicts Australian thoroughbred winners — not just what theory suggests should work.
---
Use the framework inside the product
Responsible gambling: Racing involves financial risk. National Gambling Helpline: 1800 858 858.