Every honest description of a trading idea ends the same way: here is how it could fail. This chapter is that ending. No hedging language, no "but we're confident" — just the plain list of things that could go wrong, followed by a glossary so you never have to guess what a word meant.
By now you've seen the whole machine. A crowd on a prediction market tends to overpay for surprises — for the low-probability weather events that feel dramatic but rarely happen. We estimate that overpayment at roughly 1.3x: the crowd charges about 30% more than the fair price for those long-shot outcomes. We forecast the weather ourselves with a large ensemble of models, compare our number to the market's number, and bet only when the gap is big enough to be worth it. That's the edge. This chapter is about all the ways that story could be wrong, incomplete, or simply unlucky.
Where things actually stand
Let's start with the single most important fact, stated bluntly.
polyAether has never made a real bet with real money. Everything so far is paper trading — running the full system live, but recording imaginary trades instead of placing them. There is no track record. The edge is a hypothesis supported by historical analysis, not a proven fact confirmed by profit.
"Paper trading" means the software watches real markets in real time and decides what it would do — but no money changes hands. It's a flight simulator, not a flight. Simulators are genuinely useful; they catch bugs and let you measure behaviour safely. But a simulator has never crashed a real plane, and it has never landed one either. Keep that framing for everything below.
The risks, plainly
The edge might not be real
The whole thesis rests on that 1.3x overpricing. We measured it in historical data — but the past is not a promise. Markets learn. If enough traders notice the same pattern, they stop overpaying, and the gap we're betting on quietly closes. It's also possible the pattern was partly an illusion in the data — a coincidence that looked like a rule. We won't know for sure until real money has been at stake for a long time.
The classic trap: a pattern that fits yesterday's data perfectly but has no predictive power tomorrow — like a "system" for picking lottery numbers that only works on last week's draw.
Thin liquidity
Liquidity is how much you can buy or sell without moving the price against yourself. Weather markets are small. If we try to place a meaningful bet and there simply aren't enough people on the other side, we either can't fill the order or we push the price so far that the edge evaporates before we've finished buying. A brilliant forecast is worthless if there's no one to trade with.
Competition
We are not the only clever people looking at these markets. If a better-funded, faster competitor is doing something similar, they can take the good prices before we do, leaving us the scraps. Edges in markets are shared until they're gone.
Model risk
Our forecast comes from weather models, and weather models are wrong sometimes. A physics assumption breaks, a data feed goes stale, our probability estimate is miscalibrated — meaning when we say "20% chance," it doesn't actually happen 20% of the time. If our number is off, our "edge" is just confident nonsense, and confident nonsense loses money faster than honest uncertainty.
Venue risk
The venue is the platform where the market lives (for us, Polymarket). Venue risk is everything outside our control there: the site goes down, rules change, a market settles (pays out) based on a weather reading we didn't expect, funds get frozen, or the legal picture shifts. You can be completely right about the weather and still lose because the venue did something surprising. Chapter 7 covers why settlement is the sneakiest part of all this.
Weather is genuinely uncertain
This one isn't a bug — it's the nature of the thing. The atmosphere is chaotic. Even a perfect model can only give probabilities, never certainties. We are betting on being right on average over many bets, not on any single forecast. Over a small number of trades, plain bad luck can look exactly like a broken strategy. Only volume tells the difference, and volume takes time.
The edge is plausible and carefully measured, but unproven. Even if it's real, thin liquidity, competition, model errors, and venue surprises can eat it. And weather itself guarantees that being right "on average" still means losing plenty of individual bets. This is a research project, not a sure thing.
The guardrails
We can't remove these risks, but we can refuse to let any one of them sink us. Chapter 8 is the full story; here's the short version. We bet only a fraction of what the math says is optimal (fractional Kelly), we cap how much rides on any single market, on the total book, and on any one day, we limit exposure to bets that would all lose together (a correlation cap), and there's a kill switch — a single control that halts everything instantly if the system misbehaves. None of this makes the edge real. It just makes sure that if we're wrong, we live to learn from it.
Every term, defined
If a word in this course ever felt like a bluff, here's the plain meaning.
- Ensemble — Running the weather model many times with slightly different starting conditions and getting a spread of answers instead of one. The spread is the forecast's uncertainty. polyAether uses a ~122-member ensemble built from three model families (GFS, ICON, ECMWF).
- Brier score — A grade for probability forecasts. Lower is better. It rewards being both confident and right, and punishes confident-but-wrong hard. It's how we check whether our forecasts are actually any good.
- Kelly — A formula for bet size that grows your money fastest over the long run given your edge. Full Kelly is aggressive and swingy, so we use a fraction of it to stay calmer and safer.
- Edge — The gap between our estimated probability and the market's price. If we think an event is 40% likely and the market prices it at 25%, the edge is that 15-point difference — the reason to bet at all.
- Calibration — Whether your probabilities match reality. If everything you call "30%" happens 30% of the time, you're well calibrated. Miscalibration turns a real edge into a fake one.
- Liquidity — How much you can trade without moving the price against yourself. High liquidity = deep market, easy to get in and out. Low liquidity = thin market, your own order shifts the price.
- Bucket — A range a market question carves the weather into, e.g. "high temperature 70–72°F." Each bucket is a separate yes/no bet.
- Station — The specific weather-observation site whose reading officially decides a market. polyAether tracks ~80 curated stations. Which station settles a market matters enormously (see Chapter 7).
- PIT — Probability Integral Transform. A diagnostic that checks whether our ensemble's uncertainty is honest — not too confident, not too timid. If the PIT looks lopsided, our spread is off and needs fixing.
- Settlement — How a market pays out: which real-world measurement, from which station, at which moment, decides who was right.
- Paper trading — Running the whole system live but with imaginary money. Where we are right now.
That's the honest picture, and that's the whole vocabulary. If you've read this far, you understand polyAether as well as we do — including, importantly, everything we don't yet know.