A Dam Builder’s Yardstick in the Lone Star State

Rolling Hurst tested as an early warning for Texas electricity spikes, in forty-eight different configurations, and (spoiler) it lost every one.
Image

A Civil Servant on the Nile

In the 1940s, an unfussy British civil servant named Harold Edwin Hurst was trying to design the Aswan High Dam. The problem is easy to state and horrible to solve: how big does the reservoir have to be so that the dam never runs dry in a drought and never overtops in a flood?

The Nile's annual flood records went back roughly 800 years. Hurst waded into them and noticed that the river did not behave the way classical probability theory said it should. Floods clustered. Droughts clustered. A run of wet years made another wet year likelier, not less likely. The river had memory.

He boiled the effect down to a single number. Rescale the range of cumulative deviations by the standard deviation, plot against window size, fit a line, and read off its slope. That slope is the Hurst exponent, conventionally written H. It lives between 0 and 1:

- H = 0.5 — coin-flip independence. A random walk. No memory.

- H > 0.5 — persistent. Trends continue. Wet years follow wet years.

- H < 0.5 — anti-persistent. Mean-reverting. Up-days followed by down-days.

Hurst measured H ≈ 0.7 on the Nile, sized the dam accordingly, and accidentally gave Mandelbrot a career. The exponent turned out to apply to rather more than hydrology: stock returns, heartbeats, network traffic, river flows on four continents, and — as we shall see in a moment — the wholesale price of electricity in Texas.

It is a beautiful tool. The question in front of us is whether it is a useful tool, which is an entirely different and much nastier question.

Electricity Markets for People Who Have Better Things to Do

Electricity is the only commodity in civilian life that must be produced at the exact instant it is consumed. There is no warehouse. Batteries help at the margins but do not change the basic physics: generation and demand must match, continuously, or the grid falls over.

Wholesale electricity markets exist to coordinate this. Every few minutes — every five in some markets, every fifteen in others — operators run an auction. Generators bid the price they will accept to produce the next block of megawatt-hours. The market clears at a single price that everyone producing in that interval gets paid. That clearing price is the wholesale price, and it is the number we care about.

Most of the time it is boring. $20, $30, $50 per megawatt-hour. Coal, gas, wind, solar all undercut each other, nobody loses their shirt, the lights stay on.

And then, occasionally, it is not boring. Demand surges, a generator trips, the wind stops. Cheap capacity is exhausted and the market reaches down into increasingly expensive reserves. Prices climb. Climb some more. And then detonate.

Image

Seven years of ERCOT real-time prices across the four zonal hubs. Flat, flat, flat — and then a very bad week in February 2021.

 

The flat green floor you see in the chart above is the normal market doing normal market things. The vertical stalactites are scarcity events: moments when supply and demand came dangerously close to not matching. In February 2021, during Winter Storm Uri, gas pipelines in Texas literally froze. The price pinned at $9,000/MWh — the legal maximum — for the better part of a week. People froze. Other people made fortunes. A handful of retail electricity suppliers went bankrupt inside seventy-two hours.

Knowing a spike is coming is worth a great deal of money. It is also worth a great deal of not-dying, if you happen to be running a dialysis machine at the time. Which is why anyone with a background in time-series analysis, and a surplus of spare weekends, eventually looks at the ERCOT price tape and asks: is there a signal hiding in here?

Capacity Markets, and the Absence Thereof

Most grids run a capacity market alongside the energy market. It works like a retainer. Generators are paid simply for being available — for standing at attention, with fuel on site and a turbine warmed up, ready to produce electricity if called. They get that retainer whether they run or not.

The logic is insurance. Building a gas peaker plant is expensive. If you only get paid when you actually produce, and you only produce three days a year during summer heatwaves, the economics never close. You do not build the plant. The peaker does not exist. The heatwave arrives. The lights go out. Everybody is very cross.

Capacity payments fix this by socialising the cost of reserve margin. Everybody's electricity bill is slightly higher; in exchange, somebody else's lights stay on during the bad week.

Texas does not do this.

ERCOT is an energy-only market. Generators get paid for electricity they produce, and for nothing else. There is no retainer, no availability fee, no capacity auction. The only signal that the system is running short of reserves is the price itself, which is allowed to go mental. Before 2022 the ceiling was $9,000/MWh. After 2022 it was lowered, grudgingly, to $5,000/MWh. Both are absurdly high — a hundred times a normal operating price — and that is deliberate. In an energy-only market, scarcity pricing has to do the job that a capacity payment does elsewhere.

This design makes Texas electricity cheaper on average than almost anywhere else in the developed world. It also makes it the most volatile energy-only wholesale market on the planet. You are not hallucinating those spikes in the chart above: the market is supposed to behave like that.

It is also the ideal natural experiment. If a fractal precursor to spikes exists anywhere, it exists here.

Why Hurst Might Save Texas (Spoiler Alert)

The theory is almost elegant enough to be true.

Classical forecasting tools — ARIMA models, LSTMs, the usual suspects — are fundamentally averaging machines. They minimize squared error on the bulk of the distribution. They are excellent at predicting next Tuesday's typical price and uselessly bad at predicting next Tuesday's spike, because the spike lives in the fat tail that the averaging ignored. And the tail is exactly what you care about.

The Hurst exponent is different. It does not try to predict the next value. It measures the character of the memory in the signal. Specifically: how persistent are the trends right now?

The hypothesis: before a regime transitions from calm into a scarcity event, its memory structure ought to change. Maybe the short-term returns become more persistent as the market climbs the supply stack. Maybe they become less persistent as the system hunts between generators at the margin. Either way, something in the fractal structure of the local price movement should look different in the minutes before a spike than in the hours of calm that preceded it.

That shift, if it exists, is the leading indicator. Measure Hurst on a rolling window, watch for the characteristic move, alert the dispatcher, make money and save lives.

That's the hypothesis. Let's go break it.

The Experiment

Seven years of ERCOT real-time settlement prices at 15-minute resolution, across all four zonal hubs. Roughly 245,000 samples per hub; one million in total. 2019 through 2025. Winter Storm Uri included, treated honestly: run the analysis with the storm in, run it again with the worst 0.5% of prices stripped, report both.

The codebase is C11, libc and libm only, because the target audience deserves a PDF and CSVs rather than a notebook that stops working the moment anyone touches it. Gnuplot for figures. XeLaTeX for the document. No Python. There will be no Python.

Before pointing any new number at Texas, the toolchain was run against a sibling project's European (EPEX) data and asked to reproduce a previously-published whitepaper table to seven decimal places. It did. A drifted toolchain is useless; a validated one is the minimum cost of entry.

Then:

Global scaling. Compute Hurst three ways across the full seven-year series for each hub: rescaled-range (R/S) on prices, R/S on signed-log returns, and DFA (detrended fluctuation analysis, less twitchy around non-stationarity). Answer: ERCOT prices are highly persistent, H ≈ 0.84 — almost exactly the same as European power markets. ERCOT returns are strongly mean-reverting, H ≈ 0.30. Prices drift with memory; individual jumps reverse within minutes. The market has a fractal character. So far, so unsurprising.

Regime segmentation. Slice the dataset by season × time-of-day × day-of-week and recompute H in each cell. Winter overnight stretches show the longest memory; shoulder-season afternoons the shortest. HB_WEST — the wind-heavy hub — behaves visibly differently from the other three. (HB_WEST is where surplus West Texas wind regularly drives prices negative. Four thousand negative-price intervals in 2022 alone. The other hubs run an order of magnitude lower.)

Image

Local Hurst by season and time-of-day for each hub. Real structure, reportable on its own, but this is not the question we came here to answer.

A publishable result on its own. Also — bluntly — not the question we came here to answer.

The headline experiment. Compute Hurst on a rolling window across the full seven years. Three window sizes: 24 hours (96 samples), 48 hours (192 samples), 7 days (672 samples). For each rolling H value, look forward by horizon Δ ∈ {15 min, 1 h, 4 h, 24 h} and ask: does this value of H predict whether a price spike happens in the next interval?

A "spike" is any price crossing the trailing 30-day 95th percentile. The question becomes a binary classification problem: from the local Hurst right now, discriminate "spike incoming" from "no spike incoming".

The natural metric is ROC-AUC. It is the probability that, if you draw one genuinely-spike-about-to-happen window and one quiet window at random, your predictor ranks the spike window higher. AUC = 1.0 is a perfect oracle. AUC = 0.5 is a coin. Anything clearly north of 0.5 is signal.

Four hubs, three windows, four horizons. 48 cells. Every one of them run both on the raw series and on the event-stripped series. No cherry-picking. Report the grid.

Image

A week's worth of price overlaid on rolling local Hurst around a summer scarcity event. The eyeball fails to find a precursor, which turns out to be an accurate preview.

The Result

Image

ROC-AUC for every (hub × window × horizon) cell. The colour bar does not have to range very far.

Every one of the 48 cells lands between AUC 0.43 and 0.50. Zero cells above 0.5. All the uncertainty around 0.5 lives on the wrong side of it, consistently, across every hub and every window and every horizon.

In Orac-speak: "The signal you are looking for is not here, and it is not here in forty-eight different ways."

A follow-on check — compare the distribution of local Hurst in the 24 hours before a spike against matched non-spike windows — produced the same answer from the opposite angle. The two distributions differ by about 0.022 in mean against a within-class spread of roughly 0.11, and that gap vanishes entirely once the event-stripped series is used. Two independent tests, same direction, same inoperative effect size.

Image

Box plots of local Hurst in spike-preceding windows vs matched quiet windows. The boxes overlap so thoroughly the question answers itself.

So The Answer Is No

Rolling local Hurst, as a stand-alone leading indicator of near-term price spikes on ERCOT, does not work. The evidence is as clean as a negative result ever gets: 48 cells, uniform direction, all below 0.5, cross-confirmed by a second independent test, robust to event-stripping.

It is also an honest finding. The toolchain reproduced a prior published result exactly; the data passed every sanity check, every DST edge case, every row-count audit; the test grid was predeclared. There is nowhere for the result to hide.

Why It Doesn't Work

Two reasons, one of them the interesting one.

First, the uninteresting reason. ERCOT scarcity spikes are frequently exogenous. Winter Storm Uri was not an emergent feature of fractal price dynamics; it was a frozen gas pipeline in Pampa. No amount of memory analysis on the pre-storm price tape reveals an approaching ice front. A Hurst exponent is a property of endogenous dynamics. It cannot see a weather system coming, and pretending otherwise is cargo-cult statistics. The report says so on page one.

Second, the interesting reason. ERCOT's price is a composite signal. Layered inside it are:

- A slow demand cycle driven by weather and time-of-day.

- A medium-frequency supply cycle driven by wind, solar, and unit commitment.

- A fast noise process driven by dispatch and the bid stack.

- A spike process driven by the tails of the supply-demand mismatch.

Rolling Hurst collapses all of that into a single scalar per window. If the predictive information lives at one specific frequency band — say, 4-to-12-hour oscillations in net-load-minus-wind — then averaging across bands is exactly the wrong move. The signal gets diluted into noise from the other bands before it reaches the classifier. A single summary statistic, however elegant, is the wrong instrument.

Put less politely: we went hunting megalodons with a bathroom scale and are now reporting that the bathroom did not contain any megalodons. This is true, but perhaps not conclusive.

What would actually work, if anything does, is something that preserves the frequency-band structure: wavelet decomposition of the price signal, computing Hurst per band, and asking the predictive question at each scale separately. Or a multi-factor classifier that takes local Hurst as one feature among several and lets it earn incremental signal on top of baselines like rolling volatility, hour-of-day, and season. Or a change-point detector that models the scarcity process directly rather than hoping a summary statistic will see it coming.

All three are tractable follow-ons. None of them are the answer to "does Hurst, by itself, predict ERCOT spikes." That question has an answer, and the answer is no.

The Moral of the Story

A clean null result on a sharply-posed question is worth more than a mushy positive on a vague one. Everybody with a degree in the matter already believes fractal methods characterise market structure; the actual operational question is whether a specific, named fractal tool beats a coin on a specific, named prediction task. That question deserves an answer, and now it has one.

Harold Hurst measured the Nile and sized a dam that has not run dry in sixty years. The exponent that bears his name accurately describes the memory of ERCOT prices: persistent levels, mean-reverting returns, structurally different behaviour in the wind-heavy west. Everything Hurst designed the tool to do, it does.

It just does not, by itself, tell you when the lights are about to go out in Texas.

There are worse things to learn at the cost of one weekend of C and a few million CPU-seconds.

*Toolchain: ~1,500 lines of C11 (libc + libm only), gnuplot, xelatex. Data: ERCOT NP6-785-ER, 2019-2025, 15-minute real-time settlement point prices, four zonal hubs.

The full whitepaper, including methodology, figures, and CSVs, is available here. If you're working on a quantitative problem in energy markets and want analysis that tells you what's true rather than what you hoped to find, get in touch.

Contact us.