What is Statistical Arbitrage?

Statistical arbitrage (stat arb) exploits pricing inefficiencies between related securities. Unlike pure arbitrage (which is risk-free), stat arb uses statistical models to identify when the relationship between securities has deviated from its historical norm, betting that it will eventually normalize.

The most common form is pairs trading—simultaneously buying an underperforming stock and shorting an outperforming stock in the same sector, profiting when their prices converge.

Pairs Trading Fundamentals

Pairs trading involves three key steps:

  1. Pair Selection

    Identify two securities with strong historical correlation, typically from the same sector or industry (e.g., Coca-Cola and Pepsi, Visa and Mastercard).

  2. Spread Calculation

    Calculate the price ratio or spread between the pair. When this deviates significantly from the historical mean, a trading opportunity exists.

  3. Trade Execution

    When the spread widens beyond a threshold (e.g., 2 standard deviations), short the outperformer and buy the underperformer, then close when the spread normalizes.

Cointegration: The Key Concept

For pairs trading, you need more than correlation—you need cointegration. Two stocks are cointegrated if their spread is mean-reverting, meaning it tends to return to a stable long-term equilibrium.

Correlation vs. Cointegration

Correlation measures how prices move together moment-to-moment. Cointegration measures whether the spread between prices reverts to a mean. You can have high correlation without cointegration, which won't work for pairs trading.

Implementation Example

Pairs Trading Rules

  • Pair: Two stocks with confirmed cointegration (p-value < 0.05)
  • Entry (Long Spread): Z-score of spread < -2.0
  • Entry (Short Spread): Z-score of spread > 2.0
  • Exit: Z-score returns to 0 (mean)
  • Stop Loss: Z-score reaches ±3.0 or relationship breaks down
  • Position Sizing: Dollar-neutral (equal dollar amounts long and short)

Risks and Challenges

  • Relationship breakdown: Historical correlations can change permanently
  • Convergence timing: Spread may take longer to normalize than expected
  • Execution costs: Trading two securities doubles transaction costs
  • Short selling risks: Borrowing costs, short squeezes, and recall risk
  • Model risk: Statistical relationships found in historical data may not persist