Scientific Methodology
Overview
This document describes the scientific methods and mathematical foundations underlying the climate indices and climate extreme event analysis in this package.
Part 1: Climate Indices (SPI & SPEI)
1.1 Standardized Precipitation Index (SPI)
Developed by: McKee, Doesken, and Kleist (1993)
Purpose: Quantify precipitation deficit and surplus relative to long-term climatology for monitoring both dry (drought) and wet (flood/excess) conditions
Interpretation:
- Negative values: Dry conditions (drought)
- Positive values: Wet conditions (flooding/excess precipitation)
Mathematical Foundation:
Step 1: Temporal Aggregation
For a given time scale \(n\) (months), calculate rolling sum:
\[ P_i^n = \sum_{j=i-n+1}^{i} P_j \]
Where:
- \(P_i^n\) = accumulated precipitation for n-month period ending at month i
- \(P_j\) = monthly precipitation
Step 2: Distribution Fitting
Fit gamma distribution to aggregated precipitation using Maximum Likelihood Estimation:
Gamma PDF:
\[ f(x) = \frac{1}{\beta^\alpha \Gamma(\alpha)} x^{\alpha-1} e^{-x/\beta} \]
Where:
- \(\alpha\) = shape parameter
- \(\beta\) = scale parameter
- \(\Gamma(\alpha)\) = gamma function
Parameter Estimation:
\[ \alpha = \frac{1}{4A} \left(1 + \sqrt{1 + \frac{4A}{3}}\right) \]
\[ \beta = \frac{\bar{x}}{\alpha} \]
Where:
\[ A = \ln(\bar{x}) - \frac{\sum\ln(x)}{n} \]
- \(\bar{x}\) = mean precipitation
Step 3: Probability Calculation
Calculate cumulative probability:
\[ G(x) = \int_0^x f(t) \, dt \]
Account for zero precipitation:
\[ H(x) = q + (1-q) \cdot G(x) \]
Where:
- \(q\) = probability of zero precipitation
- \(1-q\) = probability of non-zero precipitation
Step 4: Standardization
Transform to standard normal distribution:
\[ \text{SPI} = \Phi^{-1}(H(x)) \]
Where: - \(Φ^(-1)\) = inverse standard normal cumulative distribution function
Interpretation:
| Category | SPI Value | Hex | RGB |
|---|---|---|---|
| Exceptionally Dry | -2.00 and below | #760005 |
rgb(118, 0, 5) |
| Extremely Dry | -2.00 to -1.50 | #ec0013 |
rgb(236, 0, 19) |
| Severely Dry | -1.50 to -1.20 | #ffa938 |
rgb(255, 169, 56) |
| Moderately Dry | -1.20 to -0.70 | #fdd28a |
rgb(253, 210, 138) |
| Abnormally Dry | -0.70 to -0.50 | #fefe53 |
rgb(254, 254, 83) |
| Near Normal | -0.50 to +0.50 | #ffffff |
rgb(255, 255, 255) |
| Abnormally Moist | +0.50 to +0.70 | #a2fd6e |
rgb(162, 253, 110) |
| Moderately Moist | +0.70 to +1.20 | #00b44a |
rgb(0, 180, 74) |
| Very Moist | +1.20 to +1.50 | #008180 |
rgb(0, 129, 128) |
| Extremely Moist | +1.50 to +2.00 | #2a23eb |
rgb(42, 35, 235) |
| Exceptionally Moist | +2.00 and above | #a21fec |
rgb(162, 31, 236) |
Time Scales:
- SPI-1: Monthly, sensitive to short-term conditions
- SPI-3: Seasonal, reflects soil moisture
- SPI-6: Medium-term, agricultural water stress
- SPI-12: Annual, hydrological drought
- SPI-24: Long-term, reservoir/groundwater drought
1.2 Standardized Precipitation Evapotranspiration Index (SPEI)
Developed by: Vicente-Serrano, Beguería, and López-Moreno (2010)
Purpose: Extend SPI by including temperature effects via evapotranspiration for monitoring both dry and wet conditions with climate change sensitivity
Interpretation:
- Negative values: Dry conditions (drought with temperature effects)
- Positive values: Wet conditions (surplus with temperature effects)
Mathematical Foundation:
Step 1: Water Balance
Calculate monthly water balance:
\[ D_i = P_i - PET_i \]
Where:
- \(D_i\) = climatic water balance for month \(i\)
- \(P_i\) = precipitation
- \(PET_i\) = potential evapotranspiration
Step 2: Temporal Aggregation
Same as SPI, but on water balance:
\[ D_i^n = \sum_{j=i-n+1}^{i} D_j \]
Step 3-4: Distribution and Standardization
Same process as SPI:
- Fit log-logistic or gamma distribution to \(D^n\)
- Calculate cumulative probability
- Transform to standard normal
Key Difference from SPI:
SPEI includes temperature effects through PET, making it sensitive to:
- Rising temperatures (increases PET → more negative D)
- Climate change impacts
- Agricultural water stress
PET Methods Used:
Thornthwaite Method:
\[ PET = 16 \left(\frac{10T}{I}\right)^a \]
Where:
- \(T\) = monthly mean temperature (°C)
- \(I\) = annual heat index = \(\sum(T_i/5)^{1.514}\)
- \(a = 0.49239 + 1.7912 \times 10^{-2}I - 7.71 \times 10^{-5}I^2 + 6.75 \times 10^{-7}I^3\)
Hargreaves-Samani Method:
\[ PET = 0.0023 \cdot R_a \cdot (T_{mean} + 17.8) \cdot \sqrt{T_{max} - T_{min}} \]
Where:
- \(R_a\) = extraterrestrial radiation (MJ/m²/day, converted to mm/day equivalent)
- \(T_{mean}\) = mean daily temperature (°C)
- \(T_{max} - T_{min}\) = diurnal temperature range (°C)
Extraterrestrial Radiation (\(R_a\)):
Calculated using FAO-56 equations based on latitude and day of year:
\[ R_a = \frac{24 \times 60}{\pi} \cdot G_{sc} \cdot d_r \cdot [\omega_s \cdot \sin(\phi) \cdot \sin(\delta) + \cos(\phi) \cdot \cos(\delta) \cdot \sin(\omega_s)] \]
Where:
- \(G_{sc}\) = solar constant (0.0820 MJ/m²/min)
- \(d_r\) = inverse relative distance Earth-Sun
- \(\omega_s\) = sunset hour angle
- \(\phi\) = latitude (radians)
- \(\delta\) = solar declination
When to use Hargreaves:
- Arid and semi-arid regions
- When Tmin/Tmax data is available
- Better correlation with physically-based PET methods (e.g., Penman-Monteith)
1.3 Calibration Period
Standard Practice: 30-year climatological normal
WMO Recommendation: 1991-2020 (current standard period)
Purpose:
- Establish local probability distribution
- Ensure stationary reference period
- Allow spatial and temporal comparisons
Implementation:
spi_12 = spi(precip, scale=12,
calibration_start_year=1991,
calibration_end_year=2020)Part 2: Climate Extreme Event Analysis (Run Theory)
2.1 Run Theory
Developed by: Yevjevich (1967)
Purpose: Identify and characterize climate extreme events based on continuous periods beyond a threshold. Works for both dry (drought) and wet (flood/excess) events.
Mathematical Foundation:
Run Definition
A run is a continuous sequence where the variable remains below (or above) a threshold level.
Truncation level (threshold): \(x_0\)
Drought run (dry events): Continuous period where \(SPI < x_0\) (negative threshold)
Wet run (wet events): Continuous period where \(SPI > x_0\) (positive threshold)
Event Identification
For drought events (negative threshold):
- Start: First month where \(SPI_i < x_0\)
- End: Last month where \(SPI_i < x_0\) before recovery
- Recovery: \(SPI_i \geq x_0\)
For wet events (positive threshold):
- Start: First month where \(SPI_i > x_0\)
- End: Last month where \(SPI_i > x_0\) before return to normal
- Return: \(SPI_i \leq x_0\)
Filtering: Events shorter than minimum duration excluded
2.2 Climate Extreme Event Characteristics
Duration (D)
\[ D = t_{end} - t_{start} + 1 \]
Number of consecutive months below threshold.
Units: months
Magnitude (M) - Cumulative
\[ M = \sum_{i=start}^{end} |x_0 - SPI_i| \]
Total accumulated deviation from threshold during event (deficit for drought, surplus for wet).
Units: Same as SPI (standardized units)
Interpretation:
- Represents total water deficit
- Monotonically increases during event
- Analogous to cumulative financial debt
Magnitude (M_inst) - Instantaneous
\[ M_{inst}(t) = x_0 - SPI_t \quad \text{for each month } t \text{ in event} \]
Current severity at specific time.
Units: Same as SPI
Interpretation:
- Varies during event (rises-peaks-falls)
- Like NDVI crop phenology
- Shows event evolution pattern
Relationship:
\[ M = \sum M_{inst}(t) \quad \text{for all } t \text{ in event} \]
Intensity (I)
\[ I = \frac{M}{D} \]
Average severity per month.
Units: SPI units per month
Interpretation:
- High I: severe but possibly short drought
- Low I: mild but possibly long drought
Peak (P)
\[ P = \min(SPI_i) \quad \text{for } i \in [start, end] \]
Most severe SPI value during event.
Units: SPI units
Interpretation:
- Worst moment of event (most severe dry or wet condition)
- Related to maximum instantaneous magnitude
- \(P = M_{inst}(t_{peak})\)
Peak Date
\[ t_{peak} = \arg\min(SPI_i) \quad \text{for } i \in [start, end] \]
When peak severity occurred.
Inter-arrival Time (T)
\[ T_n = t_{start}(n+1) - t_{start}(n) \]
Time between consecutive event onsets.
Units: months
Interpretation:
- Event frequency (dry or wet)
- Return period approximation
2.3 Period Aggregation
Purpose: Answer decision-maker questions about specific time periods
Mathematical Foundation:
For a spatial grid and time period \([t_{start}, t_{end}]\):
Number of Events
\[ N(x,y) = \text{count of events at location } (x,y) \text{ during period} \]
Total Event Months
\[ D_{total}(x,y) = \sum D_i \quad \text{for all events } i \text{ at } (x,y) \]
Total Magnitude
\[ M_{total}(x,y) = \sum M_i \quad \text{for all events } i \text{ at } (x,y) \]
Mean Magnitude
\[ M_{mean}(x,y) = \frac{M_{total}(x,y)}{N(x,y)} \]
Maximum Magnitude
\[ M_{max}(x,y) = \max(M_i) \quad \text{for all events } i \text{ at } (x,y) \]
Worst Peak
\[ P_{worst}(x,y) = \min(P_i) \quad \text{for all events } i \text{ at } (x,y) \]
Percent Time in Events
\[ Pct(x,y) = \frac{D_{total}(x,y)}{n_{months}} \times 100 \]
Where \(n_{months} = (t_{end} - t_{start} + 1)\) in months
Implementation:
- Applied independently to each grid cell
- Parallelizable across spatial domain
- Efficient for large gridded datasets
Part 3: Statistical Considerations
3.1 Probability Distributions
Original SPI (McKee et al., 1993): Used gamma distribution
This Package Supports:
- Gamma (default): Bounded at zero, robust globally, well-established
- Pearson Type III: More flexible (3 parameters), recommended for SPEI
- Log-Logistic: Often used for SPEI, handles negative water balance
- Generalized Extreme Value (GEV): For extreme value analysis
- Generalized Logistic: Alternative for heavy-tailed distributions
Gamma Advantages:
- Bounded at zero (like precipitation)
- Flexible shape (α parameter)
- Robust globally
- Well-established theory
Pearson III Considerations:
- Sometimes used for SPI
- More flexible (3 parameters)
- Caution: May fail in arid regions with very low variability
Recommendation: Gamma for SPI (default), Pearson III or Log-Logistic for SPEI
3.2 Zero Precipitation Handling
Gamma distribution defined for x > 0, but precipitation can be zero.
Solution:
\[ H(x) = q + (1-q) \cdot G(x) \]
Where:
\[ q = P(\text{precipitation} = 0) = \frac{n_{zeros}}{n_{total}} \]
Mixed distribution:
- Discrete component at zero (probability q)
- Continuous component (gamma) for x > 0
3.3 Minimum Data Requirements
- SPI/SPEI: Minimum 30 years monthly data (50+ recommended)
- Dry/wet event analysis: Minimum 20 years (30+ for return periods)
For practical input specifications (formats, dimensions, units, quality checks), see Data Model & Outputs.
3.4 Spatial Consistency
Important: Each grid cell fitted independently
Reason:
- Different climate regimes
- Different precipitation distributions
- Spatial heterogeneity
Result: SPI=-1.0 means same probability (~16th percentile) everywhere
Part 4: Operational Considerations
4.1 Near-Real-Time Updates
Approach:
- Fit distribution on historical calibration period (1991-2020)
- Save fitted parameters
- For new data, use pre-fitted parameters
Advantage:
- Consistent with historical baseline
- Fast operational updates
- No refitting required
Implementation:
# One-time calibration
params = spi(precip_calibration, scale=12, ...)
params.to_netcdf('spi_params.nc')
# Operational use
spi_new = spi(precip_new, scale=12, fitting_params=params)4.2 Threshold Selection
Common Thresholds:
| Threshold | Percentile | Use Case |
|---|---|---|
| -0.5 | ~31st | Early warning |
| -0.8 | ~21st | Agricultural concern |
| -1.0 | ~16th | Standard operational |
| -1.2 | ~11st | Conservative threshold |
| -1.5 | ~7th | Severe drought |
| -2.0 | ~2nd | Extreme drought |
Recommendation: -1.0 or -1.2 for operational monitoring
Rationale:
- Balances sensitivity vs false alarms
- Captures significant droughts
- Allows minimum duration filtering
4.3 Minimum Duration
Purpose: Filter short-term fluctuations
Typical Values:
- SPI-1: min_duration = 2-3 months
- SPI-3: min_duration = 2-3 months (already smoothed)
- SPI-6: min_duration = 2 months
- SPI-12: min_duration = 3 months (captures sustained events)
Impact:
- Higher = fewer, longer events
- Lower = more events, includes brief dry spells
Part 5: Validation and Quality Control
5.1 Input Data Quality
Input data should be CF-compliant NetCDF, monthly resolution, with minimal missing data (<5%). See Data Model & Outputs for the full input contract and recommended quality checks.
5.2 Distribution Fitting Quality
Checks:
- Parameters within reasonable ranges
- Shape (α): typically 0.5-5
- Scale (β): positive, reasonable magnitude
- Fitted distribution matches data
- No extreme outliers
Fallback:
- If fitting fails: use default parameters
- Issue warning to user
- Set problematic cells to NaN
5.3 Output Validation
Checks:
- SPI/SPEI range typically -3 to +3
- Mean ≈ 0, StdDev ≈ 1 (over calibration period)
- No excessive NaN values
- Spatial patterns reasonable
Part 6: Limitations and Assumptions
6.1 SPI Limitations
- Precipitation only: Ignores temperature, evapotranspiration
- Stationarity: Assumes constant climate (violated under climate change)
- Distribution choice: Gamma may not fit all locations perfectly
- Data requirements: Needs long, quality-controlled records
6.2 SPEI Limitations
- PET uncertainty: Different methods give different results
- Data requirements: Needs temperature + precipitation
- Stationarity: Climate change affects both P and PET
- Complexity: More inputs = more potential errors
6.3 Run Theory Limitations
- Threshold dependence: Results sensitive to threshold choice
- Minimum duration: Arbitrary choice affects event count
- Independence: Adjacent events may not be truly independent
- Spatial correlation: Events often span multiple grid cells (not captured in point analysis)
6.4 Assumptions
- Monthly data: Sub-monthly processes not captured
- Point analysis: Each grid cell independent
- Calibration period: Representative of long-term climate
- Linear trend: No explicit detrending (assumes stationary)
Part 7: Best Practices
7.1 Index Selection
Use SPI when:
- Only precipitation data available
- Purely meteorological drought
- Comparison with global studies (most common)
- Simplicity desired
Use SPEI when:
- Temperature data available
- Agricultural drought (crop water stress)
- Climate change analysis
- Temperature effects important
7.2 Time Scale Selection
| Scale | Application |
|---|---|
| SPI-1 | Month-to-month variability, not recommended for drought monitoring |
| SPI-3 | Seasonal drought, soil moisture |
| SPI-6 | Agricultural season, crop impacts |
| SPI-12 | Hydrological year, water resources |
| SPI-24 | Long-term drought, groundwater, reservoirs |
Recommendation: Use SPI-12 for general drought monitoring
7.3 Calibration Period
Use the most recent 30-year WMO normal period (currently 1991–2020). Update every 10 years. For climate change studies, use a fixed historical baseline. See Data Model & Outputs for calibration conventions and parameter reuse workflow.
7.4 Threshold and Duration
Standard Approach:
- Threshold: -1.0 or -1.2
- Minimum duration: 3 months for SPI-12
Sensitivity Analysis:
- Test multiple thresholds
- Evaluate against known events
- Document choice rationale
References
SPI
McKee, T.B., Doesken, N.J., Kleist, J. (1993). The relationship of drought frequency and duration to time scales. 8th Conference on Applied Climatology, 17-22 January, Anaheim, CA.
Guttman, N.B. (1999). Accepting the Standardized Precipitation Index: a calculation algorithm. Journal of the American Water Resources Association, 35(2), 311-322.
WMO (2012). Standardized Precipitation Index User Guide (WMO-No. 1090). Geneva, Switzerland.
SPEI
Vicente-Serrano, S.M., Beguería, S., López-Moreno, J.I. (2010). A Multiscalar Drought Index Sensitive to Global Warming: The Standardized Precipitation Evapotranspiration Index. Journal of Climate, 23(7), 1696-1718.
Beguería, S., Vicente-Serrano, S.M., Reig, F., Latorre, B. (2014). Standardized precipitation evapotranspiration index (SPEI) revisited: parameter fitting, evapotranspiration models, tools, datasets and drought monitoring. International Journal of Climatology, 34(10), 3001-3023.
Run Theory
Yevjevich, V. (1967). An objective approach to definitions and investigations of continental hydrologic droughts. Hydrology Papers 23, Colorado State University, Fort Collins, CO.
Tallaksen, L.M., van Lanen, H.A.J. (2004). Hydrological Drought: Processes and Estimation Methods for Streamflow and Groundwater. Developments in Water Science 48, Elsevier, Amsterdam.
PET
Thornthwaite, C.W. (1948). An approach toward a rational classification of climate. Geographical Review, 38(1), 55-94.
Hargreaves, G.H., Samani, Z.A. (1985). Reference crop evapotranspiration from temperature. Applied Engineering in Agriculture, 1(2), 96-99.
See Also
- Data Model & Outputs - Input contract, calibration conventions, output schemas
- Probability Distributions - Distribution selection and fitting methods
- Validation & Test Results - Quality verification and test coverage
- Implementation Details - Code architecture
- API Reference - Function documentation
- User Guides - Practical usage