pyErosivity

Rainfall erosivity calculation in Python

View on GitHub



About

pyErosivity is a Python package for computing rainfall erosivity (R-factor / EI30) from precipitation time series at any temporal resolution. Erosivity quantifies the capacity of rain to detach and transport soil — a key driver in the USLE/RUSLE family of soil erosion models. The package handles the full pipeline from raw precipitation data to per-event and annual erosivity statistics, including temporal resolution correction and bootstrap uncertainty estimation.

Note: The erosivity methodology is based on Wischmeier & Smith (1978), Renard et al. (1997), and DIN 19708:2017-08. The kinetic energy formula defaults to Rogler & Schwertmann (1981), with Brown & Foster (1987) and McGregor et al. (1995) also available. I developed this package as part of my PhD research at the University of Padova.

Pipeline

The standard call order is:

  1. remove_incomplete_years() — removes years exceeding the missing-data tolerance; returns cleaned DataFrame and detected time resolution.
  2. get_events() — separates storms by a minimum dry-spell gap (default 6 hours, Wischmeier & Smith 1958).
  3. remove_short() — discards events shorter than a minimum duration.
  4. get_events_values() — for each event computes depth, kinetic energy (E_kin), and peak intensities for all accumulation windows supported by the data resolution (imax_5, imax_10, imax_15, imax_30, imax_60).
  5. compute_erosivity() — computes EI30 = E_kin × IMax30 (or IMax60 for hourly data); adds erosivity_EU [kJ m⁻² mm h⁻¹] and erosivity_US [MJ mm ha⁻¹ h⁻¹].
  6. get_only_erosivity_events() — applies the Wischmeier dual criterion to retain erosive events.
  7. [optional] apply_rusle_split() — further splits erosive events using the Renard et al. (1997) 6-hour / 1.27 mm within-storm rule.

Event Filtering Criteria

An event is classified as erosive if it satisfies at least one of two criteria (OR logic):

Formulation Criterion (i) — depth Criterion (ii) — intensity Notes
Standard RUSLE event_depth ≥ 12.7 mm IMax15 ≥ 25.4 mm/h Wischmeier & Smith (1978); 6.35 mm in 15 min
Alternative event_depth ≥ 12.7 mm IMax30 ≥ 12.7 mm/h Wider window; looser filter; validated vs RIST

Both thresholds derive from the same physical quantity — 6.35 mm concentrated within the accumulation window. The wider 30-min window selects more events than IMax15 on identical data.

Same 6.35 mm burst measured at 15, 30, and 60-min windows
The same 6.35 mm rain burst gives IMax15 = 25.4 mm/h, IMax30 = 12.7 mm/h, and IMax60 = 6.35 mm/h. All three erosivity thresholds represent the same physical event at different accumulation windows.

Kinetic Energy Formulas

get_events_values() accepts a formula argument. All formulas are normalized to [kJ m⁻² mm⁻¹]:

KeyFormulaReferenceNotes
'rogler' (default) e_k = (11.89 + 8.73·log₁₀I) × 10⁻³, capped at 28.33×10⁻³ for I ≥ 76.2 mm/h Rogler & Schwertmann (1981) / DIN 19708 European calibration, log form
'brown_foster' e_k = 0.029·(1 − 0.72·exp(−0.05I)) Brown & Foster (1987), RUSLE standard Exponential form, natural plateau
'mcgregor' e_k = 0.029·(1 − 0.72·exp(−0.082I)) McGregor et al. (1995), RUSLE2 Steeper decay at low intensities

Key Functions

FunctionDescription
Core pipeline
remove_incomplete_years()Removes years with >10% missing data; auto-detects time resolution
get_events()Extracts events using a minimum dry-spell gap (default 6 h); removes events near data gaps
remove_short()Filters events shorter than a minimum duration
get_events_values()Computes depth, E_kin, and imax_5/10/15/30/60 for each event; supports formula choice
compute_erosivity()Adds erosivity_EU and erosivity_US (EI30 = E_kin × IMax30/60); auto-detects imax column
get_only_erosivity_events()Dual-criterion filter: IMax ≥ threshold OR depth ≥ 12.7 mm
apply_rusle_split()Renard (1997) within-storm splitting: splits where any 6 h window < 1.27 mm, then re-filters
Resolution correction
find_optimal_thr_imax30()Finds the intensity threshold that minimises the gap in mean annual event count to a reference (Fischer et al. 2018)
compute_sf_annual_r()Scaling factor from year-by-year annual R comparison (same calendar period required)
compute_sf_clim()Scaling factor from climatological mean annual R; no year matching — for obs vs CPM
compute_sf_per_event()Scaling factor from matched per-event EI (inner join on event date)
Statistics & uncertainty
get_mean_annual_stats()Mean annual event count, R-factor, depth, and intensity from an event table
bootstrapping_erosivity_60min()Block bootstrap over calendar years for observational data
bootstrapping_erosivity_CPM_60min()Bootstrap with optional pre-defined year-draw sequence (for CPM ensemble comparisons)
Kinetic energy (unit functions)
E_kin_i_Rogler()Rogler & Schwertmann (1981) / DIN 19708 — default
E_kin_i_BrFr()Brown & Foster (1987) — RUSLE standard
E_kin_i_McGregor()McGregor et al. (1995) — RUSLE2 variant

Study 1: Validation against RIST 3.99

pyErosivity was validated against RIST 3.99 (USDA Rainfall Intensity Summarization Tool) — the reference implementation of RUSLE — using 31 years (1990–2020) of 5-min data from station VE_0091 near Kreuzbergpass, Italy (~1600 m a.s.l.).

Validation scatter and mean annual statistics against RIST
Validation against RIST 3.99. pyErosivity recovers 962–966 erosive events (vs RIST 927–962) and matches the mean annual R-factor to within 0.4% for both IMax30 and IMax15 criteria. The 4-event gap is fully explained by RIST's internal inch rounding at the 12.8 mm threshold.
RIST IMax30pyEr IMax30RIST IMax15pyEr IMax15
Events (total)962966927931
Events / yr31.031.229.930.0
Mean depth [mm]31.431.532.332.3
R-factor [MJ mm ha⁻¹ h⁻¹ yr⁻¹]2027.82019.71993.01985.6

Study 2: Effect of Temporal Resolution

At coarser temporal resolution (15, 30, 60 min), IMax30 is underestimated because the sliding window straddles fixed clock bins. Two distinct problems arise:

Sketch showing how coarse binning merges events or creates false dry spells
Coarse binning can both lose events (a short burst straddles a bin boundary and never reaches the intensity threshold) and merge events (the inter-storm gap shrinks below the 6-hour separation threshold).

At 60-min resolution the IMax-only zone empties completely for any threshold: an event intense enough to exceed 12.7 mm/h in one hour automatically accumulates ≥ 12.7 mm and satisfies the depth criterion instead. The R-factor drops by ~43% relative to 5-min.

Event classification scatter at all resolutions and criteria
Event depth vs peak intensity at 5, 15, 30, and 60-min resolution for three selection criteria. At 60-min the IMax-only zone (steelblue) disappears entirely; all selected events are in the depth-only zone (tomato).

Study 3: Threshold Calibration & Scaling Factor Correction

Coarser resolution introduces two independent biases: (1) event selection bias — the intensity criterion selects fewer events; (2) intensity underestimation bias — EI30 is lower even for the same storm. Two correction steps address them:

  1. Threshold calibration: find_optimal_thr_imax30() sweeps all unique IMax values and finds the cut-off that recovers the 5-min event count. For 60-min data the optimal threshold is ~6.8 mm/h (vs the naive 12.7 mm/h).
  2. Scaling factor: a multiplicative SF removes the remaining within-event intensity bias. Three approaches are available — compute_sf_annual_r (year-by-year), compute_sf_per_event (matched events), and compute_sf_clim (climatological mean; required when obs and CPM cover different periods).
Annual R-factor before and after scaling factor correction
Annual R-factor scatter (5-min reference vs 60-min target) before and after applying each scaling factor approach. After correction, all three approaches align the mean annual R with the 5-min reference.

Study 4: Bootstrap Uncertainty & OBS vs CPM Comparison

bootstrapping_erosivity_60min() and bootstrapping_erosivity_CPM_60min() estimate uncertainty by block-resampling calendar years (default 1000 iterations). The CPM variant accepts a pre-defined year-draw array (randy) so that different ensemble members are compared on identical samples.

Bootstrap distributions for OBS and CPM
Bootstrap distributions (boxplots) for mean annual events, mean IMax, mean depth, and mean annual R-factor. OBS = 31 years (VE_0091, 1990–2020); CPM = ETH convection-permitting model, historical run (10 years, 1996–2005). The shorter CPM record produces substantially wider uncertainty boxes.

The comparison highlights that threshold calibration and scaling factor correction address temporal resolution bias but do not replace a full bias correction of the underlying precipitation field. The CPM's ~95% wet bias at this Alpine station propagates directly into erosive event counts and R-factor even after resolution correction.


References

Wischmeier, W. H. & Smith, D. D. (1978).
Predicting Rainfall Erosion Losses. USDA Agricultural Handbook 537.

Renard, K. G. et al. (1997).
Predicting Soil Erosion by Water (RUSLE). USDA Agricultural Handbook 703.

Rogler, H. & Schwertmann, U. (1981).
Erosivität der Niederschläge und Isoerodentkarte Bayerns.
Journal of Rural Engineering and Development, 22, 99–112.

Fischer, F. K., Winterrath, T. & Auerswald, K. (2018).
Temporal- and spatial-scale and positional effects on rain erosivity.
Hydrology and Earth System Sciences, 22, 6505–6518.
https://doi.org/10.5194/hess-22-6505-2018

Brown, L. C. & Foster, G. R. (1987).
Storm erosivity using idealized intensity distributions.
Transactions of the ASAE, 30(2), 379–386.

van Dijk, A. I. J. M., Bruijnzeel, L. A. & Rosewell, C. J. (2002).
Rainfall intensity–kinetic energy relationships: a critical literature appraisal.
Journal of Hydrology, 261(1), 1–23.
https://doi.org/10.1016/S0022-1694(02)00020-3