Core Analysis Functions
The gcs_core module provides fundamental analysis functions for hysteresis metrics,
CVc/CVq analysis, C-Q slopes, and statistical calculations.
Hysteresis Metrics
- calculate_all_hysteresis_metrics(df: DataFrame, time_col: str = 'date', discharge_col: str = 'Q', concentration_col: str = 'C') Dict[str, any][source]
Calculate hysteresis metrics using all three methods (HARP, Zuecco, Lloyd).
This function applies scientifically validated hysteresis analysis methods to time series data (typically a moving window around each segment).
- Parameters:
df (pd.DataFrame) – Time series data containing discharge and concentration (length >= 5 recommended)
time_col (str) – Column name for time values
discharge_col (str) – Column name for discharge/flow values
concentration_col (str) – Column name for concentration values
- Returns:
Dictionary containing: - harp_metrics: dict from calculate_harp_metrics() - zuecco_metrics: dict from calculate_zuecco_metrics() - lloyd_metrics: dict from calculate_lawlerlloyd_metrics() - classifications: dict with hysteresis direction classifications - processed_data: dict with processed DataFrames from each method - error: str if calculation failed
- Return type:
dict
Wrapper function that applies all three hysteresis methods (HARP, Zuecco, Lloyd/Lawler) to a single dataset. Returns a dictionary containing metrics from all methods plus classification results.
- Returns:
- Dictionary with keys:
harp_metrics: HARP method resultszuecco_metrics: Zuecco method resultslloyd_metrics: Lloyd/Lawler method resultsclassifications: Direction classificationsprocessed_data: DataFrames from each methoderror: Error message if calculation failed
CVc/CVq Variability Analysis
- compute_cvc_cvq_windows(df, qcol='Q_mLs', ccol='Zn_mgL', window=5)[source]
Compute CVc/CVq ratios and C-Q slopes over rolling windows.
Based on Musolff et al. (2015) for chemostatic vs chemodynamic behavior detection. https://doi.org/10.1016/j.advwatres.2015.09.026
- Parameters:
df (pd.DataFrame) – Input dataframe with time series of concentration and flow measurements
qcol (str) – Name of the flow column
ccol (str) – Name of the concentration column
window (int) – Number of measurements to use for rolling window analysis
- Returns:
DataFrame with CVc/CVq analysis for each window, including: - CVc, CVq: Coefficients of variation - CVc_CVq: Ratio (>1 = chemodynamic, <1 = chemostatic) - cq_slope_loglog: C-Q slope in log-log space (power-law exponent b)
- Return type:
pd.DataFrame
Compute rolling window analysis of coefficient of variation ratios following Musolff et al. (2015). Calculates CVc/CVq to distinguish chemostatic from chemodynamic behavior.
- Interpretation:
CVc/CVq > 1: Chemodynamic (concentration varies more than flow)
CVc/CVq < 1: Chemostatic (concentration buffered relative to flow)
C-Q Slope Calculation
- compute_cq_slope(q1: float, q2: float, c1: float, c2: float, kind: str = 'loglog') float[source]
Compute the local C-Q slope for a segment between two consecutive points.
- Parameters:
q1 (float) – Discharge values at segment start and end.
q2 (float) – Discharge values at segment start and end.
c1 (float) – Concentration values at segment start and end.
c2 (float) – Concentration values at segment start and end.
kind ({'loglog','linear'}, optional) –
‘loglog’ returns Δlog(C) / Δlog(Q) (i.e., power-law slope b)
’linear’ returns ΔC / ΔQ
- Returns:
The requested slope, or NaN if undefined.
- Return type:
float
Calculate C-Q slope using log-log regression: log(C) = log(a) + b·log(Q) where b is the power-law exponent.
- Interpretation:
b > 0.15: Dilution/flushing signature
b < -0.15: Enrichment/loading signature
|b| < 0.1: Chemostatic buffering
Flow Dynamics Analysis
- analyze_segment_flow_dynamics(segment_data: DataFrame, percentiles: Dict, ccol: str | None = None, qcol: str | None = None) Dict[source]
Analyze high-resolution Q dynamics within a segment, with window-scale hysteresis analysis.
Integrates hourly discharge data to capture within-segment flow dynamics that may not be apparent from sampling-point averages. Calculates hysteresis metrics on the time window (not event-scale or point-to-point).
- Parameters:
segment_data (pd.DataFrame) – High-resolution time series for the segment window
percentiles (dict) – Global percentile thresholds for Q levels
ccol (str, optional) – Name of concentration column
qcol (str, optional) – Name of flow column
- Returns:
Dictionary containing: - peak_q: Maximum Q in segment - peak_time: When peak occurred - days_since_peak: Days from peak to segment end - days_to_peak: Days from segment start to peak - flow_phase: ‘rising’ | ‘at_peak’ | ‘early_decline’ | ‘late_decline’ | ‘low’ - peak_position: ‘early’ | ‘middle’ | ‘late’ in segment - q_trend: slope of Q over segment - q_acceleration: change in Q rate - q_level: ‘low’ | ‘medium’ | ‘high’ - q_range: Q variability within segment - window_HI_* : Window-scale hysteresis metrics (if ccol provided) Returns None if no data available
- Return type:
dict or None
Analyze high-resolution flow data to determine flow phase (rising, falling, peak, low), days since peak, and flow level categories. Used for improved classification accuracy when hourly or sub-daily flow data is available.
Statistical Functions
- compute_change_percentiles(data: DataFrame, sites: List[str], ccol: str, qcol: str) Dict[str, float][source]
Calculate percentiles for concentration and flow changes (ΔC and ΔQ).
These percentile-based thresholds make the classification compound-agnostic by adapting to the specific distribution of each compound’s dynamics.
- Parameters:
data (pd.DataFrame) – Time series data
sites (list of str) – Sites to include
ccol (str) – Concentration column
qcol (str) – Flow column
- Returns:
Percentile thresholds for dC and dQ at various levels (p01, p05, p08, p10, p25, p50, p75, p90, p95)
- Return type:
dict
Compute percentile-based thresholds for concentration and flow changes. Used in the classification system to determine phase boundaries in a compound-agnostic manner.
Notes
Data Requirements
- For hysteresis analysis:
Minimum 5 data points
Recommended 10-15 points for single events
20-30 points for robust window-scale analysis
- For time series classification:
Minimum 20-30 sampling points per site
Recommended 50+ points covering multiple cycles
High-resolution Q data (hourly) improves accuracy
Scientific References
Musolff, A. et al. (2015). Catchment controls on solute export. Advances in Water Resources, 86, 133-146.
Thompson, S. E. et al. (2011). Comparative hydrology across AmeriFlux sites. Water Resources Research, 47(10).
See Also
Hysteresis Methods - Individual hysteresis method functions
Classification Functions - Geochemical phase classification
Quick Start Guide - Usage examples