Core Analysis Functions

The gcs_core module provides fundamental analysis functions for hysteresis metrics, CVc/CVq analysis, C-Q slopes, and statistical calculations.

Hysteresis Metrics

calculate_all_hysteresis_metrics(df: DataFrame, time_col: str = 'date', discharge_col: str = 'Q', concentration_col: str = 'C') Dict[str, any][source]

Calculate hysteresis metrics using all three methods (HARP, Zuecco, Lloyd).

This function applies scientifically validated hysteresis analysis methods to time series data (typically a moving window around each segment).

Parameters:
  • df (pd.DataFrame) – Time series data containing discharge and concentration (length >= 5 recommended)

  • time_col (str) – Column name for time values

  • discharge_col (str) – Column name for discharge/flow values

  • concentration_col (str) – Column name for concentration values

Returns:

Dictionary containing: - harp_metrics: dict from calculate_harp_metrics() - zuecco_metrics: dict from calculate_zuecco_metrics() - lloyd_metrics: dict from calculate_lawlerlloyd_metrics() - classifications: dict with hysteresis direction classifications - processed_data: dict with processed DataFrames from each method - error: str if calculation failed

Return type:

dict

Wrapper function that applies all three hysteresis methods (HARP, Zuecco, Lloyd/Lawler) to a single dataset. Returns a dictionary containing metrics from all methods plus classification results.

Returns:
Dictionary with keys:
  • harp_metrics: HARP method results

  • zuecco_metrics: Zuecco method results

  • lloyd_metrics: Lloyd/Lawler method results

  • classifications: Direction classifications

  • processed_data: DataFrames from each method

  • error: Error message if calculation failed

CVc/CVq Variability Analysis

compute_cvc_cvq_windows(df, qcol='Q_mLs', ccol='Zn_mgL', window=5)[source]

Compute CVc/CVq ratios and C-Q slopes over rolling windows.

Based on Musolff et al. (2015) for chemostatic vs chemodynamic behavior detection. https://doi.org/10.1016/j.advwatres.2015.09.026

Parameters:
  • df (pd.DataFrame) – Input dataframe with time series of concentration and flow measurements

  • qcol (str) – Name of the flow column

  • ccol (str) – Name of the concentration column

  • window (int) – Number of measurements to use for rolling window analysis

Returns:

DataFrame with CVc/CVq analysis for each window, including: - CVc, CVq: Coefficients of variation - CVc_CVq: Ratio (>1 = chemodynamic, <1 = chemostatic) - cq_slope_loglog: C-Q slope in log-log space (power-law exponent b)

Return type:

pd.DataFrame

Compute rolling window analysis of coefficient of variation ratios following Musolff et al. (2015). Calculates CVc/CVq to distinguish chemostatic from chemodynamic behavior.

Interpretation:
  • CVc/CVq > 1: Chemodynamic (concentration varies more than flow)

  • CVc/CVq < 1: Chemostatic (concentration buffered relative to flow)

C-Q Slope Calculation

compute_cq_slope(q1: float, q2: float, c1: float, c2: float, kind: str = 'loglog') float[source]

Compute the local C-Q slope for a segment between two consecutive points.

Parameters:
  • q1 (float) – Discharge values at segment start and end.

  • q2 (float) – Discharge values at segment start and end.

  • c1 (float) – Concentration values at segment start and end.

  • c2 (float) – Concentration values at segment start and end.

  • kind ({'loglog','linear'}, optional) –

    • ‘loglog’ returns Δlog(C) / Δlog(Q) (i.e., power-law slope b)

    • ’linear’ returns ΔC / ΔQ

Returns:

The requested slope, or NaN if undefined.

Return type:

float

Calculate C-Q slope using log-log regression: log(C) = log(a) + b·log(Q) where b is the power-law exponent.

Interpretation:
  • b > 0.15: Dilution/flushing signature

  • b < -0.15: Enrichment/loading signature

  • |b| < 0.1: Chemostatic buffering

Flow Dynamics Analysis

analyze_segment_flow_dynamics(segment_data: DataFrame, percentiles: Dict, ccol: str | None = None, qcol: str | None = None) Dict[source]

Analyze high-resolution Q dynamics within a segment, with window-scale hysteresis analysis.

Integrates hourly discharge data to capture within-segment flow dynamics that may not be apparent from sampling-point averages. Calculates hysteresis metrics on the time window (not event-scale or point-to-point).

Parameters:
  • segment_data (pd.DataFrame) – High-resolution time series for the segment window

  • percentiles (dict) – Global percentile thresholds for Q levels

  • ccol (str, optional) – Name of concentration column

  • qcol (str, optional) – Name of flow column

Returns:

Dictionary containing: - peak_q: Maximum Q in segment - peak_time: When peak occurred - days_since_peak: Days from peak to segment end - days_to_peak: Days from segment start to peak - flow_phase: ‘rising’ | ‘at_peak’ | ‘early_decline’ | ‘late_decline’ | ‘low’ - peak_position: ‘early’ | ‘middle’ | ‘late’ in segment - q_trend: slope of Q over segment - q_acceleration: change in Q rate - q_level: ‘low’ | ‘medium’ | ‘high’ - q_range: Q variability within segment - window_HI_* : Window-scale hysteresis metrics (if ccol provided) Returns None if no data available

Return type:

dict or None

Analyze high-resolution flow data to determine flow phase (rising, falling, peak, low), days since peak, and flow level categories. Used for improved classification accuracy when hourly or sub-daily flow data is available.

Statistical Functions

compute_change_percentiles(data: DataFrame, sites: List[str], ccol: str, qcol: str) Dict[str, float][source]

Calculate percentiles for concentration and flow changes (ΔC and ΔQ).

These percentile-based thresholds make the classification compound-agnostic by adapting to the specific distribution of each compound’s dynamics.

Parameters:
  • data (pd.DataFrame) – Time series data

  • sites (list of str) – Sites to include

  • ccol (str) – Concentration column

  • qcol (str) – Flow column

Returns:

Percentile thresholds for dC and dQ at various levels (p01, p05, p08, p10, p25, p50, p75, p90, p95)

Return type:

dict

Compute percentile-based thresholds for concentration and flow changes. Used in the classification system to determine phase boundaries in a compound-agnostic manner.

Notes

Data Requirements

For hysteresis analysis:
  • Minimum 5 data points

  • Recommended 10-15 points for single events

  • 20-30 points for robust window-scale analysis

For time series classification:
  • Minimum 20-30 sampling points per site

  • Recommended 50+ points covering multiple cycles

  • High-resolution Q data (hourly) improves accuracy

Scientific References

  • Musolff, A. et al. (2015). Catchment controls on solute export. Advances in Water Resources, 86, 133-146.

  • Thompson, S. E. et al. (2011). Comparative hydrology across AmeriFlux sites. Water Resources Research, 47(10).

See Also