Checking for non-preferred file/folder path names (may take a long time depending on the number of files/folders) ...

TEVA workshop (TEST)


Authors:
Owners: This resource does not have an owner who is an active HydroShare user. Contact CUAHSI (help@cuahsi.org) for information on this resource.
Type: Resource
Storage: The size of this resource is 5.4 MB
Created: Jan 03, 2025 at 4:17 p.m.
Last updated: Jan 03, 2025 at 4:43 p.m.
Citation: See how to cite this resource
Sharing Status: Public
Views: 72
Downloads: 16
+1 Votes: Be the first one to 
 this.
Comments: No comments (yet)

Abstract

Underwood et al. (2023) have recently introduced the tandem evolutionary algorithm (TEVA) of Hanley et al. (2020) to the water resources and ecology domains, and applied it to identify features (catchment-scale attributes) and feature interactions important in determining patterns in Dissolved Organic Carbon across the continental US. TEVA has particular advantages for feature selection in large, multivariate observational data sets of complex systems like riverscapes or ecosystems, and has been shown to outperform logistic regression or random forest for identifying feature interactions and equifinality (Hanley et al., 2020; Anderson et al., 2020). TEVA finds interactions between multiple variables that may result from either additive processes or feature interactions, and not only extracts features significantly associated with a given outcome classe(s), but also identifies the specific value ranges associated with those features (Underwood et al., 2023; Hanley, et al., 2020). This algorithm is also robust to issues of mixed data types (continuous, categorical), missing data, censored data, skewed distributions, and unbalanced target classes or clusters (Hanley et al., 2020).

When presented with n observations of p features across a study domain and a target of one or more classes or outcomes, the algorithm identifies and archives two types of clauses below a given fitness threshold. In the first pass, TEVA identifies Conjunctive Clauses (CCs) - a combination of variables that may or may not be correlated and somehow interact to produce an outcome. For example, an Extreme Flood may result from steep slopes + shallow soils + intense rainfall. A second pass of TEVA identifies Disjunctive Clauses (DCs) - a sequence of CCs that are linked with a logical “OR” statement. For example, an Extreme Flood may results from (steep slopes + shallow soils + intense rainfall) OR (high antecedent soil moisture + rainfall) OR (thick snow pack + high temperatures). DCs are multi-order, while the CCs comprising a DC can themselves range from first-order to multi-order (Underwood et al., 2023).

In this workshop, we illustrate the application of TEVA to 91 observations from forested catchments across the CONUS of 54 catchment attributes inferred to have importance to DOC dynamics. Combinations of these catchment attributes were identified in CCs and DCs with high probability to be linked to an outcome class of High or Low mean DOC concentration. Target classes were assigned using Jenks natural breaks for 91 catchments with sufficient (≥3) observations of DOC in stream water to calculate a mean value. Originally, computation of TEVA was performed in the Matlab programming language; the codebase has now been transferred to the open-source coding language Python, and is accessed through CUAHSI’s JupyterHub.

Subject Keywords

Content

How to Cite

Bogan, A. (2025). TEVA workshop (TEST), HydroShare, http://www.hydroshare.org/resource/74c25d1c16fb4a6cafa61b7e851a7ba4

This resource is shared under the Creative Commons Attribution CC BY.

http://creativecommons.org/licenses/by/4.0/
CC-BY

Comments

There are currently no comments

New Comment

required