Checking for non-preferred file/folder path names (may take a long time depending on the number of files/folders) ...
This resource contains some files/folders that have non-preferred characters in their name. Show non-conforming files/folders.
This resource contains content types with files that need to be updated to match with metadata changes. Show content type files that need updating.
Authors: |
|
|
---|---|---|
Owners: |
|
This resource does not have an owner who is an active HydroShare user. Contact CUAHSI (help@cuahsi.org) for information on this resource. |
Type: | Resource | |
Storage: | The size of this resource is 5.4 MB | |
Created: | Jan 03, 2025 at 4:17 p.m. | |
Last updated: | Jan 03, 2025 at 4:43 p.m. | |
Citation: | See how to cite this resource |
Sharing Status: | Public |
---|---|
Views: | 72 |
Downloads: | 16 |
+1 Votes: | Be the first one to this. |
Comments: | No comments (yet) |
Abstract
Underwood et al. (2023) have recently introduced the tandem evolutionary algorithm (TEVA) of Hanley et al. (2020) to the water resources and ecology domains, and applied it to identify features (catchment-scale attributes) and feature interactions important in determining patterns in Dissolved Organic Carbon across the continental US. TEVA has particular advantages for feature selection in large, multivariate observational data sets of complex systems like riverscapes or ecosystems, and has been shown to outperform logistic regression or random forest for identifying feature interactions and equifinality (Hanley et al., 2020; Anderson et al., 2020). TEVA finds interactions between multiple variables that may result from either additive processes or feature interactions, and not only extracts features significantly associated with a given outcome classe(s), but also identifies the specific value ranges associated with those features (Underwood et al., 2023; Hanley, et al., 2020). This algorithm is also robust to issues of mixed data types (continuous, categorical), missing data, censored data, skewed distributions, and unbalanced target classes or clusters (Hanley et al., 2020).
When presented with n observations of p features across a study domain and a target of one or more classes or outcomes, the algorithm identifies and archives two types of clauses below a given fitness threshold. In the first pass, TEVA identifies Conjunctive Clauses (CCs) - a combination of variables that may or may not be correlated and somehow interact to produce an outcome. For example, an Extreme Flood may result from steep slopes + shallow soils + intense rainfall. A second pass of TEVA identifies Disjunctive Clauses (DCs) - a sequence of CCs that are linked with a logical “OR” statement. For example, an Extreme Flood may results from (steep slopes + shallow soils + intense rainfall) OR (high antecedent soil moisture + rainfall) OR (thick snow pack + high temperatures). DCs are multi-order, while the CCs comprising a DC can themselves range from first-order to multi-order (Underwood et al., 2023).
In this workshop, we illustrate the application of TEVA to 91 observations from forested catchments across the CONUS of 54 catchment attributes inferred to have importance to DOC dynamics. Combinations of these catchment attributes were identified in CCs and DCs with high probability to be linked to an outcome class of High or Low mean DOC concentration. Target classes were assigned using Jenks natural breaks for 91 catchments with sufficient (≥3) observations of DOC in stream water to calculate a mean value. Originally, computation of TEVA was performed in the Matlab programming language; the codebase has now been transferred to the open-source coding language Python, and is accessed through CUAHSI’s JupyterHub.
Subject Keywords
Content
How to Cite
This resource is shared under the Creative Commons Attribution CC BY.
http://creativecommons.org/licenses/by/4.0/
Comments
There are currently no comments
New Comment