Data and code for: Modeling Seasonal Effects of River Flow on Water Temperatures in an Agriculturally Dominated California River


Authors:
Owners:
Resource type: Composite Resource
Storage: The size of this resource is 102.0 MB
Created: Mar 08, 2021 at 6:35 p.m.
Last updated: Mar 26, 2021 at 3:43 p.m.
Citation: See how to cite this resource
Sharing Status: Public
Views: 69
Downloads: 11
+1 Votes: Be the first one to 
 this.
Comments: No comments (yet)

Abstract

This resource contains the data and scripts used for: Asarian, J.E., and Robinson, C. (in review) Modeling Seasonal Effects of River Flow on Water Temperatures in an Agriculturally Dominated California River. Due to manuscript revisions, the numbering of tables in these scripts no longer exactly matches the manuscript. This will be updated/revised after receiving reviewers' comments.

Abstract from the article:
Low summer river flows can increase vulnerability to warming, impacting coldwater fish. Water managers need tools to quantify the complex linkages between flow and water temperature, yet statistical models often assume a constant relationship between these variables. In California’s snowmelt and groundwater-influenced Scott River where agricultural irrigation consumes most summer river flow, flow variation had stronger effects on water temperature in April–July than other months. Using 24 years of daily air temperature and flow data as predictors, we compared multiple statistical methods for modeling daily Scott River water temperatures, including generalized additive models with non-linear interactions between flow and day of the year. Models with seasonally varying flow effects performed better than those assuming a constant relationship between water temperature and flow. Cross-validation root mean squared errors of the selected models were ≤1 °C. We applied the models to several instream flow scenarios currently being considered by stakeholders and regulatory agencies. Relative to historic conditions, the most protective flow scenario would reduce average annual maximum temperature from 25.9 °C to 24.6 °C, reduce average annual degree-days exceedance of 22 °C (a cumulative thermal stress metric) from 107 to 54, and delay the onset of water temperatures greater than 22 °C during some drought years. Withdrawal of river water after 1 June, including for groundwater management purposes, could contribute to additional exceedances of 22 °C. These methods can be applied to model any stream with long-term flow and water temperature measurements, with applications including scenario prediction and infilling data gaps.

The files are organized into 5 folders: R_Scripts, SourceDataFiles, CompiledData, WorkingFiles, and Outputs. Details of file are provided in the README.txt file.

Subject Keywords

Deleting all keywords will set the resource sharing status to private.

Resource Level Coverage

Spatial

Coordinate System/Geographic Projection:
WGS 84 EPSG:4326
Coordinate Units:
Decimal degrees
Place/Area Name:
Scott River, Siskiyou County, California, USA
North Latitude
41.6410°
East Longitude
-122.9320°
South Latitude
41.5992°
West Longitude
-123.0146°

Temporal

Start Date:
End Date:

Content

README.txt

INTRODUCTION
This resource contains the data and scripts used for: Asarian, J.E., and Robinson, C. (in review) Modeling Seasonal Effects of River Flow on Water Temperatures in an Agriculturally Dominated California River.  Due to manuscript revisions, the numbering of tables in these scripts no longer exactly matches the manuscript.  This will be updated/revised after receiving reviewers' comments.  

For questions, contact: Eli Asarian (Riverbend Sciences, Eureka, CA, USA), eli@riverbendsci.com

Abstract from the article:
We compared multiple statistical approaches for predicting daily water temperatures in Californias Scott River where agricultural irrigation consumes most summer river flow, affecting coldwater fish. Statistical models often assume a constant relationship between flow and water temperature. At our snowmelt and groundwater-influenced study site, water temperatures in AprilJuly are much cooler during high flows than low flows, whereas flow has less effect in late summer and fall. Using 24 years of daily air temperature and river flow data as predictors, we compared a widely-used non-linear logistic regression approach to two methods that allow flow effects to vary seasonally harmonic regression and generalized additive models. Models that included seasonally varying flow effects performed much better than those assuming a constant relationship between water temperature and flow. Cross-validation root mean squared errors of the final models were =1 C. We applied the models to several instream flow scenarios currently being considered by stakeholders and regulatory agencies. Relative to historic conditions, the most protective flow scenario would reduce average annual maximum temperature from 25.9 C to 24.6 C, reduce average annual degree-days exceedance of 22 C (a metric of cumulative thermal stress) from 107 to 54, and delay the onset of water temperatures greater than 22 C during some drought years, but we caution that diversions for managed aquifer recharge after 1 June could cause thermal impacts. Our methods can be applied in any stream with long-term flow and water temperature measurements.


INSTRUCTIONS FOR RUNNING THE SCRIPTS
1. Download the entire contents of this resource and place all 5 folders in a single folder (i.e., match the data folder structure of the resource)
2. Open the "r.Rproj" file in R Studio. This file will set R's working directory so that the codes work properly.
a) Open the R script "R_Scripts/MasterScript.R"
b) Review the list of packages at the top of "R_Scripts/MasterScript.R", and install any missing packages.
c) Run all lines in "MasterScript.R". This will call the 3 individual scripts ("R_Scripts/Script_A_compilation.R", "R_Scripts/Script_B_prepare.R", and "R_Scripts/Script_C_analysis.R") that compile the data, prepare the data for analysis, and implements the analysis, respectively. MasterScript.R is provided for convenience, but for troubleshooting purposes it might be necessary to run each script individually. 
d) If the R console shows an "Output created:" message for each of the 3 scripts that signifies the scripts all successfull ran, and updated outputs should now appear in the "Outputs/FiguresManuscript", "Outputs/Tables/Manuscript", and "Outputs/R_Markdown" folders.


FILE CONTENTS
As detailed below, this resource is organized into 5 folders: R_Scripts, SourceDataFiles, CompiledData, WorkingFiles, and Outputs.

"R_Scripts" folder
This folder contains all of the scripts used to compile the data, prepare the data for analysis, and implements the analysis. 

"Outputs/R_Markdown" folder
This folder contains 3 HTML files that are reports showing code and resulting outputs from the running of the R scripts. These HTML files will be overwritten every time the scripts are re-run.

"Outputs/TablesManuscript/Table_2_ModelTrainingStats.csv"
Table 2 from article. See article for caption.

"Outputs/FiguresManuscript"
This folder contains all the figures from the manuscript. See article for captions.

"Outputs/ModelDiagnostics" folder contains model diagnostic figures and tables
There are a series of 5 files for each model. File names start with the response variables (e.g., "wtemp_max" or "wtemp_mean"), followed by model number (e.g., GAM1 ... GAM11, with key provided in Outputs/TablesManuscript/Table_2_ModelTrainingStats.csv or Outputs/TablesOther/ModelInfo.stats.all.csv), followed by 1 of 5 suffixes:
1. _ACF_no_AR1.png is the autocorrelation function plot from itsadug::acf_resid function, for the with-AR1 version of the model
2. _ACF_with_AR1.png is the autocorrelation function plot from itsadug::acf_resid function, for the no-AR1 version of the model
3. _appraise_with_AR1.png is the output from the gratia::appraise function.
4. _concurvity.csv is the output from the mgcv::concurvity function.
5. _draw_with_AR1.pngis is the output from the gratia::draw function.

"Outputs/TablesOther/"
This folder contains 3 files:
1. Validation.daily.csv is the daily time series of validation results for each model. This table is used to generate the points in Figure 4, S5, and S6. Key to column names: 
Model.withnum = Model name and number (see article for details)
Parameter = Daily maximum or Daily mean
YearType = LOYOCV cross-validation or Out-of-sample validation
Year = Calendar year
Date = Date in Month/Day/Year format
DayOfYear = Julian day 1 to 366
value.obs = Observed temperature in degrees C
value.pred = Modeled temperature in degrees C
value.resid = Residual (Modeled - Observed) temperature in degrees C
2. Validation.summary.csv is a summary of validation results for each model. This table is used to generate the text in Figure 4 and S5. Key to column names: 
Model.withnum = Model name and number (see article for details)
Parameter = Daily maximum or Daily mean
YearType = LOYOCV cross-validation or Out-of-sample validation
n = number of days with measured temperature data
n.years = number of years with measured temperature data
R2 = coefficient of determination (r-squared)
RMSE = root mean squared error
3. ModelInfo.stats.all.csv is a summary of training and validation results for each model. This table is a precursor to Table 2 and the text in Figure 4 and S5. Key to column names: 
Model.Num = model number
Model.withnum = combination of Model.Num and Model
Model = model name
Parameter = Daily maximum or Daily mean
Formula = short version of formula used in Table 2 in article 
Formula.long = full formula (excluding correlation structure)
AIC = Akaike information criterion
fREML = fast restricted maximum likelihood score
AR1 = autocorrelation coefficient
edf = effective degrees of freedom 
RMSE.train = root mean squared error for training dataset
R2.train = coefficient of determination for training dataset
RMSE.xval = root mean squared error for cross-validation
R2.xval = coefficient of determination for cross-validation
RMSE.outval = root mean squared error for out-of-sample validation dataset
R2.outval = coefficient of determination for out-of-sample validation dataset
4. Scenario.Inputs.csv is the time series of scenario inputs, which are a version of the data used to create Figure 3.
Date = Date in Month/Day/Year format
Year = Calendar year
DayOfYear = Julian day 1 to 366
USGS.FLOW.log10cms = measured flow in log10 transformed original units (cms)
ATemp.mean.2dweighted50 = air temp in original units (degrees C)
usgs.flow.log10cms = measured flow in standardized units
atemp.mean.2dweighted50 = air temp in standardized units
USGS.FLOW.log10cms.usfs = USFS flows in log10 transformed original units (cms)
USGS.FLOW.log10cms.cdfw = CDFW flows in log10 transformed original units (cms)
usgs.flow.log10cms.usfs = USFS flows in standardized units
usgs.flow.log10cms.cdfw = CDFW flows in standardized units
atemp.mean.2dweighted50.quant0.05 = air temp 0.05 quantile in standardized units
atemp.mean.2dweighted50.quant0.5 = air temp 0.50 quantile in standardized units
atemp.mean.2dweighted50.quant0.95 = air temp 0.95 quantile in standardized units
usgs.flow.log10cms.quant0.05 =  flow 0.05 quantile in standardized units
usgs.flow.log10cms.quant0.5 = flow 0.50 quantile in standardized units
usgs.flow.log10cms.quant0.95 = flow 0.95 quantile in standardized units
5. Scenario.Outputs.Group1.csv is the outputs of the Group 1 scenarios
DayOfYear = Julian day 1 to 366
ScenarioAir = Air temperatures used in scenario (scenario is defined by the pairing of ScenarioAir and ScenarioQ)
ScenarioQ = Flow used in scenario (scenario is defined by the pairing of ScenarioAir and ScenarioQ)
Parameter = Daily maximum or Daily mean
value = modeled temperature in degrees C 
6. Scenario.Outputs.Group2.csv is the outputs of the Group 1 scenarios
ScenarioAir = Air temperatures used in scenario (scenario is defined by the pairing of ScenarioAir and ScenarioQ)
ScenarioQ = Flow used in scenario (scenario is defined by the pairing of ScenarioAir and ScenarioQ)
Year = Calendar year
Date = Date in Month/Day/Year format
DayOfYear = Julian day 1 to 366
Parameter = Daily maximum only (Daily mean not run for Group 2 scenarios)
value = modeled temperature in degrees C 

"SourceDataFiles" folder
This folder contains all the source data used in the analysis:
1. ghcnd.raws.ccal.csv, ghcnd.raws.coak.csv, and ghcnd.raws.cqua.csv are daily mean, maximum, and minimum air temperature data that are downloaded from the Global Historical Climatology Network - Daily (GHCN-D, Menne et al. 2012, http://doi.org/10.7289/V5D21VHZ) via the rnoaa package when Script_A_compilation.R is run. The GHCN-D site identifier code is provided in the "id" column of the data files. Units are Degrees C.
2. PRISM_tmin_tmean_tmax_provisional_4km_19900101_20201231_41.5994_-122.9336.csv is gridded air temperature from PRISM (Daly et al., 2008) downloaded from: https://prism.oregonstate.edu/explorer.
3. Flow.usgs.daily.csv is daily flow data from the U.S. Geological Survey's (USGS) gage 11519500 SCOTT R NR FORT JONES CA. Units are cfs.
4. WTemp.daily.batch1.csv contains daily average stream temperature data from the Quartz Valley Indian Reservation (QVIR) and U.S. Forest Service (USFS). Units are Degrees C. The USFS data were extracted from the USFS National Resource Information System Aquatic Surveys (USFS_NRIS_AqS) by Callie McConnell (USFS Corvallis). The QVIR data were obtained by request from Crystal Robinson (Crystal.Robinson@qvir-nsn.gov) at QVIR. Data are from several functionally equivalent sites all located near the USGS gage 11519500 SCOTT R NR FORT JONES CA
5. WTemp.hourly.batch2.csv contains 15 to 60 minute resolution stream temperature data from the Quartz Valley Indian Reservation (QVIR), U.S. Forest Service Klamath National Forest(USFS_KNF), and U.S. Bureau of Reclamation (USBR). The data cover different years than the daily data in WTemp.daily.batch1.csv. Units are Degrees C. The QVIR data were obtained by request from Crystal Robinson (Crystal.Robinson@qvir-nsn.gov) at QVIR. USFS data were obtained from Maija Meneks (Maija.Meneks@usda.gov), and apparently were not input into the USFS NRIS AqS database because they covered only part of the summer season. USBR stream temperature data were downloaded from the USGS Data Grapher: https://or.water.usgs.gov/cgi-bin/grapher/graph_setup.pl?basin_id=all&site_id=11519500.  

"CompiledData" folder
This folder contains a single file srga.daily.csv that has all the compiled stream temperature, air temperature, and flow data used in the analysis. Key to columns:
- Date: Date
- Year: Date in Month/Day/Year format
- USGS.FLOW.cfs: Daily average flow in units of cfs, from USGS gage 11519500 SCOTT R NR FORT JONES CA
- ATemp.mean: Daily average air temperature in units of degrees C. Data are primarily from GHCND site USR0000CQUA, with some infilling based on other stations (see article for details)
- ATemp.max: Daily maximum air temperature in units of degrees C. Data are primarily from GHCND site USR0000CQUA, with some infilling based on other stations (see article for details)
- WTemp.min: daily minimum stream temperature in units of degrees C.
- WTemp.mean:  daily mean stream temperature in units of degrees C.
- WTemp.max:  daily maximum stream temperature in units of degrees C.
- WTemp.n: number of stream temperature measurements from which daily stats are derived
- WTemp.range: WTemp.max - WTemp.min
- Source.Entity: Original source of stream temperature data (QVIR, USFS_KNF, USFS_NRIS_AqS, USBR)
- Site.Code: stream temperature site location identifier from the original source. Different sources used different codes over time, but all are essentially the same site (gage 11519500 SCOTT R NR FORT JONES CA).

"TemporaryWorkingFiles/ModelDiagnostics" folder contains temporary files generated during the running of the R scripts. Users do not need to be aware of what they are, although the "R_Scripts/Script_C_analysis.R" shows how they are created and used.


REFERENCES:

Daly, C., Halbleib, M., Smith, J. I., Gibson, W. P., Doggett, M. K., Taylor, G. H., Curtis, J., & Pasteris, P. P. (2008). Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. International Journal of Climatology, 28(15), 20312064. https://doi.org/10.1002/joc.1688 

Menne, M.J., Durre, I., Korzeniewski, B., McNeal, S., Thomas, K., Yin, X., Anthony, S., Ray, R., Vose, R.S., Gleason, B.E., & Houston, T.G. (2012a). Global Historical Climatology Network - Daily (GHCN-Daily), Version 3.26. NOAA National Climatic Data Center. http://doi.org/10.7289/V5D21VHZ. Accessed 2021-01-11]

Menne, M. J., Durre, I., Vose, R. S., Gleason, B. E., & Houston, T. G. (2012b). An Overview of the Global Historical Climatology Network-Daily Database. Journal of Atmospheric and Oceanic Technology, 29(7), 897910. https://doi.org/10.1175/JTECH-D-11-00103.1

References

Related Resources

The content of this resource serves as the data for: Asarian, J.E., and Robinson, C. (in review) Seasonally Varying Regression Highlights River Flow as a Key Driver of Water Temperatures in California's Agriculturally Dominated and Groundwater-Influenced Scott River.

Credits

Funding Agencies

This resource was created using funding from the following sources:
Agency Name Award Title Award Number
Klamath Tribal Water Quality Consortium
U.S. Environmental Protecion Agency, Region IX

How to Cite

Asarian, J. E., C. Robinson (2021). Data and code for: Modeling Seasonal Effects of River Flow on Water Temperatures in an Agriculturally Dominated California River, HydroShare, http://www.hydroshare.org/resource/a6653e2919964f9b840ec0340d86e11c

This resource is shared under the Creative Commons Attribution CC BY.

 http://creativecommons.org/licenses/by/4.0/
CC-BY

Comments

There are currently no comments

New Comment

required