--- title: "tikatuwq: Water Quality Indices and Temporal Trends" author: "tikatuwq developers" output: rmarkdown::html_vignette: number_sections: true vignette: > %\VignetteIndexEntry{tikatuwq: Water Quality Indices and Temporal Trends} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4, dpi = 96, message = FALSE, warning = FALSE, fig.alt = "Figure generated by tikatuwq package" ) ``` ## Introduction This vignette focuses on the methods implemented in tikatuwq for computing water quality indices and analyzing temporal trends. We cover: - Water Quality Index (IQA/WQI) calculation methods - Trophic State Index (IET) for lakes and reservoirs - Temporal trend analysis using robust and parametric methods - Parameter-specific analysis tools ## Water Quality Index (IQA/WQI) ### Method overview The IQA combines sub-indices (Qi) for individual parameters using weighted arithmetic mean. The sub-indices are obtained by piecewise-linear interpolation over approximate curves (CETESB/NSF style). ```{r load-package} library(tikatuwq) data("wq_demo", package = "tikatuwq") ``` ### Default parameters and weights The default IQA implementation uses 9 parameters with standard weights: ```{r iqa-weights} # Default weights default_weights <- c( od = 0.17, coliformes = 0.15, dbo = 0.10, nt_total = 0.10, p_total = 0.10, turbidez = 0.08, tds = 0.08, pH = 0.12, temperatura = 0.10 ) # Check sum equals 1 sum(default_weights) ``` ### Computing IQA ```{r compute-iqa} # Compute IQA with default settings df_iqa <- iqa(wq_demo, na_rm = TRUE) # View results cols_show <- intersect(c("ponto", "IQA", "IQA_status"), names(df_iqa)) head(df_iqa[, cols_show, drop = FALSE]) # Distribution hist(df_iqa$IQA, breaks = 10, main = "IQA Distribution", xlab = "IQA") ``` ### Handling missing parameters When `na_rm = TRUE`, weights are rescaled per row to use only available parameters: ```{r iqa-missing} # Example with missing parameters df_missing <- wq_demo df_missing$tds <- NULL # Remove one parameter df_iqa_missing <- iqa(df_missing, na_rm = TRUE) head(df_iqa_missing$IQA) ``` ### Custom weights You can provide custom weights: ```{r iqa-custom-weights} # Custom weights (must sum to 1) custom_weights <- c( od = 0.20, coliformes = 0.20, dbo = 0.10, nt_total = 0.10, p_total = 0.10, turbidez = 0.10, tds = 0.10, pH = 0.05, temperatura = 0.05 ) df_iqa_custom <- iqa(wq_demo, pesos = custom_weights, na_rm = TRUE) cols_show2 <- intersect(c("IQA", "IQA_status"), names(df_iqa_custom)) head(df_iqa_custom[, cols_show2, drop = FALSE]) ``` ### Classification The IQA values are automatically classified into qualitative categories: ```{r iqa-classification} # Classification function classify_iqa(c(15, 40, 65, 80, 95)) # English labels classify_iqa(c(15, 40, 65, 80, 95), locale = "en") # Distribution in demo data table(df_iqa$IQA_status) ``` ## Trophic State Index (IET) ### Carlson IET For lentic systems, the Carlson Trophic State Index uses Secchi depth, chlorophyll-a, and total phosphorus: ```{r iet-carlson, eval=FALSE} # Example dataset with required parameters # df_lake <- data.frame( # ponto = c("L1", "L2"), # secchi = c(2.5, 1.0), # meters # clorofila = c(5, 20), # ug/L # p_total = c(0.02, 0.10) # mg/L (converted to ug/L internally) # ) # # iet_carlson(df_lake, .keep_ids = TRUE) ``` The function automatically: - Converts p_total (mg/L) to tp (ug/L) if needed - Accepts aliases (sd for secchi, chla for clorofila) - Returns TSI values with classification ### Lamparelli IET Similar to Carlson but with different equations and thresholds: ```{r iet-lamparelli, eval=FALSE} # iet_lamparelli(df_lake, .keep_ids = TRUE) ``` ## Temporal Trend Analysis ### Single parameter trend The `trend_param()` function computes Theil-Sen slope and Spearman correlation: ```{r trend-single} # Add temporal structure to demo data df_temporal <- wq_demo df_temporal$data <- as.Date("2025-01-01") + seq_len(nrow(df_temporal)) - 1 # Compute trend for turbidity trend_result <- trend_param(df_temporal, param = "turbidez") print(trend_result) ``` The result includes: - `slope`: Theil-Sen slope - `p_value`: Spearman correlation p-value - `trend`: classification (increasing, decreasing, stable) ### Plotting trends ```{r plot-trend} library(ggplot2) # Plot with trend line p_trend <- plot_trend(df_temporal, param = "turbidez", method = "theilsen") print(p_trend) # With LOESS smoothing p_loess <- plot_trend(df_temporal, param = "turbidez", method = "loess") print(p_loess) ``` ### Multiple parameters Use `param_trend_multi()` to analyze trends across multiple parameters: ```{r trend-multi} # Trends for multiple parameters params <- c("turbidez", "od", "dbo") trends_multi <- param_trend_multi(df_temporal, parametros = params) print(trends_multi) ``` ## Parameter-specific Analysis ### Summary statistics ```{r param-summary} # Summary for one parameter summary_turb <- param_summary(df_temporal, parametro = "turbidez") print(summary_turb) # Multi-parameter summary summary_multi <- param_summary_multi(df_temporal, parametros = c("turbidez", "od", "dbo")) print(summary_multi) ``` ### Parameter plots ```{r param-plot} # Single parameter plot p1 <- param_plot(df_temporal, parametro = "turbidez") print(p1) # Multi-parameter plot p2 <- param_plot_multi(df_temporal, parametros = c("turbidez", "od", "dbo")) print(p2) ``` ## Statistical Methods ### Theil-Sen estimator The Theil-Sen method is robust to outliers: ```{r theil-sen-details, eval=FALSE} # Theil-Sen computes median of all pairwise slopes # For data with outliers, it is more reliable than OLS # Used by default in trend_param() and plot_trend() ``` ### Spearman correlation Non-parametric test for monotonic trends: ```{r spearman-details, eval=FALSE} # Spearman correlation tests for monotonic relationship # Does not assume linearity # p-value indicates significance of trend ``` ## Best Practices ### Choosing parameters for IQA - Include all 9 default parameters when possible - Use `na_rm = TRUE` if some parameters are missing - Adjust weights only if you have domain knowledge ### Handling censored values - Use `nd_policy = "ld2"` (default) for conservative estimates - Consider `nd_policy = "na"` if censored values should not influence results - Document your choice in reports ### Trend analysis - Use Theil-Sen for robust estimates with outliers - Require at least 4 observations per group for reliable trends - Consider seasonal effects when analyzing temporal data ### Units consistency - Ensure all parameters use standard units (mg/L, NTU, etc.) - Use `clean_units()` to convert if needed - Document unit conversions in methodology sections ## References - Carlson, R. E. (1977). A trophic state index for lakes. *Limnology and Oceanography*, 22(2), 361-369. - Lamparelli, M. C. (2004). Graus de trofia em corpos d'agua do estado de Sao Paulo: avaliacao dos metodos de monitoramento. *Tese de Doutorado, Universidade de Sao Paulo*. - CETESB. (2021). Aguas superficiais: indice de qualidade das aguas (IQA). *Companhia Ambiental do Estado de Sao Paulo*. ## Summary This vignette covered: 1. IQA calculation with default and custom weights 2. Handling missing parameters 3. IET methods for lentic systems 4. Temporal trend analysis (Theil-Sen, Spearman) 5. Parameter-specific analysis tools For workflow examples, see the "From raw water quality data to CONAMA report" vignette.