Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: BioMonTools
Type: Package
Title: Biomonitoring and Bioassessment Calculations
Version: 1.2.4.9008
Version: 1.2.4.9012
Authors@R: c(
person("Erik W.", "Leppo",
email="Erik.Leppo@tetratech.com",
Expand Down
4 changes: 4 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@ export(metric.values.fish)
export(metvalgrpxl)
export(qc.checks)
export(qc_taxa)
export(qc_taxa_match_official)
export(qc_taxa_values_ffg)
export(qc_taxa_values_habit)
export(qc_taxa_values_tolval)
export(rarify)
export(taxa_translate)
importFrom(rlang,.data)
21 changes: 20 additions & 1 deletion NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,29 @@ NEWS

<!-- NEWS.md is generated from NEWS.Rmd. Please edit that file -->

#> Last Update: 2026-03-18 20:26:19.353336
#> Last Update: 2026-03-18 22:51:30.206983

# Version History

## Changes in version 1.2.4.9012 (2026-03-18)

- feature: Add qc_taxa_values_tolval function

## Changes in version 1.2.4.9011 (2026-03-18)

- refactor: Add default column name to qc_taxa_values_ffg
- feature: Add qc_taxa_values_habit function

## Changes in version 1.2.4.9010 (2026-03-18)

- feature: Add qc_taxa_values_ffg function

## Changes in version 1.2.4.9009 (2026-03-18)

- deprecate: Change qc_taxa to qc_taxa_match_official
- Will be removed in a future version
- refactor: Add qc_taxa_match_official and update with new name

## Changes in version 1.2.4.9008 (2026-03-18)

- test: Add test for metric.values for collapsing, bugs and fish, Issue
Expand Down
21 changes: 20 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,29 @@ NEWS

<!-- NEWS.md is generated from NEWS.Rmd. Please edit that file -->

#> Last Update: 2026-03-18 20:26:19.353336
#> Last Update: 2026-03-18 22:51:30.206983

# Version History

## Changes in version 1.2.4.9012 (2026-03-18)

- feature: Add qc_taxa_values_tolval function

## Changes in version 1.2.4.9011 (2026-03-18)

- refactor: Add default column name to qc_taxa_values_ffg
- feature: Add qc_taxa_values_habit function

## Changes in version 1.2.4.9010 (2026-03-18)

- feature: Add qc_taxa_values_ffg function

## Changes in version 1.2.4.9009 (2026-03-18)

- deprecate: Change qc_taxa to qc_taxa_match_official
- Will be removed in a future version
- refactor: Add qc_taxa_match_official and update with new name

## Changes in version 1.2.4.9008 (2026-03-18)

- test: Add test for metric.values for collapsing, bugs and fish, Issue
Expand Down
19 changes: 19 additions & 0 deletions NEWS.rmd
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,25 @@ cat(paste0("Last Update: ",Sys.time()))

# Version History

## Changes in version 1.2.4.9012 (2026-03-18)

* feature: Add qc_taxa_values_tolval function

## Changes in version 1.2.4.9011 (2026-03-18)

* refactor: Add default column name to qc_taxa_values_ffg
* feature: Add qc_taxa_values_habit function

## Changes in version 1.2.4.9010 (2026-03-18)

* feature: Add qc_taxa_values_ffg function

## Changes in version 1.2.4.9009 (2026-03-18)

* deprecate: Change qc_taxa to qc_taxa_match_official
+ Will be removed in a future version
* refactor: Add qc_taxa_match_official and update with new name

## Changes in version 1.2.4.9008 (2026-03-18)

* test: Add test for metric.values for collapsing, bugs and fish, Issue #131
Expand Down
249 changes: 8 additions & 241 deletions R/qc_taxa.R
Original file line number Diff line number Diff line change
@@ -1,56 +1,10 @@
#' Quality Control Check on User Data Against Master Taxa List
#'
#' This function compares the user's data frame to a data frame with the
#' official (or user supplied) master taxa list (benthic macroinvertebrates).
#' This function has been deprecated (March 2026).
#'
#' Output is a data frame with matches.
#' The new function is qc_taxa_match_official.
#'
#' Messages are output to the console with the number of matches and which user
#' taxa did not match the official list.
#'
#' The official list is stored online but the user can input their own saved
#' copy.
#'
#' Any columns in the user input file that match the official master taxa list
#' will be renamed with the "_NonOfficial" suffix.
#'
#' New/different taxa in the user data are handled by the 'useOfficialTaxaInfo'
#' parameter. For taxa that did not match the master taxa list the user has
#' options on how to handle the differences for the phylogeny (e.g., columns for
#' phylum, class, family, etc.) and autecology (e.g., columns for FFG, habit,
#' tolerance value, etc.). The options are below.
#'
#' * only_official = use only official master taxa information. Any
#' non-matching taxa will not have any master taxa information.
#'
#' * only_user = only use the information provided by the user. Information
#' from the 'Official' will not be used. This should only be used for
#' non-official calculations.
#'
#' * add_new = hybrid approach that uses official master taxa information, when
#' present, but includes user information for non-matching taxa if the column
#' names match.
#'
#' Default master taxa lists are saved as CSV files online at:
#'
#' https://github.com/leppott/MBSStools_SupportFiles
#'
#' The files can be downloaded with the following code.
#'
#' **Benthic Macroinvertebrate**
#'
#' url_mt_bugs <- "https://github.com/leppott/MBSStools_SupportFiles/raw/master/Data/CHAR_Bugs.csv"
#' df_mt_bugs <- read.csv(url_mt_bugs)
#'
#' The master taxa files are periodically updated. Update dates will be logged
#' on the GitHub repository.
#'
#' Expected fields include:
#'
#' **Benthic Macroinvertebrates**
#'
#' + TAXON, Phylum, Class, Order, Family, Genus, Other_Taxa, Tribe, FFG,
#' FAM_TV, Habit, FinalTolVal07, Comment
#' This function exists only as a wrapper to avoid breaking older code.
#'
#' @param DF_User User taxa data.
#' @param DF_Official Official master taxa list. Can be a local file or
Expand Down Expand Up @@ -94,198 +48,11 @@ qc_taxa <- function(DF_User,
DF_Official = NULL,
fun.Community = NULL,
useOfficialTaxaInfo = "only_Official") {
##FUNCTION ~ mastertaxa ~START
#
boo_DEBUG <- FALSE
if(boo_DEBUG==TRUE){##IF~boo_DEBUG~START
# # # Bugs
# DF_User<- taxa_bugs_genus
# DF_Official = NULL
# fun.Community = "bugs"
# useOfficialTaxaInfo = "only_Official"
# #
}##IF~boo_DEBUG~END

# Col Suffixes
sfx_Official <- "_Official"
sfx_NonOfficial <- "_NonOfficial"

# QC
## inputs as data frames (just in case have a tibble)
DF_User <- data.frame(DF_User)
# DF_Official handled when checking URL
## Community, convert community to lowercase
fun.Community <- tolower(fun.Community)

# Taxa list, official
# run the proper sub function
if (fun.Community == "bugs") {##IF.START
url_mt <- "https://github.com/leppott/MBSStools_SupportFiles/raw/master/Data/CHAR_Bugs.csv"
col_mt <- c("Taxon",
"Phylum",
"Class",
"Order",
"Family",
"Genus",
"Other_Taxa",
"Tribe",
"FFG",
"FAM_TV",
"Habit",
"FinalTolVal07",
"Comment")
col_taxon <- col_mt[1]
# } else if(fun.Community == "fish"){
# url_mt <- "https://github.com/leppott/MBSStools_SupportFiles/raw/master/Data/CHAR_Fish.csv"
# col_mt <- c("SPECIES", "TYPE", "PTOLR", "NATIVE", "TROPHIC", "SILT"
# , "PIRHALLA","DATE.ADDED", "REASON", "SOURCE", "FAM", "GENUS"
# , "SP_SCI", "IN_KEY", "APPROX_ID" )
# col_taxon <- col_mt[1]
# future functionality
} else {
msg <- "Valid values for fun.Community is only 'bugs'."
stop(msg)
}##IF ~ fun.community ~ END

# Master Taxa
# Download "official" list if none provided
if(is.null(DF_Official)){
# 404 Error if file not found
df_mt <- utils::read.csv(url_mt)
} else {
df_mt <- data.frame(DF_Official)
}## IF ~ is.null(DF_Official) ~ END

# Names to upper case
names(DF_User) <- toupper(names(DF_User))
names(df_mt) <- toupper(names(df_mt))
# col_mt <- toupper(col_mt)
col_taxon <- toupper(col_taxon)

# QC check for col_taxon
if (!col_taxon %in% names(DF_User)) {
stop(paste0("DF_User missing column; ", col_taxon))
} ## IF, stop

# taxa names to ALL CAPS for bugs and fish
DF_User[, col_taxon] <- toupper(DF_User[, col_taxon])

# Check Numbers
taxa_user <- sort(unique(DF_User[, col_taxon]))
taxa_user_n <- length(taxa_user)
boo_taxa_match <- taxa_user %in% df_mt[, col_taxon]
sum_taxa_match <- sum(boo_taxa_match)
taxa_nonmatch <- taxa_user[!boo_taxa_match]
# Output to Console
msg <- paste0("Taxa match, ", sum_taxa_match, " / ", taxa_user_n)
message(msg)
# Inform user of the non-matches
if(sum_taxa_match != taxa_user_n){
n_nonmatch <- taxa_user_n - sum_taxa_match
str_tax <- ifelse(n_nonmatch == 1, "taxon", "taxa")
msg_1 <- paste0("The following user ",
str_tax,
" (",
n_nonmatch,
"/",
taxa_user_n,
") did not match the master list.\n")
msg_2 <- paste0(taxa_nonmatch, collapse = "\n")
message(paste0(msg_1, msg_2))
}##IF ~ non-matches ~ END



# Merge and Munge Columns
## Columns
# col_mt_nonTaxon <- col_mt[!(col_mt %in% col_taxon)]
# col_mt_nonOfficial <- paste0(col_mt_nonTaxon, sfx_NonOfficial)
# boo_col_match <- colnames(DF_User) %in% col_mt_nonTaxon
# col_mod <- colnames(DF_User)[boo_col_match]
## Rename matching columns before merge
#names(DF_User)[boo_col_match] <- paste0(names(DF_User)[boo_col_match]
# , "_NonOfficial")
# more control than using suffixes in merge()
#
## Merge
# df_merge <- merge(DF_User, df_mt
# , by = col_taxon
# , all.x = TRUE)
## Munge Cols
if(useOfficialTaxaInfo == "only_Official"){
# Do Nothing
# leave in "_NonOfficial" columns
df_result <- merge(DF_User, df_mt,
by = col_taxon,
all.x = TRUE,
suffixes = c(sfx_NonOfficial, ""))

#names(df_result) <- gsub(".x$", "", names(df_result))

# df_result <- dplyr::left_join(DF_User, df_mt
# , by = col_taxon
# , suffix = c(sfx_NonOfficial, ""))

} else if(useOfficialTaxaInfo == "only_user"){
# Reverse and keep _NonOfficial and remove official field
# # Remove Official Cols
# col_keep <- !(names(df_merge) %in% col_mod)
# df_result <- df_merge[, col_keep]
# # Revert "_NonOfficial"
# names(df_result) <- gsub("_NonOfficial$", "", names(df_result))

df_result <- merge(DF_User, df_mt,
by = col_taxon,
all.x = TRUE,
suffixes = c("", sfx_Official))


# df_result <- dplyr::left_join(DF_User, df_mt
# , by = col_taxon
# , suffix = c("", sfx_Official))

} else if(useOfficialTaxaInfo == "add_new"){
# add user info for new taxa to official columns
# df_result <- df_merge
# df_merge[df_merge[, col_taxon] == taxa_nonmatch, col_mod] <-
# df_merge[df_merge[, col_taxon] == taxa_nonmatch, paste0(col_mod
# , "_NonOfficial")]

df_result <- merge(DF_User, df_mt,
by = col_taxon,
all.x = TRUE,
suffixes = c(sfx_NonOfficial, ""))

# df_result <- dplyr::left_join(DF_User, df_mt
# , by = col_taxon
# , suffix = c(sfx_NonOfficial, ""))

col_match_y <- names(df_result)[grepl(paste0(sfx_NonOfficial,"$")
, names(df_result))]
col_match_x <- gsub(paste0(sfx_NonOfficial,"$"), "", col_match_y)
df_result[df_result[, col_taxon] == taxa_nonmatch, col_match_x] <-
df_result[df_result[, col_taxon] == taxa_nonmatch, col_match_y]

} else {
# Stop if wrong values
msg <- "Valid values for useOfficialTaxaInfo are
'only_Official', 'only_user', or 'add_new'."
stop(msg)
}

# QC
## Missing Columns

## Valid values
# Bugs = "FFG", "FAM_TV", "Habit", "FinalTolVal07"
# Fish = TYPE, PTROLR, TROPHIC

# Other columns for metric calculation
# Bugs = EXCLUDE, STRATA_R
# Fish =


# Output
return(df_result)
.Deprecated("qc_taxa")
qc_taxa(DF_User,
DF_Official,
fun.Community,
useOfficialTaxaInfo)
#
}##FUNCTION ~ qc_taxa ~ END
Loading
Loading