Clean BIEN Trait Data — clean_trait_data

Cleans and processes plant trait data from BIEN by handling missing values, removing duplicates,computing mean trait values per species, and optionally removing outliers.

Usage

clean_trait_data_BIEN(
  data,
  remove_outliers = FALSE,
  outlier_threshold = 3,
  author_info = FALSE
)

Arguments

data: A dataframe containing plant trait data. Must include the columns: scrubbed_species_binomial, trait_name, trait_value, unit, method, and url_source.
remove_outliers: Logical. If TRUE, removes outliers based on a specified threshold. Default is FALSE.
outlier_threshold: Numeric. The number of standard deviations from the mean to classify a value as an outlier. Default is 3.
author_info: Logical. If TRUE, it includes authorship and contact information for data collection. Default is FALSE.

Value

A cleaned dataframe with the columns: scrubbed_species_binomial and mean_trait_value. Optional columns include: project_pi and project_pi_contacts.

Examples

if (FALSE) { # \dontrun{
data <- data.frame(scrubbed_species_binomial = c("Quercus robur", "Quercus robur", "Pinus sylvestris"), trait_name = c("Leaf Area", "Leaf Area", "Needle Length"), trait_value = c(20, 22, 5), unit = c("cm2", "cm2", "cm"),method = c("measurement", "measurement", "observation"), url_source = c("source1", "source2", "source3"))

cleaned_data <- clean_trait_data_BIEN(data, remove_outliers = FALSE)

print(cleaned_data)
} # }