I am trying to help some friends create a formatted "checklist" of plant species found in our state.
The data looks like this (except there are over 3,000 taxa):
dat<- as.data.frame(cbind(clade = c("Clade x", "Clade x", "Clade x", "Clade y", "Clade y", "Clade z", "Clade z"),
family = c("FAMILY A", "FAMILY A", "FAMILY B", "FAMILY C", "FAMILY C", "FAMILY D", "FAMILY E"),
taxon = c("Juniperus osteosperma", "Ephedra viridis", "Achillea millefolium", "Artemisia tridentata var. tridentata", "Iva axillaris", "Pleiacanthus spinosus", "Packera multilobata"),
life_history = c("tree", "shrub", "forb", "shrub", "forb", "forb", "forb"),
County = c("All counties", "WP", "WP, CK", "EU, WP, WA", "CK", "DG", "DG, CC"),
non.native = c("", "", "Non-native", "", "Non-native", "", "")))
I would like to be able to parse this to a word document, where each row becomes its own entry, and the entries are grouped by clade and then family. I would also like to format certain parts of the text strings that are output (for example, all words except "var." in taxon should be italicized).
The output I am looking for would be something like this:
I was able to combine columns needed for each entry into a string using this:
entry<- paste0(dat$taxon, ". ",
dat$life_history, ". ",
dat$County, ". ",
ifelse(!is.na(dat$non.native), paste0(dat$non.native, "")))
entry
I have tried to use dplyr to group by clade and family and a for-loop to get separate "entries" for each row, but can't seem to make the for-loop recognize the groupings.
dat %>% group_by(clade, family) %>%
for (clade in unique(dat$clade)) {
cat(glue::glue("\n\n# {clade} \n \n "))
for(family in unique(dat$family)) {
cat(glue::glue("\n\n# {family} \n \n "))
for(entry in unique(dat$entry)) {
cat(glue::glue("{entry} \n \n"))
}
}
}
This results in an error: 4 arguments passed to 'for' which requires 3. If I delete the group_by line, I get an output where every family and entry are repeated for every clade, not just the ones that belong together.
How do I get it to print only things that actually belong with each group??
Here's a for loop approach similar to your attempt.
First, let's clean up the entry
definition using glue
and add it as a column to the data. Note I also wrap the taxon in *
s so they will be italicized.
dat = mutate(dat,
entry_glue = glue::glue(
"*{taxon}*. {life_history}. {County}. {non.native}"
)
)
The trick to making this for loop work is to subset the data at each level, so that we're not looping through every entry in the whole data frame for every family and for every clade. (I also made the clades second level headers with ##
and the families third level header with ###
. I don't think Stack Overflow formats 2nd and 3rd level headers differently, but Word does and I think they will look better this way.)
for (clade_i in unique(dat$clade)) {
cat(glue::glue("\n\n## {clade_i} \n \n "))
clade_dat = filter(dat, clade == clade_i)
for(family_i in unique(clade_dat$family)) {
cat(glue::glue("\n\n### {family_i} \n \n "))
family_dat = filter(clade_dat, family == family_i)
for(entry_i in unique(family_dat$entry)) {
cat(glue::glue("{entry_i} \n \n"))
}
}
}
Result:
Juniperus osteosperma. tree. All counties.
Ephedra viridis. shrub. WP.
Achillea millefolium. forb. WP, CK.Non-native
Artemisia tridentata var. tridentata. shrub. EU, WP, WA.
Iva axillaris. forb. CK.Non-native
Pleiacanthus spinosus. forb. DG.
Packera multilobata. forb. DG, CC.