I am using the textstem package to lemmatise words in some responses. However there is one word (spotting) which I do not wan't to be included, and reduced to "spot". I want it to remain as spotting. How might I be able to do this? Do I need to make a custom dictionary? Currently doing:
lemmatize_strings(df, dictionary = lexicon::hash_lemmas)
You can create your own dictionary where you remove the token spotting
# hash_lemmas is a datatable, so you can use column name token instead hash_lemmas$token
my_lex <- lexicon::hash_lemmas[!token == "spotting", ]
df_lemmatized <- lemmatize_strings(df, dictionary = my_lex)
Or if you want to do it without creating your own lexicon:
df_lemmatized <- lemmatize_strings(df, dictionary = lexicon::hash_lemmas[!token == "spotting", ])