I have a list of word that I want to correct using unspell but in these words, there could be some specific word that hunspell didn't know and that he has to not correct(the list is not defined and too long to be added by hand)
what method do I could use to solve that?
I already tried to find and upgrade the dictionary
here is a list of the word :
keywords<-c("Millimeter", "OMT", "Chooz",
"DCTPC", "JEM" "EUSO"
"EUSO", "EUSO" "PDM"
"FPGA", "Chooz" "Cepheids"
"Circumstellar","Tokamak" "ASIC"
"TiSAFT", "CoRoT" "Unes"
"Radioastronomy" ,"Coronagraphy", "Fiber",
"Ultrastable" ,"Puslsar" "Magnetohydrodynamic",
"KSZ", "Gaussianity", "Raman",
"Gravimetry", "Casimir" "transfert"
"TES", "MEMS", "CMB",
"CMB" ,"TES" "Blazar"
"modeling","DFB" "linewidth"
"Asteroseismology","ExPRES", "NDA",
"rephasing", "Nulling", "Gyroscop",
"Atmopsheric","fibers", "Spectroscopie",
"d'absorption","Calculs", "Aluminum",
"Transneptunian","Planetology", "Ultrastable",
so are really bad spelling like transfert or d'absorption but other are special words or anagrams here is the code :
bad_matrix<-sapply(keywords,FUN = function(x){hunspell(x,dict=dict_lang)})
bad_index=sapply(1:dim(bad_matrix)[1],FUN =function(x){length(bad_matrix[[x]])!=0})
Use dictionary()
with add_words
parameter -
library("hunspell")
keywords<-c("Millimeter", "OMT","Chooz")
words <- c("OMT", "wiskey")
correct_pkg <- hunspell_check(words)
correct_custom <- hunspell_check(words, dict = dictionary("en_US", add_words=keywords))
correct_pkg
correct_custom
Output
> correct_pkg
[1] FALSE FALSE
> correct_custom
[1] TRUE FALSE
Notice how in the second case "OMT"
gets accepted as a word.