I have trained a newsmap model in the Newsmap package for quanteda in R and am trying to export the large dictionary it constructed based on my corpus (not the seed dictionary). I have tried this code, but it only gives me the 10 most associated terms per country in a list format, which I also fail to extract in order to form a dictionary object I can use in R.
Dict <-coef(model)
I would really appreciate any and all help!
You only need to extract the names of the vectors with desired number of words passed to n
.
> quanteda::dictionary(lapply(coef(model, n = 1000), FUN = names))
Dictionary object with 226 key entries.
- [bi]:
- burundi, burundi's, bujumbura, burundian, nkurunziza, uprona, msd, nduwimana, hutus, tutsi, radebe, drcongo, rapporteur, elderly, mushikiwabo, generation, kayumba, faustin, hutu, olga [ ... and 980 more ]
- [dj]:
- djibouti, djibouti's, djiboutian, western-led, pretty, photo, watkins, ask, entebbe, westerners, mujahideen, salvation, osprey, persistent, horn, afdb, donors, ismael, nevis, grenade [ ... and 980 more ]
- [er]:
- eritrea, eritreans, eritrean, keetharuth, issaias, eritrea's, binnie, sheila, somaliland, catania, mandeb, brutal, sicily's, lana, horn, lampedusa, aman, afdb, donors, monitoring [ ... and 980 more ]
- [et]:
- ethiopia, ethiopian, addis, ababa, addis, ababa, hailemariam, desalegn, ethiopians, maasho, ethiopia's, mandeb, igad, dibaba, genzebe, mesfin, bekele, spla, shrikesh, laxmidas [ ... and 980 more ]
- [ke]:
- kenya, kenyan, nairobi, nairobi, uhuru, lamu, mombasa, mpeketoni, kenyans, kws, nairobi's, akwiri, ruto, westgate, kenyatta's, mombasa, makaburi, kenyatta, kenya's, ol [ ... and 980 more ]
- [km]:
- comoros, mazen, emiratis, oil-rich, canterbury, lahiya, shoukri, gender, wadia, lombok, brisbane's, entire, christiana, blahodatne, everest's, culiacan, kamensk-shakhtinsky, protestants, pk-5, parwan [ ... and 980 more ]
[ reached max_nkey ... 220 more keys ]