I have a text file, and each line is of the form:
TAB WORD TAB PoS TAB FREQ#
Word PoS Freq
the Det 61847
of Prep 29391
and Conj 26817
a Det 21626
in Prep 18214
to Inf 16284
it Pron 10875
is Verb 9982
to Prep 9343
was Verb 9236
I Pron 8875
for Prep 8412
that Conj 7308
you Pron 6954
Would one of you regex wizards kindly assist me in isolating the WORDS from the file? I'll do a find and replace in TextPad, hopefully, and that will be that. Multiple find and replaces is fine. One thing: notice that searching for "verb" would also turn up the WORD of "verb," not just the part of speech, so be carefull. In the end I want to end up with 1 word per line.
Thanks so much!
I think microsoft excel can help you that better...
Just copy the whole text on excel and it will be formatted as table then go ahead and select the appropriate column cells for the word, finally copy them on notepad.
I bet this is the easiest path.
If in case excel stores all values in a single column, in a separate column extract the word by:
=Trim(LEFT(C1,maxchar))