I'm looking for a free or open source rhyming database.
I've found the CMU pronunciation "database" and its series of apps but I can't make sense of them or figure out where the data's coming from.
A simple text file with the word and its phonemes is all I need.
Does anybody here know where I'd find one or where I would begin to derive such a list from the CMU files?
The cmudict is a text file and it's format is really simple. First, the word is listed. Then, there are two spaces. Everything following the two spaces is the pronunciation. Where a word may have two different ways of being spoken you will see two entries for the word like
word
word(1)
At the beginning of the file they've listed symbols and punctuation. The symbol is followed by the english spelling of said symbols name with no space between them. This is then followed by the two space divider and the arpabet code. Since you're only looking for rhymes you don't have to do anything special with the symbols section since you're never going to be looking for a rhyme to ...ELLIPSIS
The information about how ARPAbet codes map to IPA is listed in wikipedia http://en.wikipedia.org/wiki/Arpabet and each mapping shows example words. It's pretty easy to see how the two relate to one another and that may help you to understand how to read the ARPAbet codes if you are familiar with IPA.
Basically, if you've already found the cmudict then you've already got what you asked for: a database of words and their pronunciations. To find words that rhyme you'll have to parse the flat file into a table and run a query to find words that end with the same ARPAbet code.
Once you've got the data into whatever kind of database you choose, you can then use that database to find correlations between the arpabet codes. You could find rhymes, consonance, assonance, and other mnemonic devices. It would go something like
I got bored and wrote a Node.js module that covers "Part: Stuff" listed above. If you've got Node.js installed on your machine you can get the module by running npm install cmudict-to-sqlite
See https://npmjs.org/package/cmudict-to-sqlite for the README or just look in the module for docs.