I want to calculate tf-idf weight. So, for finding idf I need big database of different documents. Then I have make other db with colums-(word/count). So my question is "where can I find last database of "idf" (or count) coef for words"? Many search engines are using this db, maybe it is possible find this db in Internet for different languages? I don't want to make this db by myself.
idf is Inverse Document Frequency. In other words, the frequency of the term goes in the denominator. So what you want are word frequency tables. Wiktionary:Frequency lists should get you started. Keep in mind these lists treat inflected forms of a word as the same word e.g. be, is, am, are, ....