Search code examples
javamemoryrssdb2openjpa

Java Rss Reader with OpenJPA, many Entries in Database - Best Practices


this is my first posting here, i found already many useful hints for problems regarding the program i am working on right now, thanks for that! But i still have a question on my own and this one might be a little more general.

I am building a Java Program for my diploma thesis, it is a RSS Reader (im using ROME for that), all RSS entries are saved in a DB2 database (i am using OpenJPA as the Persistence layer). All incoming entries will be automatically tagged (using MAUI) and be given a "relevance score" depending on the rating the user gave for previous entries (still working on that algorithm). There is a SWING GUI where all the feeds and the entries belonging to them will be listed, the user can see the Tags, add new Tags (MAUIs machine learning will take those to improve future tagging) and give a rating for the entries.

So far i implemented all the basic functionality and it works just fine. However, i am wondering about the performance of this program considering that all Feeds will be saved in the database and the GUI if i would follow through with my current approach.

To simplify it, i have the objects Ressource and Entry. A Ressource is an RSS Feed and Entries are all the "RSS News", each ressource has x entries but a entry belongs to one Ressource, thats how i modelled it in DB2 and with the JPA annotations. At runtime i create a List with all Ressources (with a "SELECT * FROM RESSOURCES" as a named Query that i call with the Entitymanager). At that moment i can access a ressources entries and fill a List in the GUI with them. Fine - i like it like that, i get all the information out of the DB from the beginning and it is turned into Java objects. So far i have a few hundred RSS entries and the program needs about 7MB of memory - great.

BUT: What happens as soon as we have ten thousands of entries, won´t the program need way too much memory? How can i tell JPA to just load, lets say 100 entries for each ressource (when retrieving the Ressource objects with JPA) and how can i dynamically fetch more?

I know there might be ways to work around that problem by querying on my own but i hope you know what i mean - i want to use standard JPA functionality without having all my Database turned into objects all the time resulting in a huge memory demand.

Thanks a lot for your help, Matthias


Solution

  • Use pagination, just like Google does for example. Instead of loading all the entries, load the first 100 ones, and make it possible to load the next page of 100 entries, etc.

    See setMaxResults() and setFirstResult() in the Query class.

    Also add a search form to your GUI, because browsing through 10,000 entries to find the one you're looking for is not something anyone wants to do.