Search code examples
windowsdatabaseperllan

How can I manipulate a local database with Perl?


I'm a Perl programmer with some nice scripts that go fetch HTTP pages (from a text file-list of URLs) with cURL and save them to a folder.

However, the number of pages to get is in the tens of millions. Sometimes the script fails on number 170,000 and I have to start the script again manually. It automatically reads the URL and sees if there is a page downloaded and skips. But, with a few hundred thousand, it still takes a few hours to skip back up to where it left off. Obviously, this is not going to pan out in the end.

I've been told that instead of saving to a text file, which is hard to search and modify, I need to use a database. I don't know much about databases, just messed around with MySQL on a school server a year ago. I just need the ability to add millions of rows and a few static columns, search/modify one quickly, and do this all locally on a lan (or a single computer if that's difficult). And of course, I need to access this database using perl.

Where should I start? What do I need to download to get a server started on Windows? Which Perl modules should I use? (I'm using an ActiveState distro)


Solution

  • Since you only need to search on one column, you may wish to consider a key/value store database like the Berkeley DB by using either BerkeleyDB or DB_File.

    Generally, you can think of these key/value databases as being Perl hashes that operate from a disk rather than memory. Exact key look ups are very fast. Everything else requires scanning the whole dataset.