Search code examples
databaseperlflat-file

replacing flat-file db with proper database with record level editing


I cannot install SQLite on a remote machine, so I have to find a way to store a large amount of data in some kind of database structure.

Example data

key,values...
key,values....
..

There are currently about a million rows in a 20MB flat file, and hourly I have to read through each record and value in the file and update or add a record. Since it is a flat file I have to rewrite the whole file each time.

I am looking at the Storable module, but I think it also writes data sequentially. I want to edit only those records which need to be changed.

reading and updating of random records is a requirement. Additions can be anywhere(order is not important)

Can anyone suggest something? How will I know if I can setup a native Berkeley database file on these systems, which are a mixture of Solaris and Linux?

________________finally__________________

finally I understood things better (thank you all), and based on your suggestions I used AnyDBM_File. It found NDBM_File ('C' library) installed on all OS. So far so good.

Just to check how it will play out in real world. I ran a sample script to add 1 million records (the max records i think i may ever get in a day, normally between 500k to 700k). OMG it created a 110G data file on my disk !!!! and all the records were like:

a628234 = 0.178532683639599

I mean my real world records are longer than that. compare this to a flat file which is holding real-life 700k+ records and is only 15Mb on disk.

I am disappointed with the slowness and bloat-ness of this, so for now i think i will pay the price by writing the whole file each time an edit is required.

Thanks again for all your help.


Solution

  • As they said in the comments you may use SDBM_File module. For example:

    #!/usr/bin/perl 
    use strict;
    use warnings;
    use v5.14;
    
    use Fcntl;
    use SDBM_File;
    
    my $filename = "dbdb";
    
    my %h;
    
    tie %h, 'SDBM_File', $filename, O_RDWR|O_CREAT, 0666
        or die "Error: $!\n";
    
    # To run only one time to fill the dbdb file.
    # Next time you may delete this line and
    # the output will be the same "16,40".    
    $h{$_} = $_ * 2 . "," . $_ * 5  for 1..100;
    
    say $h{8};
    
    untie %h;
    

    Output: 16,40