Search code examples
databaseout-of-memorylow-memorycrate

Crate - What is the minimum memory requirement for a node host?


I can find cheap VPS hosts with 128MB RAM, and I wonder if that is enough to run a crate node for a tiny database, initially for testing. (I'm not looking for recommended memory, but the minimum one, for not running into out-of-memory exceptions. Crate is supposed to be the only service in the node.)


Solution

  • It is possible to run Crate in such an environment. I wouldn't recommend it, though. In any case you need to take a few precautions:

    1. Select a lean Linux distribution that actually boots and runs with such a small memory footprint. Alpine might be one choice.
    2. Install Java. You need at least openjdk7 (update 55 and up).
    3. Install and start Crate from the tarball as explained on the Crate website.

    On a virtual machine with 128 MB RAM on top of Alpine 3.3, I installed openjdk8-jre (you have to enable community repositories in /etc/apk/repositories) on disk. I downloaded the Crate 0.54.7 tarball and just extracted it. I set CRATE_HEAP_SIZE=64m as this is the recommeded half of the available memory.

    I created a table "demo"

    DROP TABLE IF EXISTS demo;
    CREATE TABLE demo (
        data string
    );
    

    and filled it up with 10,000 records of 10 KB random strings each with a slow bash script:

    head -c7380 /dev/urandom | uuencode - | grep ^M | tr -d '\n\047'
    

    This took a few minutes (about 20 records/s), but with bulk inserts it should be way faster and just take seconds.

    The net amount of data was about 100 MB and took 287 MB gross on disk as reported by the admin UI.

    Operating system, the installed software, and the data altogether claimed 820 MB on the disk.

    I configured twice the amount of memory as swapspace and got the following footprint (the Crate process itself without data takes up about 40 MB):

    # free
                 total       used       free     shared    buffers     cached
    Mem:        120472     117572       2900          0        652       6676
    -/+ buffers/cache:     110244      10228
    Swap:       240636     131496     109140
    

    A fulltext search over all 10,000 records (SELECT count(*) FROM demo WHERE data LIKE '%ABC%') took about 1.9 seconds.

    Summary: Yes, it's possible, but you lose a lot of features if you actually do so. Your results will heavily depend on the type of queries you actually run.