Search code examples
hashmaparduinohashtable

Arduino Hash Table/Dictionary


I am trying to get a Hash Table or Dictionary to work on the Arduino Mega 2560. My goal is to have something like

dictionary[ENGLISH]["ACCOUNT"] = "Account";
dictionary[ENGLISH]["DATE_AND_TIME"] = "Date and Time";
dictionary[ENGLISH]["IDLE"] = "Idle";
dictionary[ENGLISH]["Language"] = "Languge"
dictionary[ENGLISH]["MAIN_MENU"] = "Main Menu";
dictionary[ENGLISH]["PRESCRIPTION"] = "Prescription";
dictionary[ENGLISH]["SETTINGS"] = "Settings";
dictionary[ENGLISH]["SOUND"] = "Sound";

where ENGLISH is essentially a constant 0, and I will have SPANISH and FRENCH as well (1 and 2 respectively). That is, an array of 3 dictionary elements.

On a first Google search, I found a link to a library that models the C++ STL, but it is not working on Arduino 1.0.3 for me at all. I was wondering if anybody had an alternative for using maps/hash tables in arduino for me, or a fix to get the library mentioned working.

For some context of my situation, I am modeling a menu system via a touchscreen on the Arduino, it has to accept 3 languages (for the buttons). The chosen language is found in a location in EEPROM and I'll keep it in the variable 'lang', when I need to print something to the screen, I'll do something like:

    screen.print(dictionary[lang]["SOUND"], CENTER, 23);

and depending on the 'lang' the user has chosen, it will print accordingly, ideally.


Solution

  • I think a hash table may be unnecessary here, and there are good reasons to avoid it on this platform anyway.

    Why a hash table may be unnecessary

    Usually in such situations there is no need for a string key, since the key is visible only inside your code, and the key is not interacted with from outside your program. Thus, the usual solution is to have a (pseudo) integer key, in the form of #define which is handled by the preprocessor and not your program:

    #define kWORDIDX_LANGUAGE   1
    #define kWORDIDX_SOUND      2
    #define kWORDIDX_MAINMENU   3
    #define kWORDIDX_SPAGHETTI  4
    ...
    dictionary[ENGLISH][kWORDIDX_SOUND] = "Sound";
    ...
    

    You can then access your dictionary entries like this Sreen.print(dictionary[lang][kWORDIDX_SOUND], CENTER, 23); or something similar.

    The advantages of this approach are:

    • You are saving memory space: no string keys are used
    • You are saving processing time: while hash table access is technically O(1), there are still constant factors involved to compute the hash. Array access is faster.
    • You code is less prone to errors: if you misspell your key using string accesses (which are a form of magic numbers anyway) you'll get a hash table miss. If you misspell one of the #defined keys, you get a compile error, which is what you want

    Why you don't want a hash table on an Arduino

    Arduino is a memory-constrained platform: it has very limited memory. The problem with using a real hash map is as follows:

    • Strings take up more memory space that ints (usually). Using #define keys, which are converted by the compiler to integer literals, you are using 1, 2, or 4 bytes per key (depending on your compiler settings), whereas each string key requires strlen(s) + 1 memory space to store. This space is actually used twice: once in Flash (that is where variables are initialized from) and once in SRAM.
    • The hash map data structure itself takes up more memory: there is the overhead of either empty entries in the hash map (in open addressing scheme) or of the linked list (in the separate chaining scheme). Since your data is read-only you don't need that overhead, since the hash table will not be added to.
    • One trick people use to save on memory in Arduinos is storing read-only data (such as strings) in program memory only (and not SRAM) using the PROGMEM keyword. If you build your dictionary using a hash map, this route will not be available to you. On the other hand, given a #define-type indexing scheme as above, it will be very easy to store all your language strings as PROGMEM strings.