I am writing a bitfield abstraction class, which wraps around a 32-bit piece of memory (u32 = unsigned int) and provides access to individual bits or ranges within that memory.
To implement this, I have used a std::map where the unique key is a pointer (not std::string) to a C-character array representing the mnemonic, and the value is a struct containing the bitfield properties (such as mnemonic, starting position, length, initial value and field value). All of these properties are constant and defined on startup, except for the field value which is changed only when the underlying u32 value is changed. (Also note: I have just reused the mnemonic pointer value as the unique key).
This is being used in an emulator where getBitfieldValue()
, which returns the bitfield value (read only), is being called many times per second.
On compiling and profiling the code under VS 2015 update 3 (using -O2 and any speed optimisations I could find), it shows that the getBitfieldValue()
function, and by extension std::find()
is taking up around 60-70% of total cpu time... much too slow.
I have tried using other map implementations, such as Boost::flat_map
, google::dense_hash_map
or std::unordered_map
, and they somewhat help but still end up being too slow (~50-60%).
My guess is I am using a map for the wrong purpose, but I am not sure considering that there is only 5-20 bitfield mappings (small lookup size)... It just seems much too slow. Most of the time would be spent looking up the same field as well.
The relevant class source code can be found here: BitfieldMap32
An example of how the map is initalised at startup (run one-time only):
struct Fields
{
static constexpr char * ADDR = "ADDR";
static constexpr char * SPR = "SPR";
};
ExampleClass() // constructor
{
// registerField(mnemonic, start position, length, initial value)
registerField(Fields::ADDR, 0, 31, 0);
registerField(Fields::SPR, 31, 1, 0);
}
And how the field value is accessed (read only):
// getFieldValue definition.
const u32 & BitfieldMap32_t::getFieldValue(const char* fieldName)
{
return mFieldMap.find(fieldName)->second.mFieldValue;
}
// Field access.
const u32 value = ExampleClassPointer->getFieldValue(Fields::ADDR)
Any ideas on how to reduce the lookup time? Or do I need to change implementation all together?
IIUC, using a dictionary (std::map
or std::unordered_map
) is a huge overkill. Perhaps you should use the following:
The class should just be a wrapper around an internal storage of an integer (or at most an std::bitset
).
The mnemonics should be enum
s, not std::string
s.
Internally, have an std::vector
efficiently mapping each enum
value to a bitmask. (If you're using c++11 enum
s, see here how to convert an enum
value into a position within the std::vector
).
Each operation should just take the mnemonic, find by index the bitmask, and apply it to the internal storage.