Search code examples
c++boostboost-multi-index

How to use Boost::multi_index_container with checking if enum has set on specific bit


I wonder if the following code snippet can be modified to remove rv::filter and use boost::multi_index instead.

I have a struct Person containing first name, last name, job position and access level defined as enum. A single person may have access to any combination of "Level1", "Level2", Level3" which is represented in the code as AccessLevel::Level1 | AccessLevel::Level2 etc.

I want to find all persons whose job position is equal to "developer" and has access to Level1. I solved it by finding the developers firstly, and later filtering it by access level.

But I wonder is it possible to modify boost::multi_index_key somehow to achieve the same result.

When I call equal_range with std::tuple("developer", AccessLevel::Level1), it returns a developer with AccessLevel1 only (what is expected result because AccessLevel::Level1 | AccessLevel::Level2 is a different value).

#include <string>
#include <ranges>
#include <iostream>
#include <boost/multi_index/key.hpp>
#include <boost/range/iterator_range.hpp>
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>

namespace bmi = boost::multi_index;
namespace rv = std::ranges::views;

enum AccessLevel
{
    Level1 = 1 << 0,
    Level2 = 1 << 1,
    Level3 = 1 << 2,
};

AccessLevel operator|(AccessLevel lhs, AccessLevel rhs)
{
    return static_cast<AccessLevel>(static_cast<int>(lhs) | static_cast<int>(rhs));
}

AccessLevel operator&(AccessLevel& lhs, AccessLevel rhs)
{
    return static_cast<AccessLevel>(static_cast<int>(lhs) & static_cast<int>(rhs));
}

struct Person
{
    std::string firstName;
    std::string lastName;
    std::string jobPosition;
    AccessLevel accessLevel;
};

struct JobPositionTag {};


using Persons = bmi::multi_index_container<
    Person,
    bmi::indexed_by<
        bmi::ordered_non_unique<
            bmi::tag<JobPositionTag>,
            bmi::key<&Person::jobPosition>
        >
    >
>;

int main()
{
    Persons m_persons;

    m_persons.insert({"John", "Johnovsky", "developer", AccessLevel::Level1 | AccessLevel::Level2 | AccessLevel::Level3 });
    m_persons.insert({"Shrek", "Ogr", "developer", AccessLevel::Level2  });
    m_persons.insert({"Foo", "Bar", "PO", AccessLevel::Level1});
    m_persons.insert({"Hackerman", "", "developer", AccessLevel::Level1 });

    auto& byJobPosition = m_persons.get<JobPositionTag>();
    auto [beginIt, endIt] = byJobPosition.equal_range("developer");

    auto filtered = boost::make_iterator_range(beginIt, endIt) 
                    | rv::filter([](auto&& person) { return person.accessLevel & AccessLevel::Level1; });

    for (auto&& dev : filtered) 
    {
        std::cout << dev.firstName << " " << dev.lastName << " : " << dev.jobPosition << "\n";    
    }
}

Is there any way to achieve it? Thanks in advance!


Solution

  • Indexes operate on the principle that there is an ordering that can speed up lookup. Since your levels form an independent set, there is no such ordering.

    The best you can hope to achieve is to have independent indices:

    template <AccessLevel l> constexpr bool isLevel(Person const& p) { return p.accessLevel & l; }
    
    using Persons = bmi::multi_index_container<
        Person,
        bmi::indexed_by< //
            bmi::ordered_non_unique<bmi::tag<struct ByPos>, bmi::key<&Person::jobPosition>>,
            bmi::ordered_non_unique<bmi::tag<struct ByLevel1>, bmi::key<isLevel<Level1>>>, //
            bmi::ordered_non_unique<bmi::tag<struct ByLevel2>, bmi::key<isLevel<Level2>>>, //
            bmi::ordered_non_unique<bmi::tag<struct ByLevel3>, bmi::key<isLevel<Level3>>>  //
            >>;
    

    However, this leaves the question of which order to query by first:

    Live On Compiler Explorer

    Persons m_persons{
        {"John",      "Johnovsky", "developer", Level1 | Level2 | Level3},
        {"Shrek",     "Ogr",       "developer", Level2},
        {"Foo",       "Bar",       "PO",        Level1},
        {"Hackerman", "",          "developer", Level1},
    };
    
    auto& byPos = m_persons.get<ByPos>();
    auto& byL1  = m_persons.get<ByLevel1>();
    
    fmt::print("by pos first {}\n",
               make_iterator_range(byPos.equal_range("developer")) | rv::filter(isLevel<Level1>));
    
    fmt::print("by lvl first {}\n",
               make_iterator_range(byL1.equal_range(true)) |
                   rv::filter([](auto&& person) { return person.jobPosition == "developer"; }));
    

    Printing

    by pos first [{Hackerman '': developer}, {John 'Johnovsky': developer}]
    by lvl first [{Hackerman '': developer}, {John 'Johnovsky': developer}]
    

    There's probably a good deal of science about how to approach such optimization questions (as it is central to normal-form database representations) that I'm not fully aware of. I can mostly recommend to measure performance for your actual needs.

    Not Independent?

    The name "level" implies a superseeding of "ranks", and also the names Level1/Level2/Level3 imply potential order. If one of the levels is actually more significant then you could either order these by their integral values, or consider a composite key:

    using Persons = bmi::multi_index_container<
        Person,
        bmi::indexed_by<bmi::ordered_non_unique<bmi::tag<struct ByPos>, bmi::key<&Person::jobPosition>>,
                        bmi::ordered_non_unique<bmi::tag<struct ByLevel>,
                                                bmi::key<isLevel<Level1>, isLevel<Level2>, isLevel<Level3>>>>>;
    

    Leading to Live:

    fmt::print("by lvl first {}\n",
               make_iterator_range(byLev.equal_range(true)) |
                   rv::filter([](auto&& person) { return person.jobPosition == "developer"; }));
    

    Of course, this still leaves situations in which post-filtering is required. Also, it will not help for this particular query if the rank significance order is actually inverted.

    Other Avenues

    This kind of question comes up a bunch. There are different approaches you may want to consider:

    • consider external indexes. E.g. using boost::intrusive::set<> you can make the actual container elements part of other sets
    • consider interval containers
    • even more advanced, using spatial indexing (e.g. rtree) can allow you to query multi-dimensional orthogonal space; I think it's slightly less interesting here because all the axes will be [true,false] only, but I still thought I should mention it.

    Some inspirational examples from past answers: Selection in the codomain of interval_map, boost::multi_index_container, operations on std::set inside container

    • lastly it reminded me of this Indexing several vector fields using multi_index where I suggest making the many-to-many indexes external. The interesting bit here is the technique to dynamically map strings (like your job positions) onto integral domain