Search code examples
c++containersapi-design

API design for retrieving object from container that may not exists


What are the ways for creating API for retrieving object from indexed custom container, when the object in question may not exists?

So far I thought about:

  1. Throw an exception

    T get(int index) const
    {
        if(not_exists(index)) throw std::out_of_range("Index is out of range");
        return get_base(index);
    }
    
  2. Construct T and return it

    T get(int index) const
    {
        if(not_exists(index)) return T{};
        return get_base(index);
    }
    
  3. Return bool and retrieve as reference

    bool get(int index, T & obj) const
    {
        if(not_exists(index)) return false;
        obj = get_base(index); return true;
    }
    
  4. Use default argument if not found

    T get(int index, T def_obj) const
    {
        if(not_exists(index)) return def_obj;
        return get_base(index);
    }
    
  5. Combine 4 + 2

    T get(int index, T def_obj = {}) const
    {
        if(not_exists(index)) return def_obj;
        return get_base(index);
    }
    
  6. Modify container to add such object (warning - the get will no longer be const!)

    T get(int index, T def_obj = {})
    {
        if(not_exists(index)) set(index, def_obj);
        return get_base(index);
    }
    

What are the pros and cons of each solution? Have I missed anything?

I am especially worried about reasoning in heavily concurrent environment and I want to have as intuitive and safe API for client as possible.


Solution

  • The fundamental issue here is of semantics: #1 and #3 are the only ones where presence is distinguishable from absence; #6 always succeeds in returning an element of the container; and the others always succeed in returning some value. The application governs which of these you need.

    In this regard, #1 and #3 are complete: either is sufficient for implementing any other (given some other means of adding elements to emulate #6). If interference from other threads can be avoided, #4 and #5 are equally powerful: they can be used to detect absence by offering two different default values. Alternatively, bool contains(int index) const; can be added to allow distinguishing absence (again with external synchronization as needed).

    However, these emulations (except #2/4/5 from #1/3) involve repeated lookups that might have inadequate performance. For some underlying data structures, yet other operations might be necessary for best performance: for example, moving an element from one index to another without reconstructing it.

    Meanwhile, all of these approaches have practical issues, at least in a generic context.

    1. Some experts believe that logic_error is always a mistake; certainly throwing the exception in a reasonably common case is expensive. However, a reference can be returned here, which is very useful.
    2. T must be value-constructible (similar but not identical to default-constructible).
    3. T must be assignable (and the client must have constructed one, perhaps used for multiple calls). Undefined behavior can result from ignoring the flag (so mark it [[nodiscard]]).
    4. Two T objects must be constructed per call. The default could be made a reference to allow a reference return (and support a cumbersome form of detecting missing values), but to avoid allowing temporary arguments an rvalue-reference overload (or a constrained template) would then be required.
    5. Like #4, this constructs two objects. In a template context, improves on #2 in that value-constructibility is required only if the default value is used.
    6. This could of course have three variations like #2/4/5. It would be more useful if it returned a reference (like map::operator[]) to allow mutating the (potentially) new element.

    If T might be expensive to construct (even from {}), only #1 (as used by map::at) and the optional suggestion are efficient; conveniently they are also complete. Perhaps the fastest variation is to return const T*, using a null pointer to indicate absence. Picking between them is a question of fine-tuning performance trade-offs (unless your shop has it in for exceptions or pointers in general). For a cheap T, #5 is attractive if its semantics suffice; otherwise #3 might be best (for its similarity to if(std::cin >> x)).