I want to create a class in order to manage markup language (such as HTML) in C++. I would like my class to retain attributes and sub-tags. The problem is, given encapsulated containers, how to properly abstract the accesses and what to return in order to provide an easy way to check if the value returned is valid.
I defined my class containing two maps as private members (nominally, std::map<std::string, Tag> _children;
and std::map<std::string, std::string> _attr;
. I defined two functions to populate these fields and I would like to define two functions to read-access the stored elements.
The problem is, I don't want to break my abstraction and, as I'm doing this in order to work on my c++ skills, I would like to find the proper way (or cleaner way, or standard way) to do it.
One basic solution would be to simply call return map.find(s);
, but then I would have to define the return type of my function as std::map<std::string, Tag>::const_iterator
, which would break the abstraction. So I could dereference the iterator returned by map.find()
, but in case the value in not in the map I would dereference a non-dereferencable iterator (_children.cend()
).
What I defined so far:
using namespace std;
class Tag {
static const regex re_get_name, re_get_attributes;
string _name;
map<string,string> _attr;
map<string,Tag> _children;
public:
Tag(const string &toParse) {
/* Parse line using the regex */
}
const string& name() const {
return _name;
}
Tag& add_child(const Tag& child) {
_children.insert(child._name, child);
return *this;
}
SOMETHING get_child(const string& name) const {
map<string,Tag>::const_iterator val = _children.find(name);
/* Do something here, but what ? */
return something;
}
SOMETHING attr(const string& name) const {
map<string, string>::const_iterator val = _attr.find(name);
/* Do something here, but what ? */
return something;
}
};
const regex Tag::re_get_name("^<([^\\s]+)");
const regex Tag::re_get_attributes(" ([^\\s]+) = \"([^\\s]+)\"");
What would be the proper way to handle this kind on situation in C++? Should I create my own Tag::const_iterator
type? If so, how to? Should I go for a more "C" approach, where I just define the return type as Tag&
and return NULL
if the map doesn't contain my key? Should I be more OOP with a static member static const Tag NOT_FOUND
, and return a reference to this object if the element isn't in my map? I also thought of throwing an exception, but exception management seems to be quite heavy and ineffective in C++.
std::optional
could help you, but needs a C++17 ready standard library, so in the meantime you could also use boost::optional
which is more or less the same, since AFAIK std::optional
s design was based on the boost one. (As boost is often the source for new C++ standard proposals)
Even as I am reluctant to make you a proposal because of the general problem of your approach, I still wrote one for you, but please consider the points after the code:
#include <string>
#include <regex>
#include <map>
#include <boost/optional.hpp>
class Tag {
static const std::regex re_get_name, re_get_attributes;
using string = std::string;
string _name;
std::map<string,string> _attr;
std::map<string,Tag> _children;
public:
Tag(const string &toParse) {
/* Parse line using the regex */
}
const string& name() const {
return _name;
}
Tag& add_child(const Tag& child) {
_children.emplace(child._name, child);
return *this;
}
boost::optional<Tag> get_child(const string& name) const {
auto val = _children.find(name);
return val == _children.cend() ? boost::optional<Tag>{} : boost::optional<Tag>{val->second};
}
boost::optional<string> attr(const string& name) const {
auto val = _attr.find(name);
return val == _attr.cend() ? boost::optional<string>{} : boost::optional<string>{val->second};
}
};
As you can see you are basically just reimplementing container semantics of std::map
but also with the somehow built in parser logic. I strongly disagree from this approach, since parsing gets ugly very fast in a hurry, and mixing value generation code into a container which could i.e. should be use as a value class will make things even worse.
My first suggestion is to just declare/use your Tag
class/struct as a value class, so just containing the std::maps as public members. Put your parsing functions in a namespace along with the Tag container, and let them just be functions or distinct classes if needed.
My second suggestion is small one: Don't prefix with _
, it's reserved and considered bad style, but you can use it as a suffix. Also don't use using namespace directives outside of a class/function/namespace block i.e. global, it's bad style in a .cpp, and extremely bad style in a header /.h/.hpp
My third suggestion: Use the boost spirit qi parser frame work, you would just declare your value classes as I suggestion first, while qi would automatically fill them, via boost fusion. If you know the EBNF notation already, you can just write the EBNF like grammar in C++, and the compiler will generate a parser via template magic. However qi and especially fusion has some issues, but it makes things much easier in the long run. Regexes only does half of the parsing logic, at best.