I am writing a C++ program involving boost::program_options, here I came across some problems. Some of my codes are given here.
int main(int argc, char* argv[]) {
options_description desc("useage: filterfq", options_description::m_default_line_length * 2, options_description::m_default_line_length);
options_description generic("Gerneric options", options_description::m_default_line_length * 2, options_description::m_default_line_length);
generic.add_options()
("help,h", "produce help message")
;
options_description param("Parameters", options_description::m_default_line_length * 2, options_description::m_default_line_length);
param.add_options()
("checkQualitySystem,c", bool_switch(), "only check quality system of the fastq file")
("baseNrate,N", value<float>() -> default_value(0.05), "maximum rate of \'N\' base allowed along a read")
("averageQuality,Q", value<float>() -> default_value(0), "minimum average quality allowed along a read")
("perBaseQuality,q", value<int>() -> default_value(5), "minimum quality per base allowed along a read")
("lowQualityRate,r", value<float>() -> default_value(0.5), "maximum low quality rate along a read")
("rawQualitySystem,s", value<int>(), "specify quality system of raw fastq\n0: Sanger\n1: Solexa\n2: Illumina 1.3+\n3: Illumina 1.5+\n4: Illumina 1.8+")
("preferSpecifiedRawQualitySystem,p", bool_switch(), "indicate that user prefers the given quality system to process")
;
options_description input("Input", options_description::m_default_line_length * 2, options_description::m_default_line_length);
input.add_options()
("rawFastq,f", value< vector<path> >() -> required() -> multitoken(), "raw fastq file(s) that need cleaned, required")
;
options_description output("Output", options_description::m_default_line_length * 2, options_description::m_default_line_length);
output.add_options()
("cleanQualitySystem,S", value<int>() -> default_value(4), "specify quality system of cleaned fastq, the same as rawQualitySystem")
("outDir,O", value<path>() -> default_value(current_path()), "specify output directory, not used if cleanFastq is specified")
("outBasename,o", value<string>(), "specify the basename for output file(s), required if outDir is specified")
("cleanFastq,F", value< vector<path> >() -> multitoken(), "cleaned fastq file name(s), not used if outDir or outBasename is specified")
("droppedFastq,D", value< vector<path> >() -> multitoken(), "fastq file(s) containing reads that are filtered out")
;
desc.add(generic).add(param).add(input).add(output);
variables_map vm;
store(command_line_parser(argc, argv).options(desc).run(), vm);
if (vm.count("help")) {
cout << desc << "\n";
return 0;
}
...
}
The #include using namespace parts are not given here. When I typed the command to see the help message, it showed me the following
useage: filterfq:
Gerneric options:
-h [ --help ] produce help message
Parameters:
-c [ --checkQualitySystem ] only check quality system of the fastq file
-N [ --baseNrate ] arg (=0.0500000007) maximum rate of 'N' base allowed along a read
-Q [ --averageQuality ] arg (=0) minimum average quality allowed along a read
-q [ --perBaseQuality ] arg (=5) minimum quality per base allowed along a read
-r [ --lowQualityRate ] arg (=0.5) maximum low quality rate along a read
-s [ --rawQualitySystem ] arg specify quality system of raw fastq
0: Sanger
1: Solexa
2: Illumina 1.3+
3: Illumina 1.5+
4: Illumina 1.8+
-p [ --preferSpecifiedRawQualitySystem ] indicate that user prefers the given quality system to process
Input:
-f [ --rawFastq ] arg raw fastq file(s) that need cleaned, required
Output:
-S [ --cleanQualitySystem ] arg (=4) specify quality system of cleaned fastq, the same as rawQualitySystem
-O [ --outDir ] arg (="/home/tanbowen/filterfq") specify output directory, not used if cleanFastq is specified
-o [ --outBasename ] arg specify the basename for output file(s), required if outDir is specified
-F [ --cleanFastq ] arg cleaned fastq file name(s), not used if outDir or outBasename is specified
-D [ --droppedFastq ] arg fastq file(s) containing reads that are filtered out
The help message looks somewhat ugly, especially the "0.0500000007", I want to improve it. But I googled for a long time, I cannot find solutions. So I ask for help here solve following problems:
One extra question: how can I prevent the following command to be executed
filter -f <some file> -f <some file>
i.e., do not allow the same option to be specified more than once?
Thanks very much!!
Yes, see below (shows the general form of collecting and displaying formatted options)
look at the constructor of options_description
. It allows you to specify column widths.
Here is a (real) example of a custom option value. In my case I wanted to collect a buffer size in bytes, but also wanted to be able to parse things like 4K or 1M.
struct bytesize_option
{
bytesize_option(std::size_t val = 0) : _size(val) {}
std::size_t value() const { return _size; }
void set(std::size_t val) { _size = val; }
private:
std::size_t _size;
};
std::ostream& operator<<(std::ostream& os, bytesize_option const& hs);
std::istream& operator>>(std::istream& is, bytesize_option& hs);
namespace {
static constexpr auto G = std::size_t(1024 * 1024 * 1024);
static constexpr auto M = std::size_t(1024 * 1024);
static constexpr auto K = std::size_t(1024);
}
std::ostream& operator<<(std::ostream& os, bytesize_option const& hs)
{
auto v = hs.value();
if (v % G == 0) { return os << (v / G) << 'G'; }
if (v % M == 0) { return os << (v / M) << 'M'; }
if (v % K == 0) { return os << (v / K) << 'K'; }
return os << v;
}
std::istream& operator>>(std::istream& is, bytesize_option& hs)
{
std::string s;
is >> s;
static const std::regex re(R"regex((\d+)([GMKgmk]){0,1})regex");
std::smatch match;
auto matched = std::regex_match(s, match, re);
if(!matched) {
throw po::validation_error(po::validation_error::invalid_option_value);
}
if (match[2].matched)
{
switch (match[2].str().at(0))
{
case 'G':
case 'g':
hs.set(std::stoul(match[1].str()) * G);
break;
case 'M':
case 'm':
hs.set(std::stoul(match[1].str()) * M);
break;
case 'K':
case 'k':
hs.set(std::stoul(match[1].str()) * K);
break;
}
}
else {
hs.set(std::stoul(match[1].str()));
}
return is;
}
You would use it like so:
return boost::shared_ptr<po::option_description> {
new po::option_description("server.max-header-size,x",
po::value(&_max_hdr_size)
->default_value(_max_hdr_size),
"The maximum size (in bytes) of a HTTP header "
"that the server will accept")
};
Where in this case, _max_hdr_size
is defined:
bytesize_option _max_hdr_size;