I've read through this answer to my question already but it hasn't been answered in a decade so I'm hoping things have changed.
I'm looking for a way to remove whitespace from a const char *
at compile time. (Producing a new string) For example:
constexpr static const char * initial = "hello world";
constexpr static const char * after = strip_whitespace(initial); // "helloworld"
But I have no idea how to implement something like strip_whitespace
as either a constexpr function or macro.
Using const char*
to store pointers to compile time strings is something you should avoid. Compile time string manipulation libraries use fixed size buffers usually.
you start with this, a basic compile time string:
template<std::size_t N>
struct ct_string {
char bytes[N];
// does not include trailing nil
// so ct_string<10> has a max size of 9
// if there is an earlier nil character, size()
// is the length up to that nil
[[nodiscard]] constexpr std::size_t size() const {
std::size_t r = 0;
while(r + 1 < N && bytes[r])
++r;
return r;
}
constexpr char& operator[](std::size_t i) { return bytes[i]; }
constexpr char const& operator[](std::size_t i) const { return bytes[i]; }
// from a "string literal":
constexpr ct_string( char const(&arr)[N] ) {
for (std::size_t i = 0; i < N; ++i)
bytes[i] = arr[i];
}
constexpr char const* data() const { return bytes; }
constexpr operator char const*() const { return data(); }
};
we can then add a few operations:
template<std::size_t A, std::size_t B>
[[nodiscard]] constexpr ct_string<A+B-1> operator+( ct_string<A> lhs, ct_string<B> rhs ) {
ct_string<A+B-1> retval;
// copy up to first nil in lhs:
for (std::size_t i = 0; i < lhs.size(); ++i)
retval[i] = lhs[i];
// copy entire rhs buffer, including trailing nils:
for (std::size_t i = 0; i < B; ++i)
retval[lhs.size()+i] = rhs[i];
// zero out the leftovers, if any:
for (std::size_t i = lhs.size() + B; i < A+B; ++i)
retval[i] = 0;
return retval;
}
// copies characters for which f(c) is true:
template<std::size_t N, class F>
[[nodiscard]] constexpr ct_string<N> filter_if( ct_string<N> lhs, F f ) {
std::size_t j = 0;
for(std::size_t i = 0; i < N; ++i)
{
lhs[j] = lhs[i];
if (f(lhs[i])) // if we fail, we overwrite lhs[j]
{
++j;
}
}
// 0 out everything from j to end of buffer:
for(j = j+1; j < N; ++j)
lhs[j] = 0;
return lhs;
}
template<std::size_t N>
[[nodiscard]] constexpr ct_string<N> filter( ct_string<N> lhs, char x ) {
return filter_if( lhs, [x](char y){ return x!=y; } );
}
use is simple:
constexpr ct_string hello_world = "hello world";
constexpr ct_string shorten = filter(hello_world, ' ');
and test code:
#include <cstdio>
int main() {
printf("%s\n", hello_world.data());
printf("%s\n", shorten.data());
}
fancier code, like supporting stuff that isn't ' '
as whitespace, is left to you.
My buffers are always at least long enough, always null terminated, and extra nulls are appended if we shorten the string algorithmically.
Live example which includes supporting tabs as whitespace.