Search code examples
c++error-handlingpugixml

Convert pugixml's result.offset to column/line


I need user-friendly error reporting for an application that uses pugixml.
I am currently using result.offset.
Is there a way to get the line and column instead? I am potentially dealing with large XML files, if that makes a difference.


Solution

  • This functionality is not readily available in pugixml since it's relatively expensive to do it on every parse, and after parsing is complete it's impossible to recover file/line information in the general case.

    Here's a snippet that builds an offset -> line mapping that you can use in case parsing fails or you need the information for other reasons; feel free to tweak file I/O code to match your requirements.

    typedef std::vector<ptrdiff_t> offset_data_t;
    
    bool build_offset_data(offset_data_t& result, const char* file)
    {
        FILE* f = fopen(file, "rb");
        if (!f) return false;
    
        ptrdiff_t offset = 0;
    
        char buffer[1024];
        size_t size;
    
        while ((size = fread(buffer, 1, sizeof(buffer), f)) > 0)
        {
        for (size_t i = 0; i < size; ++i)
            if (buffer[i] == '\n')
                result.push_back(offset + i);
    
        offset += size;
        }
    
        fclose(f);
    
        return true;
    }
    
    std::pair<int, int> get_location(const offset_data_t& data, ptrdiff_t offset)
    {
        offset_data_t::const_iterator it = std::lower_bound(data.begin(), data.end(), offset);
        size_t index = it - data.begin();
    
        return std::make_pair(1 + index, index == 0 ? offset + 1 : offset - data[index - 1]);
    }