Search code examples
parsingtracepacket-capturebgp

BGP MRT format parsing


I'm trying to parse the BGP trace downloaded here. It is said that the BGP packet traces are stored in the files with prefix updates and these MRT format files can be read by PyBGPdump.

I downloaded one file and followed the instruction (or this better formatted one):

cnt = 0
dump = pybgpdump.BGPDump('sample.dump.gz')
for mrt_h, bgp_h, bgp_m in dump:
    cnt += 1
print cnt, 'BGP messages in the MRT dump'

However, I got this error:

Traceback (most recent call last):
  File "bgp-stats.py", line 8, in <module>
    for mrt_h, bgp_h, bgp_m in dump:
  File "/usr/local/lib/python2.7/dist-packages/pybgpdump.py", line 61, in next
    bgp_m = dpkt.bgp.BGP(bgp_h.data)
  File "/usr/local/lib/python2.7/dist-packages/dpkt/dpkt.py", line 89, in __init__
    self.unpack(args[0])
  File "/usr/local/lib/python2.7/dist-packages/dpkt/bgp.py", line 152, in unpack
    self.data = self.update = self.Update(self.data)
  File "/usr/local/lib/python2.7/dist-packages/dpkt/dpkt.py", line 89, in __init__
    self.unpack(args[0])
  File "/usr/local/lib/python2.7/dist-packages/dpkt/bgp.py", line 247, in unpack
    attr = self.Attribute(self.data)
  File "/usr/local/lib/python2.7/dist-packages/dpkt/dpkt.py", line 89, in __init__
    self.unpack(args[0])
  File "/usr/local/lib/python2.7/dist-packages/dpkt/bgp.py", line 326, in unpack
    self.data = self.as_path = self.ASPath(self.data)
  File "/usr/local/lib/python2.7/dist-packages/dpkt/dpkt.py", line 89, in __init__
    self.unpack(args[0])
  File "/usr/local/lib/python2.7/dist-packages/dpkt/bgp.py", line 376, in unpack
    seg = self.ASPathSegment(self.data)
  File "/usr/local/lib/python2.7/dist-packages/dpkt/dpkt.py", line 94, in __init__
    (self.__class__.__name__, args[0]))
dpkt.dpkt.UnpackError: invalid ASPathSegment: '\x1d\xf6\x00\x00\x1d\xf6\x00\x00\x1d\xf6\x00\x00F\xe0'

It seems to be a format issue. I searched for "sample.dump.gz" and found it here. The result is just fine:

(999, 'BGP messages in the MRT dump')

Any insights what happens here? All trace files are not readable and I have no idea how to parse the files from the repo I found.

Many thanks!


Solution

  • This is currently a bug in the dpkt library. There is an open issue in the official repository, but it's from 2015. The problem is that the BGP Update parser is treating the AS Numbers in the AS Path as 2 octet/byte AS Numbers, even though they are encoded as 4 octet/byte AS Numbers. So when it reaches the beginning of an 4 byte encoded AS path of length two

    \x00\x00\xab\xcd   \x00\x00\x12\x34
    

    it would try to read two 2 byte AS Numbers and then stop. So instead of 43981 4660 it reads 0 43981 and interprets the remaning bytes wrong.

    There is currently no quick fix, as the problem is quite tricky. In order to know how an AS path is encoded, one would have to look at the capabilities that were negotiated in the BGP Open message. Not sure how other parsers handle this.

    You could bump the issue in the repo or try an alternative library like mrtparse.