We are parsing pcap files that are created via the tcpdump command. Inside these pcap files we are attempting to extract the GET request information in the Raw field and print it in a readable format.
pkts = rdpcap(filename)
for pkt in pkts:
if Raw in pkt:
raw_test = pkt[Raw].load
if "GET" in raw_test:
#do stuff
The resulting text of raw_test comes out looking like this:
▒פ▒▒▒▒▒▒2▒nk▒N▒▒bEr▒▒(|▒▒▒▒Ǫ=▒▒Ih▒H+%▒2.▒L[▒▒▒sl▒E▒▒▒k6▒]=މf▒d▒O▒hB{6s▒▒▒7O2!PCG&▒A.4I▒耓▒X▒▒▒W]▒▒M5@▒▒▒vK▒#Ċ▒ ▒▒▒m]Zb_▒8▒▒▒nb~
]▒h▒6▒.̠▒49ؾG?▒▒▒4▒Ӹ▒▒G▒▒́G▒:Y▒▒▒▒.▒8▒▒d▒i4▒JAC)▒▒AO▒k▒z-▒▒S30▒X?▒▒W5B▒yW▒m▒▒▒/ƈ:G▒▒▒E▒▒<▒▒▒m▒]▒▒▒▒t▒:▒▒▒Ŕ▒W▒▒D▒E▒▒▒▒▒࿄▒▒zZ▒▒x▒]▒▒{{▒▒u▒){▒▒o▒▒G▒F▒▒▒▒▒v
▒▒▒b.
We have also tried formatting it via pkt.sprintf(“{Raw:%Raw.load%}\n”)
but that has yielded the same output
P.S. Please do not link us to other related stack posts/questions as we have come across many of them already, and none of them seem to fix our problem.
Thank you in advance, any help is greatly appreciated!.
Please try this, I assume that http is targeted to port 80
if TCP in pkt and pkt[TCP].dport == 80 \
and pkt[TCP].load.startswith("GET") :
print pkt[TCP].load