Search code examples
xmlwiresharktsharkpyshark

How to access the text representation of a xml payload contained in a sniffed http packet in pyshark?


I need to reverse engineer the XML based communication between an application and a server.

In Wireshark there is an option to export the raw text of the http packet's xml payload to a text file or to the clipboard.

I'd like to achive the same in pyshark in order to log all XML communication programatically.

Below is a snippet with what I have so far. Unfortunately, I cannot figure out how to access the unparsed text representation of the packet's xml payload. Instead, I can only access the parsed version of the xml or pretty_print it.

How can I access the unparsed xml in pyshark?

import pyshark

filtered_cap2 = pyshark.LiveCapture(interface=['4'], bpf_filter='tcp port 80')

for packet in filtered_cap2.sniff_continuously(packet_count=500):
    try:
        packet.xml.pretty_print()
    except:
        pass

Solution

  • packet.http.file_data is your option. You can try also packet.http.file_data.raw_value.