Search code examples
reverse-engineeringprotocol-buffers

Parsing Protocol-Buffers without .proto file


I am reverse-engineering an Android app as part of a security project. My first step is to discover the protocol exchanged between the app and server. I have found that the protocol being used is protocol buffers. Given the nature of protobuf, the original .proto file is needed to be able to unserialize the protobuf-encoded message. Since I don't have that, I used protod to disassemble the Android app and recover out any .proto files used.

I have the Android app in a form where it is a bunch of .smali and .so files. Running protod against the .so files yields only one .proto file -- google/protobuf/descriptor.proto.

I was under the impression that users of protocol buffers write their own .proto files, which might reference google/protobuf/descriptor.proto, but according to protod google/protobuf/descriptor.proto is the only protofile used by the app. Could this actually be possible and google/protobuf/descriptor.proto is enough for me to unserialize the messages between the app and server?


Solution

  • When you write a .proto file you can set an option optimize_for to LITE_RUNTIME (see here) and this will omit the descriptors from the generated code to reduce the size of the binary. I believe this is a common practice for mobile development since code size is a scarce resource in that environment. This may explain why you found only a single .proto file. It is unlikely that the app is actually transferring any data using descriptor.proto since that is mostly an implementation detail of the protocol buffers library.

    If you cannot find any other descriptors, your best bet might be to try to interpret the protocol buffers without them. You can read about the protocol buffers wire format here. An easy way to get started would be to create a proto2 message type containing no fields and attempt to parse the data as that type. You can then use the reflection API to examine what are known as the "unknown fields" in the message and try to figure out what they represent.