Search code examples
sqlitefirefox-addonblobindexeddbfirefox-addon-webextensions

Parsing FB-Purity's Firefox idb (Indexed Database API) object_data blob from Linux bash


From a Linux bash script, I want to read the structured data stored by a particular Firefox add-on called FB-Purity.

I have found a folder called .mozilla/firefox/b8eab5j0.default/storage/default/moz-extension+++37a9788c-671d-4cae-ba5c-fbdb8788499a^userContextId=4294967295/ that contains a .metadata file which contains the string moz-extension://37a9788c-671d-4cae-ba5c-fbdb8788499a, an URL which when opened in Firefox shows the add-on's details, so I am pretty sure that this folder belongs to the add-on.

That folder contains an idb directory, which sounds like Indexed Database API, a W3C standard apparently used since last year by Firefox it to store add-ons data.

The idb folder only contains an empty folder and an SQLite file.

The SQLite file, unfortunately, does not contain much application structured data, but the object_data table contains a 95KB blob which probably contains the real structured data:

INSERT INTO `object_data` VALUES (1,'0pmegsjfoetupsf.742612367',NULL,NULL,
X'e08b0d0403000101c0f1ffe5a201000400ffff7b00220032003100380035003000320022003a002
2005300610074006f0072007500200055007205105861006e00690022002c00220036003100350036
[... 95KB ...]
00780022007d00000000000000');

Question: Any clue what this blob's format is? How to extract it (using command line or any library or Linux tool) to JSON or any other readable format?


Solution

  • Well, I had a fun day today figuring this out and ended creating a Python tool that can read the data from these indexedDB database files and print them (and maybe more at some point): moz-idb-edit

    To answer the technical parts of the question first:

    • Both the name key (name) and data (value) use a Mozilla proprietary format whose only documentation appears to be its source code at this time.
    • The keys use a special just-for-this use-case encoding whose rough description is available in mozilla-central/dom/indexedDB/Key.cpp – the file also contains the only known implementation. Its unique selling point appears to be the fact that it is relatively compact while being compatible with all the possible index types websites may throw at you as well as being in the correct binary sorting order by default.
    • The values are stored using SpiderMonkey's internal StructuredClone representation that is also used when moving values between processes in the browser. Again there are no docs to speak of but one can read the source code which fortunately is quite easy to understand. Before being added to the database however the generated binary is compressed on-the-fly using Google's Snappy compression which “does not aim for maximum compression [but instead …] aims for very high speeds and reasonable compression” – probably not a bad idea considering that we're dealing with wasteful web content here.
    • To locate the correct indexedDB file for an extension's local storage data, one needs to resolve the extension's static ID to a so-call “internal UUID” whose value is different in every browser profile instance (to make tracking based on installed addons a lot harder). The mapping table for this is stored as a pref (“extensions.webextensions.uuids”) in the prefs.js. The IDB path then is ${MOZ_PROFILE}/storage/default/moz-extension+++${EXT_UUID}^userContextId=4294967295/idb/3647222921wleabcEoxlt-eengsairo.sqlite

    For all practical intents and purposes you can read the value of a single storage key of any extension by downloading the project mentioned above. Basic usage is:

    $ ./moz-idb-edit --extension "${EXT_ID}" --profile "${MOZ_PROFILE}" "${STORAGE_KEY}"
    

    Where ${EXT_ID} is the extension's static ID (check its manifest.json file or look in about:support#extensions-tbody if your unsure), ${MOZ_PROFILE} is the Firefox profile directory (also in about:support) and ${STORAGE_KEY} is the name of the key you'd like to query (unfortunately querying all keys is not supported yet).

    Also writing data is not currently supported either.

    I'll update this answer as I implement more features (or drop me an issue on the project page!).