java algorithm encryption web-scraping restful-url

League of Legends Read Chunks/Keyframes through its RESTful API

I am planning to do game data mining in LOL but stuck at parsing replay files. I find that the most popular replay recorder is LOL Replay which records games in .lrf files. They are saved as binary files. I try to print a lrf file to find some patterns in it. As far as I know, the file has two parts:

The initial part is meta data. It's human readable. At the end of it, it shows an encryption key(32bytes) and a client hash for this .lrf file.

The second part has several sections. Each section is in "RESTful URL+encryption+padding(possibly)" format. For example:

?S4GI____GET /observer-mode/rest/consumer/getGameDataChunk/EUW1/1390319411/1/token
?S4GH____?￥?G??,\??1?q??"Lq}?n??&??????l??(?^P???￥I?v??k>x??Z?￡??3Gug
......
??6GI____GET /observer-mode/rest/consumer/getGameDataChunk/EUW1/1390319411/2/token

Some are even unreadable characters.3

I have followed this link and this wiki. It seems like they use BlowFish ECB Algorithm plus PKCS5Padding to encrypt after using GZIP to compress contents. But I failed to decrypt contents using the 32 bytes encryptionkey in meta data. And I am not sure where I should start to read and where to stop because JVM keeps warning me that Given final block not properly padded.

So my question is:

Is there any one who is familiar with Blowfish Algorithm and PKCS5Padding? Which part of those binary files should I read to decrypt between two consecutive RESTful URL? Do I use the right key to decrypt? (the 32 bytes encryption key in the meta data)
Given the patterns around each RESRful URL, could anyone make a guess which algorithm exactly LOL uses to encrypt/decrypt contents? Is it Blowfish algorithm?

Any help would be appreciated. Thank you guys.

Edit @6.17:

Following Divis and avbor's answers, I tried the following Java snippet to decode chunks:

    // Decode EncryptKey with GameId
    byte[] gameIdBytes = ("502719605").getBytes();
    SecretKeySpec gameIdKeySpec = new SecretKeySpec(gameIdBytes, "Blowfish");
    Cipher gameIdCipher = Cipher.getInstance("Blowfish/ECB/PKCS5Padding");
    gameIdCipher.init(Cipher.DECRYPT_MODE, gameIdKeySpec);
    byte[] encryptKeyBytes = Base64.decode("Sf9c+zGDyyST9DtcHn2zToscfeuN4u3/");
    byte[] encryptkeyDecryptedByGameId = gameIdCipher.doFinal(encryptKeyBytes);

    // Initialize the chunk cipher
    SecretKeySpec chunkSpec = new SecretKeySpec(encryptkeyDecryptedByGameId, "Blowfish");
    Cipher chunkCipher = Cipher.getInstance("Blowfish/ECB/PKCS5Padding");
    chunkCipher.init(Cipher.DECRYPT_MODE, chunkSpec);

    byte[] chunkContent = getChunkContent();
    byte[] chunkDecryptedBytes = chunkCipher.doFinal(chunkContent);

It works with no error when decoding encryptionkey with gameid. However it doesn't work in the last two lines. Currently I just hard coded getChunkContent() to return an byte array containing the bytes between two RESTful URLs. But Java either returns "Exception in thread "main" javax.crypto.IllegalBlockSizeException: Input length must be multiple of 8 when decrypting with padded cipher"

returns "Exception in thread "main" javax.crypto.BadPaddingException: Given final block not properly padded".

I notice that the hex pattern between two RESTful URLs are as follows: (hex for first URL e.g. /observer-mode/rest/consumer/getKeyFrame/EUW1/502719605/2/token) + 0a + (chunk contents) + 000000 + (hex for next URL)

My questions are:

Which part of chunks need to be included? Do I need to include "0a" right after the last URL? Do I need to include "000000" before the next URL?
Am I using the right padding algorithm (Blowfish/ECB/PKCS5Padding)?

My test lrf file could be downloaded on : https://www.dropbox.com/s/yl1havphnb3z86d/game1.lrf

EDIT @ 6.18

Thanks to Divis! Using the snippet above, I successfully got some chunk info decrypted without error. Two things worth noting when you write your own getChunkContent():

The chunk content starts right after "hex for previous url 0a".
The chunk content ends as close as possible to "0000000 (hex for next url)" when its size reaches exactly a multiple of 8.

But I still got two questions to ask:

Here is an example of what I decode for the content between two .../getKeyframe/... RESTful urls.
```
39117e0cc2f7e4bb1f8b080000000000000bed7d0b5c15d5 ... 7f23a90000
```
I know Gzip compressed data starts with "1f8b08..." according to this RFC doc. Can I just discard "39117e0cc2f7e4bb" and start gzip decompress the proceeding content? (Actually I've already tried to start decoding from "1f8b08..", at least it could be decompressed without error)
After the gzip decompression, the result is still a long sequence of binary (with some readable strings like summoners names, champions names, etc.) When I look at the wiki, it seems like it is far from finish. What I expect is to read every item, rune, or movement in readable string. How exactly can I read those game events from it? Or we just need some patience to figure them out ourselves with the community?

Millions of thanks!

Solution

Repository dev contributor here, according to the wiki, the key is the base64 Blowfish ECB "encryption_key" (with game id as key for the blowfish).

Then, use this decrypted key to decode the content (blow fish ECB too). Then, gzip decode.

base64decode encryptionkey = decodedKey
blowfishECBdecode decodedKey with (string) gameId as key = decodedKey

blowfishECBdecode content with decodedKey as key = decodedContent
gzipdecode decodedContent = binary

I made a library to download and decode replay files : https://github.com/EloGank/lol-replay-downloader and the CLI command is also available : https://github.com/EloGank/lol-replay-downloader-cli
Hope it'll help :)