Search code examples
filedata-structuresfilesystemsbinaryfileshexdump

Identify data structure from unknown file (hexdump)


I have a bunch of FOUND.000/FILEnnnn.CHK files with some data pattern. I would like to know what it is and what data fields it contain (how to interpret that).

So far, I investigated the following facts:

The structure consists of two parts. One of them is always the same:

Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

00000000   14 CF EE EB FC E7 EE E2  E0 F2 E5 EB FC 20 57 69    Пользователь Wi
00000010   6E 64 6F 77 73 00 00 00  00 00 00 00 00 00 00 00   ndows           
00000020   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00                   
00000030   00 00 00 00 00 00 14 00  1F 04 3E 04 3B 04 4C 04           П о л ь 
00000040   37 04 3E 04 32 04 30 04  42 04 35 04 3B 04 4C 04   з о в а т е л ь 
00000050   20 00 57 00 69 00 6E 00  64 00 6F 00 77 00 73 00     W i n d o w s 

The [0x01..0x14] field is text in Windows-1251 encoding. It says "Windows User" in russian.

Apparently, byte at [0x00] is the length of the aforementioned text field (can confirm that because I have a .CHK file with different text length and the first byte still indicates its length).

Also, [0x38..0x5f] contains same string in russian, but this time encoded as UTF-16 LE. Length of this string in characters is contained in [0x36..0x37] (little endian).

The second part contains different data in different files. Sometimes it's some strings along with some binary values. Sometimes content is merely binary and unclear. Examples:

ANSI string (environment variable?):

Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

00000060   00 00 00 00 00 00 3F 96  90 A1 00 10 00 8E 55 53         ?– Ў   ЋUS
00000070   45 52 44 4F 4D 41 49 4E  5F 52 4F 41 4D 49 4E 47   ERDOMAIN_ROAMING
00000080   50 52 4F 46 49 4C 45 3D  44 45 53 4B 54 4F 50 2D   PROFILE=DESKTOP-
00000090   4B 35 42 44 44 35 42 00  00 00 00 00 00 00 36 96   K5BDD5B       6–
000000A0   99 A1 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ™Ў              
000000B0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00                   

ANSI (codepage 1251) string (environment variable?):

Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

00000060   00 00 00 00 00 00 C8 7A  4A 43 00 09 00 8E 4C 4F         ИzJC   ЋLO
00000070   43 41 4C 41 50 50 44 41  54 41 3D 43 3A 5C 55 73   CALAPPDATA=C:\Us
00000080   65 72 73 5C E0 E4 EC E8  ED 5C 41 70 70 44 61 74   ers\админ\AppDat
00000090   61 5C 4C 6F 63 61 6C 00  00 00 00 00 00 00 C7 7A   a\Local       Зz
000000A0   45 43 00 00 00 00 00 00  00 00 00 00 00 00 00 00   EC              
000000B0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00                   

UTF-16 LE string (font name):

Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

00000060   00 00 00 00 00 00 D1 7B  E5 14 00 02 00 8C 15 00         С{е    Њ  
00000070   42 00 61 00 68 00 6E 00  73 00 63 00 68 00 72 00   B a h n s c h r 
00000080   69 00 66 00 74 00 20 00  53 00 65 00 6D 00 69 00   i f t   S e m i 
00000090   4C 00 69 00 67 00 68 00  74 00 00 00 00 00 C8 7B   L i g h t     И{
000000A0   EC 14 00 00 00 00 00 00  00 00 00 00 00 00 00 00   м               
000000B0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00                   

Unclear data (same 12 bytes as at [0x72..0x7d] repeated on 0x80 and 0x90 lines):

Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

00000060   00 00 00 00 00 00 D3 D3  BE C6 00 09 00 88 00 00         УУѕЖ   €  
00000070   00 00 C0 DF 70 64 00 D5  6E 64 80 D6 6E 64 40 00     АЯpd ХndЂЦnd@ 
00000080   00 00 C0 DF 70 64 00 D5  6E 64 80 D6 6E 64 FF FF     АЯpd ХndЂЦndяя
00000090   FF FF C0 DF 70 64 00 D5  6E 64 80 D6 6E 64 DA D3   яяАЯpd ХndЂЦndЪУ
000000A0   B1 C6 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ±Ж              
000000B0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00                   

Couple more sparse-populated data:

Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

00000060   00 00 00 00 00 00 D8 43  BB CF 00 1E 00 88 00 00         ШC»П   €  
00000070   00 00 78 5A D1 00 00 00  00 00 00 00 00 00 A8 01     xZС         Ё 
00000080   00 00 00 00 10 00 00 00  00 00 00 00 00 00 00 00                   
00000090   00 00 00 00 00 00 00 00  00 00 00 00 00 00 C3 43                 ГC
000000A0   B2 CF 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ІП              
000000B0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00                   
Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

00000060   00 00 00 00 00 00 B7 86  DC 8C 00 1A 00 88 00 00         ·†ЬЊ   €  
00000070   00 00 18 19 70 01 00 00  00 00 00 00 00 00 00 09       p           
00000080   00 00 00 00 40 00 00 00  00 00 00 00 00 00 00 00       @           
00000090   00 00 00 00 00 00 00 00  00 00 00 00 00 00 BC 86                 ј†
000000A0   A5 8C 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ҐЊ              
000000B0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00                   

The questions are: What is the kind of files contain such data? What the purpose of these files? What the kind of data contained in the second part (interpretation of all the fields/bytes)?


Solution

  • Figured it out myself.

    It is Microsoft Office's so-called "Owner file". The tiny hidden file that spawns when you open an MS Office document and has the same name as the document, but beginning replaced with "~$".

    Knowing this, the rest of the answers are found here: https://superuser.com/questions/405257/what-type-of-file-is-file