It is unclear to me, what is a correct .tar
file format, as I am experiencing proper functionality with three scenarios (see below).
Based on .tar
specification I have been working with, the magic
field (ustar) is null-terminated character string and version
field is octal number with no trailing nulls.
However I've review several .tar
files I found on my server and I found different implementation of magic
and version
field and all three of them seems to work properly, probably because system ignore those fields.
See different (3) bytes between words ustar and root in the following examples >>
Scenario 1 (20 20 00
):
000000F0 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
000000FC 00 00 00 00 | 00 75 73 74 | 61 72 20 20 .....ustar
00000108 00 72 6F 6F | 74 00 00 00 | 00 00 00 00 .root.......
00000114 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
Scenario 2 (00 20 20
):
000000F0 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
000000FC 00 00 00 00 | 00 75 73 74 | 61 72 00 20 .....ustar.
00000108 20 72 6F 6F | 74 00 00 00 | 00 00 00 00 root.......
00000114 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
Scenario 3 (00 00 00
):
000000F0 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
000000FC 00 00 00 00 | 00 75 73 74 | 61 72 00 00 .....ustar..
00000108 00 72 6F 6F | 74 00 00 00 | 00 00 00 00 .root.......
00000114 00 00 00 00 | 00 00 00 00 | 00 00 00 00 ............
Which one is the correct format?
In my opinion none of your examples is the correct one, at least not for the POSIX format.
As you can read here:
/* tar Header Block, from POSIX 1003.1-1990. */
/* POSIX header */
struct posix_header { /* byte offset */
char name[100]; /* 0 */
char mode[8]; /* 100 */
char uid[8]; /* 108 */
char gid[8]; /* 116 */
char size[12]; /* 124 */
char mtime[12]; /* 136 */
char chksum[8]; /* 148 */
char typeflag; /* 156 */
char linkname[100]; /* 157 */
char magic[6]; /* 257 */
char version[2]; /* 263 */
char uname[32]; /* 265 */
char gname[32]; /* 297 */
char devmajor[8]; /* 329 */
char devminor[8]; /* 337 */
char prefix[155]; /* 345 */
};
#define TMAGIC "ustar" /* ustar and a null */
#define TMAGLEN 6
#define TVERSION "00" /* 00 and no null */
#define TVERSLEN 2
The format of your first example (Scenario 1
) seems to be matching with the old GNU header format:
/* OLDGNU_MAGIC uses both magic and version fields, which are contiguous.
Found in an archive, it indicates an old GNU header format, which will be
hopefully become obsolescent. With OLDGNU_MAGIC, uname and gname are
valid, though the header is not truly POSIX conforming */
#define OLDGNU_MAGIC "ustar " /* 7 chars and a null */
In both your second and third examples (Scenario 2
and Scenario 3
), the version
field is set to an unexpected value (according to the above documentation, the correct value should be 00
ASCII or 0x30 0x30
hex), so this field is most likely ignored.