How serializing foreign keyed table works internally in kdb

I have a keyed table(referenced table) linked using foreign key to the referencing table and I serialize both tables using set operator.

q)kt:([sym:`GOOG`AMZN`FB]; px:20 30 40);
q)`:/Users/uts/db/kt set kt
q)t:([] sym:`kt$5?`GOOG`AMZN`FB; vol:5?10000)
q)`:/Users/uts/db/t set t

Then I remove these tables from the memory

q)delete kt,t from `.

Now I deserialize the table t in memory:

t:get `:/Users/uts/db/t

If I do meta t after this it fails, expecting kt as foreign key. If I print t, as expected it shows index values in column sym of table t.

So, the question arises -

As kdb stores the meta of each table(i.e c,t,f,a) and its corresponding values on disk, how does table t serialization works internally?
How(In which form in binary format) are these values stored in file t.

-rw-r--r-- 1 uts staff 100 Apr 13 23:09 t

Solution

tl;dr A foreign key is stored as a vector of 4-byte indices of a key column of a referenced table plus a name of a table a foreign key refers to.

As far as I know kx never documented their file formats, and yet I think some useful information relevant to your question can be deduced right from a q console session.

Let me modify your example a bit to make things simpler.

q)show kt:([sym:`GOOG`AMZN`FB]; px:20 30 40)
sym | px
----| --
GOOG| 20
AMZN| 30
FB  | 40
q)show t:([] sym:`kt$`GOOG`GOOG`AMZN`FB`FB)
sym
----
GOOG
GOOG
AMZN
FB
FB

I left only one column - sym - in t because vol is not relevant to the question. Let's save t without any data first:

q)`:/tmp/t set 0#t
`:/tmp/t
q)hcount `:/tmp/t
30

Now we know that it takes 30 bytes to represent t when it's empty. Let's see if there's a pattern when we start adding rows to t:

q){`:/tmp/t set x#t;`cnt`size!(x;hcount[`:/tmp/t] - 30)} each til[11], 100 1000 1000000
cnt     size
---------------
0       0
1       4
2       8
3       12
4       16
5       20
6       24
7       28
8       32
9       36
10      40
100     400
1000    4000
1000000 4000000

We can see that adding one row increases the size of t by four bytes. What can these 4 bytes be? Can they be a representation of a symbol itself? No, because if they were and we renamed a sym value in kt it would affect the size of t on disk but it doesn't:

q)update sym:`$50#.Q.a from `kt where sym=`GOOG
`kt
q)1#t
sym
--------------------------------------------------
abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwx
q)`:/tmp/t set 1#t
`:/tmp/t
q)hcount `:/tmp/t
34

Still 34 bytes. I think it should be obvious by now that the 4 bytes is an index, but an index of what? Is it an index of a column which must be called sym exactly? Apparently no, it isn't.

q)kt:`foo xcol kt
q)t
sym
--------------------------------------------------
abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwx
abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwx
AMZN
FB
FB

There's no column called sym in kt any longer but t hasn't changed at all! We can go even further and change the type of foo (ex sym) in kt:

q)update foo:-1 -2 -3.0 from `kt
`kt
q)t
sym
---
-1
-1
-2
-3
-3

Not only did it change t, it changed its meta too:

q)meta t
c  | t f  a
---| ------
sym| f kt
q)/  ^------- used to be s

I hope it's clear now that kdb stores a 4-byte index of a key column of a referenced table and a name of a table (but not a key column name!). If a referenced table is missing kdb can't reconstruct the original data and displays the bare index. It a referencing table needs to be sent over the wire then indices are replaced with actual values so that the receiving side can see the real data.