Search code examples
javaepochmilliseconds

Understanding bits and time in milliseconds


I was reading this page, where it says that 41 bits are used to represent 41 years using a custom epoch.

I am unable to understand the relationship between time in milliseconds, bits and years. Can any one help?

Eg. In Java, System.currentTimeMillis() returns a long, which is 64 bits. Does that mean it could represent 64 years worth of unique values if I had to generate 1 per millisecond?

In the above case, what happens after 41 years? Will they have to increase the bits used to designate if they keep the same approach?


Solution

  • It's a weird coincidence mixed with a weird way to write it. 41 bits actually gets you 69 years, not 41. The authors of this documentation messed up or oversimplified, but note that 69 is pretty close to 41, by pure coincidence.

    Let's delve into what we do know a bit:

    They are explicitly calling out that it is some sort of 'milliseconds' value. We also know that it's 41 bits, and the rest, well, the rest we're going to have to guess.

    Let's work on the stuff we do know: 41 bits, and 'milliseconds'.

    41 bits is like 41 separate light switches. Imagine the following game:

    You get to enter a room. It has 1 light switch in it and nothing else of note. You can't leave anything, scratch in the walls, or otherwise interact with this room, except the light switch. Then, you have to go. Then I enter some time later.

    How much information can we communicate?

    With a single light switch, only 1 bit of information: You can leave the light on, or off, and that's all I know. If all you needed to communicate to me was the result of a coinflip that you observed and I didn't, then 1 bit is all we needed. We make an arrangement beforehand: Light switch down means the coin landed tails, light switch up means it landed heads. Voila, we can now communicate 1 coin flip.

    Let's say there are 2 light switches instead. You can now communicate 4 different things. Let's say someone drew a card from a deck and you saw that and I didn't: You could communicate the suit to me, if we arrange a 'code'.

    Treating 'light switch off' as a 0 and 'light switch on' as a 1, then we could pre-arrange this code:

    00 - hearts
    01 - clubs
    10 - spades
    11 - diamonds
    

    So, if I enter the room, i see the left lightswitch is down and the right downswitch is up, I can say: You drew a clubs! And that'd be right.

    Every light switch you add doubles the number of states you can communicate. so, 1 lightswitch is good for differentiating 2 things (coinflips for example), 2 switches can do 4 things, 3 switches can do 8, 4 switches can do 16, etcetera.

    Here we have 41 light switches. That's good for differentiating between 2^41, or 2,199,023,255,552 different unique values. By way of simple math.

    We also know that this is differentiating between 'milliseconds'. Let's read that as: This mechanism is capable of storing time with a granularity of 1 millisecond. In other words, it can tell any 2 points in time apart as long as those 2 points in time are at least 1 millisecond different.

    Let's work on millisecond a bit:

    • Divide by 1000 for seconds.
    • Divide by 60 for minutes.
    • Divide by 60 for hours.
    • Divide by 24 for days.
    • Divide by 365.25 for years.

    So let's do just that. 2,199,023,255,552/1000/60/60/24/365.25 = 69.682842027.

    In other words, with 41 light switches, you can communicate a moment in time to me with millisecond granularity, as long as we arrange ahead of time that we know that we're communicating only about a specific range of time, and that range can be no larger than a bit more than 69 and half a year.

    The easiest way to make such an arrangement is to decree a certain point in time as the 'epoch' - the 0 value.

    For example, we can make this pre-arrangement:

    • Let's decree that UTC timezone, new year's as the year 1999 becomes the year 2000, that very instant in time, we call that 0. Then the number represents that many milliseconds later.

    So, the number 60000 encodes the instant in time at which (at the UTC timezone), the time was 2000-01-01 00:01:00 (1 minute after midnight in the year 2000 in UTC zone).

    In other words, I enter the room and I notice that all light switches are down except the 2nd and 3rd from the right: 0000...00110. We arranged beforehand that it's the usual binary counting mechanism, so that's 6. Thus, I know that you were trying to communicate to me that the picture was taken 6 milliseconds after midnight, year 2000, UTC zone.

    Our 41 bits get us to 2069-07-01 or so (July 2069) and then we flat out run out of bits. If you just blindly keep counting, well, computers wrap around, so then you get the number 0 again, and we would incorrectly read that as: Juuust at midnight, year 2000.

    In other words, It is 69 and a bit years, and 41 is just complete horse manure. I don't know why they wrote 41. But, 41 is at least close to 69, so perhaps it's an oversimplification.

    What happens when they hit 2041, or fixing the error in their own documentation, 2069? Well, one easy solution to e.g. buy another 10 years is to decree that 0 is to be read as 2069-august, and not as 2000, which is okay because instagram wasn't around yet. But that only gets you a few more years.

    Then, either really really old instagram posts all of a sudden look like they are from 2080 (by redefining the 69-and-a-bit window upwards, any timestamp that isn't in the window looks like one that is, and is thus completely wrong), or they change their ID system and e.g. add another few bits. Every bit they add doubles the window size. even 1 bit is enough for another 69-and-a-bit years.