I used this ULID example in a project where I not only needed the uniqueness offered by ULID but also its lexicographic sortability.
I discovered, however, that no matter how much I tried, I could not just get the ids generated in a loop sorted.
e.g.
class Test{
public static void main(String[] args) {
ArrayList<String> ulids = new ArrayList<>();
for (int i = 0; i < 10; i++) {
ulids.add(ULID.generate());
}
System.out.println("Original:\n..." + ulids);
Collections.shuffle(ulids);
System.out.println("Shuffled:\n..." + ulids);
ulids.sort(new Comparator<String>() {
@Override
public int compare(String o1, String o2) {
return o1.compareTo(o2);
}
});
System.out.println("Sorted:\n..." + ulids);
}
}
Sample output:
Original:
...[01edrp4ng81d3mvkp8s7z19znm, 01edrp4ng872nwfj6b9fsxjkkd, 01edrp4ng86v07r6c9sh62ghr7, 01edrp4ng8bpfw3m2q8bynd5st, 01edrp4ng896t1qhsngrz3h251, 01edrp4ng8jne084nsw5saesfe, 01edrp4ng8w8qz9qtgy3958r1v, 01edrp4ng8fdn30qnr2ktddyz4, 01edrp4ng8ekj0vt393tw12x8j, 01edrp4ng80wacxxskgej5d8mm]
Shuffled:
...[01edrp4ng896t1qhsngrz3h251, 01edrp4ng8w8qz9qtgy3958r1v, 01edrp4ng86v07r6c9sh62ghr7, 01edrp4ng8bpfw3m2q8bynd5st, 01edrp4ng8fdn30qnr2ktddyz4, 01edrp4ng80wacxxskgej5d8mm, 01edrp4ng872nwfj6b9fsxjkkd, 01edrp4ng81d3mvkp8s7z19znm, 01edrp4ng8jne084nsw5saesfe, 01edrp4ng8ekj0vt393tw12x8j]
Sorted:
...[01edrp4ng80wacxxskgej5d8mm, 01edrp4ng81d3mvkp8s7z19znm, 01edrp4ng86v07r6c9sh62ghr7, 01edrp4ng872nwfj6b9fsxjkkd, 01edrp4ng896t1qhsngrz3h251, 01edrp4ng8bpfw3m2q8bynd5st, 01edrp4ng8ekj0vt393tw12x8j, 01edrp4ng8fdn30qnr2ktddyz4, 01edrp4ng8jne084nsw5saesfe, 01edrp4ng8w8qz9qtgy3958r1v]
I checked the implementation and figured that since time was a core factor in the generation of ULIDs, and also since the sensitivity of time used was the millisecond i.e. (System.currentTimeMillis())
, I could get them sorted by introducing some delay in my id generation loop.
I introduced a delay of about 5 milliseconds and the ids all came out sorted; e.g:
class TestWithMsDelay{
public static void main(String[] args) {
ArrayList<String> ulids = new ArrayList<>();
for (int i = 0; i < 10; i++) {
try {
Thread.sleep(5L);
ulids.add(ULID.generate());
} catch (Exception ex) {
ex.printStackTrace();
}
}
System.out.println("Original:\n..." + ulids);
Collections.shuffle(ulids);
System.out.println("Shuffled:\n..." + ulids);
ulids.sort(new Comparator<String>() {
@Override
public int compare(String o1, String o2) {
return o1.compareTo(o2);
}
});
System.out.println("Sorted:\n..." + ulids);
}
}
Sample output:
Original:
...[2rjdme5a5h2ntcd20xq4z487tx, 2rjdme63a23ddsy0km21n6n34a, 2rjdme6pnrenx79zd3jj18est4, 2rjdme70bv45b648p82dbj584n, 2rjdme7d8gx9v9db66ftsxbmqq, 2rjdme7psqdykt24qfymn2e4ba, 2rjdme80as7t1h1rr00m676718, 2rjdme8rztp50bad6ktkhrfhk8, 2rjdme93ngkxkfmf6aegqxer9e, 2rjdme9ea04x22rpx2f3rp5gez]
Shuffled:
...[2rjdme7psqdykt24qfymn2e4ba, 2rjdme6pnrenx79zd3jj18est4, 2rjdme80as7t1h1rr00m676718, 2rjdme63a23ddsy0km21n6n34a, 2rjdme93ngkxkfmf6aegqxer9e, 2rjdme70bv45b648p82dbj584n, 2rjdme9ea04x22rpx2f3rp5gez, 2rjdme8rztp50bad6ktkhrfhk8, 2rjdme7d8gx9v9db66ftsxbmqq, 2rjdme5a5h2ntcd20xq4z487tx]
Sorted:
...[2rjdme5a5h2ntcd20xq4z487tx, 2rjdme63a23ddsy0km21n6n34a, 2rjdme6pnrenx79zd3jj18est4, 2rjdme70bv45b648p82dbj584n, 2rjdme7d8gx9v9db66ftsxbmqq, 2rjdme7psqdykt24qfymn2e4ba, 2rjdme80as7t1h1rr00m676718, 2rjdme8rztp50bad6ktkhrfhk8, 2rjdme93ngkxkfmf6aegqxer9e, 2rjdme9ea04x22rpx2f3rp5gez]
This is not good enough for my work... I don't want to wait for any length of time to generate ulids(even if it is 10us - 100us), the concept of artificial delay bothers me very much, lol.
So, I modified ULID.java and changed the time source from System.currentTimeMillis()
to System.nanoTime()
To my surprise, I no longer needed any time delay in the loop to get the output ULIDs sortable.
I feel there must be a snag somewhere though; because the Java Spec warns that System.nanoTime()
is not necessarily more accurate than System.currentTimeMillis()
e.g in the Javadoc for System.nanoTime()
, it says:
This method provides nanosecond precision, but not necessarily nanosecond resolution (that is, how frequently the value changes) - no guarantees are made except that the resolution is at least as good as that of currentTimeMillis().
Also, the Javadoc for System.nanoTime()
seems to indicate that it is not relative to the Epoch (as is System.currentTimeMillis()
)
I believe that this may cause bad behaviour(messed up sortability over time and maybe affect uniqueness) in using System.nanoTime()
in ULID.java instead of
System.currentTimeMillis()
QUESTION
The ULID has two parts: a time component and a random component.
The time component is the count of milliseconds since 1970.
The random component is updated in two cases:
The implementation you show here doesn't do the second step.
Maybe you could include some code like this (just an example):
if (timestamp == previousTimestamp) {
randomComponent++;
} else {
randomComponent = RANDOM.nextLong();
}
Another problem that I found is that it uses Math.random(), instead of java.security.SecureRandom
. To fix this, this is a sugestion:
import java.security.SecureRandom;
private static final RANDOM = new SecureRandom();
Finally, it's not recommended to use System.nanoTime()
as it returns the number of nanoseconds since an arbitrary point in time. It's not the day time returned from the real time clock (RTC) in your mother board. This function is used to measure the elapsed time between two points in your code, maybe for benchmarking. Example:
long startNanos = System.nanoTime();
// do some expensive tasks here
long endNanos = System.nanoTime();
long elapsedNanos = endNanos - startNanos;
If you prefer, you can check the library ulid-creator
. Maybe it can help. Example:
// Generate a ULID as UUID
UUID ulid = UlidCreator.getUlid();
// Or generate a ULID as String (Crockford's base32)
String ulid = UlidCreator.getUlidString();
Project page: https://github.com/f4b6a3/ulid-creator
EDIT
Sorry. I didn't answare the questions.
Are my fears legitimate
Yep, your wright.
What can I do if (1.) is true, to improve the ULID's time-sensitivity beyond 1 millisecond without destroying its strong points?
You can increase the ULID resolution, but it won't conform the ULID Spec (which is not a formal standard like RFC-4122 by the way). The resulting UUID is like a COMB GUID, created by Jimmy Wilson. The main idea on both is the same.
You can reserve more bits for timestamp component, but it will have a cost of some bits. For example, if you increase the time component from 48 to 64 bits, it will roll over around the year 2262 AD, but the random component will be reduced from 1208925819614629174706176 (2^80) to 18446744073709551616 (2^64). If the cost affects the strong points of ULID, it depends on your project.
I just implemented a generator for ULIDs with nanosecond resolution. I coincidently was working on it a few days ago. It actually has millisecond accuracy using the method System.currentTimeMillis()
. The nanosecond resulution is simulated using the method System.nanoTime()
between two subsequent calls.
If you still intend to use nanoseconds ULID, feel free to test it:
package your.package.name;
import java.security.SecureRandom;
import java.time.Instant;
import java.util.UUID;
/**
* Utility class that creates a COMB GUID with nanoseconds resolution.
*
* It borrows the main idea from ULID and COMB generators: a concatenation of
* time and random bytes. It is composed of 64 bits for time and 64 for random
* bits.
*
* A Nano COMB has two components:
*
* 1. Time camponent (64 bits): nanoseconds since 1970
*
* 2. Random component (64 bits): a value generated by a secure random
* generator.
*
* Maximum time component year is ~2262 A.D. (2^63/10^9/60/60/24/365.25 + 1970)
*
* @author: Fabio Lima 2020
*/
public final class NanoCombCreator {
private long prevTime = 0;
private long prevNano = 0;
private static final long ONE_MILLION_NANOSECONDS = 1_000_000L;
private static final SecureRandom SECURE_RANDOM = new SecureRandom();
/**
* Returns a time component in nanoseconds.
*
* It uses `System.currentTimeMillis()` to get the system time in milliseconds
* accuracy. The nanoseconds resolution is simulated by calling
* `System.nanoTime()` between subsequent calls within the same millisecond.
* It's not precise, but it provides some monotonicity to the values generates.
*
* @return the current time in nanoseconds
*/
private synchronized long getTimeComponent() {
final long time = System.currentTimeMillis();
final long nano = System.nanoTime();
final long elapsed; // nanoseconds since last call
if (time == prevTime) {
elapsed = (nano - prevNano);
if (elapsed > ONE_MILLION_NANOSECONDS) {
try {
// make the clock to catch up
Thread.sleep(1);
} catch (InterruptedException e) {
System.err.println("something went wrong...");
}
}
} else {
prevTime = time;
prevNano = nano;
elapsed = 0;
}
return (time * ONE_MILLION_NANOSECONDS) + elapsed;
}
/**
* Returns the random component using a secure random generator.
*
* @return a random value.
*/
private synchronized long getRandomComponent() {
return SECURE_RANDOM.nextLong();
}
/**
* Returns a Nano COMB.
*
* A Nano COMB is inspired on ULID and COMB generators.
*
* It is composed of 64 bits for time and 64 for random bits.
*
* @return a UUID
*/
public synchronized UUID create() {
final long timeBits = getTimeComponent();
final long randomBits = getRandomComponent();
return new UUID(timeBits, randomBits);
}
/**
* Test method that generates many Nano COMBs in a loop.
*
* @param args
*/
public static void main(String[] args) {
NanoCombCreator creator = new NanoCombCreator();
for (int i = 0; i < 100; i++) {
// Generate a Nano COMB
UUID uuid = creator.create();
// Extract the milliseconds and nanoseconds
long milliseconds = uuid.getMostSignificantBits() / ONE_MILLION_NANOSECONDS;
long nanoseconds = uuid.getMostSignificantBits() & ONE_MILLION_NANOSECONDS;
// Instantiate an instant using the milliseconds and nanoseconds
Instant time = Instant.ofEpochMilli(milliseconds).plusNanos(nanoseconds);
// Print the UUID and the time it was generated (UTC)
System.out.println("UUID: '" + uuid + "', time: " + time);
}
}
}
OUTPUT:
UUID: '16240ee8-3865-1503-d1fb-b4e85f991c6b', time: 2020-07-22T11:15:58.537327680Z
UUID: '16240ee8-3865-f90a-ca19-3ec529750ef7', time: 2020-07-22T11:15:58.537344064Z
UUID: '16240ee8-3866-dd7c-f32f-7acaebcf7766', time: 2020-07-22T11:15:58.537409664Z
UUID: '16240ee8-3868-0a99-3ead-b114e1d61520', time: 2020-07-22T11:15:58.537524800Z
UUID: '16240ee8-3868-efc8-937d-599c72de71a6', time: 2020-07-22T11:15:58.537541248Z
UUID: '16240ee8-386a-3643-6a5e-e3b5e3b03c71', time: 2020-07-22T11:15:58.537655936Z
UUID: '16240ee8-386b-132f-7016-057ab30a2920', time: 2020-07-22T11:15:58.537721408Z
UUID: '16240ee8-386b-f929-d5b0-f70b68aea3d9', time: 2020-07-22T11:15:58.537737280Z