Search code examples
pythonpython-3.xformatf-string

Python 3 how to format integer into IPv6 address using f string?


I am trying to format integers into IPv6 addresses.

IPv6 address here means a string that represents an integer between 0 and 2^128 - 1 (340282366920938463463374607431768211455), formatted into 32 hexadecimal digits, separated by colons (':') into 8 fields of 4 digits each.

Now I know of str(ipaddress.IPv6Address(n)) and int(ipaddress.IPv6Address(s)), but I want to write my own functions in the name of learning, and I have already written them and I am trying to improve them.

I am looking for a way to format integers into IPv6 format using either f-strings or str.format, I currently use this one-liner:

ipv6 = ':'.join(hex(n).removeprefix('0x').zfill(32)[i:i+4] for i in range(0, 32, 4))

And it is slow, because it uses string slicing.

I have already written codes to shorten IPv6 addresses using regexes, and I am looking to replace the above mentioned one-liner using a string format one-liner.

I have already implemented the first part (format an integer into 32 bit hexadecimal with leading zeros and without '0x' prefix) using this:

"{0:0>32x}".format(n)

But I cannot implement the second part, I have Google searched for a way to insert a separator every N characters into strings using Python but most are irrelevant no matter what keywords I use, and I have only seen two relevant methods, one being the method I am using that I came up with by myself, the other is this:

re.sub('([\da-fA-F]{4})', r'\1:', s, 7)

But regexes are slow:

In [221]: re.sub('([\da-fA-F]{4})', r'\1:', 'c42d7a7d155b93f7658c20c9fea598ff', 7)
Out[221]: 'c42d:7a7d:155b:93f7:658c:20c9:fea5:98ff'

In [222]: s = 'c42d7a7d155b93f7658c20c9fea598ff'

In [223]: %timeit re.sub('([\da-fA-F]{4})', r'\1:', s, 7)
11.6 µs ± 993 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [224]: %timeit ':'.join(s[i:i+4] for i in range(0, 32, 4))
2.11 µs ± 23.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

But I have found this keyword: Python f string thousand separator in Google search suggestions and found this syntax: "{:,d}"

In [226]: "{:,d}".format(1234567890)
Out[226]: '1,234,567,890'

It is pretty close to what I seek, but unfortunately it isn't the method, first it deals with integers (d), secondly it inserts a separator every 3 characters instead of 4, and finally it uses a comma instead of a colon, and I can't change that without invalidating the syntax...

So what is the correct f-string syntax to insert a colon every 4 digits while formatting integer to 32 digit hexadecimal string? Preferably without f-string nesting. And I am looking for a one-liner.

By f-string nesting I meaning something like this:

f'{f"{n:0>32x}"}'

I don't know if that is valid, and I don't like it.


Currently I have done this:

"{0:0>32_x}".format(n)
In [248]: "{0:0>32_x}".format(260764824896579434326633182196140447999)
Out[248]: 'c42d_7a7d_155b_93f7_658c_20c9_fea5_98ff'

But I can't change that underscore to a colon, I know I can use str.replace but still I want it done in one command.


Performance comparison:

In [253]: %timeit "{0:0>32_x}".format(260764824896579434326633182196140447999).replace('_', ':')
988 ns ± 7.72 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [254]: %timeit ':'.join([hex(260764824896579434326633182196140447999).replace('0x',"").zfill(32)[i:i+4] for i in range(0, 32, 4)])
4.59 µs ± 493 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [255]: %timeit "{0:0>32_x}".format(260764824896579434326633182196140447999)
810 ns ± 48.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

String interpolation is much faster than string slicing.


Solution

  • But regexes are slow:

    Note that you might harness re.compile if you want to lessen required time, consider following example

    import timeit
    timeit.timeit(stmt='re.sub("(....)","\\1:",s)',setup='import re;s = "c42d7a7d155b93f7658c20c9fea598ff"') # 2.398639300000468
    timeit.timeit(stmt='pat.sub("\\1:",s)',setup='import re;s = "c42d7a7d155b93f7658c20c9fea598ff";pat=re.compile("(....)")') # 1.6385453000002599
    

    I use different pattern, as I assume input is always 32 hex digits, replace every 4 characters with them followed by :.