I have a bigint
column or variable. I want to convert it into its bitmap (base 2) representation as a varchar
and additionally to that count the 1s.
I need both components - not just the bit_count
equivalent
Example:
DECLARE @a bigint=-9151314442816847872
,@b bigint=71776119061217280
-9151314442816847872
converts to 1000000100000000000000000000000000000000000000000000000000000000
and there are 2 ones in the bitmap.
71776119061217280
converts to
0000000011111111000000000000000000000000000000000000000000000000
and there are 8 ones in it
The questions are:
There are millions of numbers to process in a very short time. Every millisecond counts!
Solution needs to be compatible with SQL Server 2016.
Here is my code.
My algorithm converts 1mln numbers in 18 seconds. This is slow if millions of numbers have to be converted.
BIT_COUNT()
is magnitude of orders faster, but it is available since SQL 2022. I need a solution compatible with 2016.
Can it be done faster?
DECLARE @a bigint=71776119061217280;
DECLARE @s char(64);
SET @s=
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
CONVERT(varchar,CAST(@a as binary(8)),2)
,'0','0000'),'1','0001'),'2','0010'),'3','0011'),'4','0100'),'5','0101'),'6','0110'),'7','0111')
,'8','1000'),'9','1001'),'A','1010'),'B','1011'),'C','1100'),'D','1101'),'E','1110'),'F','1111');
PRINT 'Bitmap: '+@s;
PRINT '1s count: '+CAST(64-LEN(REPLACE(@s,'1','')) as varchar);
I'd be minded to compute both elements independently rather than trying to combine them.
The below runs fairly quickly for me. I did originally just return the Base2String
and do the REPLACE
and length arithmetic on it but then I was grappling to stop SQL Server re-evaluating the underlying Base2String
string twice. And even when I prevented that doing the separate computations still seemed quicker.
SELECT TOP (1000000) CAST(CRYPT_GEN_RANDOM(8) AS BIGINT) AS val
INTO #BigInts
FROM sys.all_objects o1, sys.all_objects o2
SELECT val, Base2String, BitCount
FROM #BigInts
CROSS APPLY dbo.BigIntToBase2String(val)
Indicative execution times for the above on my machine were 5.3 seconds. I did also try a couple of other options as below
--evaluates Base2String twice - 7.5 seconds
SELECT val, Base2String, 64-LEN(REPLACE(Base2String COLLATE Latin1_General_BIN2,'1','')) AS BitCount
FROM #BigInts
CROSS APPLY dbo.BigIntToBase2String(val)
--evaluates Base2String once but adds a nested loops and is even slower 8.4 seconds
SELECT val, Base2String, 64-LEN(REPLACE(Base2String COLLATE Latin1_General_BIN2,'1','')) AS BitCount
FROM #BigInts
OUTER APPLY dbo.BigIntToBase2String(val)
OPTION (MAXDOP 1)
It has a dependency on the following function that computes both individually and returns them as separate columns. This also has an advantage that if you only require the BitCount
in some future scenario you can just select that column to use this without incurring unnecessary string processing.
For the base2 string conversion my hunch is that potentially a single string concatenation of 64 elements will beat the conversion to hex string and multiple replace. Note that below uses the CONCAT
operator to perform a single concatenation rather than +
that will perform multiple binary concatenation operations.
CREATE FUNCTION dbo.BigIntToBase2String
(
@input bigint
)
RETURNS TABLE
AS
RETURN
(
SELECT Base2String =
CONCAT(
IIF(@input & 0x8000000000000000 = 0, '0','1'),
IIF(@input & 0x4000000000000000 = 0, '0','1'),
IIF(@input & 0x2000000000000000 = 0, '0','1'),
IIF(@input & 0x1000000000000000 = 0, '0','1'),
IIF(@input & 0x0800000000000000 = 0, '0','1'),
IIF(@input & 0x0400000000000000 = 0, '0','1'),
IIF(@input & 0x0200000000000000 = 0, '0','1'),
IIF(@input & 0x0100000000000000 = 0, '0','1'),
IIF(@input & 0x0080000000000000 = 0, '0','1'),
IIF(@input & 0x0040000000000000 = 0, '0','1'),
IIF(@input & 0x0020000000000000 = 0, '0','1'),
IIF(@input & 0x0010000000000000 = 0, '0','1'),
IIF(@input & 0x0008000000000000 = 0, '0','1'),
IIF(@input & 0x0004000000000000 = 0, '0','1'),
IIF(@input & 0x0002000000000000 = 0, '0','1'),
IIF(@input & 0x0001000000000000 = 0, '0','1'),
IIF(@input & 0x0000800000000000 = 0, '0','1'),
IIF(@input & 0x0000400000000000 = 0, '0','1'),
IIF(@input & 0x0000200000000000 = 0, '0','1'),
IIF(@input & 0x0000100000000000 = 0, '0','1'),
IIF(@input & 0x0000080000000000 = 0, '0','1'),
IIF(@input & 0x0000040000000000 = 0, '0','1'),
IIF(@input & 0x0000020000000000 = 0, '0','1'),
IIF(@input & 0x0000010000000000 = 0, '0','1'),
IIF(@input & 0x0000008000000000 = 0, '0','1'),
IIF(@input & 0x0000004000000000 = 0, '0','1'),
IIF(@input & 0x0000002000000000 = 0, '0','1'),
IIF(@input & 0x0000001000000000 = 0, '0','1'),
IIF(@input & 0x0000000800000000 = 0, '0','1'),
IIF(@input & 0x0000000400000000 = 0, '0','1'),
IIF(@input & 0x0000000200000000 = 0, '0','1'),
IIF(@input & 0x0000000100000000 = 0, '0','1'),
IIF(@input & 0x0000000080000000 = 0, '0','1'),
IIF(@input & 0x0000000040000000 = 0, '0','1'),
IIF(@input & 0x0000000020000000 = 0, '0','1'),
IIF(@input & 0x0000000010000000 = 0, '0','1'),
IIF(@input & 0x0000000008000000 = 0, '0','1'),
IIF(@input & 0x0000000004000000 = 0, '0','1'),
IIF(@input & 0x0000000002000000 = 0, '0','1'),
IIF(@input & 0x0000000001000000 = 0, '0','1'),
IIF(@input & 0x0000000000800000 = 0, '0','1'),
IIF(@input & 0x0000000000400000 = 0, '0','1'),
IIF(@input & 0x0000000000200000 = 0, '0','1'),
IIF(@input & 0x0000000000100000 = 0, '0','1'),
IIF(@input & 0x0000000000080000 = 0, '0','1'),
IIF(@input & 0x0000000000040000 = 0, '0','1'),
IIF(@input & 0x0000000000020000 = 0, '0','1'),
IIF(@input & 0x0000000000010000 = 0, '0','1'),
IIF(@input & 0x0000000000008000 = 0, '0','1'),
IIF(@input & 0x0000000000004000 = 0, '0','1'),
IIF(@input & 0x0000000000002000 = 0, '0','1'),
IIF(@input & 0x0000000000001000 = 0, '0','1'),
IIF(@input & 0x0000000000000800 = 0, '0','1'),
IIF(@input & 0x0000000000000400 = 0, '0','1'),
IIF(@input & 0x0000000000000200 = 0, '0','1'),
IIF(@input & 0x0000000000000100 = 0, '0','1'),
IIF(@input & 0x0000000000000080 = 0, '0','1'),
IIF(@input & 0x0000000000000040 = 0, '0','1'),
IIF(@input & 0x0000000000000020 = 0, '0','1'),
IIF(@input & 0x0000000000000010 = 0, '0','1'),
IIF(@input & 0x0000000000000008 = 0, '0','1'),
IIF(@input & 0x0000000000000004 = 0, '0','1'),
IIF(@input & 0x0000000000000002 = 0, '0','1'),
IIF(@input & 0x0000000000000001 = 0, '0','1')
),
BitCount =
IIF(@input & 0x8000000000000000 = 0, 0, 1) +
IIF(@input & 0x4000000000000000 = 0, 0, 1) +
IIF(@input & 0x2000000000000000 = 0, 0, 1) +
IIF(@input & 0x1000000000000000 = 0, 0, 1) +
IIF(@input & 0x0800000000000000 = 0, 0, 1) +
IIF(@input & 0x0400000000000000 = 0, 0, 1) +
IIF(@input & 0x0200000000000000 = 0, 0, 1) +
IIF(@input & 0x0100000000000000 = 0, 0, 1) +
IIF(@input & 0x0080000000000000 = 0, 0, 1) +
IIF(@input & 0x0040000000000000 = 0, 0, 1) +
IIF(@input & 0x0020000000000000 = 0, 0, 1) +
IIF(@input & 0x0010000000000000 = 0, 0, 1) +
IIF(@input & 0x0008000000000000 = 0, 0, 1) +
IIF(@input & 0x0004000000000000 = 0, 0, 1) +
IIF(@input & 0x0002000000000000 = 0, 0, 1) +
IIF(@input & 0x0001000000000000 = 0, 0, 1) +
IIF(@input & 0x0000800000000000 = 0, 0, 1) +
IIF(@input & 0x0000400000000000 = 0, 0, 1) +
IIF(@input & 0x0000200000000000 = 0, 0, 1) +
IIF(@input & 0x0000100000000000 = 0, 0, 1) +
IIF(@input & 0x0000080000000000 = 0, 0, 1) +
IIF(@input & 0x0000040000000000 = 0, 0, 1) +
IIF(@input & 0x0000020000000000 = 0, 0, 1) +
IIF(@input & 0x0000010000000000 = 0, 0, 1) +
IIF(@input & 0x0000008000000000 = 0, 0, 1) +
IIF(@input & 0x0000004000000000 = 0, 0, 1) +
IIF(@input & 0x0000002000000000 = 0, 0, 1) +
IIF(@input & 0x0000001000000000 = 0, 0, 1) +
IIF(@input & 0x0000000800000000 = 0, 0, 1) +
IIF(@input & 0x0000000400000000 = 0, 0, 1) +
IIF(@input & 0x0000000200000000 = 0, 0, 1) +
IIF(@input & 0x0000000100000000 = 0, 0, 1) +
IIF(@input & 0x0000000080000000 = 0, 0, 1) +
IIF(@input & 0x0000000040000000 = 0, 0, 1) +
IIF(@input & 0x0000000020000000 = 0, 0, 1) +
IIF(@input & 0x0000000010000000 = 0, 0, 1) +
IIF(@input & 0x0000000008000000 = 0, 0, 1) +
IIF(@input & 0x0000000004000000 = 0, 0, 1) +
IIF(@input & 0x0000000002000000 = 0, 0, 1) +
IIF(@input & 0x0000000001000000 = 0, 0, 1) +
IIF(@input & 0x0000000000800000 = 0, 0, 1) +
IIF(@input & 0x0000000000400000 = 0, 0, 1) +
IIF(@input & 0x0000000000200000 = 0, 0, 1) +
IIF(@input & 0x0000000000100000 = 0, 0, 1) +
IIF(@input & 0x0000000000080000 = 0, 0, 1) +
IIF(@input & 0x0000000000040000 = 0, 0, 1) +
IIF(@input & 0x0000000000020000 = 0, 0, 1) +
IIF(@input & 0x0000000000010000 = 0, 0, 1) +
IIF(@input & 0x0000000000008000 = 0, 0, 1) +
IIF(@input & 0x0000000000004000 = 0, 0, 1) +
IIF(@input & 0x0000000000002000 = 0, 0, 1) +
IIF(@input & 0x0000000000001000 = 0, 0, 1) +
IIF(@input & 0x0000000000000800 = 0, 0, 1) +
IIF(@input & 0x0000000000000400 = 0, 0, 1) +
IIF(@input & 0x0000000000000200 = 0, 0, 1) +
IIF(@input & 0x0000000000000100 = 0, 0, 1) +
IIF(@input & 0x0000000000000080 = 0, 0, 1) +
IIF(@input & 0x0000000000000040 = 0, 0, 1) +
IIF(@input & 0x0000000000000020 = 0, 0, 1) +
IIF(@input & 0x0000000000000010 = 0, 0, 1) +
IIF(@input & 0x0000000000000008 = 0, 0, 1) +
IIF(@input & 0x0000000000000004 = 0, 0, 1) +
IIF(@input & 0x0000000000000002 = 0, 0, 1) +
IIF(@input & 0x0000000000000001 = 0, 0, 1)
)