To understand the MYSQL's BIT
, I created a table name bit_demo
in mysql and added a few rows in it as shown below :
mysql> CREATE TABLE `bit_demo` (
-> `mybit` bit(10) NOT NULL DEFAULT b'0'
-> );
Query OK, 0 rows affected (0.05 sec)
mysql> insert into bit_demo values(b'1111111111');
Query OK, 1 row affected (0.00 sec)
mysql> insert into bit_demo values(b'0');
Query OK, 1 row affected (0.00 sec)
mysql> insert into bit_demo values(b'1');
Query OK, 1 row affected (0.00 sec)
But when I use the plain SELECT
query, it showed some strange characters on screen:
mysql> select mybit from bit_demo;
+-------+
| mybit |
+-------+
| ♥ |
| |
| ☺ |
+-------+
3 rows in set (0.00 sec)
So I tried using the CONV(value,from_base,to_base)
in MySQL to see them in bits as I entered them. Since I didn't know, which value of base works, I tried different values. To my surprise, any value between 2
and 36
for from_base
works in this case.
Successful results with from_base
or to_base
is 2
, 10
, or 36
:
mysql> select conv(mybit,2,2) mybit from bit_demo;
+------------+
| mybit |
+------------+
| 1111111111 |
| 0 |
| 1 |
+------------+
3 rows in set (0.00 sec)
mysql> select conv(mybit,2,36) mybit from bit_demo;
+-------+
| mybit |
+-------+
| SF |
| 0 |
| 1 |
+-------+
3 rows in set (0.00 sec)
mysql> select conv(mybit,36,2) mybit from bit_demo;
+------------+
| mybit |
+------------+
| 1111111111 |
| 0 |
| 1 |
+------------+
3 rows in set (0.00 sec)
mysql> select conv(mybit,10,2) mybit from bit_demo;
+------------+
| mybit |
+------------+
| 1111111111 |
| 0 |
| 1 |
+------------+
3 rows in set (0.00 sec)
mysql> select conv(mybit,36,36) mybit from bit_demo;
+-------+
| mybit |
+-------+
| SF |
| 0 |
| 1 |
+-------+
3 rows in set (0.00 sec)
mysql> select conv(mybit,10,10) mybit from bit_demo;
+-------+
| mybit |
+-------+
| 1023 |
| 0 |
| 1 |
+-------+
3 rows in set (0.00 sec)
As you can see above, results do not change when from_base
changes between 2
and 36
. But it changes when to_base
is between 2
and 36
.
I also used CAST(value AS UNSIGNED)
, and it worked like CONV(value, <2 to 36>, 10)
:
mysql> select cast(mybit as unsigned) mybit from bit_demo;
+-------+
| mybit |
+-------+
| 1023 |
| 0 |
| 1 |
+-------+
3 rows in set (0.00 sec)
What is the explanation, when any value in from_base
works with CONV
over BIT
?
The answer for all questions and doubts above is: because it's BIT
data type, so it stores bits. Bits are exactly what any data in any storage is. Thus, you can not just look at them as on their representation. Bits are only content and "shape" of what they are depends of what is the context.
BIT
actually is?Some definitions
Well, as it stated in documentation, it stores values as plain bits. What does that mean? That means: data is stored as bit sequence and there is no information about what kind of data is stored. DBMS simply doesn't care about type of data - there is no definition of that and "BIT
" should not confuse you. "BIT
" does not point to any real data type, but instead it claims that data inside is nothing more than sequence of bits.
What "sequence of bits" means
Storing sequence of bits means that real sense of that sequence will depend from context. You can not really say what certain sequence mean without pointing to the context. For instance, integers and strings. What is integer? Well, it's a number which is stored as sequence of bits. What is string? It's .. sequence of bits too. So how to distinct them? That's exactly why do we have data types. Each type is a structure, which determines how to deal with certain value - and that value is always sequence of bits.
Now, "BIT
data-type" is really terrible naming because in fact, there's no "data type" at all. It just tells that it stores data, without binding of what that data means. Let's illustrate with some examples. Let's say, we want to store string "s"
. How will it be interpreted? With bit sequence, and we may restore it's "internal" view:
mysql> SELECT ORD("s");
+----------+
| ORD("s") |
+----------+
| 115 |
+----------+
1 row in set (0.00 sec)
So now we know "numeric" representation. Next:
mysql> SELECT CONV(115, 10, 2);
+------------------+
| CONV(115, 10, 2) |
+------------------+
| 1110011 |
+------------------+
1 row in set (0.01 sec)
Ok, it is our "bits" as we wanted. I enquoted "bits" because it's only visualization, not real data. Finally, we can insert it as a bit-literal:
mysql> INSERT INTO bit_demo (mybit) VALUES (b'1110011');
Query OK, 1 row affected (0.02 sec)
And now, some "magic":
mysql> SELECT * FROM bit_demo;
+-------+
| mybit |
+-------+
| s |
+-------+
1 row in set (0.00 sec)
Tada! As you can see, I didn't make any conversions - but I can see valid "s"
string on the screen. Why? Because when you're "selecting" something and MySQL client displays that, it does it, interpreting incoming data as strings. So that's why "it worked" - we just written bit sequence that may be interpreted as "s"
- and, since client was trying to do it (so, interpret incoming data as string), all went well and we saw our string.
More for strings: encodings
Now, strings are very good sample also because they too has issue of how to interpret symbols because of encodings. Symbol is nothing more as sequence of bits and what you see on the screen when symbols is shown is nothing more than the corresponding graphical shape for chosen encoding. To illustrate:
mysql> insert into bit_demo values(b'0111111111');
Query OK, 1 row affected (0.02 sec)
Let it be our value, and now, first case:
mysql> SET NAMES UTF8;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from bit_demo;
+-------+
| mybit |
+-------+
| � |
+-------+
1 row in set (0.00 sec)
and second case:
mysql> SET NAMES cp1251;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from bit_demo;
+-------+
| mybit |
+-------+
| я |
+-------+
1 row in set (0.00 sec)
As you can see, what certain "symbol" actually means depends from which encoding did we use. Bits, which are just values, knows nothing of what those values should mean.
Integer operations
So, now, this is about same issue with CONV()
. Your values, passed to that functions, are interpreted just as integer values. No information about such things as "radix" is given, and, more, it's just not applicable here - your bits are storing just value, it will be same for any radix, thus, it doesn't matter what radix you will convert "from" - value in bits won't change. That is why for any arbitrary input radix (so, 2..36
) you will see same conversion result if destination radix is same.
Value in bits, when used as an integer, immediately became that "integer" type, but they also will become values, defined by those data types. Again, let's illustrate (for this sample, I'm using 64-bit length):
mysql> INSERT INTO bit_demo VALUES (b'1111111111111111111111111111111111111111111111111111111111111101');
Query OK, 1 row affected (0.07 sec)
Let's see "what" is it:
mysql> SELECT CAST(mybit AS SIGNED) FROM bit_demo;
+-----------------------+
| CAST(mybit AS SIGNED) |
+-----------------------+
| -3 |
+-----------------------+
1 row in set (0.00 sec)
And:
mysql> SELECT CAST(mybit AS UNSIGNED) FROM bit_demo;
+-------------------------+
| CAST(mybit AS UNSIGNED) |
+-------------------------+
| 18446744073709551613 |
+-------------------------+
1 row in set (0.00 sec)
Pretty huge difference, right? Again, that's because with certain data type we've defined rules for stored value, but value itself has no clue how it will be used & represented.
You may think about "BIT
data type" as about "no-type" values. Because it really doesn't specify any information about what value means, it only stores that value. How to work with it and how to interpret it is just another thing. You should keep in mind that when using this type - as well as any value, no matter what it is and where it is - in the end is just bits sequence.