{"id":394,"date":"2013-03-14T04:04:46","date_gmt":"2013-03-14T04:04:46","guid":{"rendered":"http:\/\/ntspl.co.in\/blog\/?p=394"},"modified":"2021-12-16T05:20:13","modified_gmt":"2021-12-16T05:20:13","slug":"why-we-should-not-use-bit-columns-in-mysql-database","status":"publish","type":"post","link":"https:\/\/www.ntspl.co.in\/blog\/why-we-should-not-use-bit-columns-in-mysql-database\/","title":{"rendered":"Why we should not use BIT columns in MySQL Database"},"content":{"rendered":"<p>MySQL implements the\u00a0<code>BIT<\/code>\u00a0data type differently in different versions, and the behavior is not what one might expect. In this article I\u2019ll explain how MySQL\u2019s behavior has changed over time, what strange things can happen as a result, and how to understand and work around display issues. I\u2019ll tell you about a serious bug I\u2019ve found, and discuss differences in the\u00a0<code>BIT<\/code>\u00a0data type between MySQL and Microsoft SQL Server.<\/p>\n<h3>History<\/h3>\n<p>MySQL has supported the\u00a0<code>BIT<\/code>\u00a0data type for a long time, but only as a synonym for\u00a0<code>TINYINT(1)<\/code>\u00a0until version 5.0.3. Once the column was created, MySQL no longer knew it had been created with\u00a0<code>BIT<\/code>\u00a0columns.<\/p>\n<p>In version 5.0.3 a native\u00a0<code>BIT<\/code>\u00a0data type was introduced for MyISAM, and shortly thereafter for other storage engines as well. This type behaves very differently from\u00a0<code>TINYINT<\/code>.<\/p>\n<h3>Changed behavior<\/h3>\n<p>The old data type behaved just like the small integer value it really was, with a range from -128 to 127. The new data type is a totally different animal. It\u2019s not a single bit. It\u2019s a fixed-width \u201cbit-field value,\u201d which can be from 1 to 64 bits wide. This means it doesn\u2019t store a single\u00a0<code>BIT<\/code>\u00a0value; it\u2019s something akin to the\u00a0<code>ENUM<\/code>\u00a0and\u00a0<code>SET<\/code>types. The data seems to be stored as a\u00a0<code>BINARY<\/code>\u00a0value, even though the documentation lists it as a \u201cnumeric type,\u201d in the same category as the other numeric types. The data isn\u2019t treated the same as a numeric value in queries, however. Comparisons to numeric values don\u2019t always work as expected.<\/p>\n<p>This change in behavior means it\u2019s not safe to use the\u00a0<code>BIT<\/code>\u00a0type in earlier versions and assume upgrades will go smoothly.<\/p>\n<h3>Display issues<\/h3>\n<p>The client libraries, including the command-line client and all the\u00a0<abbr title=\"Graphical User Interface\">GUI<\/abbr>\u00a0clients I\u2019ve seen, don\u2019t seem to know how to handle\u00a0<code>BIT<\/code>\u00a0values. They don\u2019t display them as a series of ones and zeroes. For instance, the following code even breaks the alignment of the command-line output!<\/p>\n<p><a href=\"https:\/\/ntspl.co.in\/blog\/wp-content\/uploads\/2013\/03\/ms1.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-395\" src=\"https:\/\/ntspl.co.in\/blog\/wp-content\/uploads\/2013\/03\/ms1.jpg\" alt=\"MYSQL\" width=\"686\" height=\"194\" srcset=\"https:\/\/www.ntspl.co.in\/blog\/wp-content\/uploads\/2013\/03\/ms1.jpg 686w, https:\/\/www.ntspl.co.in\/blog\/wp-content\/uploads\/2013\/03\/ms1-300x85.jpg 300w\" sizes=\"(max-width: 686px) 100vw, 686px\" \/><\/a><\/p>\n<p>As I mentioned above, the data seems to be stored internally, and transmitted through the client libraries, as a<code>BINARY<\/code>\u00a0value, which is actually a string type in MySQL. How it displays depends on the width of the column. For example, if the column is 32 bits wide, it is treated as\u00a0<code>CHAR(4)<\/code>. If it\u2019s 8 bits wide, it is treated as\u00a0<code>CHAR(1)<\/code>:<\/p>\n<p><a href=\"https:\/\/ntspl.co.in\/blog\/wp-content\/uploads\/2013\/03\/ms2.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-396\" src=\"https:\/\/ntspl.co.in\/blog\/wp-content\/uploads\/2013\/03\/ms2.jpg\" alt=\"ms2\" width=\"687\" height=\"144\" srcset=\"https:\/\/www.ntspl.co.in\/blog\/wp-content\/uploads\/2013\/03\/ms2.jpg 687w, https:\/\/www.ntspl.co.in\/blog\/wp-content\/uploads\/2013\/03\/ms2-300x63.jpg 300w\" sizes=\"(max-width: 687px) 100vw, 687px\" \/><\/a><\/p>\n<p>To display the value as an integer, it has to be cast to another type. One way to do this is add 0 to the value:<code>select ch + 0 from test;<\/code>. Another way is\u00a0<code>select cast(ch as unsigned) from test;<\/code><\/p>\n<p>Display width seems to be related to value with\u00a0<code>BIT<\/code>, in contrast to what the manual\u2019s section on\u00a0Overview of Numeric Types\u00a0states: \u201cDisplay width is unrelated to the storage size or range of values a type can contain.\u201d It appears that a field of size\u00a0<em>M<\/em>\u00a0can only store\u00a0<em>M<\/em>\u00a0bits, so it\u2019s the storage size, not the display size, that\u2019s affected. As I mentioned, bit values don\u2019t display as ones and zeroes anyway, so it makes no sense to say an 8-bit wide column \u201cdisplays with a width of 8.\u201d It doesn\u2019t display 8 bits, it stores 8 bits. In fact, inserting<code>b'100000000'<\/code>\u00a0into the table I defined above will store the value 255, demonstrating that the actual value has a maximum capacity of 8 bits. Any bits not set explicitly to 1 are set to 0, so values are left-padded with zeroes (the most significant bits are zeroes).<\/p>\n<h3>Bugs<\/h3>\n<p>I\u2019ve discovered some very strange bugs with\u00a0<code>BIT<\/code>\u00a0columns in MySQL. The issue I noticed was a\u00a0<code>LEFT OUTER JOIN<\/code>\u00a0failing when it should have succeeded. I discovered a combination of factors could cause the bug to appear and vanish. For example, the join will succeed or fail depending on combinations of these factors:<\/p>\n<ol>\n<li>the presence of an additional column, not involved in the query at all<\/li>\n<li>the presence of additional rows which don\u2019t match in the join<\/li>\n<li>the order of columns in the table<\/li>\n<li>the presence of an additional tautology in the join clause<\/li>\n<\/ol>\n<p>I\u2019ve\u00a0<a href=\"http:\/\/bugs.mysql.com\/bug.php?id=18895\">filed a bug with MySQL<\/a>\u00a0about the issues I found, including a\u00a0script which demonstrates several ways to trigger the bug.<\/p>\n<p><strong><span style=\"font-size: 1.17em;\">Why use it?<\/span><\/strong><\/p>\n<p>Given the problems I\u2019ve mentioned, I recommend avoiding it entirely. It provides nothing that\u2019s not already possible with standard numeric types and adds a lot of confusion.<\/p>\n<p><strong><span style=\"font-size: 1.17em;\">Differences from Microsoft SQL Server<\/span><\/strong><\/p>\n<p>Microsoft SQL Server also provides a\u00a0<code>BIT<\/code>\u00a0data type. However, it\u2019s completely different; it\u2019s a\u00a0<em>single-bit column<\/em>. Internally, it is stored as a single bit within an integer data type. As successive\u00a0<code>BIT<\/code>\u00a0columns are added to a table, SQL Server packs them together behind the scenes. This is equivalent to doing bitmask operations on a single column (my previous employer loved bitmask columns!), but it allows the bits to be named explicitly, avoiding the need to pass around named constants (or embed magic numbers) and deal with bitwise arithmetic.<\/p>\n<h3>Pros and cons of bitmask columns<\/h3>\n<p>Bitmask columns (an integer within which each bit is retrieved and set via bitwise arithmetic) can be extremely handy. They\u2019re a very compact way to pack true\/false values together for efficient storage. They can also facilitate certain types of queries; for example, \u201cif any value is set\u201d queries become simple. I\u2019ve used them in ACLs\u00a0stored in a database, for instance. Certain types of problems are just easy to solve with bitwise arithmetic, and for those problems, creating an integer column and declaring \u201cbit 5 is whether the value is [something]\u201d makes a lot of sense.<\/p>\n<p>On the negative side, bitmask columns can be hard to use. For one thing, they\u2019re hard to understand. Without the documentation that says which bit means what, they\u2019re pretty much useless. SQL Server avoids this and the other problems I\u2019ll name by treating each bit as a separate column and naming it, but that\u2019s only if you use that facility, which my previous employer didn\u2019t. Bitwise arithmetic can also be pretty tricky to write, and even harder to read.<\/p>\n<p>Magic numbers in queries are just as meaningless as a column named\u00a0<code>bitcolumn1<\/code>. Declaring and passing around constants to name the magic numbers is a nice thought, but it\u2019s error-prone and it\u2019s extra work. Creating a table to define the bits can be quasi-helpful as well, unless (as often happened at my previous employer) you can\u2019t find the table, or the column is named such that you can\u2019t tell which table defines the values for which column, or the table\u2019s values don\u2019t make any sense for bitwise arithmetic.<\/p>\n<p>Bitmask columns are also not index-friendly, so querying against them isn\u2019t optimal. Of course, any column with only two values is useless to an index anyway, so this is no worse, performance-wise than storing the yes\/no values in columns by themselves. Since there\u2019s fewer data to examine, it can actually be more efficient.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>MySQL implements the\u00a0BIT\u00a0data type differently in different versions, and the behavior is not what one might expect. In this article I\u2019ll explain how MySQL\u2019s behavior has changed over time, what strange things can happen as a result, and how to understand and work around display issues. I\u2019ll tell you about a serious bug I\u2019ve found, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1499,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[29],"tags":[],"class_list":["post-394","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-resellerclub"],"acf":{"custom_meta_title":"Why we should not use BIT columns in MySQL Database | NTSPL","meta_description":"MySQL implements the BIT data type differently in different versions. Here are the reasons why we should not use BIT columns in MySQL Database.","meta_keyword":"bit columns in mysql, bit data type, bit data type in mysql server","other_meta_tag":"<meta property=og:locale content=\"en-IN\" \/>\r\n<meta property=og:type content=\"website\" \/>\r\n<meta property=og:title content=\"Why we should not use BIT columns in MySQL Database | NTSPL\"\/>\r\n<meta property=og:description content=\"MySQL implements the BIT data type differently in different versions. Here are the reasons why we should not use BIT columns in MySQL Database.\"\/>\r\n<meta property=og:url content=\"https:\/\/www.ntspl.co.in\/blog\/why-we-should-not-use-bit-columns-in-mysql-database\"\/>\r\n<meta property=og:site_name content=NTSPL \/>\r\n<meta name=\"twitter:site\" content=\"@NTSPL\">\r\n<meta name=twitter:card content=\"summary\" \/>\r\n<meta name=twitter:description content=\"MySQL implements the BIT data type differently in different versions. Here are the reasons why we should not use BIT columns in MySQL Database.\"\/>\r\n<meta name=twitter:title content=\"Why we should not use BIT columns in MySQL Database | NTSPL\"\/>"},"_links":{"self":[{"href":"https:\/\/www.ntspl.co.in\/blog\/wp-json\/wp\/v2\/posts\/394"}],"collection":[{"href":"https:\/\/www.ntspl.co.in\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ntspl.co.in\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ntspl.co.in\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ntspl.co.in\/blog\/wp-json\/wp\/v2\/comments?post=394"}],"version-history":[{"count":6,"href":"https:\/\/www.ntspl.co.in\/blog\/wp-json\/wp\/v2\/posts\/394\/revisions"}],"predecessor-version":[{"id":4303,"href":"https:\/\/www.ntspl.co.in\/blog\/wp-json\/wp\/v2\/posts\/394\/revisions\/4303"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.ntspl.co.in\/blog\/wp-json\/wp\/v2\/media\/1499"}],"wp:attachment":[{"href":"https:\/\/www.ntspl.co.in\/blog\/wp-json\/wp\/v2\/media?parent=394"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ntspl.co.in\/blog\/wp-json\/wp\/v2\/categories?post=394"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ntspl.co.in\/blog\/wp-json\/wp\/v2\/tags?post=394"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}