Determine whether field in MySql indexed column is numeric using index semantics

I have a MySql table where I want to get the count of rows where a given VARCHAR column has a numeric value (convertible to number, you know). Right now, I'm doing a simple REGEXP check on this field. Since this table is very large, I'm using a series of indexes to REGEXP as few rows as possible.

But this VARCHAR column is also indexed. Is there a clever hack of the MySql indexing algorithm that I can exploit to scan even fewer rows? :-/ This is an InnoDB table.

Solution

You may not like this, as you are probably already trying to avoid it, but rather than trying to do some clever trick, when I have had situations like this, I add an additional column that stores the varchar in an numeric column (updated using a trigger), and query on that.

But, there is a way I can see to do it (though I have never had a reason to do this in production), which is to exploit the fact that indexing will put the values in order, such that all that begin with a number are sequenced together.

Assuming a table like this:

CREATE TABLE `test_1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `text_or_number` varchar(255),
  PRIMARY KEY (`id`),
  KEY `test_1_idx` (`text_or_number`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

You can select only values starting with numbers by exploiting the order of utf8 characters - http://en.wikipedia.org/wiki/UTF-8#Examples

The lowest value before 0 in utf-8 is "/", and the highest after it is ":", so this should extract only values that start with a number:

select   cast(text_or_number as unsigned)
from     test_1
where    text_or_number < ':'
and      text_or_number > '/'
and      cast(text_or_number as unsigned) > 0;

That could still contain values that start with a number, but do not end with one, which is why I have added the cast(...) > 0 clause, but I think mysql will be smart enough to run the where clauses in order, so hopefully it will only run the cast on the subset of rows that start with a numeric char.