Search code examples
regexsqlitepandasql

sqlite regex: How return count of 'X' from values in column


i am using pandasql to transform data. Inside a query i would like to pull out for example the number of periods ('.') in the email address. SQLite does not seem to support regex.

In SQL i could write:

length(regexp_replace(email, '[^.]', '', 'g')) as email_period

*#applying this to the email ([email protected]) would return 2*

Look forward to your expertise for a solution with SQLite. Thank you in advance.


Solution

  • You can use simple REPLACE and LENGTH to calculate number of periods:

    CREATE TABLE tab(email VARCHAR(100));
    
    INSERT INTO tab(email) 
    VALUES ('[email protected]'),('[email protected]');
    
    SELECT email, LENGTH(email) - LENGTH(REPLACE(email, '.', '')) AS  email_period
    FROM tab;
    

    SqlFiddleDemo

    EDIT:

    Counting digits:

    SELECT email,
       LENGTH(email) - 
       LENGTH(REPLACE( REPLACE(REPLACE(REPLACE(REPLACE(
                 REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(email, '0', '')
                 ,'1', ''),'2', ''),'3', ''),'4', '')  
                 ,'5', ''),'6', ''),'7', ''),'8', '')  
                 ,'9', '')) AS email_digits
    FROM tab
    

    SqlFiddleDemo2