Search code examples
mysqlxmlxpathextract-value

Count number of rows where Extract Value found a match


100,000 rows in a table which contains a column that stores a large XML block, I need to check if there is a certain XML tag is filled with data in this column, lets say the column is called test_request and the XML tag is named 'd'. Also I want to make sure that the value inside 'd' doesn't contain a newline /n within the XML tag. So for every row that has a match I want to add 1 to a overall count. Here is my query so far.

SELECT EXTRACTVALUE( UNCOMPRESS(`test__request` ) ,  count('/a/b/c/d') ) 
FROM testTable16
WHERE  `test_created` >  '2014-08-16 10:00:00'
AND  `test_created` <=  '2014-08-16 10:10:00'
AND  `test_client` =  'test2'
AND  `test_user` =  'testuser2'
AND UNCOMPRESS(  `test__request` ) LIKE  '%<testID>test</testID>%'
LIMIT 0 , 30

it doesn't work though as it returns 100,000 rows which I cant obviously sift through. And I am not sure how to do the isnt newline check.


Solution

  • If you only to return rows with a count, you should move your count to the WHERE clause.

    My XPATH is a little rusty, but I believe you can use a predicate with the contains function:

    SELECT *
    FROM testTable16
    WHERE  `test_created` >  '2014-08-16 10:00:00'
    AND  `test_created` <=  '2014-08-16 10:10:00'
    AND  `test_client` =  'test2'
    AND  `test_user` =  'testuser2'
    AND UNCOMPRESS(`test__request`) LIKE  '%<testID>test</testID>%'
    AND EXTRACTVALUE( 
          UNCOMPRESS(`test__request`),  
          'count(/a/b/c/d[contains(text(),"\n")])'
        ) > 0
    LIMIT 0 , 30
    

    If you want to return a count of all rows that have at least one match use SELECT COUNT(*) ...

    If you want a total of all the node counts use:

    SELECT SUM(EXTRACTVALUE( 
             UNCOMPRESS(`test__request`),  
             'count(/a/b/c/d[contains(text(),"\n")])'
           ))
    FROM testTable16
    WHERE  `test_created` >  '2014-08-16 10:00:00'
    AND  `test_created` <=  '2014-08-16 10:10:00'
    AND  `test_client` =  'test2'
    AND  `test_user` =  'testuser2'
    AND UNCOMPRESS(`test__request`) LIKE '%<testID>test</testID>%'