Search code examples
design-patternskdb

Extract rows with column name matching a specific pattern in kdb q


I have a table with multiple patterns of char string. I need to extract the data with specific pattern .

tableName: ([] description: ("ABC 1.45 Apr-24"; "ABC 1.34 1/2/23"; "DE 2.456 Mar-22";"CBD 4/3/2023 5.78"; "XYZ 3.7 Jun-25"; "PQR 4.12 May-21"; "MNO 7.345 Jul-23"))

From the above table i need to extract data which is of pattern "?* .** ???-??"

The pattern "?* .** ???-??" is divided into three parts

  1. first part - i need it to be in letter which is the name(can be any number of letters) (ex. AB or ABC or ABCD etc)
  2. second part - i need it in decimal values format , can be more than one decimal value or max available
  3. third part - date format in the pattern APR-24

i tried the below , but does not show any data

q)pattern: "^[A-Za-z]+ [0-9]+\\.[0-9]+ [A-Za-z]+-[0-9]+$"  / Pattern to match

q)result: select description from tableName where description like\: pattern

q)result

OUTPUT AS

description
------------
"ABC 1.45 Apr-24"
"DE 2.456 Mar-22"
"XYZ 3.7 Jun-25"
"PQR 4.12 May-21"
"MNO 7.345 Jul-23"

Solution

  • Here's a method you could implement entirely within Q:

    tableName:update description:{" " vs x} each description from tableName; // Split into components
    tableName:select from tableName where 3=count each description, all each description[;0] in .Q.A, all each description[;1] in (.Q.n,"."), description[;2] like "[A-Z][a-z][a-z]-[0-3][0-9]"; // Apply clauses
    tableName:update description:{" " sv x}each description from tableName; // Rejoin into strings
    

    This first breaks up the strings by a space delimiter. Then the 4 elements of the where clause are:

    1. Check if there are 3 elements after slitting up by a space delimiter.
    2. Check if all elements in the first text block are Alphabetic Capitalised (.Q.A)
    3. Check if all elements in the second text block are Numeric and Decimal (.Q.n,".")
    4. Use Q's inbuilt regex functionality to check if the last text block adheres to our pattern

    Result:

    description       
    ------------------
    "ABC 1.45 Apr-24" 
    "DE 2.456 Mar-22" 
    "XYZ 3.7 Jun-25"  
    "PQR 4.12 May-21" 
    "MNO 7.345 Jul-23"