Search code examples
coldfusioncoldfusion-10

How to parse out data of a string


I have a function which gets a string from another website and if I extract it I end up with the following string

IFX TMP2134567 1433010010 WT33 PARTIAL 2014-11-26 09:43:58 IFX TEMP12345 1433010003 SW80 PARTIAL 2014-11-26 09:43:10 IFX AP RETERM 007 1418310108 MB01 CONFIRMED 2014-07-03 09:48:37

In this case it's 2 records which have 6 fields each and they are all separated by a space. how can I go and read the string and add these into an structure and array to access them.

The fields would be set up like this

  1. IFX
  2. TMP2134567 (this field may contain a space)
  3. 1433010010
  4. WT33
  5. PARTIAL
  6. 2014-11-26 09:43:58.

So if we use the " " as a separator we would get 7 since the 6th is a date time and has a space between I could also use 7 since I can put 6 and 7 back together and store date and time separately.

My question is there a way to do this with 6 or if I have to use 7 how would I do that. I tried valuelist but that does not work.

I know a couple of things in my list, 1st one is always 3 Char, 4th is always 4 char and my record ends with a date time in format YYYY-MM-DD HH:MM:SS

To make it a bit more complicated I just found that the 2nd field can have spaces like in the 3rd record which looks like this "AP RETERM 007"


Solution

  • Another option is to create a JSON string with your data like this, and then deserialize it.

    <cfsavecontent variable="sampledata">
      IFX TMP2134567 1433010010 WT33 PARTIAL 2014-11-26 09:43:58 IFX TEM P12345 1433010003 SW80 PARTIAL 2014-11-26 09:43:10 IFX AP RETERM 007 1418310108 MB01 CONFIRMED 2014-07-03 09:48:37</cfsavecontent>
    
    <cfset asJson  = ReReplaceNoCase(sampledata,"\s*(.{3}) (.*?) (\d+) (.{4}) ([^\s]*) (\d+-\d+-\d+ \d+:\d+:\d+)\s*",'["\1","\2","\3","\4","\5","\6"],',"ALL")>
    
    <!--- Replace the last comma in the generated string with a closing bracket --->
    <cfset asJson = "[" & ReReplace(asJson,",$","]","ALL")>
    
    <cfset result_array = DeSerializeJSON(asJson)>
    
    <cfdump var="#result_array#">
    

    You can access the data simply with the resulting array.

    So here's how I understand it

    1. 3 characters
    2. Variable string
    3. All digits
    4. 4 characters
    5. I assume this value never contains a space
    6. Date/Time