I have the text below stored in the variable description
:
This is a code update
Official Name: None
Pub: https://content.upcodes.co/viewer/washington/wa-mechanical-code-2021
Agency:
Reference: https://web.archive.org/web/20230226234118/https://lawfilesext.leg.wa.gov/law/wsr/agency/BuildingCodeCouncil.htm
Citation: WAC 51-52 / WSR 23-02-055
Draft Doc Title: WSR 23-02-055 (#1)
Draft Source Doc: https://web.archive.org/web/20230303022030/https://lawfilesext.leg.wa.gov/law/wsr/2023/02/23-02-055.htm (#1)
Draft Drive: https://drive.google.com/file/d/1pYmwQS3t-ZX-Vyg9yBabtIpXZ7By2G6f/view?usp=share_link ( #1)
Final Doc Title:
IECC Com Update(#1)
IECC Res Update (#2)
Final Source Doc: https://web.archive.org/web/20230303022130/https://apps.leg.wa.gov/wac/default.aspx?cite=51-52&full=true&pdf=true (#1)
https://web.archive.org/web/20230303022030/https://lawfilesext.leg.wa.gov/law/wsr/2023/02/23-02-055.htm (#2)
Final Drive: https://web.archive.org/web/20230303022130/https://apps.leg.wa.gov/wac/default.aspx?cite=51-52&full=true&pdf=true (#1)
https://web.archive.org/web/2023030302fdfdfg2130/https://apps.legfdg.gov/wac/default.aspx?cite=51-52&fdsfullfdsf=true&pfdsfdf=true (#2)
Effective Date: January 4, 2023
I want to extract the information after 'Final Doc Title:' tag. It should give me two values. The first value is IECC Com Update(#1)
and IECC Res Update (#2)
. I have a code below that extracts the text after the tag until a new line character is found.
//8. Extract Final Doc Title
var final_doc_title = description.search("Final Doc Title:");
if(final_doc_title != -1){
final_doc_title = description.match(/(?<=^Final Doc Title:)[^\n\r]+/m);
final_doc_title = final_doc_title?.[0].trim();
}else{
final_doc_title = '';
}
console.log('Final Doc Title: ' + final_doc_title);
The problem with this code is it returns an empty string, because there is a newline character right after 'Final Doc Title:'.
Final Doc Title:\n
IECC Com Update(#1)\n
IECC Com Update(#1)\n
How will I modify my code to return two lines? Thanks!
You can match those newline characters with \s*
, assuming you are not interested in white space that precedes the text you are looking for.
If the text you want to find ends just before the line that has a colon (like in Final Source Doc: https:....
), then you could do the following:
const description = "This is a code update\n\nOfficial Name: None\n\nPub: https://content.upcodes.co/viewer/washington/wa-mechanical-code-2021\n\nAgency: \n\nReference: https://web.archive.org/web/20230226234118/https://lawfilesext.leg.wa.gov/law/wsr/agency/BuildingCodeCouncil.htm\n\nCitation: WAC 51-52 / WSR 23-02-055\n\nDraft Doc Title: WSR 23-02-055 (#1)\n\nDraft Source Doc: https://web.archive.org/web/20230303022030/https://lawfilesext.leg.wa.gov/law/wsr/2023/02/23-02-055.htm (#1)\n\nDraft Drive: https://drive.google.com/file/d/1pYmwQS3t-ZX-Vyg9yBabtIpXZ7By2G6f/view?usp=share_link ( #1)\n\nFinal Doc Title: \n\nIECC Com Update(#1)\n\nIECC Res Update (#2)\n\nFinal Source Doc: https://web.archive.org/web/20230303022130/https://apps.leg.wa.gov/wac/default.aspx?cite=51-52&full=true&pdf=true (#1)\n\nhttps://web.archive.org/web/20230303022030/https://lawfilesext.leg.wa.gov/law/wsr/2023/02/23-02-055.htm (#2)\n\nFinal Drive: https://web.archive.org/web/20230303022130/https://apps.leg.wa.gov/wac/default.aspx?cite=51-52&full=true&pdf=true (#1)\n\nhttps://web.archive.org/web/2023030302fdfdfg2130/https://apps.legfdg.gov/wac/default.aspx?cite=51-52&fdsfullfdsf=true&pfdsfdf=true (#2)\n\nEffective Date: January 4, 2023\nI want to extract the information after 'Final Doc Title:' tag. It should give me two values. The first value is IECC Com Update(#1) and IECC Res Update (#2). I have a code below that extracts the text after the tag until a new line character is found.\n\n//8. Extract Final Doc Title";
var result = description.match(/^Final Doc Title:\s*((?:\s*^(?:[^:\r\n]*)$)*)/m)?.[1];
var parts = result?.match?.(/.+/gm);
console.log(parts);