I have created a regular expression to find a certain pattern(example{ 008-150A-003E
,9C1-E10-010
etc...) from a a paragraph but its failing by fetching me values based on length of the pattern.
this is the Regx I wrote:
[A-z0-9]{3,4}[\S][A-z0-9]{3,4}[\S][A-z0-9]{3,4}
For testing I have used this paragraph given below:
****Test Paragraph******
PCB ASSLY (9C1-E10-010)
PLEASE ARRANGE QUOTATION FOR THE ABOVE ITEAM AT THE EARLIEST 9C1 E10.010
008-150A-003E
Form BHUVANESWARI COTSPIN INDIA P LTD
asdfghjklpo
***************
I want to find only the patterns 9C1-E10-010, 008-150A-003E etc... from the above test paragraph. PS: the expression can be of theis form also i.e;9C1 E10.010, I'm inlcuding that in test Paragraph also
The point is that you are using \S
, any non-whitespace, to match hyphens. Use -
or [^\w\s]
(to match any punctuation), or plain \W
(to match any non-word char) instead. Also, replace A-z
with A-Za-z
as [A-z]
matches more than just letters. Also, to enforce length restriction on start/end, you need (at least) word boundaries.
Use
\b[A-Za-z0-9]{3,4}\W[A-Za-z0-9]{3,4}\W[A-Za-z0-9]{3,4}\b
See the regex demo