Search code examples
phpregextext-processingplaintext

get list in plain text using regex


I want to get all the list items in a plain text (.txt file) using regex. For example in:

Books must I read this week before Saturday:
1. Geography
2. Math
3. Biology
The priority book is book 2. This book is borrowed by John.

I use preg_match_all as follows

$pattern = "/^[0-9]\.(.*)\n/";
preg_match_all($pattern, $filehandler, $matches);

I expect the following result:

1. Geography
2. Math
3. Biology

The string 2. This book is borrowed by John. should not be matched in $matches. But I get nothing from that pattern. Does anyone know what pattern I should use?


Solution

  • you can try this

    $list = 'Books must I read this week before Saturday:
    1. Geography
      2. Math
            3. Biology
    The priority book is book 2. This book is borrowed by John.';
    
    preg_match_all('/\n[\s\t]*(\d+\..*)/', $list, $bullets);
    
    var_dump($bullets);