Get paragraphs after a certain symbol: need better output

I‘m a python beginner.

I have this code that get‘s all paragraphs after the symbols ‘*****‘

import re


file = open('/Users/simon/DRIVE/ARCHIVED/Tools at Hand/PASTE.txt', mode='r')

result = [s.strip() for s in re.findall(r'^\*{4,}((?:\r?\n(?!\s*$|\*{4}).+)*)', file.read(), 

re.MULTILINE)]



print(result)
file.close()

Input:

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

****
Sed id placerat magna.

*******
Pellentesque in ex ac urna tincidunt tristique. 

Etiam dapibus faucibus gravida.

The output I need:

Sed id placerat magna.

Pellentesque in ex ac urna tincidunt tristique.

The output I get:

'Sed id placerat magna.', 'Pellentesque in ex ac urna tincidunt tristique.'

I can‘t seem to figure out how to output each sentence per paragraph.

Solution

Using re.findall returns a list with the values of the capture group.

You could for example print the list prepending * to unpack the result and set the separator to 2 newlines.

import re

file = open('/Users/simon/DRIVE/ARCHIVED/Tools at Hand/PASTE.txt', mode='r')

result = [s.strip() for s in re.findall(r'^\*{4,}((?:\r?\n(?!\s*$|\*{4}).+)*)', file.read(), re.MULTILINE)]

print(*result, sep="\n\n")

file.close()

Output

Sed id placerat magna.

Pellentesque in ex ac urna tincidunt tristique.