Search code examples
pythontext

Covert text file into a list using empty line as delimiter in Python


I have a text file that contains paragraphs separated by an empty line, such as below. I am trying to create a list, where each element is a paragraph in the text file.

The Yakutia region, or Sakha Republic, where the Siberian wildfires are mainly 
taking place is one of the most remote parts of Russia. 

The capital city, Yakutsk, recorded one of the coldest temperatures on Earth in 
February 1891, of minus 64.4 degrees Celsius (minus 83.9 degrees Fahrenheit); but 
the region saw record high temperatures this winter. 

The Siberian Times reported in mid-July that residents were breathing smoke from more than 
300 separate wildfires, but that only around half of the forest blazes were being tackled 
by firefighters — including paratroopers flown in by the Russian military — because 
the rest were thought to be too dangerous.

The wildfires have grown in size since then and have engulfed an estimated 62,300 square 
miles (161,300 square km) since the start of the year.

So in the above example there would be 4 elements in the list, one for each paragraph.

I can easily combine the paragraphs into a single string using the following code,

mystr = " ".join([line.strip() for line in lines])

but I have no idea how to use the empty line between the paragraphs as a delimiter to make a list out of the text file. I have tried,

with open('texr.txt', encoding='utf8') as f:
    lines = [line for line in f]

hoping that I could covert every line into a list element, and then combining everything between an empty space into one string. But that doesn't seem to work. I must be missing something very fundamental here..

Thanks


Solution

  • Try:

    with open('textr.txt') as fp:
        lst = [p.strip() for p in fp.read().split('\n\n')]
    
    >>> lst
    
    ['The Yakutia region, or Sakha Republic, where the Siberian wildfires are mainly \ntaking place is one of the most remote parts of Russia.',
     'The capital city, Yakutsk, recorded one of the coldest temperatures on Earth in \nFebruary 1891, of minus 64.4 degrees Celsius (minus 83.9 degrees Fahrenheit); but \nthe region saw record high temperatures this winter.',
     'The Siberian Times reported in mid-July that residents were breathing smoke from more than \n300 separate wildfires, but that only around half of the forest blazes were being tackled \nby firefighters — including paratroopers flown in by the Russian military — because \nthe rest were thought to be too dangerous.',
     'The wildfires have grown in size since then and have engulfed an estimated 62,300 square \nmiles (161,300 square km) since the start of the year.']