I have a text that consists of several paragraphs that are divided by double newlines. I'd like to format them to a line width of 70, keeping the new lines and the whole thing should start with one non-indented line with Abstract: Lorem ipsum ...
.
So the whole thing should look like this:
Abstract: Magna risus nonummy mollis mattis neque commodo mattis fusce
hendrerit nibh. Lorem massa lorem mauris ad orci quam risus
viverra aliquet senectus sociis. Donec proin nam dolor neque
placerat imperdiet eros ullamcorper egestas cum torquent
habitasse. Risus donec odio nostra ac et pede inceptos
praesent montes. Neque morbi sit morbi vestibulum
suspendisse mauris. Lacus massa mollis.
Donec class integer pede ac sed elit. Fames augue magnis
sapien natoque nisi. Proin augue mus nisl interdum convallis
pellentesque conubia.
Class dolor tempor netus suspendisse odio orci
vestibulum mus. Netus purus. Lacus metus tempor purus
adipiscing faucibus eget maecenas. Velit lacus integer
rhoncus primis nunc quis lorem lacus dictumst hendrerit.
I am trying to use textwrap
, but that doesn't produce the desired output. Here's the code:
from loremipsum import get_paragraphs
import textwrap
text = '\n\n'.join(get_paragraphs(3))
item = 'Abstract: '
print textwrap.fill(item+text,initial_indent='',subsequent_indent=' '*len(item),replace_whitespace=False)
This works fine for the first paragraph, but the following paragraphs get some strange indentation and short lines like this
Class vitae
nonummy imperdiet cras blandit fusce. Massa porta metus
semper tempor non id viverra eget. Purus morbi lorem semper
eget. Proin magna tortor metus magnis. Vitae ipsum. Velit
class aliquet tortor dolor parturient ullamcorper libero ac.
This happens even if I use initial_indent=' '*len(item)
. Is this a bug? How can I get what I want?
From the documentation:
Note: If
replace_whitespace
is false, newlines may appear in the middle of a line and cause strange output. For this reason, text should be split into paragraphs (usingstr.splitlines()
or similar) which are wrapped separately.
So you should do something like:
paragraphs = get_paragraphs(3)
item = 'Abstract: '
paragraphs[0] = item + paragraphs[0]
for idx, paragraph in enumerate(paragraphs):
rest_indent = " "*len(item)
start_indent = "" if idx == 0 else rest_indent
print textwrap.fill(paragraph,initial_indent=start_indent,subsequent_indent=rest_indent,replace_whitespace=False)
print ""
Alternatively, using a list comprehension:
paragraphs = get_paragraphs(3)
item = 'Abstract: '
text = "\n\n".join(textwrap.fill(p,initial_indent=' '*len(item),subsequent_indent=' '*len(item)) for p in paragraphs)
print item + text.lstrip()