I have a text file with the following content:
+----------------------------------------------------------------+
| This is a section |
+----------------------------------------------------------------+
#################### This is a subsection ####################
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
################# This is another subsection #################
I'd like to have each line not to overcome a certain amount of characters (66 in this case), so newlines can be inserted when needed; also, text should be justified on both sides, so multiple spaces can be added when needed as well. Finally, short lines should not be merged, and lines which contain exactly the desired amount of characters should not be modified, like shown below.
+----------------------------------------------------------------+
| This is a section |
+----------------------------------------------------------------+
#################### This is a subsection ####################
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor
in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident,
sunt in culpa qui officia deserunt mollit anim id est laborum.
################# This is another subsection #################
Unfortunately, fmt
cannot justify
fmt --width=67 in
+----------------------------------------------------------------+
| This is a section |
+----------------------------------------------------------------+
#################### This is a subsection ####################
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor
in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident,
sunt in culpa qui officia deserunt mollit anim id est laborum.
################# This is another subsection #################
and par
gives an error (at least on a recent Ubuntu) when it tries to process that file:
par 66j < in
par error:
Cannot justify.
I also tried fold
fold -w 66 in
but it breaks words just to reach the limit of the line, and with the -s
option its behaviour is similar to fmt
(on an older openSUSE it also deletes empty lines).
It seems Vim cannot justify if the line is longer than its specified textwidth (see below), but if I cut the lines breaking by spaces (fmt
or fold
approach above), save the output, open it in Vim and use the following instructions
:runtime macros/justify.vim
:% call Justify(66,3) # 3 is the maximum allowed space chars to add
+----------------------------------------------------------------+
| This is a section |
+----------------------------------------------------------------+
#################### This is a subsection ####################
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor
in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident,
sunt in culpa qui officia deserunt mollit anim id est laborum.
################# This is another subsection #################
I can obtain "almost" the desired result (spaces are added inside the "subsections"). But the worst downside is that a direct interaction is required, while I need a batch approach since the whole procedure needs to be automated.
In synthesis, if there is any solution, I'd strongly appreciate standard Unix text tools (maybe piped through each other) or calling Vim macros in "batch mode" (if possible) rather than custom scripts. I'm aware a Perl program called paradj
(not tried yet) has already been suggested in the past, but I'd like to know if standard tools can make it on their own.
EDIT 1
(thanks to Matthew Strawbridge) If I remove the first line with +- ... -+
then par
is able to process the file and returns
| This is a section |
+----------------------------------------------------------------+
#################### This is a subsection ####################
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident,
sunt in culpa qui officia deserunt mollit anim id est laborum.
################# This is another subsection #################
It seems to me like par
could be a very good tool to solve the problem, which now becomes:
par
to ignore the +- ... -+
patterns (by the way, why did the first one represent an obstacle and the second one not?);par
not to edit the spaces inside "sections" and "subsections". This might translate into "don't touch the lines with exactly the required number of characters in which the last character is not a space" (let's assume I don't use tabs).(Please note that in general this file could be longer and the "section" and "subsections" patterns could be repeated several times).
Many thanks to everybody and sorry for the excessive length.
EDIT 2
(thanks to glts) I have tested your suggestions, and both the interactive and the batch approach do well; the only thing with the latter, a minimal interaction with Vim is still required.
After googling a bit, I found some syntax examples to solve this last task as well.
vim -E -s in <<-EOF
:set textwidth=66
:g/^\a/normal! gqq
:runtime macros/justify.vim
:g/^\a/Justify 66 3
:update
:quit
EOF
or
vim -es -c 'set textwidth=66' -c 'g/^\a/normal! gqq' -c 'runtime macros/justify.vim' -c 'g/^\a/Justify 66 3' -c wq in
At this point, I consider my "problem" solved, but anybody willing to continue with the alternative par
approach is welcome!
Thanks again to anybody and thanks glts also for the Vim "lesson".
You can do a lot of this in Vim.
For example, here's an interactive approach that will do what you ask for.
Set 'textwidth'
to 66 and format your lines into paragraphs with the gq
operator.
:set textwidth=66
:g/^\a/normal! gqq
Source macros/justify.vim
and justify your paragraphs.
:runtime macros/justify.vim
:g/^\a/Justify 66 3
How well this works depends on how consistent your existing format is. I've identified paragraphs as lines starting with \a
, i.e. alphabetic characters (see :h /\a
).
In order to make this procedure part of a batch process you could save these commands in a Vim script file called, for example, myformat.vim
. This you could then repeatedly :source
on a number of text files provided as command-line arguments to Vim.
$ ls
a.txt b.txt c.txt myformat.vim
$ vim *.txt
:argdo source myformat.vim
This is one of those situations where the :argdo
command shines.