Search code examples
linuxawksednewlinedigits

How do I replace newlines only between numbers and a sentence?


I want to remove newlines in some special cases. I have this text:

0 
15.239 
23.917 
 Reprenem el debat que avui els oferim entorn de les perspectives d'aquest dos mil set. <ehh> Estavem parlant concretament dels temes 
30.027 
de la seguretat mundial 
 una miqueta 
de la intervencio
33.519 
que 

And I want to replace newlines between a number and some text as so:

0 
15.239 
23.917 Reprenem el debat que avui els oferim entorn de les perspectives d'aquest dos mil set. <ehh> Estavem parlant concretament dels temes 
30.027 de la seguretat mundial una miqueta de la intervencio
33.519 que

I want only to erase the new lines between numbers and a sentence.

Can anyone help me?


Solution

  • An awk:

    awk '/^[0-9]+\.[0-9]+/{printf "\n"}{printf $0}' filename
    

    For handling DOS line breaks:

    awk '{sub(/\r$/,"")}/^[0-9]+\.[0-9]+/{printf "\n"}{printf $0}' filename
    

    Demo:

    $ awk '{sub(/\r$/,"")}/^[0-9]+\.[0-9]+/{printf "\n"}{printf $0}' filename                        
    
    0 
    15.239 
    23.917  Reprenem el debat que avui els oferim entorn de les perspectives d'aquest dos mil set. <ehh> Estavem parlant concretament dels temes 
    30.027 de la seguretat mundial  una miqueta de la intervencio
    33.519 que que
    

    Explained code:

    • {sub(/\r$/,"")} : Delete DOS linebreaks.

    • /^[0-9\.]+/{printf "\n"}: When the line begins with a number/dot combination, print just a carriage return an continue with record processing.

    • {printf $0} : For the remain record or the ones not started by numbers just prints $0 without line breaks.

    • At the end , placing the carriage return just before the numbers and ignoring the rest makes the trick.