Search code examples
awkicalendar

Awk: generating icalendar file. How to print some consecutive lines?


Thanks to Ed Morton's answer, I could do some testing with Thunderbird and icalendar validator. I edited my question adding entries without description and the expected result with precise requirements.

I'm writing a script to generate an icalendar file from a planing text file. I'd like to get the description content with the lines coming after the date. Say I've got a planning file:

lun 06 05 2019 08 15 09 00 F206
    A descritpion text.
ven 10 05 2019 11 00 11 45 G202
    Another description text
    - on multiple; 
    - lines.
lun 13 05 2019 08 15 09 00 F206
ven 17 05 2019 11 00 11 45 G202
    A long description with more than 75 characters.
    This happen often when multiple lines are
    joined in one. So the program shoud split every lines
    To 75 characters including the word description.
lun 20 05 2019 08 15 09 00 F206
    A description text.

My script looks like this, I'm a newbie with awk:

#!/bin/bash
awk ' BEGIN { print "BEGIN:VCALENDAR\r\n\
... some entries here ...\r\n\
END:VTIMEZONE\r" ;}
$1~/^(lun|mar|mer|jeu|ven)$/ { print "BEGIN:VEVENT\r\n\
... some entries here ...\r\n\
DTSTART;TZID=Europe/Zurich:"$4""$3""$2"T"$5""$6"00\r\n\
DTEND;TZID=Europe/Zurich:"$4""$3""$2"T"$7""$8"00\r\n\
TRANSP:OPAQUE\r\n\
DESCRIPTION: >>>HERE I NEED THE DESCRIPTIVE LINES<<<< \r\n\
END:VEVENT\r"}
END { print "END:VCALENDAR" } ' < $1 > $1.ics

expected result:

BEGIN:VCALENDAR
BEGIN:VEVENT
DTSTART;TZID=Europe/Zurich:20190506T081500
DTEND;TZID=Europe/Zurich:20190506T090000
DESCRIPTION:A descritpion text.
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Zurich:20190510T110000
DTEND;TZID=Europe/Zurich:20190510T114500
DESCRIPTION:Another description text\n- on multiple;\n- lines.
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Zurich:20190513T081500
DTEND;TZID=Europe/Zurich:20190513T090000
END:VEVENT                  
BEGIN:VEVENT
DTSTART;TZID=Europe/Zurich:20190517T110000
DTEND;TZID=Europe/Zurich:20190517T114500
DESCRIPTION:A long description with more than 75 characters.\nThis happen
 often when multiple lines are\njoined in one. So the program shoud split 
 every lines\nTo 75 characters including the word description.
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Zurich:20190520T081500
DTEND;TZID=Europe/Zurich:20190520T090000
DESCRIPTION: A description text.
END:VEVENT
END:VCALENDAR

So the exact requirements are:

  1. Lines without description should not print DESCRIPTION:.
  2. Multiline description should be joined an separated with a literal \n. This is working with a printf "%s%s", $0, "\\n"
  3. The lines should be splited to fewer than 75 characters, ending with a \r\n
  4. The additional description lines should begin with a space.

Solution

  • $ cat tst.awk
    BEGIN {
        ORS="\r\n"
    
        print "BEGIN:VCALENDAR"
        print "... some entries here ..."
        print "END:VTIMEZONE"
    }
    /^[^[:space:]]/ {
        prtEndVevent()
    
        print "BEGIN:VEVENT"
        print "... some entries here ..."
    
        date = $4 $3 $2
        begt = $5 $6 "00"
        endt = $7 $8 "00"
    
        print "DTSTART;TZID=Europe/Zurich:" date "T" begt
        print "DTEND;TZID=Europe/Zurich:"   date "T" endt
        next
    }
    {
        gsub(/^[[:space:]]+|[[:space:]]+$/,"")
        desc = (desc == "" ? "DESCRIPTION:" : desc RS) $0
    }
    END {
        prtEndVevent()
        print "END:VCALENDAR"
    }
    
    function prtEndVevent(       wid) {
        if ( desc != "" ) {
            wid = 74
            gsub(RS,"\\n",desc)
            while ( desc !~ /^ ?$/ ) {
                print substr(desc,1,wid)
                desc = " " substr(desc,wid+1)
            }
            desc = ""
        }
        if ( endVevent != "" ) {
            print endVevent
        }
        endVevent = "END:VEVENT"
    }
    

    .

    $ awk -f tst.awk file
    BEGIN:VCALENDAR
    ... some entries here ...
    END:VTIMEZONE
    BEGIN:VEVENT
    ... some entries here ...
    DTSTART;TZID=Europe/Zurich:20190506T081500
    DTEND;TZID=Europe/Zurich:20190506T090000
    DESCRIPTION:A descritpion text.
    END:VEVENT
    BEGIN:VEVENT
    ... some entries here ...
    DTSTART;TZID=Europe/Zurich:20190510T110000
    DTEND;TZID=Europe/Zurich:20190510T114500
    DESCRIPTION:Another description text\n- on multiple;\n- lines.
    END:VEVENT
    BEGIN:VEVENT
    ... some entries here ...
    DTSTART;TZID=Europe/Zurich:20190513T081500
    DTEND;TZID=Europe/Zurich:20190513T090000
    END:VEVENT
    BEGIN:VEVENT
    ... some entries here ...
    DTSTART;TZID=Europe/Zurich:20190517T110000
    DTEND;TZID=Europe/Zurich:20190517T114500
    DESCRIPTION:A long description with more than 75 characters.\nThis happen
     often when multiple lines are\njoined in one. So the program shoud split
     every lines\nTo 75 characters including the word description.
    END:VEVENT
    BEGIN:VEVENT
    ... some entries here ...
    DTSTART;TZID=Europe/Zurich:20190520T081500
    DTEND;TZID=Europe/Zurich:20190520T090000
    DESCRIPTION:A description text.
    END:VEVENT
    END:VCALENDAR
    

    Note that this is wrapping at character positions, not word boundaries, so if a word crosses the 75th char position it will be split. If that's not desirable you can update prtDesc() to print one word at a time, checking if the total length of all words + blanks printed plus the next word would be less than 75 (and decide how to handle a description string that is 75+ chars long with no spaces!) or call the UNIX command fold to do the wrapping for you.

    If you're ever considering using getline make sure to read and completely understand http://awk.freeshell.org/AllAboutGetline first.