Search code examples
perlwhitespaceparagraphbackticks

Split paragraphs with extra whitespace between them using Perl's paragraph mode


I'm trying to parse the output of pvdisplay(8), which prints a separate "paragraph" for each physical volume:

  --- Physical volume ---
  PV Name               /dev/sdb
  VG Name               vg_virtual_01
  PV Size               16.37 TiB / not usable 2.25 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              4291533
  Free PE               3830989
  Allocated PE          460544
  PV UUID               zqi1Q6-tIag-ghQy-MmdJ-kyOS-XkmY-HpyQ51

  --- Physical volume ---
  PV Name               /dev/sda
  VG Name               vg_virtual_02
  PV Size               16.37 TiB / not usable 2.25 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              4291533
  Free PE               1525709
  Allocated PE          2765824
  PV UUID               BlYvJW-ieVx-AjRg-p3r6-e4oC-TkPc-u2lf7x

Simple enough, right?

#!/usr/bin/perl

use strict;
use warnings;

my @pvs = do {
    local $/ = '';
    `pvdisplay`;
};

As it turns out, not as simple as I thought. After half an hour of bashing head on keyboard because my array was only getting a single item, I realized that the blank lines between paragraphs actually contain additional whitespace before the EOL. According to perlvar, $/ treats empty lines as a terminator if set to the null string, with the caveat that empty lines cannot contain any spaces or tabs. And of course, $/ is a string, not a regex, so I can't set it to something like /^\s+$/.

The only way I can see around this is to first save the output to a file, strip the extraneous whitespace, and read it back in, but I would hate to create a temporary file for something so simple. Is there a more elegant way to do this?

Edit: I can do this using split(/^\s+$/m), but I'm just curious if there's a way to do it by changing $/


Solution

  • Huh? Just split the output yourself:

    my @pvs = split /\n\s*\n/, scalar `pvdisplay`;
    

    No, there is no way using $/ unless the exact separator is known character for character (you could try e.g. $/ = "\n \n" if there are two spaces`). Don't try to unnecessarily make your job harder.