Search code examples
regexperlsplit

Splitting a string containing newlines gives unexpected results


In Perl, if I split a string containing newlines, like this;

@fields = split /\n/, "\n\n\n";

@fields is empty.

More examples:

@fields = split /\n/, "A\nB\nC";   # 3 items ("A","B","C") as expected
@fields = split /\n/, "A\n\nC";    # 3 items ("A","","C")  as expected
@fields = split /\n/, "A\nB\n";    # 2 items ("A","B" )    NOT as expected
@fields = split /\n/, "\nB\nC";    # 3 items ("","B","C")  as expected
@fields = split /\n/, "A\n\n";     # 1 items ("A")         NOT as expected

It's acting like trailing newlines are ignored.

If my string contains 3 newlines, and I'm splitting on the newline character, why doesn't my array always contain 3 items?

Is there a way I can do this?

btw, I get the same results on all versions of Perl that I have (5.34.0, 5.30.3, 5.22.1 and 5.18.2)


Solution

  • It is documented under perlfunc split that trailing empty parts are omitted from the returned list if the LIMIT parameter is omitted:

    If LIMIT is omitted (or, equivalently, zero), then it is usually treated as if it were instead negative but with the exception that trailing empty fields are stripped (empty leading fields are always preserved); if all fields are empty, then all fields are considered to be trailing (and are thus stripped in this case).

    To include them, include a LIMIT parameter, which may be -1 if you don't want to limit the number of returned strings.

    Try:

    @fields = split /\n/, "A\nB\n", -1;