Search code examples
perltap

Prove returns inconsistent test results with Test::More and .tap extension


Attempting to make a basic test using Test::More, on a file named test.tap:

use Test::More tests => 2;

is( 1, 1 );
is( 2, 2 );

Running prove against this test causes a failure:

$ prove test.tap
test.tap .. No subtests run

Test Summary Report
-------------------
test.tap (Wstat: 0 Tests: 0 Failed: 0)
  Parse errors: No plan found in TAP output
Files=1, Tests=0,  0 wallclock secs ( 0.02 usr +  0.00 sys =  0.02 CPU)

But Perl gives a seemingly valid TAP output:

$ perl test.tap
1..2
ok 1
ok 2

The prove version is:

$ prove --version
TAP::Harness v3.35 and Perl v5.22.1

Additionally, I found that adding a shebang #! on the test file causes test results to pass intermittently

#!/usr/bin/perl

use Test::More tests => 2;

is( 1, 1 );
is( 2, 2 );

On success (passes ~1 time in 4):

t/test.tap .. ok
All tests successful.
Files=1, Tests=2,  0 wallclock secs ( 0.03 usr  0.00 sys +  0.01 cusr  0.00 csys =  0.04 CPU)
Result: PASS

I have also found that renaming the file to test.t causes the test to pass every time.

In attempting to find a bug in an old version, I have replicated this issue on a fresh DigitalOcean Droplet running Ubuntu 16.04.2, and a Debian 8 host with TAP::Harness v3.36_01 and Perl v5.24.1.

I'm hoping to avoid "Rename all files to .t extension" being the answer. I am uncertain what TAP::Harness considers to be the difference between those two extensions, and cannot find any documentation or where in the source code that distinction is made.

Any clarification on what is happening is greatly appreciated.


Solution

  • The .tap extension tells prove that test.tap is a text file which contains TAP. It does not execute it as a Perl program, it just reads the file and tries to parse it as TAP. You can see this with prove -v.

    $ prove -v test.tap
    test.tap .. 
    use Test::More tests => 2;
    is(1,1);
    is(2,2);
    
    No subtests run 
    
    Test Summary Report
    -------------------
    test.tap (Wstat: 0 Tests: 0 Failed: 0)
      Parse errors: No plan found in TAP output
    Files=1, Tests=0,  0 wallclock secs ( 0.02 usr +  0.00 sys =  0.02 CPU)
    Result: FAIL
    

    Instead, the convention for test programs to execute is test.t.

    $ mv test.tap test.t
    $ prove -v test.t
    test.t .. 
    1..2
    ok 1
    ok 2
    ok
    All tests successful.
    Files=1, Tests=2,  0 wallclock secs ( 0.03 usr  0.00 sys +  0.04 cusr  0.00 csys =  0.07 CPU)
    Result: PASS
    

    See TAP::Parser::SourceHandler::File for more.


    Additionally, I found that adding a shebang to the test file causes test results intermittently pass:

    What's happening is the various TAP::Parser::SourceHandler plugins are all voting about what it is, and it's a tie. TAP::Parser::SourceHandler::Perl sees "a shebang ala "#!...perl" and votes 0.9. TAP::Parser::SourceHandler::File sees a .tap extension and votes 0.9. You can see this by setting the TAP_HARNESS_SOURCE_FACTORY_VOTES environment variable.

    $ TAP_HARNESS_SOURCE_FACTORY_VOTES=1  prove test.tap
    votes: TAP::Parser::SourceHandler::File: 0.9, TAP::Parser::SourceHandler::Perl: 0.9
    test.tap .. ok   
    All tests successful.
    Files=1, Tests=2,  0 wallclock secs ( 0.03 usr  0.00 sys +  0.05 cusr  0.00 csys =  0.08 CPU)
    Result: PASS
    
    $ TAP_HARNESS_SOURCE_FACTORY_VOTES=1  prove test.tap
    votes: TAP::Parser::SourceHandler::Perl: 0.9, TAP::Parser::SourceHandler::File: 0.9
    test.tap .. No subtests run 
    
    Test Summary Report
    -------------------
    test.tap (Wstat: 0 Tests: 0 Failed: 0)
      Parse errors: No plan found in TAP output
    Files=1, Tests=0,  0 wallclock secs ( 0.02 usr +  0.01 sys =  0.03 CPU)
    Result: FAIL
    

    The handlers are then sorted by votes. Since Perl's sort is not stable, that is if two entries are equal it does not preserve their order, either handler can come out on top. Even using stable sort, it's sorting from keys %handlers which will come out in a different order each process. Here's the code for that.

    For a test harness to not be deterministic in its decisions is bad. It should probably instead throw an error. I also note it's using a string comparison, which is probably wrong.

    I've submitted a patch to make ties an error.