Search code examples
formsperlform-submitlwp-useragent

Why is the server returned the result for a different submit than selected by perl HTML::Form and LWP::UserAgent?


I want to process a number of files with http://2struc.cryst.bbk.ac.uk/twostruc; to automate this I wrote a perl script using perl's HTML::Form.

This server has a two step submit process: first, upload a file or enter an id; second, select the methods to be used and the output (by chosing one of five submits).

The first step works, but for the second step I seem to be unable to chose any submit button other than the first, even though my script output confirms that I selected the one I want (different from the first).

The two core parts of the code are below, the request function:

sub create_submit_request
{
  my $form_arrayref = shift;
  my $form_action = shift;
  my $value_hashref = shift;
  my $submit_name = shift;
  my $submit_index = shift;

  my $found_form = 0;
  my $form;
  foreach my $this_form( @$form_arrayref)
  {
    printf( "# Found form with action=%s\n", $this_form->action);
    if( $this_form->action eq $form_action)
    {
      $found_form = 1;
      $form = $this_form;
    }
  }
  die( "# Error: No form with action $form_action") if( $found_form == 0);

  my @inputs = $form->inputs;
  my $inputs_string;
  foreach my $input( @inputs)
  {
    my $input_name = defined( $input->name) ? $input->name : "<unnamed_input>";
    my $input_value = defined( $input->value) ? $input->value : "";
    $inputs_string .= $input_name.( length( $input_value) > 0 ? "=".$input_value : "")." (".$input->type."); ";
  }
  printf( "# Available input names: %s\n", $inputs_string);

  printf( "# Filling in form data\n");
  while( my( $key, $value) = each( %$value_hashref))
  {
    $form->value( $key, $value);
  }

  my @submit_buttons = $form->find_input( $submit_name, "submit", $submit_index); # 1-based counting for the index
  die( "# Error: Can only handle a single submit, but found ".scalar( @submit_buttons)) if( scalar( @submit_buttons) != 1);
  my %submit_hash = %{ $submit_buttons[ 0]};

  # DEBUG
  printf( "# Use submit: %s\n", Data::Dumper->Dump( [ \%submit_hash ]));

  return $form->click( %submit_hash);
}

and the code using it:

my $request = HTTP::Request->new( GET => $url_server);
my $response = $useragent->request( $request);

# the first page contains the pdb id input and file upload inputs
my @forms = HTML::Form->parse( $response);
my %value_hash = ( "file" => $pdb_file);
# the submit buttons have no name, use undef; chose the first one (w/o javascript)
$request = create_submit_request( \@forms, $form_action1, \%value_hash, undef, 1);

printf( "# Submitting to server\n");
$response = $useragent->request( $request);

# the first page contains the pdb id input and file upload inputs
@forms = HTML::Form->parse( $response);
%value_hash =( "dsspcont" => "on", "stride" => "on");
# this form has 5 submit buttons; select the 5th
$request = create_submit_request( \@forms, $form_action2, \%value_hash, undef, 5);

printf( "# Submitting to server\n");
$response = $useragent->request( $request);

my $response_content = $response->content;
printf( "# Response content: %s\n", $response_content);

Even though the script prints

# Use submit: $VAR1 = {
      'name' => 'function_sequenceStructureAlignment',
      'onclick' => 'this.form.target=\'_blank\';return true;',
      'type' => 'submit',
      'value' => 'Sequence Structure Alignments',
      'value_name' => ''
    };

which is the 5th submit button in the second step, the response is equivalent to pressing the first submit button.

To test the server itself, the file 1UBI.pdb can be downloaded from http://www.rcsb.org/pdb/files/1UBI.pdb and uploaded to the server. The full script is at http://pastebin.com/bSJLvNfc and can be run with

perl 2struc.pl --pdb 1UBI.pdb

Why is the server returning a different output/submit that I seem to select in the script?

(It seems it's not dependend on cookies, because I can clear them after the first step, and still get the correct result for the second step in a web browser.)


Solution

  • You gave a hash as selector for click, which is wrong (see documentation how to specify the selector). But because you have already found the correct submit element you could simply call click directly on it:

    --- orig.pl
    +++ fixed.pl
    @@ -87,7 +87,7 @@
       # DEBUG
       printf( "# Use submit: %s\n", Data::Dumper->Dump( [ \%submit_hash ]));
    
    -  return $form->click( %submit_hash);
    +  return $submit_buttons[0]->click($form);
     }
    
     sub predict_pdb