I am trying to take one set of data and subtract each value in that data by another set of data.
For example:
Data set one (1, 2, 3)
Data set two (1, 2, 3, 4, 5)
So I should get something like (1 - (1 .. 5))
then (2 - (1..5))
and so on.
I currently have:
#!/usr/bin/perl
use strict;
use warnings;
my $inputfile = $ARGV[0];
open( INPUTFILE, "<", $inputfile ) or die $!;
my @array = <INPUTFILE>;
my $protein = 'PROT';
my $chain = 'P';
my $protein_coords;
for ( my $line = 0; $line <= $#array; ++$line ) {
if ( $array[$line] =~ m/\s+$protein\s+/ ) {
chomp $array[$line];
my @splitline = ( split /\s+/, $array[$line] );
my %coordinates = (
x => $splitline[5],
y => $splitline[6],
z => $splitline[7],
);
push @{ $protein_coords->[0] }, \%coordinates;
}
}
print "$protein_coords->[0]->[0]->{'z'} \n";
my $lipid1 = 'MEM1';
my $lipid2 = 'MEM2';
my $lipid_coords;
for ( my $line = 0; $line <= $#array; ++$line ) {
if ( $array[$line] =~ m/\s+$lipid1\s+/ || $array[$line] =~ m/\s+$lipid2\s+/ ) {
chomp $array[$line];
my @splitline = ( split /\s+/, $array[$line] );
my %coordinates = (
x => $splitline[5],
y => $splitline[6],
z => $splitline[7],
);
push @{ $lipid_coords->[1] }, \%coordinates;
}
}
print "$lipid_coords->[1]->[0]->{'z'} \n";
I am trying to take every value in $protein_coords->[0]->[$ticker]->{'z'}
minus each value in $lipid_coords->[1]->[$ticker]->{'z'}
.
My overall objective is to find (z2-z1)^2
in the equation d = sqrt((x2-x1)^2+(y2-y1)^2-(z2-z1)^2)
. I think that if I can do this once then I can do it for X and Y also. Technically I am trying to find the distance between every atom in a PDB file against every lipid atom in the same PDB and print the ResID for distance less than 5A.
The easiest way to do this is to do your calculations while you're going through file two:
for (my $line = 0; $line <= $#array; ++$line) {
if (($array[$line] =~ m/\s+$lipid1\s+/) | ($array[$line] =~ m/\s+$lipid2\s+/)) {
chomp $array[$line];
my @splitline = (split /\s+/, $array[$line]);
my %coordinates = (x => $splitline[5],
y => $splitline[6],
z => $splitline[7],
);
push @{$lipid_coords->[1]}, \%coordinates;
# go through each of the sets of protein coors in your array...
for my $p (@{$protein_coords->[0]}) {
# you can store this value however you want...
my $difference = $protein_coords->[0][$p]{z} - $coordinates{z};
}
}
}
If I were you, I would use some form of unique identifier to allow me to access the data on each combination -- e.g. build a hash of the form $difference->{<protein_id>}{<lipid_id>} = <difference>
.