I have to make a program which takes a file of DNA sequences and a DNA subsequence from command arguments and find each time the subsequence and how many times it occurs. I'm having troubles with strcmp in line 36 and 42. Currently the way I have it I figured out through GDB that I am comparing the address of the strings and not the actual strings. But if I remove the & I get an error. I'm not sure what is the correct way to go about this is. TIA
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]) {
// place subsequence in string
char *subsequence = argv[2];
// get length of subsequence
int seqLength = strlen(subsequence);
// define file type and open for reading
FILE *inputFile = fopen(argv[1], "r");
// get each line using while loop
char inputLine[200]; // string variable to store each line
int i, j, lineLength, counter = 0, flag = -1;
while (fgets(inputLine, 200, inputFile) != NULL) { // loop through each line
lineLength = strlen(inputLine);
for (i = 0; i < lineLength; i++) { // loop through each char in the line
if (strcmp(&inputLine[i], &subsequence[0]) == 0) {
// if current char matches beginning of sequence loop through
// each of the remaining chars and check them against
// corresponding chars in the sequence
flag = 0;
for (j = i + 1; j - i < seqLength; j++) {
if (strcmp(&inputLine[j], &subsequence[j - i]) != 0) {
flag = 1;
break;
}
}
if (flag == 0) {
counter++;
}
}
}
}
fclose(inputFile);
printf("%s appears %d time(s)\n", subsequence, counter);
return 0;
}
dna.txt:
GGAAGTAGCAGGCCGCATGCTTGGAGGTAAAGTTCATGGTTCCCTGGCCC
input:
./dnaSearch dna.txt GTA
expected output:
GTA appears 2 times
Just do like this:
if (inputLine[i] == subsequence[0]) {
if (inputLine[j] != subsequence[j - i]) {
You do not need library functions to compare single characters.