I have two characters arrays called arraypi
and arraye
containing numbers that I read from a file. Each have 1,000,000
characters. I need to start from the first character in arraye
(In this case, 7) and search for it in arraypi
. If 7
exists in arraypi
then I have to search for the next substring of arraye
(in this case, 71). Then search for 718
, 7182
and so on until the substring does not exist in arraypi
. Then I have to simply put the length of the biggest substring in a integer variable and print it.
Worth mentioning that arraypi
contains a newline every 50 characters whereas arraye
contains a newline every 80 although I don't think that will be problem right?
I tried thinking about a way to accomplish this but so far I haven't thought of something.
I am not absolutely sure if I got this right. I have something like this on my mind:
arraypi
is in a browserctrl+f
for findarraye
letter by letter until you see the red no matchIf that's right, then an algorithm like the following should do the trick:
#include <stdio.h>
#define iswhitespace(X) ((X) == '\n' || (X) == ' ' || (X) == '\t')
int main( ) {
char e[1000] = "somet\n\nhing";
char pi[1000] = "some other t\nhing\t som\neth\n\ning";
int longestlen = 0;
int longestx = 0;
int pix = 0;
int ex = 0;
int piwhitespace = 0; // <-- added
int ewhitespace = 0; // <-- these
while ( pix + ex + piwhitespace < 1000 ) {
// added the following 4 lines to make it whitespace insensitive
while ( iswhitespace(e[ex + ewhitespace]) )
ewhitespace++;
while ( iswhitespace(pi[pix + ex + piwhitespace]) )
piwhitespace++;
if ( e[ex + ewhitespace] != '\0' && pi[pix + ex + piwhitespace] != '\0' && pi[pix + ex + piwhitespace] == e[ex + ewhitespace] ) {
// the following 4 lines are for obtaining correct longestx value
if ( ex == 0 ) {
pix += piwhitespace;
piwhitespace = 0;
}
ex++;
}
else {
if ( ex > longestlen ) {
longestlen = ex;
longestx = pix;
}
pix += piwhitespace + 1;
piwhitespace = 0;
// the two lines above could be replaced with
// pix++;
// and it would work just fine, the injection is unnecessary here
ex = 0;
ewhitespace = 0;
}
}
printf( "Longest sqn is %d chars long starting at %d", longestlen, longestx + 1 );
putchar( 10 );
return 0;
}
What's happening there is, the loop searches for a starting point for match first. Until it finds a match, it increments the index for the array being examined. When it finds a starting point, it then starts incrementing the index for the array containing the search term, keeping the other index constant.
Until a next mismatch, which is when a record-check is made, search term index is reset and examinee index starts getting incremented once again.
I hope this helps, somehow, hopefully more than resolving this single-time struggle.
Changed the code to disregard white space characters.