Search code examples
calgorithmpseudocode

algorithm allowing to track a word occuring two words after word with particular pattern


I'd greatly appreciate help with that algorithm/pseudo code.

Basically I'm looking for words with a particular pattern (doesn't matter what). I have a special function to determine it, which returns 1 if the word fulfills the requirements. When it does, the second word after that should be omitted and not saved in the output. I had no problem with that, when the "chosen" words are separated by one "not chosen" word. The problem is - what to do when the "chosen" ones appear one after the other ?

I've prepared such a pseudo code to clarify the situation a bit. But unfortunately it doesn't work for all the combinations of "chosen" and "not chosen".

I introduced three counters/variables, helping me to discover the position I'm currently at.

Following pseudo code isn't in logical order!

if (counter == 2 || in_a_row >= 3) {
    erase = 1;
    counter--;
    yes = 0;
    if (!chosen) 
        counter = 0;
}
if (chosen) {
    counter++;
    yes = 1;
    in_a_row++;
} else {
    if (yes = 1) /* yes - is used when the preceeding word is chosen and the next is not chosen, in order to keep track of the counter */
        counter++;
}
if (in_a_row == 5)
    in_a_row = 4; /* just to make sure that counter doesn't go too high */
if (erase == 1)
    /*erasing procedure*/

If You have a simpler idea, or see a mistake in that one, PLEASE help me. Trying to work that out for 8 hours...


Solution

  • Pardon me for not using pseudo-code, instead I used actual code. I'm hoping I now understand the problem well enough that my belief that it doesn't seem very complicated, is accurate.

    # include <stdio.h>
    # include <ctype.h>
    # include <string.h>
    
    
    # define BUFF_SIZE       1024
    # define WORD_DELIM     " "
    # define MATCH_PATT     "barf"
    
    
    int main(  int ac ,  char *av[]  )
    {
        __u_char    false = ( 1 == 0 ) ;
        __u_char    true = ( 1 == 1 ) ;
    
        __u_char    match_1_back = false ;
        __u_char    match_2_back = false ;
    
        char        line_buff[  BUFF_SIZE  ] ;
        char        *buff_ptr ;
        char        *word_ptr ;
    
    
        while (  fgets( line_buff ,  BUFF_SIZE ,  stdin )  )
        {
            puts(  "\nInput line was:  "  ) ;
            puts(  line_buff  )  ;
    
            puts(  "Output line is:  "  ) ;
    
            buff_ptr = line_buff ;
    
            while (  ( word_ptr = strtok( buff_ptr ,  WORD_DELIM )  )  !=  NULL  )
            {
                buff_ptr = NULL ;
    
                if (  strcmp( word_ptr ,  MATCH_PATT  )  ==  0  )
                {
                    // Set these to what they should be for next iteration.
                    match_2_back = match_1_back ;
                    match_1_back = true ;
    
                    // Don't output matched token.
                }
                else
                {
                    // Don't output token if a token matched 2 tokens back.
                    if (  ! match_2_back  )
                        printf(  "%s " ,  word_ptr  ) ;
    
                    // Set these to what they should be for next iteration.
                    match_2_back = match_1_back ;
                    match_1_back = false ;
                }
            }
    
            printf(  "\n"  ) ;
        }
    }
    

    With this input:

    barf   barf  barf   healthy     feeling     better   barf  barf barf uh oh sick again
    barf   barf  healthy     feeling     better   barf  barf uh oh sick again
    barf   healthy     barf   feeling     better     barf   uh   barf   oh sick again
    barf   healthy     feeling     better   barf uh oh sick again
    

    I got this output:

    Input line was:  
    barf   barf  barf   healthy     feeling     better   barf  barf barf uh oh sick again
    
    Output line is:  
    better sick again
    
    
    Input line was:  
    barf   barf  healthy     feeling     better   barf  barf uh oh sick again
    
    Output line is:  
    better sick again
    
    
    Input line was:  
    barf   healthy     barf   feeling     better     barf   uh   barf   oh sick again
    
    Output line is:  
    healthy feeling uh oh again
    
    
    Input line was:  
    barf   healthy     feeling     better   barf uh oh sick again
    
    Output line is:  
    healthy better uh sick again
    

    I just used a simple comparison, not actual regular expressions. I only wanted to illustrate the algorithm. Does the output conform to the requirements?