Search code examples
cstringtokenizeansi-c

Tokenize Strings using Pointers in ANSI C


This is in Ansi C. I am given a string. I am supposed to create a method that returns an array of character pointers that point to the beginning of each word of said string. I am not allowed to use Malloc, but instead told that the maximum length of input will be 80.

Also, before anyone flames me for not searching the forum, I can't use strtok :(

char input[80] = "hello world, please tokenize this string"

and the output of the method should have 6 elements;

output[0] points to the "h",
output[1] points to the "w",

and so on.

How should I write the method?

Also, I need a similar method to handle input from a file with maximum of 110 lines.


Solution

  • Pseudocode:

    boolean isInWord = false
    while (*ptr != NUL character) {
       if (!isInWord and isWordCharacter(*ptr)) {
           isInWord = true
           save ptr
       } else if (isInWord and !isWordCharacter(*ptr)) {
           isInWord = false
       }
       increment ptr
    }
    

    isWordCharacter checks whether the character is part of the word or not. Depending on your definition, it can be only alphabet character (recognize part-time as 2 words), or it may include - (recognize part-time as one word).