Search code examples
cparsingwavefront

Extracting face vertex indices from a Wavefront .obj file


I'm trying to extract faces and then get the respective vertex indices from the .obj file as follows. I can now print the vertex indices separately using the strtok() function, but I can't seem to find the correct way to parse it.

Here's my code:

#include <stdio.h>
#include <string.h>

//lazy wavefront obj file parser

int main(){
    FILE *fp = fopen("head.obj", "r");

    //find the number of lines in the file
    int no_lines = 0;
    char ch;
    while(ch != EOF){
        ch = getc(fp);
        if(ch == '\n')
            no_lines++;
    }
    printf("number of lines: %d\n", no_lines);

    //set seek point to start of the file
    fseek(fp, 0, SEEK_SET);

    //get the faces and parse them
    char line[100];
    while(fgets(line, sizeof(line), fp) != NULL){
        if(line[0] == 'f'){

            //split the line at spaces
            char *s = strtok(line, " ");
            
            while(s != NULL){
                if(*s != 'f'){
                    /*this will print faces as follows
                        58/35/58
                        59/36/59
                        60/37/60

                        we need to get the first number from each line, i.e., 58, 59, 60
                    */
                    printf("%s\n", s);
                }
                s = strtok(NULL, " ");
            }
        }
    }
    
    fclose(fp);

    return 0;
}

The output looks something like this.

1202/1248/1202
1220/1281/1220
1200/1246/1200

1200/1246/1200
1220/1281/1220
1198/1247/1198

1201/1249/1201
1200/1246/1200
1199/1245/1199

1201/1249/1201
1202/1248/1202
1200/1246/1200

I want to extract the numbers from the above output as follows, all I need are the first numbers from each line.

For the following lines

1202/1248/1202
1220/1281/1220
1200/1246/1200

the output should be 1202, 1220, 1200.


Solution

  • A brief overview

    The code below is by no means a complete Wavefront OBJ parser, but it satisfies the requirements in the question. First, it checks if the first character of the line is an 'f', if this is not the case, then we can skip this line. Else, the line parsing begins.
    We first skip past the 'f' and then repeat two calls to strtok with alternating delimiters. In the first one, we read until '/' for the first vertex index. In the second one, we read until the next space character (and ignore the result). The pointer is now at the start of the second vertex index. This process is repeated until the end of the line.

    Some technical details

    According to this source blank space can freely be added to the OBJ file, although this source says that "no spaces are permitted before or after the slash.". Skipping whitespace is in itself not difficult. My default approach would be to use this "state machine" to read a single line:

    1. Skip ahead until the current character is whitespace.
    2. Skip ahead until the current character is not whitespace.
    3. Read until the '/' character. This is vertex index.
    4. Back to Step 1.

    Manually advancing a pointer and reading the vertex index with sscanf is a valid approach. It is tempting to try to use strtok here, however, this makes parsing more difficult as strtok

    • has a different initial call,
    • keeps the current position as internal state,
    • is destructive,
    • always looks for the next char, i.e. cannot handle a result of length 0.

    Since the question indicates that the implementation should be "simple" and that usually a single space is used as a whitespace delimiter, the following implementation is possible:

    Code

    Example input line: f 2/1/1 4/2/1 1/3/1.
    Example output line: 2 4 1 .

    #include <stdbool.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    #define LINE_SIZE_MAX 128U
    
    static size_t fileToLines(FILE *fp);
    static bool parseFacesFromFile(FILE *fp);
    
    int main(int argc, char *argv[])
    {
        if (argc != 2)
        {
            fprintf(stderr, "Usage: %s <obj file name>\n", argv[0]);
            exit(EXIT_FAILURE);
        }
    
        FILE *fp = fopen(argv[1], "r");
        if (fp == NULL)
        {
            fprintf(stderr, "Error opening file\n");
            exit(EXIT_FAILURE);
        }
    
        printf("Number of lines: %zu\n", fileToLines(fp));
    
        printf("Parsing %s\n", parseFacesFromFile(fp) ? "succeeded" : "failed");
    
        fclose(fp);
    }
    
    static size_t fileToLines(FILE *fp)
    {
        size_t numberOfLines = 0U;
        int ch;
        while ((ch = getc(fp)) != EOF)
            if (ch == '\n')
                ++numberOfLines;
    
        fseek(fp, 0L, SEEK_SET);
        
        return numberOfLines;
    }
    
    static bool parseFacesFromFile(FILE *fp)
    {
        char line[LINE_SIZE_MAX];
        while (fgets(line, sizeof(line), fp) != NULL)
        {
            if (line[0] != 'f')
                continue;
    
            char *tokenPtr;
            strtok(line, " "); // Read past the initial "f"
            while ((tokenPtr = strtok(NULL, "/")) != NULL)
            {
                printf("%s ", tokenPtr);
                strtok(NULL, " "); // Return value checking omitted, assume correct input
            }
            putchar('\n');
        }
    
        return true;
    }
    

    Additional comments:

    • The typical signature of main, using no parameters, is int main(void).
    • In the question's code, char ch is used uninitialized on its first use.
    • The function getc returns an int which should not be cast to a char. In the latter case, EOF cannot be reliably detected.
    • If you are only after the end result, consider using an OBJ-loading library and discarding the normals and texture data.