Search code examples
cbuffer-overflowsigabrtdynamic-allocation

Abort trap 6 upon function return with dynamic memory allocation


I have trouble debugging an "Abort trap 6" error in C using Apple LLVM version 5.1 (clang-503.0.40) on OS X Mavericks. The goal is to construct a data structure word2class which contains for each word in a vocabulary of num_words words an array of class ids to which that word belongs (arbitrary, but for example: Noun, Verb, ..., as integer ids) and also its inverse, class2word which contains for each class 0 <= class < num_class an array of the words it contains.

To do this, I am maintaining global arrays of pointers to arrays of long longs. The forwards data structure (word2class) and backwards data structure (class2words) are constructed in the same manner, call this A for now. Both A and each array pointed to by each element A[i] is dynamically allocated. At construction we know the maximum number of elements that A may contain (num_words or num_class). A is constructed by reading index,id pairs from a file (index is a word id, and id is a class id). Each id should be inserted into the array pointed to by A[index]. Words that belong to more than one class are stored as a separate line for each class id (e.g. 0<tab>0<newline>0<tab>1). During construction, I maintain an array of the current maximum size of each array A_max_size[i], and the current number of elements in each array A_cur_size[i], and lazily realloc() each *A[i] if it gets too small. I amortize memory allocation costs by resizing A[i] by factor 2 whenever it gets too small. Below is the actual relevant parts of the code:

#include <stdio.h>
#include <stdlib.h>

// global arrays of pointers to long longs
long long **word2class = NULL, **class2word = NULL;
// bounds for each array pointed to by word2class[i] / class2word[j]
long long *w2c_cur_size = NULL, *c2w_cur_size = NULL;
// max number of words, passed as first argument argv[1]
// max number of classes, read from file
long long num_words, num_classes = 0;

void AllocateAndInsert(long long **A, long long *A_cur_size, long long *A_max_size,
    long long index, long long item) {
 if (A == NULL || A_cur_size == NULL || A_max_size == NULL) {
  printf("NULL pointer passed.\n"); exit(1);
 }
 // Check if A[index] is big enough, realloc if needed
 if (A_cur_size[index] + 1 >= A_max_size[index]) {
   if (A_max_size[index] == 0) A_max_size[index] = 1;
   else A_max_size[index] *= 2;  // amortize memory allocation
   // dynamically allocate a large enough array of long longs
   A[index] = (long long*)realloc(A[index], A_max_size[index] * sizeof(long long));
   if (A[index] == NULL) { 
    printf("(Re)allocation error.\n");
    exit(1);
   }
 }
 // insert item at end of array pointed to by A[index]
 A[index][A_cur_size[index]] = item;
 A_cur_size[index]++;
}

void PrintDS(long long **A, long long max_index, long long *A_size) {
 int i, j;
 for (i = 0; i < max_index; i++) {
  printf("Index %d:\n", i);
  for (int j = 0; j < A_size[i]; j++) {
   printf("%lld ", A[i][j]);
  }
  printf("\n");
 }
}

void BuildDataStruct(char *fname) {
 FILE *fin = fopen(fname, "r");
 long long index, id;
 long long *w2c_max_size, *c2w_max_size;     // temp variables used during construction

 if (fin == NULL) {
   printf("Error opening file.\n"); 
   exit(1);
 }

 // read max number of classes as first line from file
 if (fscanf(fin, "%lld\n", &num_classes) != 1) { 
   printf("File format error.\n"); 
   exit(1); 
 }
 // word2class
 w2c_max_size = calloc(num_words, sizeof(long long));  // initialized to 0s
 w2c_cur_size = calloc(num_words, sizeof(long long));
 word2class = calloc(num_words, sizeof(long long *));
 // class2word
 c2w_max_size = calloc(num_classes, sizeof(long long));
 c2w_cur_size = calloc(num_classes, sizeof(long long));
 class2word = calloc(num_classes, sizeof(long long *));
 if (w2c_max_size == NULL || w2c_cur_size == NULL || word2class == NULL ||
     c2w_max_size == NULL || c2w_cur_size == NULL || class2word == NULL) {
  printf("Allocation error.\n"); 
  exit(1);
 }

 while (!feof(fin)) {
   if (fscanf(fin, "%lld\t%lld\n", &index, &id) != 2) {
     printf("Format error.\n");
     exit(1);
   }
   if (index < 0 || index >= num_words || id < 0 || id >= num_classes) { 
    printf("Bounds error.\n"); 
    exit(1); 
   }
   AllocateAndInsert(word2class, w2c_cur_size, w2c_max_size, index, id);
   AllocateAndInsert(class2word, c2w_cur_size, c2w_max_size, id, index);
 }

 fclose(fin);
 free(w2c_max_size);
 free(c2w_max_size);
 // DATA STRUCTURE PRINTS FINE HERE
 PrintDS(word2class, num_words, w2c_cur_size);
 PrintDS(class2word, num_classes, c2w_cur_size);
}

int main(int argc, char *argv[]) {
  if (argc <= 2) { 
    printf("Usage: %s num_words word2class_file\n", argv[0]); 
    exit(1); 
  }
  num_words = atoi(argv[1]);
  BuildDataStruct(argv[2]);
  // <-- 'Abort Trap 6' RAISED HERE
  return 0;
}

For those interested, test data can be generated using the following script as python script.py num_words max_class_id max_class_per_word out_file and consumed using the above C code as ./build_ds num_words out_file:

import sys
from random import sample, randint

num_words = int(sys.argv[1])
max_class_id = int(sys.argv[2])
max_class_per_word = int(sys.argv[3])
f = open(sys.argv[4], 'w')

f.write("%d\n" % (max_class_id,))
for w in xrange(num_words):
    num_classes = randint(1, max_class_per_word)
    classes = sample(xrange(max_class_id), num_classes)
    for c in classes:
        f.write("%d\t%d\n" % (w, c))

f.close()

This code works (even on the 'real' data), however, once I use this exact code as part of a larger application, the program terminates with the cryptic "Abort trap 6" error. BuildDataStruct() is called from another function and seems to work fine for the construction part. One can print out both data structures before returning from BuildDataStruct() with no problem. However, immediately after the function returns the error is generated.

From reading various other answers, the Abort Trap: 6 error seems to be raised on OS X when a buffer is overwritten. However, I have added checks on index to ensure that does not happen, so I am not sure what else might be the problem? Am I perhaps overwriting the stack somehow? Does anyone have any ideas where or how I can start hunting for the issue?


Solution

  • The problem was actually with the fscanf() line for reading the word and class. I made one simplification to the code I posted: In my actual code, word2class is stored as a <word_str> <tab> <class_int> <newline> entry per line. I.e. I store words as strings instead of long long ids. I then read the word into a char word[MAX_STRING] buffer and then lookup its long long word_id.

    The problem was that some words were longer than MAX_STRING-1, which overwrites the null-terminated string word, which caused a seg fault. The problem was further compounded since I used the -Ofast optimization flag, which somehow changed the segfault error to the Abort trap 6 error. Once I compiled without the optimization flag, I realized it was a seg-fault, which I was able to trace to the fscanf() function using valgrind and gdb.

    The solution was to add a maximum length to the %s fscanf format string, as follows:

    #define MAX_STRING 100
    #define MAX_STRING_S "99"  // careful not to overwrite terminating null
    
    ...
    
    // in BuildDataStruct()
    char word[MAX_STRING];
    if (fscanf(fin, "%" MAX_STRING_S "s\t%lld", word, &class) != 2) {
      printf("Format error in class_file\n");
      // continue or exit
    }
    

    This code now correctly identifies the formatting error in class_file, and furthermore works correctly as expected.