My professor showed us an example of a program that reads in particle structure objects and prints the details of each particle. I understand how the C program works but am confused about the "filea" binary file that contains the "structure objects". How is the data being automatically assigned to the values of the structs in the C program? The filea, being binary, isn't comprehensible so I'm not sure exactly how it is working and when I asked him about it I didn't get a clear answer.
Here is the program:
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
struct vector{
float x;
float y;
float z;
};
struct particle {
float mass;
struct vector pos;
struct vector vel;
};
int main(int argc, int *argv[]) {
int cnt = 0;
int fd, nbytes;
struct particle *buf = (struct particle *)malloc(sizeof(struct particle));
fd = open("filea",O_RDONLY);
while ((nbytes = read(fd,buf,sizeof(struct particle))) > 0){
printf("Particle %d\n", cnt++);
printf("\tmass\t%.1f\n",buf->mass);
printf("\tpos\t(%.1f,%.1f,%.1f)\n",buf->pos.x,buf->pos.y,buf->pos.z);
printf("\tvel\t(%.1f,%.1f,%.1f)\n",buf->vel.x,buf->vel.y,buf->vel.z);
}
close(fd);
free(buf);
return 1;
}
The slide said "Each particle is represented by the structures:"
struct vector {
float x;
float y;
float z;
};
struct particle {
float mass;
struct vector pos;
struct vector vel;
};
The two structures, vector
and particle
are fixed length structures. vector
is 3 floats, so if we assume a 4 byte float, that structure is 12 bytes, and particle
is made up of 1 float and 3 vectors, so 4 bytes + 3 * 12, for a total of 40 bytes.
read
takes the pointer to the file stream, a memory address (in this case a buffer the size of a particle
), and a size (again, the size of a particle
). It returns the number of bytes read (I think, it may return the number of blocks of data read).
So, read literally transfers the bytes from the file in to the buffer pointed at by buf
. buf
happens to be typed as a pointer to particle
, so all of the structure operators conveniently work (as seen by the printf
statements).
When the read
reaches the end of file, it will "fail" and return a 0 instead of the count of the data read, and that terminates the loop.
The data on the disk must match the internal, binary layout of the structures and the floats within those structures, otherwise you will get garbage data. For example, if you wrote the file on a machine that is "little endian" and read the file on a "big endian" machine, it's very likely the data would be corrupted, since the internal representations likely differ due to the endianess,
This technique is an efficient, and simple, mechanism for store and reading data, but is not portable.