[Solved] The Strangest Bug I've Ever Seen
I've spent two hours to find it and two seconds for the fix. There was this gigantic piece of C code to create a report by putting together data from several files. The code seemed to be executed correctly (debug prints showed what should have been put on the file), however the report was empty. Zero bytes.
At a first look, the code was correct, even if it mixed file descriptors (for the input files) and FILE *
(for the output). I could not explain that strange behavior. I also tried to search Google for similar cases, but nothing. So I tried the old-style method: find the first working point and then proceed line-by-line until the issue shows up.
The Hunt
I started to close the output file after every fprintf()
, until I found the point of failure: a cleanup function. The cleanup consisted in closing some file descriptors with the standard sequence:
if (fd != -1) {
close(fd);
fd = -1;
}
Everything seemed correct, so I started looking for where the file descriptors closed in that function were opened and... I found another place where one fd was closed. This was the code:
if (fd != -1) {
close(fd);
fd = 0; // <-- this is BAD!!
}
Then, in another part of the code there was a lseek(fd, 0, SEEK_SET)
.
Putting All Together
In Linux, the file descriptor 0 is the standard input. It's OK to close it, however it may happen that the next file opened would have 0 as file descriptor. And managing the same file as FILE *
and file descriptor may led to undefined behaviors.
What happened was that the output file was accessed in both ways and probably the seek function messed up with the length of the file. Once the bug has been fixed, the report was generated correctly.
Image credits: gratisography.com