FIO19-C. Do Not Use Fseek and Ftell To Compute The Size of A Regular File
FIO19-C. Do Not Use Fseek and Ftell To Compute The Size of A Regular File
FIO19-C. Do Not Use Fseek and Ftell To Compute The Size of A Regular File
Subclause 7.21.9.2 of the C Standard [ISO/IEC 9899:2011] specifies the following behavior for fseek() when opening a binary file in binary mode:
A binary stream need not meaningfully support fseek calls with a whence value of SEEK_END.
Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream
(because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the
initial shift state.
Seeking to the end of a binary stream in binary mode with fseek() is not meaningfully supported and is not a recommended method for computing the
size of a file.
Subclause 7.21.9.4 of the C Standard [ISO/IEC 9899:2011] specifies the following behavior for ftell() when opening a text file in text mode:
For a text stream, its file position indicator contains unspecified information, usable by the fseek function for returning the file
position indicator for the stream to its position at the time of the ftell call.
Consequently, the return value of ftell() for streams opened in text mode should never be used for offset calculations other than in calls to fseek().
POSIX [IEEE Std 1003.1:2013] provides several guarantees that the problems described in the C Standard do not occur on POSIX systems.
The character 'b' shall have no effect, but is allowed for ISO C standard conformance.
This guarantees that binary files are treated the same as text files in POSIX.
For each object, size calls shall be made to the fputc() function, taking the values (in order) from an array of unsigned char exactly
overlaying the object. The file-position indicator for the stream (if defined) shall be advanced by the number of bytes successfully
written.
This means that the file position indicator, and consequently the file size, is directly based on the number of bytes actually written to a file.
fp = fopen("foo.bin", "rb");
if (fp == NULL) {
/* Handle error */
}
if (fseek(fp, 0 , SEEK_END) != 0) {
/* Handle error */
}
file_size = ftell(fp);
if (file_size == -1) {
/* Handle error */
}
buffer = (char*)malloc(file_size);
if (buffer == NULL) {
/* Handle error */
}
/* ... */
fd = open("foo.bin", O_RDONLY);
if (fd == -1) {
/* Handle error */
}
fp = fdopen(fd, "r");
if (fp == NULL) {
/* Handle error */
}
if (fseeko(fp, 0 , SEEK_END) != 0) {
/* Handle error */
}
file_size = ftello(fp);
if (file_size == -1) {
/* Handle error */
}
buffer = (char*)malloc(file_size);
if (buffer == NULL) {
/* Handle error */
}
/* ... */
off_t file_size;
char *buffer;
struct stat stbuf;
int fd;
fd = open("foo.bin", O_RDONLY);
if (fd == -1) {
/* Handle error */
}
file_size = stbuf.st_size;
buffer = (char*)malloc(file_size);
if (buffer == NULL) {
/* Handle error */
}
/* ... */
Compliant Solution (Windows)
This compliant solution uses the Windows _filelength() function to determine the size of the file on a 32-bit operating system. For a 64-bit operating
system, consider using _filelengthi64 instead.
int fd;
long file_size;
char *buffer;
file_size = _filelength(fd);
if (file_size == -1) {
/* Handle error */
}
buffer = (char*)malloc(file_size);
if (buffer == NULL) {
/* Handle error */
}
/* ... */
HANDLE file;
LARGE_INTEGER file_size;
char *buffer;
if (!GetFileSizeEx(file, &file_size)) {
/* Handle error */
}
/*
* Note: 32-bit portability issue with LARGE_INTEGER
* truncating to a size_t.
*/
buffer = (char*)malloc(file_size);
if (buffer == NULL) {
/* Handle error */
}
/* ... */
fp = fopen("foo.txt", "r");
if (fp == NULL) {
/* Handle error */
}
if (fseek(fp, 0 , SEEK_END) != 0) {
/* Handle error */
}
file_size = ftell(fp);
if (file_size == -1) {
/* Handle error */
}
buffer = (char*)malloc(file_size);
if (buffer == NULL) {
/* Handle error */
}
/* ... */
However, the file position indicator returned by ftell() with a file opened in text mode is useful only in calls to fseek(). As such, the value of file_size
may not necessarily be a meaningful measure of the number of characters in the file, and consequently, the amount of memory allocated may be incorrect,
leading to a potential vulnerability.
The value returned by ftell may not reflect the physical byte offset for streams opened in text mode, because text mode causes
carriage return-linefeed translation. Use ftell with fseek to return to file locations correctly.
Again, this indicates that the return value of ftell() for streams opened in text mode is useful only in calls to fseek() and should not be used for any
other purpose.
Risk Assessment
Understanding the difference between text mode and binary mode with file streams is critical when working with functions that operate on them. Setting the
file position indicator to end-of-file with fseek() has undefined behavior for a binary stream. In addition, the return value of ftell() for streams opened
in text mode is useful only in calls to fseek(), not for determining file sizes or for any other use. As such, fstat() or other platform-equivalent functions
should be used to determine the size of a file.
Automated Detection
Bibliography
[IEEE Std 1003.1:2013] XSH, System Interfaces, fopen
XSH, System Interfaces, fwrite
[MSDN] "ftell"