Writing cross-platform huge file I/O code is tricky because the 64-bit offset versions of fseek and ftell are not standard across compilers and platforms. This article documents the CMarkup experience with this, related discoveries made along the way, and it may be useful to anyone dealing with this issue.

32-bit and 64-bit file offsets

The limit of signed 32-bit integer offsets in ftell and fseek is 2^31-1 = 2147483647 which is around 2GB. For example, a common cross-platform way of getting file size is fseek with SEEK_END and then ftell, but ftell cannot return a number higher than the limit of its return type.

The 64-bit versions of these functions deal in offsets of billions of gigabytes (8 exabytes) which should be enough for single file sizes in the next decade or two ;).

There are no 64-bit offset versions of fread and fwrite functions. This is because they do not deal explicitly with file offsets, though they do operate relative to the current file pointer. The underlying file pointer can generally handle the huge file, so in theory you can read multiple 2GB blocks using fread even if you don't have a 64-bit ftell function available to query the current offset.

So even as huge file support was added to the file I/O functions in C/C++ libraries by the early 90s, there was no graceful way to promote the integer types of offset variables used in existing programs. The fread and fwrite functions were upgraded under the covers, while new versions of ftell and fseek were added because they dealt explicitly with the offset.

I've seen #ifdef WIN64 used in conjunction with _fseeki64. I am not sure why they were doing that, but don't be confused. A 64-bit operating system is not required for 64-bit file offsets; you can usually use 64-bit offsets in a 32-bit operating system.

Huge files and CMarkup

The 64-bit offset versions of fseek and ftell are used to support huge files with the CMarkup release 11.0 file mode methods. The file mode methods give read and write access to files without loading the entire document into memory. File mode does not require 64-bit offsets, but 64-bit offsets are needed if you are dealing with files over 2GB.

off_t

Most UNIX flavor compilers like gcc are standardized on the ftello and fseeko which use the off_t integer type that depends on compiler setup. The functions resolve to the old fseek/ftell or huge fseeko64/ftello64 in concert with the _FILE_OFFSET_BITS and _LARGEFILE_SOURCE macro defines.

Unfortunately VC++ doesn't use this system.

CMarkup in Visual Studio

Visual C++ provides its huge versions, _fseeki64 and _ftelli64 inconsistently. There is no clean way to always know whether the 64-bit versions are available and to automatically compile accordingly.

In CMarkup release 11.0 the 64-bit functions are used by default in Visual Studio 2005. This is not correct for target platforms where the 64-bit functions are deliberately excluded such as for Windows CE.

Since the 64-bit functions are covertly distributed with Visual Studio 6.0 (see below), in CMarkup release 11.1 I declared prototypes for the 64-bit functions by default in Visual Studio 6.0. The caused more problems because the functions are not available in some build configurations of VC++ 6.0.

In CMarkup release 11.2 with any version of Visual Studio you will need to define MARKUP_HUGEFILE in the project definitions if you want huge file access. This will eliminate these linker errors for first time users, while putting the burden on those who need huge file access to specify the define.

One alternative would be to use the Win32 File APIs and the SetFilePointerEx available on Win2K+, but this would greatly increase the code differences between the different platforms.

 

comment posted 11.1 Issue: linking fseeki64 and ftelli64

David 17-Jun-2009

I got an error in my application which is a windows mobile program, developing in VS 2008. I created the application in the following steps:

a) Create a smartdevice project: New project -> smart device ->Win32 Smart Device project
b) Add markup files: copy the two files into the project directory, then add them into the project, and select the "cmarkup.cpp", set it properties "Not Using Precompiled Headers"
c) Add some code in the testmarkup.cpp: #include "Markup.h" ... CMarkup xml; xml.Load(_T("UserInfo.xml"));
d) Compile the project, then I got the following error [configuration Debug Windows Mobile 5.0 Pocket PC SDK (ARMV4I)]:

...
WINVER not defined. Defaulting to 0x0400,
  which is appropriate for all supported Windows CE versions
...
.\Markup.cpp(1498) : error C3861: '_fseeki64': identifier not found
.\Markup.cpp(1499) : error C3861: '_ftelli64': identifier not found
...

CMarkup started using fseeki64 and ftelli64 in release 11, but they are optional. In the next release I will not use them unless you set a define. To turn them off do something like this:

Old line 223:

#elif _MSC_VER >= 1000 // VC++

New line 223:

#elif _MSC_VER > 4000 // never

 

comment posted 11.1 Issue: linking fseeki64 and ftelli64

Robin Hilliard 12-Jun-2009

I'm still on VC++ 6.0 (don't ask) and the functions _fseeki64 and _ftelli64 don't exist. To fix this, you just need to bump the compiler version directive as follows:

Old line 223:
#elif _MSC_VER >= 1000 // VC++

New line 223:
#elif _MSC_VER > 1200 // > VC++ 6.0?

In CMarkup release 11.1 I implemented a workaround to offer huge file (>2GB) access by default in Visual C++ 6.0 and Visual Studio .NET versions before VC++ 2005. Huge file access is only useful for the developer version file mode (see Open) methods.

However, I did not realize that it depended on a project setting for these functions to be available in the VC++ libraries you link with. If in your project settings you have the default Microsoft Foundation Classes setting "Use MFC is a Shared DLL" you will get the following problem when linking CMarkup release 11.1:

Linking...
Markup.obj : error LNK2001: unresolved external symbol __ftelli64
Markup.obj : error LNK2001: unresolved external symbol __fseeki64

If you are able to change your Microsoft Foundation Classes project setting to "Use MFC in a Static Library" that is one way to alleviate the linker error and keep huge file access. Otherwise, you can change the compiler version directive as shown above and you cannot use file mode with files over 2GB.

Microsoft Visual Studio\VC98\CRT\SRC\FSEEKI64.C
Microsoft Visual Studio\VC98\CRT\SRC\FTELLI64.C
Microsoft Visual Studio\VC98\Lib\LIBCMT.LIB
dumpbin /exports libcmt.lib

does not show it but the following does:

dumpbin /all libcmt.lib

 

comment posted Linking Problem of CMarkup V11.1

Kang Yiqi 27-Jul-2009

I'm using CMarkup (V11.1) class in my laboratory project (Windows XP & Visual C++ 6.0) for parsing xml documents. I've download Markup.h & Markup.cpp from www.firstobject.com and added them into my VC++ project.

There's a problem in Markup.h, line 223: "#elif _MSC_VER >= 1000". As the version of VC++ 6.0 is 1200, MCD_FSEEK and MCD_FTELL are defined as _fseeki64 and _ftelli64. This causes 2 linking errors, e.g. "unresolved external symbol __ftelli64"

However, if I choose to "Use MFC in a Static Library", just the same as the sample project does, it's okay. But the size of execution file seems to be too big. So I changed the line 223 like this: "#elif _MSC_VER > 1200", and the problem is solved. I also try version 11.0 of CMarkup and there's no such problem, because it define MCD_FSEEK as _fseeki64 only when _MSC_VER is bigger than 1400 (VC++ 2005).

I would suggest that for VC++ 6.0, it's better to define MCD_FSEEK and MCD_FTELL as fseek and ftell, rather than _fseeki64 and _ftelli64

Thank you for the excellent feedback and research. Release 11.2 is fixed according to your suggestion.