Monday, November 22, 2010

Porting x86 programs to x86_64

Refer to:
http://www.physics.uq.edu.au/people/foster/amd64_porting.html


Porting x86 programs to x86_64

Since buying my amd64 system in 2004 I have been running a 64-bit linux kernel, and often want to run programs which have been developed for 32-bit x86 processors. Porting code to amd64 from x86 is usually a fairly straightforward task involving fixing a few type problems. The following describes some of the issues I've encountered when porting to x86_64 in the hope that it will save someone some time.
Sizes of integer and pointer types

One of the most common problems encountered in programs written for x86 is assumptions about the size of pointer and integer types. Typically this leads to truncation; a common C++ error looks like the following:

string str = get_some_string();
unsigned int delimPos = str.find(':');
if(delimPos != string::npos)
handle_delimiter(str);

However, since an unsigned int encompasses a smaller range than string::size_type, string::npos is truncated to fit, and the conditional is always executed.

Actually, using the unsigned int above is just bad practise and should be avoided in favour of the proper type provided by the standard library, string::size_type. These kinds of errors easily slip by on a 32-bit system however where we typically have unsigned int == string::size_type. Luckily they're easily found by a compiler - study your compiler warnings carefully - they are usually there for a reason!
Pointers

A bunch of nastier errors occur because the pointer size on a 64-bit system is naturally enough 64 bits wide. Any code which is doing horrible low-level things involving casting pointers to integers will probably break in a rather nasty fashion.
Sizes of types, for the record

The trivial piece of C++ code here shows the relevant sizes of various types. For a 64-bit amd64 system using g++, I have the following:

$ ./test_data_sizes
Integer types:
sizeof(char) = 1
sizeof(short) = 2
sizeof(int) = 4
sizeof(long) = 8
sizeof(long long) = 8

Pointers:
sizeof(void*) = 8

Floating point types:
sizeof(float) = 4
sizeof(double) = 8
sizeof(long double) = 16

Sizes from stddef.h:
sizeof(size_t) = 8
sizeof(ptrdiff_t) = 8

On the other hand, when compiled with g++ on a 32-bit system (i686) we get the following:

$ ./test_data_sizes
Integer types:
sizeof(char) = 1
sizeof(short) = 2
sizeof(int) = 4
sizeof(long) = 4
sizeof(long long) = 8

Pointers:
sizeof(void*) = 4

Floating point types:
sizeof(float) = 4
sizeof(double) = 8
sizeof(long double) = 12

Sizes from stddef.h:
sizeof(size_t) = 4
sizeof(ptrdiff_t) = 4

Of particular relevance, the sizes of the long, void* and types from stddef.h are different. Note that these results also may depend on the compiler, not just the architecture. I've been told that VC++ differs from g++ with regard to the sizeof(long).
Position Independent Code and Non-x86 Architectures; -fPIC

Another problem which often rears its ugly head is related to the difference between the way that the two platforms handle position independent code (PIC) and indicates that the build system rather than the source code needs to be modified. Since there's been a distinct lack of information on the web about this, here's my take on the subject. I'm not an expert, so take it with a grain of salt; constructive feedback is always welcome.

Briefly, position independent code is object code which may be put at an arbitrary position in memory and executed without modification. This is in contrast to (relocatable) position dependent code which needs an extra translation step. One upshot is that PIC is appropriate for use in shared objects.

In order to generate PIC with gcc, the option -fPIC should be used. Portability issues arise when a build system assumes that position dependent code can be linked into a shared library. Because of architectural peculiarities this assumption is fine on a x86 system, but invalid on many others, including amd64.
An Example

Suppose I decide to write a library containing a single function, writeHello(), the source files being as follows:

// hello.h
#ifndef HELLO_H
#define HELLO_H

void writeHello();

#endif // HELLO_H

// hello.cpp
#include
#include "hello.h"

void writeHello()
{
std::cout << "Hello World!\n";
}

If we attempt to compile these into a shared library without using -fPIC, we get the following mess:

$ g++ -c -o hello.o hello.cpp
$ g++ -shared -o libhello.so hello.o
/usr/lib/gcc/x86_64-pc-linux-gnu/3.4.4/../../../../x86_64-pc-linux-gnu/bin/ld:
hello.o: relocation R_X86_64_32 against `a local symbol' can not be used when
making a shared object; recompile with -fPIC
hello.o: could not read symbols: Bad value
collect2: ld returned 1 exit status
$

On the other hand, if we include an -fPIC in the compile options, everything is fine, and out pops a shiny new shared object:

$ g++ -c -fPIC -o hello.o hello.cpp
$ g++ -shared -o libhello.so hello.o

Granted it doesn't do much... but not to worry ;)
Shared and static libraries

The example above is a very simple case of where you can get into trouble with non-PIC object code. A more complicated case involves attempting to link static libraries into shared libraries. Suppose I created a bunch of static libraries, libfoo.a, libbar.a and libbaz.a and I wanted to link them into a shared object libfred.so. On x86 it is permissible to create the static libraries out of non-PIC objects. However, the following will fail on the amd64 architecture unless the static libraries are built with -fPIC:

$ g++ -shared -o libfred.so $fred_objects -lfoo -lbar -lbaz

Relevance to automated build tools
Autoconf et. al.

The nasty mess that is the GNU autotools has a uniform way for dealing with generating position independent code - the relevant command line flag, --with-pic will cause static libraries to be built as PIC, and avoids the library issue mentioned above. The autobook briefly mentions this though I haven't played with it myself so I'm not sure if the option is relevant if libtool is not being used.
SCons

SCons automatically determines whether to use -fPIC when creating libraries using the StaticLibrary or SharedLibrary builder methods. However, you've still got to be careful with linking static libraries made from non-PIC objects into shared libraries. This page resulted from my search for a clean solution to this problem for use with the Aqsis renderer project.

After learning much more about scons than I'd originally intended, I finally found the following reasonably clean solution. The idea is to override the default Library and StaticLibrary builder objects of the build environment and cause them to compile PIC code. This may be done by passing src_builder = SharedObject rather than StaticObject to the Builder factory function:

import platform

if platform.machine() == 'x86_64':
picLibBuilder = Builder(action = Action('$ARCOM'),
emitter = '$LIBEMITTER',
prefix = '$LIBPREFIX',
suffix = '$LIBSUFFIX',
src_suffix = '$OBJSUFFIX',
src_builder = 'SharedObject')
env['BUILDERS']['StaticLibrary'] = picLibBuilder
env['BUILDERS']['Library'] = picLibBuilder

For some projects it may be desirable to build static libraries from both PIC and non-PIC objects. In that case, instead of overriding the Builder objects as above, you'd probably want to define some new ones, and explicitly call them to produce pic libraries. You'd also want to modify the library name to allow both flavours of static library to coexist without name clashes. I gather the standard way to do this is to append _pic to the library name. Putting this together, you'd have something like:

if platform.machine() == 'x86_64':
picLibBuilder = Builder(action = Action('$ARCOM'),
emitter = '$LIBEMITTER',
prefix = '$LIBPREFIX',
suffix = '_pic$LIBSUFFIX',
src_suffix = '$OBJSUFFIX',
src_builder = 'SharedObject')
env['BUILDERS']['StaticPicLibrary'] = picLibBuilder
env['BUILDERS']['PicLibrary'] = picLibBuilder

Relevant Links

* A helpful discussion about fPIC on the freeBSD ports list
* Shared objects for the object disoriented: an introduction to shared objects at IBM developer works.

No comments:

Post a Comment