Tuesday, December 16, 2008

Using code sourcery to build a toolchain

Though it looks quite primising, it was very tough trying to build a cross toolchain using code-sourcery.

They provide prebuilt binary toolchains, but I none with the gcj compiler, so I had to use their build scripts to build my own toolchain.

The script they provide wasn't designed to run on any machine. In fact, it does not run on a machine different to the author's one. There are many hardwired directories and custom build is very tough. Though I tried to understand it, at the end some unexplainable errors appeared.

In conclusion, code sourcery for custom toolchains is not an option nowadays.

Testing the MIKA JVM

Downloading mika requires subversion as there seem to be no tarballs available for download.
Subversion requires port 3690, plus other tools.

Mika's repository is :

The dependencies are: jikes and jam.

Jikes is a java compiler from IBM. It is available as an rpm for FC4. No problems installing it.

Jam is a set of Ant scripts for building java applications. Just run configure, make, and make install. However, my FC4 has some broken links: /etc/alternatives/java_sdk_exports.
I had to edit the link to point to the installed JDK: /usr/lib/jvm/java-1.4.2-gcj

To build mika, I had to edit the Configuration/cpu/arm file, where the toolchain executables are defined. It wrongly assumes that the compiler was arm-linux-gcc, ignoring the CC variable.

To run the build process, type:
ant -DPLATFORM=default -DJAM.PLATFORM=arm-linux

Results are similar to the Skelmir's VM but with a different balance: setVisible is 3 times faster, but the screen creation is 3 times slower. The net performance is more or less the same.

Mika looks quite lightweight in terms of the flash footprint required.

Using crosstool-ng

The objective is to build a valid toolchain for ARM that supports gcj with the GTK peer classes.
The latest gcc is required (version 4.3) which has the latest GNU classpath version, and still evolving.

I tried several options to build a toolchain: crosstool, code-sourcery, etc, but crosstool-ng is the only one that was able to build a 4.3 toolchain without problems. The original crosstool was able to build a toolchain up to gcc 4.1.1/glibc 2.3.2. Newer versions of gcc or glibc did not compile.

Building and installing crosstool is very straightforward. Create a temporary directory, untar the tarball (crosstool-1.3.0) and run:

mkdir /usr/crosstool-ng (as root)
chown /usr/crosstool-ng

./configure --prefix=/usr/crosstool-ng
make install

and then add /usr/crosstool-ng/bin to the PATH variable.

The toolchain to be built is configured by running:
ct-ng menuconfig

There are some important hints:

1. The build tuple must be:
which does not mean that the links i686-pc-linux-gcc -> gcc must be created. Crosstool-ng already creates its own wrapper scripts for this.

Defines how code is generated according to some specific conventions. gcc 4.3 is able to generate EABI-compliant code. All the user-space code must be compiled with the same EABI version, and the kernel must be compiled with EABI support if the user-space code is also EABI.
A toolchain may be created with EABI support, but this does not necessarily mean that the code generated is EABI. Some gcc flags control this. The glibc library was compiled using EABI, therefore the kernel must be compiled with EABI support, otherwise the 'init' program would not be loaded by the kernel.

3. Kernel headers
If we are using a new kernel (i.e. 2.6.1 or greater) then the best option is to use the sanitized kernel headers provided by the kernel's makefile. Crosstool-ng automatically invokes this makefile to generate the sanitized header set. By the way, using the latest 4.3 gcc, requires a late glibc (2.6 or 2.7), which in turn, requires a 2.6 kernel.

4. glibc ports
If the option "use glibc ports" is not enabled, the glibc compilation may return an error like "target arm is not supported". As of glibc 2.5, the architecture-dependent stubs of glibc are distributed as separate tarballs.

5. Dependencies
gcc 4.3 requires mrfp library version 2.3.2 or greater, which requires automake > 1.10 and autoconf > 2.60. There are no rpms for these versions for FC4, so I had to download the sources and install them manually

6. Enabling gcj/java support
Must enable the java language.
Also, the specific java/gtk support must be enabled in the gcc build flags:


--enable-java-awt=gtk --enable-gtk-cairo --disable-gtktest
--x-libraries=/lib --x-includes=/include/X11

note that the GTK peer classes provided by GNU classpath require some X libraries, though the GTK compiled uses the Directfb frontend, not the X-lib frontend. It seems that it still uses some X libraries for rendering functionality or something like that. Hopefully, these libraries are not required in the target system. In particular, it requires the libXtst, libart, libXrender and libXrandr libraries.

It is important to set the pkg-config variables, so that the configure scripts of gcc and its subprojects are able to find the GTK libraries:

export PKG_CONFIG_PATH=/usr/lib/pkgconfig
export PKG_CONFIG_LIBDIR=/usr/lib/pkgconfig

the pkgconfig files (.pc) must be placed in these directories. Note that the 'prefix' keys in these files must be set to the location of the installed GTK files /usr.

there are GTK include files in different locations, such as: usr/lib/glib-2.0/include.

7. Restarting crosstool-ng
Crosstool has a nice feature that allows the process to be restarted at some specific stage, thus speeding up the trial and error build process. However, this functionality must be enabled in the menuconfig, and this functionality has some limitations:
1. changing some configuration items (i.e. gcc build flags) take no effect if the process is restarted. The entirre build process must be started from the beginning.
2. Do not break crosstool-ng while downloading the tarballs. If so, clean the tarballs directory as some files may be corrupted.

8. X libraries
Some GTK peer classes cannot be compiled because they require a GTK library compiled with xlib support. For instance, the cairo-xlib.h and gdk/gdkx.h include files are required.

After all these steps were solved, a toolchain with gcj/gtk support was built without errors.

Thursday, November 6, 2008

SIGIO terminates program

I have compiled the IPC/nt library (inter-process communications) for the i686 architecture, on a CentOS 5.2 release. Compilation was ok, but at runtime, the programs that link with this library die unexpectedly when receiving data from the IPC layer. The message shown is "I/O possible".

Of course, this did not happen on the original embedded systems this library was originally developed for, so I assume something has changed. There are two possibilities:

1. getpid() no longer returns the pid of the calling thread, but the pid of the entire process.
2. the default action of the SIGIO signal has changed, causing the program to terminate rather than ignoring the signal.

The mechanism that IPC/nt uses to work with asynchronous sockets is the following: a thread listening to SIGIO signals is created. Any asynchronous socket opened sets the owner of the SIGIO signals to the thread's pid (hence the getpid issue here). When data arrives (I/O is possible), a signal is sent to the thread which wakes up the other threads listening on the sockets using a mutex/condition variable.

Then I came across this note that explains what is happening (though it does not explain exactly what has changed):
If a nonzero value is given to F_SETSIG in a multi-threaded process running with 
a threading library that supports thread groups (e.g., NPTL), then a positive value
given to F_SETOWN has a different meaning: instead of being a process ID identifying
a whole process, it is a thread ID identifying a specific thread within a process.
Consequently, it may be necessary to pass F_SETOWN the result of gettid instead
of getpid(2) to get sensible results when F_SETSIG is used. (In current Linux threading
implementations, a main thread’s thread ID is the same as its process ID. This means that
a single-threaded program can equally use gettid(2) or getpid(2) in this scenario.)

So, I replaced the call to getpid() by a call to gettid(), which I have proven it is
also backwards compatible. It works fine.

Tuesday, July 22, 2008

Testing qemu

The goal is to run ARM software on a Linux PC.
I download the latest qemu, which is 0.9.1 today.

There isn't much fun here, just the usual stages:

./configure --target-list=arm-linux-user
make install (as root)

Qemu requires the run-time libraries to run. The quickest way is to explode an entire package release to a root directory:

# tpkg-explode main tempdir

Warning: the postinstall scripts of the packages are not executed, which is not very important for most of the packages but the library cache file. So I manually edit etc/ld.so.conf and add the library paths there /usr/TCSL/lib.

Then I rebuild the library cache:
# ldconfig -r //tempdir

and now I can run any ARM program:

# qemu-arm -L tempdir tempdir/usr/TCSL/bin/unsls

It works great, but it does not detect memory overwrites or leaks (who told me that it could?).
It's fine, but not as a memory checker for ARM, as I intended.

Cross-compiling Valgrind

Valgrind is available for other architectures other than the i386/PC. Unfortunately it is not ported to the ARM processor family, but it still can be useful for the PPC targets I use.

First, I download the latest version of valgrind (today is 3.3.1).

Before configuring, a word of advice: the prefix option does not work as usual. The absolute prefix path gets hardwired in the valgrind code so that valgrind is able to find other tools such as memcheck, etc. So, the prefix path must be used with the target distribution in mind. As I plan to install valgrind on a network directory and mount it via NFS from the target unit, I will set 'prefix' to
the name of the mountpoint.

The reason for not installing valgrind on the target is because the size of the file is quite big (more than 50MB) as valgrind requires that his own binary files are not stripped. This is explained in the README_PACKAGERS file.

Apart from setting the usual CC variable:
/configure --host=ppc-linux --prefix=/ppc/tools/valgrind --disable-tls
make install

TLS is not provided by my kernel version, so it is disabled here.

At this point, valgrind runs but it reports an error that it is unable to do some binary code substitutions (i.e. the strlen function). It suggests that the linker may be stripped /lib/ld-·so, which is true.

Having the unstripped version of 'ld' requires compiling the entire glibc RPM. For my platform, the glibc is installed as an RPM. Then, I download and install the source RPM.

The binary files get automatically stripped by rpmbuild at the end of the %install stage. The "__os_install_post" macro contains the script that must be run at that point. This scriptlet calls the brp-strip-shared script, which lives under /opt/eldk/usr/lib/rpm, and strips all the shared libraries found in the installation directory. Unfortunately, my installation of ppc-rpm does not allow this scriptlet to be edited, so the only option is to add a condition at the top of the brp-strip-shared scripts so that if the DONT_STRIP_SHARED variable is set, then exit the script immediately. The spec file must be edited so that this variable is defined.

Then, I modify the glibc.spec file where, at the end of the %install section, there is a piece of code that strips all shared libraries but the 'libpthread' library. I just add the 'ld' library as an exception to the strip process, and then I rebuild the RPM.

export PATH=/opt/eldk/usr/ppc-linux/bin/:$PATH
ppc-rpmbuild -ba glibc.spec

Do not forget to increase or change the version of the RPM so that it is installed in the target successfully.

A simpler workaround is to copy the ld-x.y.z.so file manually, in this case, use 'cp -f' as the file is always in use.