4.12 Binary Formats

To understand why FreeBSD uses the elf(5) format,the three currently “dominant” executable formats for UNIX® must be described:

FreeBSD comes from the “classic” camp and used the a.out(5) format, a technology tried and proven through many generations of BSD releases, until the beginning of the 3.X branch. Though it was possible to build and run native ELF binaries and kernels on a FreeBSD system for some time before that, FreeBSD initially resisted the “push” to switch to ELF as the default format. Why? When Linux made its painful transition to ELF, it was due to their inflexible jump-table based shared library mechanism, which made the construction of shared libraries difficult for vendors and developers. Since ELF tools offered a solution to the shared library problem and were generally seen as “the way forward”, the migration cost was accepted as necessary and the transition made. FreeBSD's shared library mechanism is based more closely on the SunOS™ style shared library mechanism and is easy to use.

So, why are there so many different formats? Back in the PDP-11 days when simple hardware supported a simple, small system, a.out was adequate for the job of representing binaries. As UNIX was ported, the a.out format was retained because it was sufficient for the early ports of UNIX to architectures like the Motorola 68k or VAXen.

Then some hardware engineer decided that if he could force software to do some sleazy tricks, a few gates could be shaved off the design and the CPU core could run faster. a.out was ill-suited for this new kind of hardware, known as RISC. Many formats were developed to get better performance from this hardware than the limited, simple a.out format could offer. COFF, ECOFF, and a few others were invented and their limitations explored before settling on ELF.

In addition, program sizes were getting huge while disks and physical memory were still relatively small, so the concept of a shared library was born. The virtual memory system became more sophisticated. While each advancement was done using the a.out format, its usefulness was stretched with each new feature. In addition, people wanted to dynamically load things at run time, or to junk parts of their program after the init code had run to save in core memory and swap space. Languages became more sophisticated and people wanted code called before the main() function automatically. Lots of hacks were done to the a.out format to allow all of these things to happen, and they basically worked for a time. In time, a.out was not up to handling all these problems without an ever increasing overhead in code and complexity. While ELF solved many of these problems, it would be painful to switch from the system that basically worked. So ELF had to wait until it was more painful to remain with a.out than it was to migrate to ELF.

As time passed, the build tools that FreeBSD derived their build tools from, especially the assembler and loader, evolved in two parallel trees. The FreeBSD tree added shared libraries and fixed some bugs. The GNU folks that originally wrote these programs rewrote them and added simpler support for building cross compilers and plugging in different formats. Those who wanted to build cross compilers targeting FreeBSD were out of luck since the older sources that FreeBSD had for as and ld were not up to the task. The new GNU tools chain (binutils) supports cross compiling, ELF, shared libraries, and C++ extensions. In addition, many vendors release ELF binaries, and FreeBSD should be able to run them.

ELF is more expressive than a.out and allows more extensibility in the base system. The ELF tools are better maintained and offer cross compilation support. ELF may be a little slower than a.out, but trying to measure it can be difficult. There are also numerous details that are different between the two such as how they map pages and handle init code. In time, support for a.out will be moved out of the GENERIC kernel, and eventually removed from the kernel once the need to run legacy a.out programs is past.