Download a prebuilt system image, boot it up under the emulator, and
compile stuff natively for a target.
Go here and download the appropriate
system-image-$ARCH.tar.bz2 for your $TARGET, extract it, cd into it,
and ./run-emulator.sh to boot it under qemu.
Alternately, you can run the script ./development-environment.sh,
which is a wrapper around run-emulator.sh that feeds QEMU extra options to add
memory (256 megs) and writeable disk space (a blank 2 gigabyte disk image
mounted on /home) to provide a more capable development environment.
The system images contain native compiler toolchains, but if you install
distccd on the host and add the appropriate cross compiler to your host's
$PATH, the ./run-emulator.sh script will detect this and set up the system
image to automatically use distcc to call out to the cross compiler through
the virtual network, speeding up native builds significantly.
Build your own cross compilers and system images from source, using
the build scripts.
Go to the downloads directory, grab the most recent
release tarball, extract it, and run ./build.sh to list
the available targets. The run ./build.sh $TARGET to compile
the one you like. The results wind up in the "build" directory.
The build scripts are written in bash, and fairly extensively commented.
All the scripts at the top level are designed to be run directly, and
build.sh is just a wrapper script that calls them in order. The less commonly
used scripts in sources/more are also designed to be run directly.
A large number of variables can be set to configure the build, either
by modifying the file "config" (which documents them all) or by exporting
them as environment variables.
To grab the latest development version of the build scripts out of the
source control system, go to the
mercurial archive.
If you don't want to install mercurial, you can grab a
tarball of the current code at
any time.
Building System Images
The build scripts compile the system images from source code. Along
the way, they create the cross compilers and root filesystem tarballs too.
If you just want to use the prebuilt binary tarballs to mess around with
native environments for various targets, you don't need to care about the
build scripts.
But if you want to understand how it all works, and how to reproduce it,
then you do.
Start by running (or reading) "build.sh", it calls everything else.
Q: How do I add $PACKAGE to my system image's root filesystem?
A: Either use setup-chroot to copy the root filesystem into a writeable
chroot, or run the build scripts with SYSIMAGE_TYPE=ext2 (and probably
HDA_MEGS=2048) to create a writeable ext2 system image instead of the default
read-only squashfs.
The setup-chroot command is a shell script in each system image's /sbin
directory which copies the squashfs contents into a writeable chroot
directory, and chroots into that directory. Since dev-environment.sh
creates a 2 gigabyte ext3 image and mounts it on /home, you should have
plenty of space under there to do:
setup-chroot /home/work /bin/ash
The first time you run this (I.E. when the directory you want to chroot into
doesn't exist), setup-chroot copies the root filesystem into it.
Afterwards, setup-chroot uses "mount --bind" to copy the host filesystem's
mounts (/proc, /sys, /tmp, and so on), then chroots into the new directory
to run your command. When the chroot exits, setup-chroot calls "zapchroot"
to unmount all those sub-mounts.
If you don't specify which command to run, chroot runs /bin/sh, which by
default points to bash 2.04b built without ncurses. This is good for running
scripts but is not the world's friendliest interactive shell.
The other thing you could do is go back to the build scripts and
build a writeable system image by specifying the environment variable
"SYSIMAGE_TYPE=ext2" instead of the default squashfs. You may also want
to set "SYSIMAGE_HDA_MEGS=2048".
Aboriginal Linux builds squashfs images by default, and the prebuilt binary
tarballs in
the downloads/binaries directory are built with the default values. Squashfs
is a read-only compressed filesystem, which means it's pretty durable (you
never need to fsck it), but also a bit limiting. The dev-environment.sh
script attaches a 2 gigabyte ext2 image to /dev/hdb (which is mounted on
/home) so you always have writeable space to build stuff in, but that doesn't
let you modify the root filesystem on /dev/hda: you can't install packages
you build into /bin and such on a read-only root filesystem.
The "SYSIMAGE_TYPE" and "SYSIMAGE_HDA_MEGS" config entries let you change
the default system image type generated by the system-image.sh script. You
can edit the file "config" or specify them as environment variables, ala:
SYSIMAGE_TYPE=ext2 SYSIMAGE_HDA_MEGS=2048 ./build.sh $TARGET
That creates a 2 gigabyte ext2 image, which you can boot into and install
packages natively under, using the "./run-from-build.sh $TARGET" script.
If you've already built a system image, you can repackage the existing root
filesystem by re-running system-image.sh (instead of the whole build.sh).
As always, your new system image is created in the "build" subdirectory.
Note: since this is a writeable image, you'll have to fsck it. You can
use "tune2fs -j" to turn it into an ext3 image to reduce the need for this.
Q: Why did the $NAME package build die
with a complaint that it couldn't find $PREREQUISITE, even though that's
installed on the host? (For example, distcc and python.)
Because you skipped the host-tools.sh step, and because installing a package
on the host isn't the same as installing it on the target.
Even though host-tools.sh is technically an optional step, your host has to
be carefully set up to work without it.
Not only does host-tools.sh add prerequisite packages your build requires,
it _removes_ everything else from the $PATH that might change the behavior of
the build. Without this, the ./configure stages of various packages will
detect that libtool exists, or that the host has Python or Perl installed,
and configure the packages to make use of things that the cross compiler's
headers and libraries don't have, and that the target root filesystem
may not have installed.
Q: How do I get better log output from the build?
Get a verbose, single-processor log of the build output.
When something goes wrong, re-run your build with a couple extra variables,
and log the output with "tee":
BUILD_VERBOSE=1 CPUS=1 ./build.sh 2>&1 | tee out.txt
The shell has a nice syntax for exporting variables just for a single
command, by putting the command to run after the assignment. Doing
that doesn't pollute your environment by leaving CPUS or BUILD_VERBOSE
exported, but it exports them just for the new "build.sh" process it
launches. And redirecting stderr to stdin and piping the result into "tee"
captures the output so you can examine it with less or vi.
BUILD_VERBOSE undoes the "pretty printing" of the linux kernel and uClibc,
and makes a few other build steps produce more explicit output.
CPUS controls the number of tasks make should run in parallel. The default
value is the number of processors on the system, times 1.5. (So a 4 processor
system runs 6 processes.) Making it single processor gives you much more
readable output, because a single-processor build stops more reliably at the
point where it hit a problem, rather than at some random later point forcing
you to scroll back quite a ways to find the error. It also shouldn't
interleave the output of multiple parallel commands.
Use the command logging wrapper
If you need more logging detail, run more/record-commands.sh, then re-run
the build and look at the output in build/logs. (A similar "record-commands"
wrapper is available in each system image's /usr/sbin directory, to
log the commands of native builds.)
more/record-commands.sh sets up a wrapper which logs every command (and
all its arguments) run out of $PATH. It populates build/wrapdir with
symlinks for every command name currently in $PATH, all pointing to the
"wrappy" binary (built from sources/toys/wrappy.c). If you run record-commands
before running host-tools.sh it wraps the host $PATH, if you run it after
host-tools.sh it wraps the sanitized $PATH in build/host.
The wrappy binary depends on two environment variables (set up by
sources/include.sh): $WRAPPY_LOGPATH is an absolute path to the current
log file (updated by the "setupfor" function) and $OLDPATH is the $PATH to
exec the real command out of after appending the current command line to
the log.
The script "more/report-recorded-commands.sh" prints out a list of all
commands used by each build stage. (Comparing the host-tools version
to a run without host-tools can be instructive; that's the extra stuff
./configure is picking up out of the host environment.)
The record-commands wrapper is also available in the target root
filesystem's /usr/sbin directory. Run "record-commands /path/to/script"
and when it exits /tmp/record-commands-log.txt should list all the
command lines run by the script, in order.
Q: How do I run my own build snippets without editing the build scripts?
A: Use the more/test.sh script
This wrapper runs a command line in build context: the first argument
is the target to build for, and the rest of its arguments are a command line
to run as if building for that target.
Examples:
more/test.sh armv5l build_section busybox
more/test.sh mips getconfig linux
The wrapper imports sources/include.sh and calls load_target (with
NO_CLEANUP so it doesn't blank an existing output directory). This sets up
the same context for building for a given $ARCH that the build scripts use:
it adds the appropriate cross compiler to the $PATH (if it's already been
built), sets all the shell functions and environment variables,
creates the temporary directory, and so on. The wrapper then runs the rest of
the command line in the resulting context.
By default, more/test.sh acts as its own build stage called "test"
(because include.sh uses the name of the script file you're running to set a
default STAGE_NAME), so output winds up in build/test-armv5l and such. You
can override this by setting STAGE_NAME yourself, for example:
# rebuild uClibc without redoing binutils/gcc/kernel headers stages:
STAGE_NAME=simple-cross-compiler more/test.sh sparc build_section uClibc
Q: How do I play around with package source code?
The source code used by package builds lives in several directories, each
with a different purpose:
packages - vanilla upstream source tarballs (populated by download.sh).
sources/patches - local patches to apply to the vanilla packages.
build/packages - the package cache, clean copies of the extracted and patched source.
build/temp-$ARCH - working copies of the source configured and built for the given architecture.
Downloading
The list of source URLs is in the script download.sh, along with a list
of mirrors to check if the original URL isn't available. Those URLs are
the only place that specifies version numbers for packages, so if you want
to switch versions just point to a new URL and re-run download.sh. (You can
set SHA1= blank for the first download, and it will output the sha1sum for
the file it downloads. Cut and paste that into the download script and
re-run to confirm.)
Extracting and patching
Each script to build a package calls the shell function "setupfor"
before building the package, and "cleanup" afterwards. Conceptually,
"setupfor" extracts a tarball (from the "packages" directory),
patches it if necessary (applying all the files in "sources/patches" that
start with that package's name, which come from the aboriginal linux
repository), and cd's into the resulting directory. The function "cleanup"
does an "rm -rf" on that directory when you're done.
In practice, the infrastructure behind the scenes caches the extracted
tarballs. This optimization saves disk space, CPU time, and I/O bandwidth,
speeding up builds considerably (especially when you do a lot of them in
parallel). This optimization is designed to be easily ignored, but
understanding the infrastructure can be useful for debugging.
There are two places to look for extracted source packages: the package
cache and the working copy. The package cache (in "build/packages")
contains clean copies of all the previously extracted source tarballs, with
patches already applied. Each working copy (in an architecture's
temporary directory, "build/temp-$ARCH") is a tree of hardlinks to the
package cache that provides a directory in which to configure, build, and
install that package for a specific target.
The source in the package cache stays clean, can be re-used across multiple
builds, and is only used to create working copies. Working copies fill up
with temporary files from configure/make/install, and are normally deleted
after each successful build. If you want to look at clean source, you
want the package cache. If you want to look at the state of a failed
build to see how it was configured or re-run portions of it, you want the
working copy.
Q: What's the package cache for?
The package cache contains clean architecture-independent source code,
which you can edit, use to run modified builds and create patches, and easily
revert to its original condition. The package cache avoids re-extracting the
same tarballs over and over, but also provides a place you can make temporary
modifications to that source behind the build system's back without having to
mess around with tarballs or patch files.
The setupfor function calls "extract_package" to populate the package
cache. First extract_package checks for an existing copy of the appropriate
source directory, and when it doesn't find one it extracts the source tarballs
from the "packages" directory, applies the appropriate patches from
"sources/patches/$PACKAGENAME-*.patch", and saves the results into its own
directory (named after the package) under "build/packages".
When the package cache has an existing copy of the package, extract_package
checks the list of sha1sums in that copy's "sha1-for-source.txt" file against
the sha1sums for the tarball and for each of the patch files it needs to apply.
If the list matches, it uses the existing copy. If it doesn't match, it
deletes the existing copy out of the package cache, re-extracts the tarball,
and reapplies each patch to it.
This means if you can edit the copy under sources/patches all you like,
and as long as you don't modify sha1-for-source.txt, don't replace the
tarball, or add/remove/edit any of the patches to apply to it, it
will re-use that source for subsequent builds. So go ahead and fill it
full of printf()s and test code, then when you want to go back to a clean
copy, delete the build/packages directory (either one package or the whole
thing) and let setupfor recreate it.
If you come up with changes you want to keep, you can create a patch from
the package cache this way:
# Rename the modified package directory
cd $TOP
cd build/packages
mv $PACKAGE $PACKAGE.bak
# Extract a clean copy
cd $TOP
more/test.sh host extract_package $PACKAGE
# Diff the two and write out the patch to sources/patches
cd build/packages
diff -ruN $PACKAGE $PACKAGE.bak > \
../../sources/patches/$PACKAGE-$NAME.patch
rm -rf $PACKAGE
# Run a clean test build
cd $TOP
rm -rf build/packages/$PACKAGE
./build.sh $ARCH
Where $TOP is your top level Aboriginal Linux directory, $PACKAGE is the
name of the package you're modifying, and $NAME is some unique name for your
patch. Don't forget to delete the $PACKAGE.bak directory to reclaim its disk
space when you're satisfied with your patch (or "rm -rf build/packages" to
zap the entire package cache, or just "rm -rf build" to clean
up all the temporary files).
If the environment variable EXTRACT_ALL is set, download.sh will
call extract_package on each package as soon as it confirms the tarball's
sha1sum. (The environment variable FORK makes each package download happen
in parallel, including the call to extract_package if any.) Prepopulating
the package cache this way is useful before running different architecture
builds in parallel, or when testing that new patches (added to the
sources/patches directory) apply correctly to the relevant package(s).
This means you can do the following to get a freshly extracted and patched
clean copy of all packages:
rm -rf build/packages
EXTRACT_ALL=1 ./download.sh
Q: What are working copies for?
Working copies are target-specific copies of package source where builds
actually happen. The build scripts clone a fresh working copy for each build,
then run configure, make, and install commands in the new copy. They leave the
aftermath of failed builds lying around for analysis; to keep the working
copies of successful builds around too, set the NO_CLEANUP environment
variable. If you want to cd into a source directory and re-run bits of a
previous build, use the working copy of a package's source. (You'll probably
have to add the appropriate cross compiler's bin directory to your $PATH, but
otherwise it'll usually just work.)
Working copies of source packages are cloned from the package cache
by the the function "setupfor", which first calls extract_package to ensure the
package cache is up to date, then creates a directory of hardlinks to the
package cache via "cp -l" (or symlinks via "cp -s" if $SNAPSHOT_SYMLINK is
set).
The working copies use hardlinks to avoid creating redundant copies of the
file contents, which would waste I/O bandwidth and eat lots of disk space
and disk cache memory. Using hardlinks instead of symlinks for the working
copies also saves inodes and dentry cache, since each symlink consumes an
inode, but that optimization requires that the package cache and working
copies be on the same filesystem.
Linking to the page cache instead of copying it doesn't cause problems
for most packages, because most methods of modifying files used by package
builds break hardlinks or symlinks by first creating a temporary copy with
the modifications, then deleting the original and moving the copy into its
place. Modifying files that are tracked by source control also creates
spurious noise for the package's developers. Occasionally a package will
make a mistake (such as zlib 1.2.5 shipping a Makefile which is
generated by configure, and modified in place), in which case the build
has to break the link itself. (Note that editing the working copies of
source files in build/temp-$ARCH can modify the cached copy if your editor
isn't configured to break hardlinks. Usually you edit the package cache
version and let setupfor create a new working copy.)
If you want to search just the generated files and not the snapshot of
the source, use "find $PACKAGE -links 1". If you want to search just
the source files and not the generated files, that's what the package
cache is for.
Q: Can I use source code from repositories instead of tarballs?
Sure. Check them out into the packages directory with the name of the
package you want. The more/repo.sh script provides an example for several
packages.
If a directory such as "packages/linux" exists, the build from that
(instead of the package cache) for the appropriate package. Note that it
will use this directory verbatim, if you want any of the patches from
sources/patches you'll have to apply them yourself.
When you'd like to build from vanilla tarballs again, either build with
IGNORE_REPOS=all or delete the directory out of packages.
Q: What's a miniconfig?
Aboriginal Linux uses "miniconfig" format for Linux and uClibc config
files.
A miniconfig is a list of interesting symbols to switch on. To create a
miniconfig, start with "allnoconfig", go into "menuconfig" to switch on all the
symbols you want, and add a "SYMBOLNAME=y" line for each symbol you had to
manually set. (You don't need to record symbols set by dependency
resolution, just the ones you'd have to set yourself to get from
allnoconfig to the config you want.)
Since the vast majority of these symbols are common between platforms, we
split our miniconfigs for linux and uClibc into a "baseconfig" file
(in the sources directory) and a list of target-specific symbols in each
target's settings file. We append these two together to get our miniconfig.
To use a miniconfig:
make allnoconfig KCONFIG_ALLCONFIG=filename
The sources/toys/miniconfig.sh script compresses a full .config into
a miniconfig. To use, "cp .config tempname; ARCH=x86 $PATHTO/miniconfig.sh
tempname" and the result winds up in mini.conf.
The kernel's new defconfig format is similarly filtered to remove
uninteresting symbols, but miniconfig has several advantages over
savedefconfig:
Miniconfig is human readable.
Each miniconfig file is self-contained: it lists all the symbols we
explicitly care about enabling. The compressed defconfig files are offsets
against an external "default configuration" that changes from platform to
platform and from version to version.
Miniconfig may rely on dependency resolution to
switch on whatever other symbols are necessary to make this configuration
work, but we don't have to care what those are. We list all the symbols
we care about, in one place, where we can easily see all the features enabled
by this configuration.
Miniconfig doesn't have to switch any symbols off.
Lots of symbols default to y, and the compressed defconfig files have
to switch off symbols that are enabled by default but which this configuration
doesn't want. To do so it uses "magic comments". (The
config file format doesn't say "SYMBOL=n", it says "# SYMBOL is not set".
Despite most things starting with a # being comments, that one isn't.)
Miniconfig doesn't silently bloat over time
In each new release, new symbols show up defaulting to "y". For example,
between linux 2.6.38 and 2.6.39 the symbol "CONFIG_SUSPEND=y" showed up on all
platforms, and i686 grew CONFIG_PNP_DEBUG_MESSAGES=y and seven different
CONFIG_ACPI_* symbols all defaulting to y. A compressed defconfig switches
all these on by default, because the delta against defconfig it records
doesn't switch them off.
In miniconfig, you only get the features you requested.
The disadvantages of miniconfig are that miniconfig.sh is really slow,
and that if new required symbols show up you have to add them to the
miniconfig yourself.
Q: Didn't this used to be called Firmware Linux?
A: Yup. The name changed shortly before the 1.0 release in 2010.
The name "Aboriginal Linux" is based on a synonym for "native", as in
native compiling. It implies it's the first Linux on a new system, and also
that it can be replaced. It turns a system into something you can do
native development in, terraforming your environment so you can use it
to natively build your deployment environment (which may be something else
entirely).
Aboriginal Linux is cross compiled, but after it boots you shouldn't need
to do any more cross compiling. (Except optionally using the cross compiler
as a native building accelerator via distcc.) Hence our motto,
"We cross compile so you don't have to".
Q: ./run-emulator.sh says qemu-system-$TARGET isn't found, but I installed the qemu package and the executable "qemu" is there. Why isn't this working?
A: You're using Ubuntu, aren't you? You need to install
"qemu-kvm-extras" to get the non-x86 targets.
The Ubuntu developers have packaged qemu in an actively
misleading "interesting" way. They've confused the emulator QEMU
with the virtualizer KVM.
QEMU is an emulator that supports multiple hardware
targets, translating the target code into host code a page at a time. KVM
stands for Kernel Virtualization Module, a kernel module which allows newer x86
chips with support for the "VT" extension to run x86 code in a virtual
container.
The KVM project started life as a fork of QEMU (replacing QEMU's CPU
emulation with a kernel module providing VT virtualization support, but
using QEMU's device emulation for I/O), but KVM only ever offered a
small subset of the functionality of QEMU, and current versions of QEMU have
merged KVM support into the base package. (QEMU 0.11.0 can automatically
detect and use the KVM module as an accelerator, where appropriate.)
It's a bit like the X11 project providing a "drm" module (for 3D acceleration
and such), which was integrated upstream into the Linux kernel. The Linux
kernel was never part of the X11 project, and vice versa, and pretending the
two projects were the same thing would be wrong.
That said, on Ubuntu the "qemu" package is an alias for "qemu-kvm", a
package which only supports i386 and x86_64 (because that's all KVM supports
when running on an x86 PC). In order to install the rest of qemu (support
for emulating arm, mips, powerpc, sh4, and so on), you need to install
the "qemu-kvm-extras" package (which despite the name has nothing whatsoever
to do with KVM).
Support for non-x86 targets is part of the base package when you build QEMU
from source. If you ignore Ubuntu's packaging insanity and build QEMU
from source, you shouldn't have to worry about this strangely named
artificial split.
If you want to cross compile from Cygwin or mingw or something, you're on
your own. Emulating a Linux system (thereby bypassing Windows entirely) is
fairly straightforward, assuming somebody else has already done the work of
porting the emulator. Trying to make Windows run posix apps is an unnatural
act involving ceremonial headgear and animal sacrifice just to get it to
fail the same way twice.
Q: What if I want to play with android?
The Aboriginal Linux root filesystem should work just fine under Android's
proprietary Linux kernel fork: you can extract the root-filesystem-armv5l
tarball and chroot on most android hardware and life is good.
Integrating Android userspace with Linux userspace is a bit more
complicated: Google decided they didn't want any GPL code in userspace, so they
rewrote the whole root filesystem from scratch. (The end result is missing
many features, and in doing so they opened themselves to a Java
patent lawsuit from Oratroll, we never said it was a _good_ decision.)
This means that Android userspace doesn't use glibc or uClibc, it uses
an incompatible BSD-derived library called "bionic". Think "klibc with
threading support" and you're not far off: it's missing a lot of stuff
needed to build most conventional Linux userspace packages against it.
However, the Android _kernel_ is mostly Linux. It's a fragmented mix
of several different obsolete forks with lots of garbage added, but Google's
idea of "embedded development" focused on adding stuff to the kernel rather
than removing stuff, so you can mostly ignore the differences. This means
binaries built against uClibc should run on the android kernel just fine:
assuming they're statically linked, or that you install the uClibc shared
libraries (possibly alongside the bionic ones).
The other major deficiency of Android is "toolbox", which is their
clone of busybox. (It has nothing to do with toybox, either: that's also
GPL. About half the code and ideas of toybox went upstream into busybox
anyway, the rest is mothballed.)
Android's toolbox is crap, and the first thing any serious developer
does is install busybox. Here's the easy way to do that.
This file is statically linked against uClibc, so it doesn't require
any external dependencies, meaning it should run on an Android system.
Now let's install it:
Make a /busybox directory, move the busybox-armv5l binary to
/busybox/busybox (this will both move it and rename it), and "chmod 700
/busybox/busybox". (The toolbox chmod doesn't understand "u+x", you have
to give it numbers. This is one of the many, many things this procedure
fixes.)
Now run "PATH=/busybox busybox sh" to get a real shell prompt with command
history. In that command prompt run this:
That gives you a /busybox directory full of symlinks to busybox. You're
running in a shell with a $PATH looking at those busybox commands, so any
command you type should run the busybox version.
You should be able to take it from there.
You can run the build against an older
kernel (such as 2.6.35) and then run ./native-build.sh static-tools.hdc in
the resulting system-image-powerpc to get dropbearmulti and busybox binaries
that restrict themselves to the old system calls.