Gentoo-Fu: Optimizing the build process

This is another Gentoo post, not strictly related to KDE, but a follow-up to my previous post, building KDE branches (such as 4.9 and master). Ccache can greatly reduce the build time for these constantly updated packages, but here I’ll present some other tips & tricks. These optimizations are applied system-wide and will therefore improve the build process for all packages, not just live/KDE-specific.

These optimizations are backed up by testing data which I myself performed on my system (Intel Core i5 3570K @ 3.40 GHz, 8GB DDR3 RAM @ 1333Mhz) in a proper environment (each run was performed after a cold boot to avoid any unwanted caching by the OS). The following optimizations were tested:

  • Moving the build directory to RAM (tmpfs)
  • Quiet build (redirect build output to log files)
  • Building multiple packages at once
  • Performing debug-less builds

Another possible optimization is setting emerge’s niceness to a more appropriate level, but I didn’t get around to testing it.

Building from RAM & Quiet Builds

The first test was timing the build process for kdelibs (without ccache and after a cold boot as explained above). It includes the first two optimizations in order to create a nice comparison table (HDD/RAM building and --quiet-build=y/n). Furthermore this test was conducted both in a non-accelerated KMS console (booting in single-user mode) and in an accelerated X-console (Konsole in a KDE session).

Before dumping the results, I will explain how the used optimizations were achieved:

Moving the build directory to RAM (tmpfs):

This is easily done with the following command:

mount tmpfs /var/tmp/portage/ -t tmpfs -o uid=250,gid=250,mode=0775,size=75%

To make it permanent add the following line in /etc/fstab:

tmpfs /var/tmp/portage tmpfs uid=250,gid=250,mode=0775,size=75% 0 0

So we actually mount a tmpfs instance over /var/tmp/portage (portage’s default build path), assign proper read/write privileges to the ‘portage’ user and group (preserving the original mode bits) and set a maximum size of 75% of your RAM. As a side note, tmpfs mounts act as a stack: when you umount this tmpfs instance, the original /var/tmp/portage contents will be revealed underneath (more information in this excellent article). Setting the appropriate permissions is quite straight-forward; I have just retained the original values. Lastly, setting a size of 75% of your RAM ups the default value of 50% (useful for packages with a large code base – i.e. chromium). You really shouldn’t set ‘size=0’ (i.e. disable size restrictions): a memory-hungry build will irreversibly hog and crash your machine instead of getting killed for running out of memory (more information in the above article and in the kernel’s following file: Documentation/filesystems/tmpfs.txt). This isn’t very cool: I did this with a chromium debug build on RAM and the kernel eventually crashed!

I would also advise you to enable the “fail-clean” FEATURE in your /etc/make.conf to free valuable space in your RAM when a build fails.

Quiet build (redirect build output to log files):

You can enable quiet builds by appending --quiet-build=y to emerge’s arguments. By default it is set to ‘n’. To make it permanent, just append it to the EMERGE_DEFAULT_OPTS variable in /etc/make.conf.

Before proceeding, you should also set PORT_LOGDIR="/var/log/portage" in /etc/make.conf to redirect the build output to log files in this directory, otherwise the build output is simply lost. This way you lose no functionality, you can always monitor the build output from another virtual console.

Lastly, the “clean-logs” FEATURE will periodically delete old build logs (older than one week by default).

So, having explained the optimizations used, I will present the test data. The test was run only once in each configuration but is quite representative:

Non-accelerated console:

HDD build: 10m 3.707s

RAM build: 9m 52.246s

HDD+quiet build: 8m 35.364s

RAM+quiet build: 8m 22.819s

Accelerated console:

HDD build: 8m 47.189

RAM build: 8m 33.321s

HDD+quiet build: 8m 35.187s

RAM+quiet build: 8m 23.945s

Obviously the non-accelerated console lags considerably when printing the build output. Other than that, a RAM build saves you ~12s in this case, plus considerably less HDD spinning noises. A quiet build gives another 10s speedup. It’s really not that great a speedup, but both optimizations are easily applied, so why not have them?

Building multiple packages at once

For this test, I timed the build process for amarok, k3b, kdevelop and ktorrent in a single run. Same conditions as above, with RAM+quiet build enabled in a KMS console. This test was quite straightforward: the only thing changing was the number of parallel jobs that portage was processing (-j emerge argument):

1 parallel job (-j1): 9m 9.950s

2 parallel jobs (-j2): X 8m 9.904s

3 parallel jobs (-j3): 8m 7.315s

Unlimited(=4) parallel jobs (-j): X 8m 10.261s

X denotes a build failure due to the compiler crashing. So, although it brought some nice speed improvements, I wouldn’t recommend running multiple jobs due to instability. I guess this is more oriented towards distcc/multiprocessor-powered systems.

Performing debug-less builds

This is not really an optimization; I included it mainly for comparison to the debug builds above and to lay out the procedure of enabling per-package debug-less builds. This can be really useful for certain packages (such as chromium) which fail to build in-RAM due to memory shortage (debug builds are extremely memory-consuming).

So the only test done here was building amarok, k3b, kdevelop and ktorrent (as above), with RAM+quiet build enabled in a KMS console, one parallel job, after disabling debug symbols generation (explained below). The result is 7m 47.164s. I actually expected this to be lower, but it seems that debug builds are not that time-consuming after all.

To enable debug symbol generation (extensive article here) I have “-ggdb” in my CFLAGS variable in /etc/make.conf and “compressdebug splitdebug” in the FEATURES variable. This generates debug symbols for all installed packages (located under /usr/lib/debug).

To disable debug symbol generation you have to make a per-package configuration (a list of excluded packages from the debug configuration mentioned above). Create /etc/portage/package.env and add the following line (chromium used as an example):

www-client/chromium nodebug.conf

We inform portage that the www-client/chromium package will be installed with the configuration variables listed in /etc/portage/env/nodebug.conf. The contents of this file are the following:

CFLAGS="-march=native -O2 -pipe"
CXXFLAGS="${CFLAGS}"
FEATURES="${FEATURES} -compressdebug -splitdebug"

CFLAGS here is exactly the same as in /etc/make.conf, excluding “-ggdb”. So, you get the idea. Another good use of per-package environment variables, related to this post, is specifying a different build directory (which is not on RAM). Instructions here.

Conclusion

All in all, no optimization will yield more than a few seconds speedup. However, after running all these tests, I have permanently enabled RAM+quiet builds. Performing the build from RAM makes way more sense: you keep HDD activity at a minimum so that other I/O processes can run in parallel without considerable lagging, you get a lot less noise, plus a small boost in the build times. Same goes for quiet builds which bring only advantages (you can always monitor the build output from another virtual console as explained above).

7 thoughts on “Gentoo-Fu: Optimizing the build process

  1. Well,
    nice comparison. Although I do not understand why your compiler crashes on parallel builds. I run:

    MAKEOPTS=”-j2″ emerge -Dau world –newuse –autounmask-keep-masks -j3 –keep-going

    and

    MAKEOPTS=”-j8″ revdep-rebuild

    without any problems.

      • I bet for a lack of memory problem. Doing a parallel build in ram is insane, you will consume your ram very fast.

        What impress me more are your time results: memory builds should be A LOT faster. The only explanation I can imagine are that the linux kernel is VERY GOOD at cache things from HD. There are any posibility of do not allow the kernel to cache files from HD and repeat your test?

        • I could try with the ‘sync’ mount option but to my experience it’s extremely slow. I too expected better times with RAM but it seems that compilation indeed is CPU-bottlenecked.

          Edit: “sync” is insanely slow. After half an hour it had only finished configuring the source and was about to start compiling.

  2. RAM compile made a much more visible (although not measured) difference in my case. Are you using a fast SSD to begin with ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s