Launchpad Buildout

Launchpad uses the buildout (or “zc.buildout”) build system. Buildout’s biggest strength is managing Python packages. That is also our focus for it.

We have at least two other ways of managing dependencies. Apt manages our Python language installation, as well as many of our non-Python system-level dependencies, such as PostgreSQL. The sourcecode directory is our other way of managing dependencies. It is supposed to only contain dependencies that are not standard Python packages. bzr plugins and Javascript libraries are existing examples.

If you are not interested in our Motivations or in an Introduction to zc.buildout, all developers will at least want to read the very brief sections about the Set Up and the Everyday Usage.

Developers who manage source dependencies probably should read the general information about Managing Dependencies and Scripts, but will also find detailed instructions to Add a Package, to Upgrade a Package, to Add a Script, to Add a File Modified By Buildout, and to Work with Unreleased or Forked Packages.

Motivations

These motivations are labeled as “[INTERNAL]” or “[EXTERNAL]” to indicate whether it is pertinent for internal dependencies, which we on the Launchpad team create and release ourselves; or external dependencies, which other parties, in and out of Canonical, create and release.

  • We want more careful specification of our dependencies across branches. [INTERNAL] [EXTERNAL]

    This is a real concern pertinent both for our “trunks” (devel, stable, db-devel, db-stable) and for our development boxes. For instance, before incorporating buildout, in our trunks, when we want to update a dependency, we needed to make sure that all the current Launchpad trunks work with the dependency initially; then submit a new Launchpad branch that uses the change dependency. A mistake can even potentially break one or both of the db-* trunks, since PQM only tests against one branch (usually devel), and sourcecode changes affect all branches at once. For simplicity, speed, and safety, we want to be able to submit a single branch that incorporates the source dependencies and the associated changes at once.

    This is also true, if less critical and easier to work around, on developer boxes. Without care, changes to sourcecode when working on dependencies will affect all a developer’s branches, polluting test results with false negatives or false positives.

  • We want to default to using released versions of our software dependencies. [EXTERNAL]

    A significant number of projects do not always have a pristine trunk, and many also spend extra effort on polish, bug fixes, and compatibility before a release. If we do not desperately need a new feature on trunk, using a release is generally regarded as a safer, better practice. Our earlier usage of bzr branches of the development trunks does not encourage this practice.

  • We want to be encouraged to make the effort to interact with upstream projects to have our patches integrated. [EXTERNAL]

    Interacting and negotiating with upstream is undeniably more time-consuming than our previous practice of maintaining local bzr branches with our patches, especially short-term. But our previous use of bzr branches is not good open-source community behaviour–an ironic characteristic for a project like Launchpad. It also can cause problems down the road, for instance, if the patch becomes stale and we want to migrate to new releases.

  • We want to be protected from changes and differences in our operating system. [INTERNAL] [EXTERNAL]

    This is a concern both over time and across different Launchpad environments.

    First, our operating system, Ubuntu, is driven by many needs and goals. Launchpad is among them, but generally Launchpad serves Ubuntu, not the reverse. For instance, Jaunty dropped Launchpad’s Python version. The Ubuntu developers had good reason–Python 2.4 has not been supported by the Python developers for some time–but it caused a significant inconvenience to the Launchpad team. Managing our dependencies, particularly the Python library dependencies, can help alleviate these problems.

    Second, Launchpad developers run a significantly different version of the operating system than that run in production. Maintaining our Python library dependencies ourselves can also help alleviate these concerns.

  • We want to be able to easily use standard packages of our primary programming language, Python. [EXTERNAL]

    Our Python library dependencies are distributed for many operating systems– Windows, Mac, and other flavors of Linux–in a unified location and format: PyPI, using distutils. Using Python library dependencies in their standard distributions makes it easier for us to reuse code.

  • We want to be encouraged to release Python packages of our open-source code. [INTERNAL]

    We are beginning to realize our aspirations of abstracting and releasing some of our code. Much of that code is in Python. Consuming standard Python packages encourages us to follow good practice in releasing our own Python packages. Dogfooding can help keep us honest, and good official releases can help us support and encourage a community of users.

Meanwhile, we want to be able to keep certain aspects of the legacy story.

  • We need deployment to not need network access.
  • We need to be able to use custom releases of our Python dependencies, if absolutely necessary.
  • We would like to be able to follow security releases in our operating system.

We will be able to support the first two of these aspects entirely.

As to the third, we will continue to follow operating system security releases for most or all of the dependencies that are not Python packages–a very similar situation to the one before Buildout integration.

Introduction to zc.buildout

Buildout is a relatively simple system that increases in complexity as it is extended via “recipes”. It works on top of distutils and setuptools. It uses declarative ini-style files to direct its work. The Buildout site points to a variety of documents describing and documenting its use.

For Launchpad, Buildout is hidden behind a Makefile as of this writing. If the Makefile is removed, developers will typically run bootstrap.py in the branch, and then run bin/buildout. After that, the bin directory will have the commands to start, stop, test, and debug the software. If you are interested in example direct usage of Buildout, you may want to read the “Hacking” document in the Launchpad wiki that describes the usage patterns in lazr.* packages.

Launchpad’s Buildout usage is roughly of medium complexity. It is more complex than that needed by a package with few dependencies and simple usage (see lazr.uri, for instance), but less complex than that of other large applications that use the buildout system. More complexity can come by building more non-Python tools and by having multiple configuration variations, for instance.

The documentation below will focus on using Launchpad’s buildout. See the links given above for a more thorough general review.

Set Up

If you use the rocketfuel-get script, run that, and you will be done.

If you don’t, you have just a bit more initial set up. I’ll assume you maintain a pattern similar to what the rocketfuel-* scripts use: you have a local, pristine branch of trunk from which you make your other branches.  You manually update the trunk and rsync sourcecode when necessary.  When you make a branch, you use utilities/link-external-sourcecode.

Developers that take this approach should do the following, where trunk is the trunk branch from which you make local branches.

bzr co lp:lp-source-dependencies download-cache

Then run make in the trunk.

See the Everyday Usage: Manual section for further instructions on how to work without the rocketfuel-* scripts.

Everyday Usage

rocketfuel Scripts

If you typically use rocketfuel-get, and you don’t change source dependencies, you should not have any further changes, except that bin/test has replaced test.py. rocketfuel-branch and link-external-dependencies will Do the Right Thing.

Manual

If you don’t use the rocketfuel scripts, you will still use link-external-dependencies as before. When a buildout complains that it cannot find a version of a dependency, do the following, from within the branch:

bzr up download-cache

After this, retry your make (or run bin/buildout from the branch).

That’s it for everyday usage.

Managing Dependencies and Scripts

What if you need to change or add dependencies or scripts? As you might expect, you need to know a bit more about what’s going on, although we can still keep this at a fairly high level.

First let’s talk a little about the anatomy of what we have set up. To be clear, much of this is based on our own decisions of what to do. If you see something problematic, bring it up with the Foundations team. Maybe together we can come up with another approach that meets our needs better.

If you saw the top-level Launchpad directory before we started using Buildout, you might notice seven new items in the checkout.

bootstrap.py

This is the standard bootstrapping file provided by the Buildout distribution. As of this writing, the Makefile uses this file, and you do not have to modify it or call it directly.

Without a Makefile, the first step of developing a source tree that uses Buildout is to run this file in the top level directory with the Python executable that you wish to use for the source tree. It creates a bin directory, if it does not already exist, and puts a buildout executable there. The next step is to run bin/buildout, which, unless you supply a -c option to specify a configuration file, will look for a buildout.cfg file by default to discover what to do.

Again, as of this writing, the Makefile uses this file, and it does not need to be modified, so you need not concern yourself with it further at this time.

ez_setup.py
This is another standard file from another project. In this case, it is the file provided by the setuptools project to install setuptools. It is used by the setup.py file, described below. It does not need to be modified or called directly.
setup.py

This is the file that uses distutils, extended by setuptools, to specify direct dependencies, scripts, and other elements of the local source tree.

Buildout uses it, but the reverse is not true: setup.py does not know about Buildout. This means that packages that use Buildout for development do not have to require it when they are being installed in other software as a dependency.

Describing this file in full is well beyond the scope of this document. We will give recipes for modifying it for certain tasks below. For more information beyond these recipes, see the setuptools and distutils documentation.

buildout.cfg

This is the default configuration file that bin/buildout will look to for instructions.

Describing it in full is well beyond the scope of this document. However, we will give an overview here.

Configuration files for Buildout are comprised of sections with key-value pairs.

The key-value pairs are separated with new lines, when the subsequent line is not indented. The key and value are each separated with an equals sign.

foo = bar
baz = bing
      bah
      boo
sha = zam

That example shows three keys, ‘foo’, ‘baz’, and ‘sha’. The ‘baz’ key has a string with two new lines (which might be interpreted one several ways, as defined for that key).

The [buildout] section is the starting point for Buildout to determine what to do. It looks for an extends key to find any additional files to merge in; we use this for versions.cfg, discussed below.

In addition to general configuration and initialization such as this, it looks in the develop key to find source trees to develop as part of the buildout. In the standard Launchpad configuration, we develop only Launchpad itself (the current directory, or ‘.’). This means that the local setup.py will be run. If you want to develop Launchpad while you develop another dependency, you can link another source tree in, and specify an additional develop directory in another line:

[buildout]
develop = .
          lazr_uri_branch

See Developing a Dependent Library In Parallel for more on this.

The other basic key in the [buildout] section that we’ll highlight here is parts. This key identifies the other sections in buildout.cfg that will be processed. A section that is not identified in the [buildout] sections parts key will usually be ignored (unless chosen for another role by another key elsewhere).

Sections other than [buildout] that are specified as parts always must specify a recipe: an identifier that determines what code should process that section. You’ll see a variety of recipes in Launchpad’s buildout.cfg, including z3c.recipe.filetemplate, zc.recipe.egg, and others.

versions.cfg
As mentioned above, buildout.cfg extends versions.cfg by specifying it in the extends key of the [buildout] section. Versions.cfg specifies the precise versions of the dependencies we use. This means that we can have several versions of a dependency available locally, but we only build the precise one we specify. We give an example of its use below. To read about the mechanism used, see the zc.buildout documentation of the versions option in the [buildout] section.
eggs
The eggs directory holds the eggs built from the downloaded distributions. Unless you set it up differently yourself, this directory is shared by all your branches. This directory is local to your system–we do not manage it in a branch. One reason for this is that eggs are often platform-specific.
download-cache

The download-cache directory is a set of downloaded distributions–that is, exact copies of the items that would typically be obtained from the Python Package Index (“PyPI”), or another download source. We manage the download cache as a shared resource across all of our developers with a bzr branch in a Launchpad project called lp-source-dependencies.

When we run buildout, Buildout reads a special key and value in the [buildout] section: install-from-cache = true. This means that, by default, Buildout will not use network access to find packages, but only look in the download cache. This has many advantages.

  • First, it helps us keep our deployment boxes from needing network access out to PyPI and other download sites.
  • Second, it makes the buildout much faster, because it does not have to look out on the net for every dependency.
  • Third, it makes the buildout more repeatable, because we are more insulated from outages at download sites such as PyPI, and poor release management.
  • Fourth, it makes our deployments more auditable, because we can tell exactly what we are deploying.
  • Fifth, it gives us a single obvious place to put custom package distributions, as we’ll discuss below.

The downside is that adding and upgrading packages takes a small additional step, as we’ll see below.

buildout-templates
The last additional item in the checkout is the buildout-templates directory. This is used to hold templates that are used by the section in buildout.cfg that uses the z3c.recipe.filetemplate recipe. This can be used for many things, but we are using it as an alternate way for producing scripts when the zc.recipe.egg approach is insufficient.

In addition to these seven listings, after you have run the Makefile (or bin/buildout), you will see an additional listing:

bin
The bin directory has already been discussed many times. After running the bootstrap.py, it holds the buildout script which can be used to process Buildout configuration files. In Launchpad’s case, after running the buildout, it also holds many executables, including scripts to test Launchpad; to run it; to run Python or IPython with Launchpad’s sourcetree and dependencies available; to run harness or iharness (with IPython) with the sourcetree, dependencies, and database connections; or to perform several other tasks. For now, the Makefile provides aliases for many of these.

Now that you have an introduction to the pertinent files and directories, we’ll move on to trying to perform tasks in the buildout. We’ll discuss adding a dependency, upgrading a dependency, adding a script, adding an arbitrary file, and working with unreleased packages.

Add a Package

Let’s suppose that we want to add the “lazr.foo” package as a dependency.

  1. Add the new package to the setup.py file in the install_requires list.

    Generally, our policy is to only set minimum version numbers in this file, or none at all. It doesn’t really matter for an application like Launchpad, but it a good rule for library packages, so we follow it for consistency. Therefore, we might simply add 'lazr.foo' to install_requires, or 'lazr.foo >= 1.1' if we know that we are depending on features introduced in version 1.1 of lazr.foo.

  2. [OPTIONAL] If you know it, add the desired version number to versions.cfg now.

    For instance, if you know you want lazr.foo 1.1.2, add this line to the [versions] section of versions.cfg:

    lazr.foo = 1.1.2
    
  3. [OPTIONAL] Add the desired distribution of lazr.foo 1.1.2 to the download-cache/dist directory.

  4. Run the following command (or your variation):

    ./bin/buildout -v buildout:install-from-cache=false | tee out.txt | grep 'Picked'
    

    The first part (./bin/buildout -v buildout:install-from-cache=false) will run buildout, allowing it to download source packages from the Internet to download-cache/dist. The second part (tee out.txt) will dump the full output to the out.txt file in case you need to debug a problem. The last part (grep 'Picked') will filter the output so that only additional packages–dependencies of your dependency–will be listed. You need to see if it means that you have dependencies, some of which may be indirect dependencies. We’ll see how to do this with an example. Here’s an imaginary output:

    Picked: ipython = 0.9.1
    Picked: lazr.foom = 1.4
    Picked: zope.bar = 3.6.1
    Picked: z3c.shazam = 2.0.1
    

    In our example, this means that these packages have been added to your download-cache/dist directory. You now need to add those versions to the versions.cfg file:

    ipython = 0.9.1
    lazr.foom = 1.4
    zope.bar = 3.6.1
    z3c.shazam = 2.0.1
    

    Note that the output might include at least one, and possibly more, spurious “Picked:” listings. ipython, in particular, has shown up in the past incorrectly–that is, when you try to add the file to the download-cache/dist directory, you’ll discover that it is already there; and you’ll see that versions.cfg already specifies the version. As long as the test passes (see step 5, below), you can ignore this.

  5. Test.

    Note that you can tell ec2 test to include all uncommitted distributions from the local download-cache in its tests with the --include-download-cache-changes flag (or -c). If you do this, you cannot use the ec2 test feature to submit on test success. Also, if you have uncommitted distributions and you do not explicitly tell ec2 test to include or ignore the uncommitted distributions, it will refuse to start an instance.

  6. Check old versions in the download-cache. If you are sure that they are not in use any more, anywhere, then remove them to save checkout space. More explicitly, check with the LOSAs to see if they are in use in production and send an email to launchpad-dev@lists.launchpad.net before deleting anything if you are unsure. A rule of thumb is that it’s worth starting this investigation if the replacement has already been in use by the Launchpad tree for more than a month. You can approximate this information by using bzr log on the newer (replacement) download-cache/dist file for the particular package.

  7. Now you need to share your package changes with the rest of the team. You must do this before submitting your Launchpad branch to PQM or else your branch will not build properly anywhere else, including buildbot. Commit the changes (cd download-cache, bzr add the needed files, bzr up, bzr commit -m 'Add lazr.foom 1.1.2 and depdendencies to the download cache') to the shared download cache when you are sure it is what you want.

Never modify a package in the download-cache. A change in code must mean a change in version number, or else very bad inconsistencies and confusion across build environments will happen.

Upgrade a Package

Sometimes you need to upgrade a dependency. This may require additional dependency additions or upgrades.

If you already know what version you want, the simplest thing to try is to modify versions.cfg to specify the new version and run steps 4, 5, and 6 of the Add a Package instructions.

If you don’t know what version you want, but just want to see what happens when you upgrade to the most recent revision, you can clear out the versions of the packages for upgrade and give it a try (that is, run steps 4, 5, and 6 of the Add a Package instructions). Note that, when not given an explicit version number, our buildout is set to prefer final releases over alpha and beta releases. If you want to temporarily override this behaviour, include buildout:prefer-final=false as another argument to bin/buildout.

Add a Script

We often need scripts that are run in a certain environment defined by Python dependencies, and sometimes even different Python executables. Several of the scripts we have are specified using the setuptools-based spelling that the zc.recipe.egg recipe supports.

For the common case, in setup.py, add a string in the console_scripts list of the entry_points argument. Here’s an example string:

'run = lp.scripts.runlaunchpad:start_launchpad'

This will create a script named run in the bin directory that calls the start_launchpad function in the lp.scripts.runlaunchpad module.

See the zc.recipe.egg documentation for more information on how to add scripts using this method.

Add a File Modified By Buildout

Sometimes we need more control for the way our scripts are generated, or we need other files processed during a buildout. Writing a custom zc.buildout recipe is one way, but well out of the scope of this document. Read the zc.buildout documentation for direction.

A much easier, and more limited approach is to use z3c.recipe.filetemplate to build the file. The recipe uses the buildout-templates directory, which is a mirror of the Launchpad source tree. The recipe searches the tree for files ending in ‘.in’, performs variable substitution on them, and then be copies them into the Launchpad source tree.

To add a file using the recipe, simply create mirrors of the source tree directories that you need under buildout-templates/, and create a .in file template at the desired location. Take a look at buildout-templates/bin/ for examples of what is possible.

Work with Unreleased or Forked Packages

Sometimes you need to work with unreleased or forked packages. FeedValidator, for instance, makes nightly zip releases but other than that only offers svn access. Similarly, we may require a patched or unreleased version of a package for some purpose. Hopefully, these situations will be rare, but they do occur.

While other answers are available for Buildout, our solution is to use the download-cache. Basically, make a custom source distribution with a unique suffix in the name, and use it (and its version name) for the normal process of adding or updating a package, as described above. Because the custom package is in the download-cache, it will be found and used.

Here’s an example of making a custom distribution of FeedValidator.

FeedValidator is a Subversion project. We check it out:

svn co http://feedvalidator.googlecode.com/svn/trunk/feedvalidator/src feedvalidator

Next, we cd feedvalidator, and, using a Python that has setuptools installed, we run the following command:

python setup.py egg_info -r -bDEV sdist

For this example, imagine that the current revision of the repository is 1049. Because setuptools has built-in Subversion support, the command above will create a tar.gz in the dist directory named feedvalidator-0.0.0DEV-r1049.tar.gz. The -r option specifies that the subversion revision should be part of the package name. The -bDEV option specifies that the ‘DEV’ suffix should be added to the version number.

We could then put the tar.gz file in Launchpad’s download-cache/dist directory, specify feedvalidator = 0.0.0DEV-r1049 in the versions.cfg file, and proceed with the usual steps to update or add a new package.

If you use a bzr branch, you might use the -d option instead of the -r option when you create the distribution. This will add the date instead of the revision:

python setup.py egg_info -d -bDEV sdist

For instance, this might produce a distribution for the lazr.restful project with a name like this: lazr.restful-0.9.1DEV-20090512.tar.gz.

See the setuptools documentation for more information about the egg_info command.

It is also possible that setup.py does not support the egg_info command and it is sufficient to just run the sdist command. It adds the current revision of the bzr branch to the version number:

python setup.py sdist

For instance, this might produce a distribution for the testtools project with a name like this: testtools-0.9.12-r228.tar.gz.

Developing a Dependent Library In Parallel

Sometimes you need to iterate on change to a library used by Launchpad that is managed by buildout as an egg. You could just edit what it is in the eggs directory, but it is harder to produce a patch while doing this. You could instead grab a branch of the libarary and produce an egg everytime you make a change and make buildout use the new egg, but this is slow.

buildout defaults to only using the current directory as code that will be used without creating a distribution. We can arrange for it to use other paths as well, so we can use a checkout of any code we like, with changes being picked up instantly without us having to create a distribution.

To do this add the extra paths to the develop key in the [buildout] section of buildout.cfg:

[buildout]
develop = .
          path/to/branch

and re-run make.

Now any changes you make in that path will be picked up, and you are free to make the changes you need and test them in the Launchpad environment.

Once you are finished you can produce a distribution as above for inclusion in to Launchpad, as well as sending your patch upstream. At that point you are free to revert the configuration to only develop Launchpad. You should not submit your branch with this change in the configuration as it will not work for others.

Be aware that you may have to change the version for the package in versions.cfg if there is a difference between the version in the eggs directory and the one in the setup.py that you pointed to in the develop key.

One thing to be wary of is that setuptools treats “develop eggs” created by this process with the same precedence as system packages. That means that if the version in the setup.py at the path that you put in the develop key is the same as the version installed system wide, setuptools may pick the wrong one. If that happens then increase the version in setup.py and setuptools will take it.

Possible Future Goals

  • No longer use system site-packages.
  • No longer use make.
  • Get rid of the sourcecode directory.