mirror of
https://github.com/llvm/llvm-project.git
synced 2025-04-24 23:06:05 +00:00
418 lines
19 KiB
ReStructuredText
418 lines
19 KiB
ReStructuredText
===================================================================
|
|
How To Add Your Build Configuration To LLVM Buildbot Infrastructure
|
|
===================================================================
|
|
|
|
Introduction
|
|
============
|
|
|
|
This document contains information about adding a build configuration and
|
|
buildbot worker to the LLVM Buildbot Infrastructure.
|
|
|
|
.. note:: The term "buildmaster" is used in this document to refer to the
|
|
server that manages which builds are run and where. Though we would not
|
|
normally choose to use "master" terminology, it is used in this document
|
|
because it is the term that the Buildbot package currently
|
|
`uses <https://github.com/buildbot/buildbot/issues/5382>`_.
|
|
|
|
Buildmasters
|
|
============
|
|
|
|
There are two buildmasters running.
|
|
|
|
* The main buildmaster at `<https://lab.llvm.org/buildbot>`_. All builders
|
|
attached to this machine will notify commit authors every time they break
|
|
the build.
|
|
* The staging buildmaster at `<https://lab.llvm.org/staging>`_. All builders
|
|
attached to this machine will be completely silent by default when the build
|
|
is broken. This buildmaster is reconfigured every two hours with any new
|
|
commits from the llvm-zorg repository.
|
|
|
|
In order to remain connected to the main buildmaster (and thus notify
|
|
developers of failures), a builbot must:
|
|
|
|
* Be building a supported configuration. Builders for experimental backends
|
|
should generally be attached to staging buildmaster.
|
|
* Be able to keep up with new commits to the main branch, or at a minimum
|
|
recover to tip of tree within a couple of days of falling behind.
|
|
|
|
Additionally, we encourage all bot owners to point their bots towards the
|
|
staging master during maintenance windows, instability troubleshooting, and
|
|
such.
|
|
|
|
Roles & Expectations
|
|
====================
|
|
|
|
Each buildbot has an owner who is the responsible party for addressing problems
|
|
which arise with said buildbot. We generally expect the bot owner to be
|
|
reasonably responsive.
|
|
|
|
For some bots, the ownership responsibility is split between a "resource owner"
|
|
who provides the underlying machine resource, and a "configuration owner" who
|
|
maintains the build configuration. Generally, operational responsibility lies
|
|
with the "config owner". We do expect "resource owners" - who are generally
|
|
the contact listed in a workers attributes - to proxy requests to the relevant
|
|
"config owner" in a timely manner.
|
|
|
|
Most issues with a buildbot should be addressed directly with a bot owner
|
|
via email. Please CC `Galina Kistanova <mailto:gkistanova@gmail.com>`_.
|
|
|
|
Steps To Add Builder To LLVM Buildbot
|
|
=====================================
|
|
Volunteers can provide their build machines to work as build workers to
|
|
public LLVM Buildbot.
|
|
|
|
Here are the steps you can follow to do so:
|
|
|
|
#. Check the existing build configurations to make sure the one you are
|
|
interested in is not covered yet or gets built on your computer much
|
|
faster than on the existing one. We prefer faster builds so developers
|
|
will get feedback sooner after changes get committed.
|
|
|
|
#. The computer you will be registering with the LLVM buildbot
|
|
infrastructure should have all dependencies installed and be able to
|
|
build your configuration successfully. Please check what degree
|
|
of parallelism (-j param) would give the fastest build. You can build
|
|
multiple configurations on one computer.
|
|
|
|
#. Install buildbot-worker (currently we are using buildbot version 2.8.4).
|
|
This specific version can be installed using ``pip``, with a command such
|
|
as ``pip3 install buildbot-worker==2.8.4``.
|
|
|
|
#. Create a designated user account, your buildbot-worker will be running under,
|
|
and set appropriate permissions.
|
|
|
|
#. Choose the buildbot-worker root directory (all builds will be placed under
|
|
it), buildbot-worker access name and password the build master will be using
|
|
to authenticate your buildbot-worker.
|
|
|
|
#. Create a buildbot-worker in context of that buildbot-worker account. Point it
|
|
to the **lab.llvm.org** port **9994** (see `Buildbot documentation,
|
|
Creating a worker
|
|
<http://docs.buildbot.net/current/tutorial/firstrun.html#creating-a-worker>`_
|
|
for more details) by running the following command:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ buildbot-worker create-worker <buildbot-worker-root-directory> \
|
|
lab.llvm.org:9994 \
|
|
<buildbot-worker-access-name> \
|
|
<buildbot-worker-access-password>
|
|
|
|
Only once a new worker is stable, and
|
|
approval from Galina has been received (see last step) should it
|
|
be pointed at the main buildmaster.
|
|
|
|
Now start the worker:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ buildbot-worker start <buildbot-worker-root-directory>
|
|
|
|
This will cause your new worker to connect to the staging buildmaster
|
|
which is silent by default.
|
|
|
|
Try this once then check the log file
|
|
``<buildbot-worker-root-directory>/worker/twistd.log``. If your settings
|
|
are correct you will see a refused connection. This is good and expected,
|
|
as the credentials have not been established on both ends. Now stop the
|
|
worker and proceed to the next steps.
|
|
|
|
#. Fill the buildbot-worker description and admin name/e-mail. Here is an
|
|
example of the buildbot-worker description::
|
|
|
|
Windows 7 x64
|
|
Core i7 (2.66GHz), 16GB of RAM
|
|
|
|
g++.exe (TDM-1 mingw32) 4.4.0
|
|
GNU Binutils 2.19.1
|
|
cmake version 2.8.4
|
|
Microsoft(R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86
|
|
|
|
See `here <http://docs.buildbot.net/current/manual/installation/worker.html>`_
|
|
for which files to edit.
|
|
|
|
#. Send a patch which adds your build worker and your builder to
|
|
`zorg <https://github.com/llvm/llvm-zorg>`_. Use the typical LLVM
|
|
`workflow <https://llvm.org/docs/Contributing.html#how-to-submit-a-patch>`_.
|
|
|
|
* workers are added to ``buildbot/osuosl/master/config/workers.py``
|
|
* builders are added to ``buildbot/osuosl/master/config/builders.py``
|
|
|
|
Please make sure your builder name and its builddir are unique through the
|
|
file.
|
|
|
|
All new builders should default to using the "'collapseRequests': False"
|
|
configuration. This causes the builder to build each commit individually
|
|
and not merge build requests. To maximize quality of feedback to developers,
|
|
we *strongly prefer* builders to be configured not to collapse requests.
|
|
This flag should be removed only after all reasonable efforts have been
|
|
exhausted to improve build times such that the builder can keep up with
|
|
commit flow.
|
|
|
|
It is possible to allow email addresses to unconditionally receive
|
|
notifications on build failure; for this you'll need to add an
|
|
``InformativeMailNotifier`` to ``buildbot/osuosl/master/config/status.py``.
|
|
This is particularly useful for the staging buildmaster which is silent
|
|
otherwise.
|
|
|
|
#. Send the buildbot-worker access name and the access password directly to
|
|
`Galina Kistanova <mailto:gkistanova@gmail.com>`_, and wait until she
|
|
lets you know that your changes are applied and buildmaster is
|
|
reconfigured.
|
|
|
|
#. Make sure you can start the buildbot-worker and successfully connect
|
|
to the silent buildmaster. Then set up your buildbot-worker to start
|
|
automatically at the start up time. See the buildbot documentation
|
|
for help. You may want to restart your computer to see if it works.
|
|
|
|
#. Check the status of your buildbot-worker on the `Waterfall Display (Staging)
|
|
<http://lab.llvm.org/staging/#/waterfall>`_ to make sure it is
|
|
connected, and the `Workers Display (Staging)
|
|
<http://lab.llvm.org/staging/#/workers>`_ to see if administrator
|
|
contact and worker information are correct.
|
|
|
|
#. At this point, you have a working builder connected to the staging
|
|
buildmaster. You can now make sure it is reliably green and keeps
|
|
up with the build queue. No notifications will be sent, so you can
|
|
keep an unstable builder connected to staging indefinitely.
|
|
|
|
#. (Optional) Once the builder is stable on the staging buildmaster with
|
|
several days of green history, you can choose to move it to the production
|
|
buildmaster to enable developer notifications. Please email `Galina
|
|
Kistanova <mailto:gkistanova@gmail.com>`_ for review and approval.
|
|
|
|
To move a worker to production (once approved), stop your worker, edit the
|
|
buildbot.tac file to change the port number from 9994 to 9990 and start it
|
|
again.
|
|
|
|
Testing a Builder Config Locally
|
|
================================
|
|
|
|
It is possible to test a builder running against a local version of LLVM's
|
|
buildmaster setup. This allows you to test changes to builder, worker, and
|
|
buildmaster configuration. A buildmaster launched in this "local testing" mode
|
|
will:
|
|
|
|
* Bind only to local interfaces.
|
|
* Use SQLite as the database.
|
|
* Use a single fixed password for workers.
|
|
* Disable extras like GitHub authentication.
|
|
|
|
In order to use this "local testing" mode:
|
|
|
|
* Create and activate a Python `venv
|
|
<https://docs.python.org/3/library/venv.html>`_ and install the necessary
|
|
dependencies. This step can be run from any directory.
|
|
|
|
.. code-block:: bash
|
|
|
|
python -m venv bbenv
|
|
source bbenv/bin/activate
|
|
pip install buildbot{,-console-view,-grid-view,-waterfall-view,-worker,-www}==3.11.7 urllib3
|
|
|
|
* If your system has Python 3.13 or newer you will need to additionally
|
|
install ``legacy-cgi`` and make a minor patch to the installed buildbot
|
|
package. This step does not need to be followed for earlier Python versions.
|
|
|
|
.. code-block:: bash
|
|
|
|
pip install legacy-cgi
|
|
sed -i \
|
|
-e 's/import pipes/import shlex/' \
|
|
-e 's/pipes\.quote/shlex.quote/' \
|
|
bbenv/lib/python3.13/site-packages/buildbot_worker/runprocess.py
|
|
|
|
* Initialise the necessary buildmaster files, link to the configuration in a
|
|
local checkout out of `llvm-zorg <https://github.com/llvm/llvm-zorg>`_, and
|
|
ask ``buildbot`` to check the configuration. This step can be run from any
|
|
directory.
|
|
|
|
.. code-block:: bash
|
|
|
|
buildbot create-master llvm-testbbmaster
|
|
cd llvm-testbbmaster
|
|
ln -s /path/to/checkout/of/llvm-zorg/buildbot/osuosl/master/master.cfg .
|
|
ln -s /path/to/checkout/of/llvm-zorg/buildbot/osuosl/master/config/ .
|
|
ln -s /path/to/checkout/of/llvm-zorg/zorg/ .
|
|
BUILDBOT_TEST=1 buildbot checkconfig
|
|
|
|
* Start the buildmaster.
|
|
|
|
.. code-block:: bash
|
|
|
|
BUILDBOT_TEST=1 buildbot start --nodaemon .
|
|
|
|
* After waiting a few seconds for startup to complete, you should be able to
|
|
open the web UI at ``http://localhost:8011``. If there are any errors or
|
|
this isn't working, check ``twistd.log`` (within the current directory) for
|
|
more information.
|
|
|
|
* You can now create and start a buildbot worker. Ensure you pick the correct
|
|
name for the worker associated with the build configuration you want to test
|
|
in ``buildbot/osuosl/master/config/builders.py``.
|
|
|
|
.. code-block:: bash
|
|
|
|
buildbot-worker create-worker <buildbot-worker-root-directory> \
|
|
localhost:9990 \
|
|
<buildbot-worker-name> \
|
|
test
|
|
buildbot-worker start --nodaemon <buildbot-worker-root-directory>
|
|
|
|
* Either wait until the poller sets off a build, or alternatively force a
|
|
build to start in the web UI.
|
|
|
|
* Review the progress and results of the build in the web UI.
|
|
|
|
This local testing configuration defaults to binding only to the loopback
|
|
interface for security reasons.
|
|
|
|
If you want to run the test worker on a different machine, or to run the
|
|
buildmaster on a remote server, ssh port forwarding can be used to make
|
|
connection possible. For instance, if running the buildmaster on a remote
|
|
server the following command will suffice to make the web UI accessible via
|
|
``http://localhost:8011`` and make it possible for a local worker to connect
|
|
to the remote buildmaster by connecting to ``localhost:9900``:
|
|
|
|
.. code-block:: bash
|
|
|
|
ssh -N -L 8011:localhost:8011 -L 9990:localhost:9990 username@buildmaster_server_address
|
|
|
|
Be aware that some build configurations may checkout the current upstream
|
|
``llvm-zorg`` repository in order to retrieve additional scripts used during
|
|
the build process, meaning any local changes will not be reflected in this
|
|
part of the build. If you wish to test changes to any of these scripts without
|
|
committing them upstream, you will need to temporarily patch the builder logic
|
|
in order to instead check out your own branch.
|
|
Typically, ``addGetSourcecodeForProject`` from
|
|
``zorg/buildbot/process/factory.py`` is used for this and you can edit the
|
|
caller to specify your own ``repourl`` and/or ``branch`` keyword argument.
|
|
|
|
Best Practices for Configuring a Fast Builder
|
|
=============================================
|
|
|
|
As mentioned above, we generally have a strong preference for
|
|
builders which can build every commit as they come in. This section
|
|
includes best practices and some recommendations as to how to achieve
|
|
that end.
|
|
|
|
The goal
|
|
In 2020, the monorepo had just under 35 thousand commits. This works
|
|
out to an average of 4 commits per hour. Already, we can see that a
|
|
builder must cycle in less than 15 minutes to have a hope of being
|
|
useful. However, those commits are not uniformly distributed. They
|
|
tend to cluster strongly during US working hours. Looking at a couple
|
|
of recent (Nov 2021) working days, we routinely see ~10 commits per
|
|
hour during peek times, with occasional spikes as high as ~15 commits
|
|
per hour. Thus, as a rule of thumb, we should plan for our builder to
|
|
complete ~10-15 builds an hour.
|
|
|
|
Resource Appropriately
|
|
At 10-15 builds per hour, we need to complete a new build on average every
|
|
4 to 6 minutes. For anything except the fastest of hardware/build configs,
|
|
this is going to be well beyond the ability of a single machine. In buildbot
|
|
terms, we likely going to need multiple workers to build requests in parallel
|
|
under a single builder configuration. For some rough back of the envelope
|
|
numbers, if your build config takes e.g. 30 minutes, you will need something
|
|
on the order of 5-8 workers. If your build config takes ~2 hours, you'll
|
|
need something on the order of 20-30 workers. The rest of this section
|
|
focuses on how to reduce cycle times.
|
|
|
|
Restrict what you build and test
|
|
Think hard about why you're setting up a bot, and restrict your build
|
|
configuration as much as you can. Basic functionality is probably
|
|
already covered by other bots, and you don't need to duplicate that
|
|
testing. You only need to be building and testing the *unique* parts
|
|
of the configuration. (e.g. For a multi-stage clang builder, you probably
|
|
don't need to be enabling every target or building all the various utilities.)
|
|
|
|
It can sometimes be worthwhile splitting a single builder into two or more,
|
|
if you have multiple distinct purposes for the same builder. As an example,
|
|
if you want to both a) confirm that all of LLVM builds with your host
|
|
compiler, and b) want to do a multi-stage clang build on your target, you
|
|
may be better off with two separate bots. Splitting increases resource
|
|
consumption, but makes it easy for each bot to keep up with commit flow.
|
|
Additionally, splitting bots may assist in triage by narrowing attention to
|
|
relevant parts of the failing configuration.
|
|
|
|
In general, we recommend Release build types with Assertions enabled. This
|
|
generally provides a good balance between build times and bug detection for
|
|
most buildbots. There may be room for including some debug info (e.g. with
|
|
`-gmlt`), but in general the balance between debug info quality and build
|
|
times is a delicate one.
|
|
|
|
Use Ninja & LLD
|
|
Ninja really does help build times over Make, particularly for highly
|
|
parallel builds. LLD helps to reduce both link times and memory usage
|
|
during linking significantly. With a build machine with sufficient
|
|
parallelism, link times tend to dominate critical path of the build, and are
|
|
thus worth optimizing.
|
|
|
|
Use CCache and NOT incremental builds
|
|
Using ccache materially improves average build times. Incremental builds
|
|
can be slightly faster, but introduce the risk of build corruption due to
|
|
e.g. state changes, etc... At this point, the recommendation is not to
|
|
use incremental builds and instead use ccache as the latter captures the
|
|
majority of the benefit with less risk of false positives.
|
|
|
|
One of the non-obvious benefits of using ccache is that it makes the
|
|
builder less sensitive to which projects are being monitored vs built.
|
|
If a change triggers a build request, but doesn't change the build output
|
|
(e.g. doc changes, python utility changes, etc..), the build will entirely
|
|
hit in cache and the build request will complete in just the testing time.
|
|
|
|
With multiple workers, it is tempting to try to configure a shared cache
|
|
between the workers. Experience to date indicates this is difficult to
|
|
well, and that having local per-worker caches gets most of the benefit
|
|
anyways. We don't currently recommend shared caches.
|
|
|
|
CCache does depend on the builder hardware having sufficient IO to access
|
|
the cache with reasonable access times - i.e. a fast disk, or enough memory
|
|
for a RAM cache, etc.. For builders without, incremental may be your best
|
|
option, but is likely to require higher ongoing involvement from the
|
|
sponsor.
|
|
|
|
Enable batch builds
|
|
As a last resort, you can configure your builder to batch build requests.
|
|
This makes the build failure notifications markedly less actionable, and
|
|
should only be done once all other reasonable measures have been taken.
|
|
|
|
Leave it on the staging buildmaster
|
|
While most of this section has been biased towards builders intended for
|
|
the main buildmaster, it is worth highlighting that builders can run
|
|
indefinitely on the staging buildmaster. Such a builder may still be
|
|
useful for the sponsoring organization, without concern of negatively
|
|
impacting the broader community. The sponsoring organization simply
|
|
has to take on the responsibility of all bisection and triage.
|
|
|
|
Managing a Worker From The Web Interface
|
|
========================================
|
|
|
|
Tasks such as clearing pending building requests can be done using
|
|
the Buildbot web interface. To do this you must be recognised as an admin
|
|
of the worker:
|
|
|
|
* Set your public GitHub profile email to one that was included in the
|
|
``admin`` information you set up on the worker. It does not matter if this
|
|
is your primary account email or a "verified email". To confirm this has been
|
|
done correctly, go to ``github.com/<your GitHub username>`` and you should
|
|
see the email address listed there.
|
|
|
|
A worker can have many admins, if they are listed in the form
|
|
``First Last <first.last@example.com>, First2 Last2 <first2.last2@example.com>``.
|
|
You only need to have one of those addresses in your profile to be recognised
|
|
as an admin.
|
|
|
|
If you need to add an email address, you can edit the ``admin`` file and
|
|
restart the worker. You should see the new admin details in the web interface
|
|
shortly afterwards.
|
|
|
|
* Connect GitHub to Buildbot by clicking on the "Anonymous" button on the
|
|
top right of the page, then "Login with GitHub" and authorise the app.
|
|
|
|
Some tasks don't give immediate feedback, so if nothing happens within a short
|
|
time, try again with the browser's web console open. Sometimes you will see
|
|
403 errors and other messages that might indicate you don't have the correct
|
|
details set up.
|
|
|