cmd / cibot

cibot is a self-hosted Continuous Integration (CI) tool for private GitHub monorepos, designed for the workflow described in git / monorepo.

It runs checks within seconds of opening a pull request, avoiding the multi-tenant queues and cold container startups typical of hosted CI platforms.

Architecture

cibot has two entrypoints:

cibot farmer
cibot worker

The farmer is an HTTP server backed by Postgres. It receives GitHub webhooks for pull request events (opened, closed, reopened, synchronize), identifies which directories changed, walks up the file hierarchy to find Testfiles, and saves test jobs to its database.

The worker long polls the farmer for jobs. When it receives one, it clones the commit, finds the Testfile in the job's directory, and runs the matching command. It reports results (success, failure, error) back to the farmer, which posts GitHub status checks.

Run more worker instances to increase test parallelism.

Testfile

A Testfile defines test jobs for its directory. Each line contains a name and command separated by a colon:

gotest: go test -race -cover ./...
lint: goimports -l . | grep . && exit 1 || true

The name appears as a GitHub status check. The command is run by a worker in /bin/bash -eo pipefail.

Comments start with #. Blank lines are ignored.

Place a Testfile at the root of the repo to run jobs on every pull request. Place Testfiles in subdirectories to run jobs only when files in those directories change.

When a pull request is opened, the farmer:

  1. Gets the list of changed files from the GitHub API
  2. Collects the directories containing those files
  3. Walks up each directory to its parents
  4. Fetches Testfiles at each level
  5. Creates jobs for each entry

This means a change to sdk/go/account.go will run Testfiles in sdk/go/, sdk/, and /.

Why it's fast

Hosted CI services often start checks 30-60 seconds after a pull request is opened. Common causes are multi-tenant queues (noisy neighbors) and containerization (cache misses on every run).

cibot workers run on dedicated hosts with all dependencies pre-installed. The repo is already cloned; the worker does a git fetch and git reset --hard to the target commit. Jobs typically begin within 1-2 seconds of a webhook.

Job scheduling

Postgres triggers assign jobs to workers. When new jobs appear or workers become available, a resolve trigger finds an unassigned job and an idle worker, then inserts an assignment into the run table. Foreign keys cascade deletes: if a worker stops pinging, its assignment is removed and the job returns to the queue.

Workers ping the farmer every second. If a worker stops pinging for 5 seconds, it is garbage collected.

Output

Completed output is uploaded to S3. Clicking the "Details" link on a GitHub status check shows the output in the browser, with cancel and retry buttons.

← All articles