From 2420f8286e22409c6f170e8a498f14aaa6ffebdd Mon Sep 17 00:00:00 2001 From: 3405691582 Date: Tue, 4 Nov 2025 23:09:37 -0500 Subject: [PATCH] Add a GitHub workflow running OpenBSD. ## Background CI solutions are a well-known mechanism to ensuring that code gets exercised or tested and problems detected as early as possible. Swift uses a Jenkins-based solution, and if additional nodes are to be testing additional configurations, this must be done by the community, requiring careful integration with the rest of the existing CI infrastructure, complex software setups, large dependencies, and most importantly, costly cloud resources. GitHub workflows are an alternative mechanism that enjoys some support on Swift satellite projects. Crucially, these can use software containers to provide clean and reproducible environments with which to run code targeting a specific userspace distribution for a given kernel. Concretely, this means that a container host running Ubuntu can test in a Debian environment, for example, since the Debian container shares the same kernel as the Ubuntu host. This is complicated when we want to test platforms that do not share the same kernel. Cross-compilation is one approach, but is incomplete: for example, the target may have platform-specific runtime requirements that would not be exercised when cross-compiling. The obvious solution to reach for is virtualization: creating a container that is running a virtual machine with our alternate kernel. If we have a container that runs a virtual machine for our target containing a Swift toolchain, pass it the code that we have checked out, get back the results, then we can simply run that container as a GitHub workflow and achieve our goal. There are some difficulties in this, naturally. We need an operating system image and a regular scheme to provide the VM with inputs and outputs. While there has been a lot of advances in schemes such `virtio` to ensure data between the host and guest is transferred efficiently, specifically such as `virtio-9p-pci`, `virtio-vsock`, or even `virtio-pmem`, these require support from within the guest. Disk devices enjoy more robust guest support. While we can use Containerfiles to orchestrate the behavior of the Linux-based container running the virtual machine, we need a way to orchestrate the inner VM's behavior as well without requiring user input. The `cloud-init` instance initialization system, used with virtual machines running in cloud infrastructure, provides a natural solution for us here. The cloud-init system allows for virtual machines to be initialized from a HTTP server or local media (referred to as NoCloud). Running a HTTP server isolated to a container and scoped solely to a VM can be tedious to get right. Here, we use local media: if a virtual machine with cloud-init installed boots with a FAT or ISO9660 image labeled CIDATA and contains two files `meta-data` and `user-data`, cloud-init will use the data within to set up the instance as described in those files. The operating system running in the virtual machine ideally should have minimal dependencies, so that an accurate accounting of additional dependencies required can be made. cloud-init, however, has several dependencies, chiefly on Python. pyinstaller can be used to prepackage those dependencies into a single standalone binary. ## Implementation To run Swift inside of a VM in a GitHub workflow, we need the following: an operating system disk image, a pyinstaller prepared version of cloud-init, a toolchain, and the commands to run `swift build` or `swift test` against a Swift project. The toolchain and its dependencies could be installed by prepending extra package installation steps before running `swift` in the VM environment, but this requires time and network resources every time the VM executes. Ideally, we would like to have the operating system image already have preinstalled everything minimally required. As OpenBSD does not support newer virtio features to enhance passing data to and from the host to the guest VM, we need to use disk images. For input data, we have two options: providing the VM with a single disk that can be used for external inputs or additional scratch space, or supplying a read-only disk image with external inputs and a read-write image for scratch. The latter approach turns out to be more natural. The CIDATA volume must be either FAT or ISO9660, but FAT has many limitations, specifically on file names. If we want to share additional data on the volume, ISO9660 (with extensions) is more attractive, but is read-only inside the VM. Output data must be read by the Linux container. While OpenBSD and Linux both support e2fsprogs, which allows for minting ext2 volumes without mounting, extracting files from an ext2 image without mounting is more difficult, especially since containers cannot easily mount disks. Instead, we exploit tar archives: when the VM wants to transmit data back to the host, it writes the tape archive file directly to the virtualized disk drive. The host converts the disk drive image from qcow2 back to raw format and reads the disk image as an ordinary tape archive. Some care may be required to specify the correct disk image size to ensure proper format blocking. We thus need to ensure that we know which disk inside the VM corresponds to which disk outside the VM, so disks are specified in the same predictable order: OS image, "scratch" disk, "tape" disk, then CIDATA. The steps we want the VM to take need to occur automatically when the operating system boots. For OpenBSD, rc.firsttime(8) is already reserved to start cloud-init, but we also have rc.local(8) available for site-specific commands. This script runs during boot as root, which leads to some quirks, but this makes the cloud-init configuration simple: our commands need only be specified as an ordinary script in rc.local as part of the `user-data` cloud-init configuration: one that performs our necessary tasks, writes any necessary output to tape, and then powers off the VM. Since qemu will still exit successfully if any commands running in the VM fail, we write the exit code from within the VM to tape and exit the container explicitly with that exit code, to properly communicate success or failure as needed. I have already constructed the initial OpenBSD disk images, pyinstaller version of cloud-init, and a version of the Swift 6.2 toolchain, installed them alongside the toolchain's necessary dependencies, and prepared a disk image for use in the container in this commit, `openbsd-swift`. This container is Alpine Linux-based, chosen for low overhead, installed with qemu, qemu-img to create the additional scratch and tape disks, and xorriso installed, to create the ISO9660 image with cloud-init files and any other read-only inputs. When the container is run, the qemu wrapper script runs these tools to enforce the above requirements, run the VM optionally with KVM acceleration until it exits, then extracts the output tape. Thankfully, Linux GitHub runners are reported to consistently support KVM, which means that there is limited performance impact. The image is also configured with a volume mount point at `/usr/local/share/cidata` for potential use outside of this GitHub workflow; here, we initialize this volume from within the binary via environment variables, so that a volume nor additional files are necessary. It is important to note while this commit and approach is intended for OpenBSD, it is certainly not limited to it. This approach could be utilized for other platforms, and may even be more efficient if those platforms support alternate virtio features. ## Caveats The toolchain used in this image is still a work in progress. As the toolchain is refined and eventually upstreamed, this container can be updated without needing the workflow to change. The base image, pyinstaller, and toolchain containers will be described elsewhere; the `Containerfile` to create the `openbsd-swift` image may eventually end up on swiftlang/swift-docker. This means that for now, the container image is relatively opaque. The workflow does not yet exist on swiftlang/github-workflows; we use swift-testing as a pilot for this workflow before making it available more widely there. Part of the motivating factor for introducing this workflow is to detect platform support bugs faster. This however does mean that platform support bugs may still be present, and blocking commits unnecessarily may be undesirable. To mitigate, the workflow is configured to run only on demand, rather than triggering on every pull request. This workflow should be able to manually test pull requests by specifying the associated branch of the pull request when the workflow is invoked. --- .github/workflows/vm.yml | 70 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 .github/workflows/vm.yml diff --git a/.github/workflows/vm.yml b/.github/workflows/vm.yml new file mode 100644 index 000000000..0d74ea224 --- /dev/null +++ b/.github/workflows/vm.yml @@ -0,0 +1,70 @@ +name: Manual VM run (OpenBSD) + +on: + workflow_dispatch: + +env: + META_DATA_CONTENT: | + { + "instance-id": "iid-local01", + "dsmode": "local" + } + USER_DATA_CONTENT: | + #cloud-config + timezone: UTC + write_files: + - content: | + set -ex + function atexit { + echo 1 > /tmp/result + tar cvf /dev/sd1c -C /tmp result + halt -p; + } + trap atexit EXIT + printf '\033\143' + export PATH=/usr/local/bin:$PATH + mount /dev/sd3c /mnt + cp -r /mnt/repo /home/repo/ + cd /home/repo/ + swift test + echo $? > /tmp/result + tar cvf /dev/sd1c -C /tmp result + halt -p + path: /etc/rc.local + permissions: '0755' + +jobs: + openbsd: + name: OpenBSD + runs-on: ubuntu-latest + timeout-minutes: 30 + container: + image: ghcr.io/3405691582/openbsd-swift:latest + env: + CPU: "4" + MEM: "16G" + KVM: "-enable-kvm" + options: --device /dev/kvm + steps: + - name: Checkout + uses: actions/checkout@v5 + + - name: Write cloud-init files + run: | + echo "$META_DATA_CONTENT" > /usr/local/share/cidata/meta-data + echo "$USER_DATA_CONTENT" > /usr/local/share/cidata/user-data + + - name: Prepare cloud-init + run: | + cp -r $GITHUB_WORKSPACE /usr/local/share/cidata/repo/ && \ + cat /usr/local/share/cidata/meta-data && \ + cat /usr/local/share/cidata/user-data && \ + ls /usr/local/share/cidata + + - name: Run + run: /usr/local/bin/cmd.sh + + - name: Report + run: | + ls -l /usr/local/share/tape && \ + exit $(cat /usr/local/share/tape/result)