-
Notifications
You must be signed in to change notification settings - Fork 62
build-node-image: add a skip tests flag #1268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Allow skipping the node image tests. Allow unblocking the nodeimage pipeline when the QEMU artifact for the latest RHCOS build is not available. Patch best viewed without whitespace change.
In what case does that happen? Did someone run with |
|
@dustymabe Looks like we have encountered this issue in the |
|
https://jenkins-rhcos--prod-pipeline.apps.int.prod-stable-spoke1-dc-iad2.itup.redhat.com/job/build/2573/parameters/ had Rather than skipping tests here we can just simply rerun the build-node-image job later. I don't see much value in skipping tests. |
|
actually it looks like the real problem with https://jenkins-rhcos--prod-pipeline.apps.int.prod-stable-spoke1-dc-iad2.itup.redhat.com/job/build-node-image/1235/ is that x86_64 is trying to download different images than the other arches (again, because EARLY_ARCH_JOBS), but EARLY_ARCH_JOBS isn't really the entire problem here, it just exposes it. I think the real problem is that we're not enforcing that we are running the test against the same RHCOS for all arches. We probably should enforce that we download the same RHCOS qemu as the node image is based on. i.e. we need to update fedora-coreos-pipeline/jobs/build-node-image.Jenkinsfile Lines 198 to 200 in 9cf0417
--find-build-for-arch with --build=$BUILDID where we found the RHCOS buildid that was used to build the node image.
|
The issue @Roshan-R hit when working on this is that we upload incomplete builds for rhcos. So you'd have to parse the meta file first to find a build with all the arches. |
If the
👍 |
|
I don't have the understanding of the pipeline that anyone else here does, but it looks like a change triggered ART automation to kick off build-node-image independent of the rhel-9.6 build, build-arch, and release job completion. The build-node-image jobs that were subsequently triggered by the successful release job were just fine. Could the build, build-arch, and release jobs relevant to downstream build-node-image jobs just hold a lock that prevents starting new instances until complete? Leveraging the early arch builds flag is highly valuable as it trims the overall pipeline duration by at least an hour. |
…uild This ensures we don't somehow pick up a different base qemu image than what we were built on. It also eliminates some awkward race conditions where a newer in progress RHCOS build was causing node image tests to fail. xref: coreos#1268
|
I opened #1279 |
…uild This ensures we don't somehow pick up a different base qemu image than what we were built on. It also eliminates some awkward race conditions where a newer in progress RHCOS build was causing node image tests to fail. xref: #1268
Not necessarily. ART triggers the build-node-image job multiple times a day.
Yes, but back when this job was written, we were allowing incomplete builds to be released, which is why we had to use |
Allow skipping the node image tests. Allow unblocking the nodeimage pipeline when the QEMU artifact for the latest RHCOS build is not available.
Patch best viewed without whitespace change.