We only use DIB based ipa ramdisks and changed bifrost jobs
names.
Depends-On: I569a766826405513f7beab5d45a52a8bbf42ddfd
Change-Id: I8dc17087d595872d660c9a90c8dbafef268ad02a
Signed-off-by: Riccardo Pittau <elfosardo@gmail.com>
... which were somehow overlooked in the previous attempt.
Change-Id: I242baa622079a3a4facde4cf19fb1818593fb668
Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>
... because ironic-inspector has been retired.
Change-Id: Id5568cbac8f559821dffd004ab9b6db3e4f4bca6
Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>
Our most basic images require 2500mb of ram, minimum, for fake nodes.
Signed-off-by: Jay Faulkner <jay@jvf.cc>
Change-Id: I0939f6fbb8dfd91c4e3f20b3a785e6acc9feb9bb
Its complicated, but basically because we run full size VMs which
take a while to boot, the multinode tests need a bit more than 600
seconds to deploy a node. They can get there in just about that time
and even sometimes beat the time window, but sometimes the job
times out internally and kills the test run.
This changes the time to be 2000 seconds, which is more consistent
across other jobs. Independently, the defaults in the tempest plugin
will need to be made sane.
Change-Id: I890d551122489e5a0b3162f08dbc10270968fb00
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
A recent image update has caused CI to begin failing against anaconda. This change is required to unblock it, and must be backported to unblock ironic-tempest-plugin merges.
Change-Id: I6a8a7baf54f7c0718b897f490671e8c3ac946e45
Signed-off-by: Jay Faulkner <jay@jvf.cc>
Console containers are run as systemctl --user units with the stack
user. Unlike in the locally running case, in a job there may be no
active user session running to allow these units to run. This change
ensures there is a stack user service running, and "loginctl
enable-linger" will start one again at boot time. These actions are only
taken when ir-novnc is enabled.
This change also installs the package slirp4netns for the required
user-mode networking, and adds fake-graphical to the list of enabled
console interfaces when ir-novnc is enabled. enabled_console_interfaces
is passed to tempest.conf so that tempest can run tests or not based on
whether fake-graphical is enabled.
Additionally the console container will bind to a high port on localhost
instead of a high port on the host IP. This still allows
ironic-novncproxy to connect to the vnc endpoint while avoiding iptables
rules.
Change-Id: Ibcd5b7b05c466d898ba69bff35a1e767be3699a3
Signed-off-by: Steve Baker <sbaker@redhat.com>
They are permafailing, the Neutron fix is not ready yet.
Change-Id: Ie5d9f76c97fb08edcd295fdfa82bd0b4539ff410
Signed-off-by: Dmitry Tantsur <dtantsur@protonmail.com>
Removing the snmp CI job, as it doesn't make sense to execute as we're
going to remove it.
Change-Id: I3da676da959fde5d4c858538888ccd7b0682cb3b
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
This is a good canary for schema-related issues. It seems reasonable to
only run on direct API-related changes. Anything that breaks on the SDK
side should be caught by the SDK job.
Change-Id: I2d6ba3666e569f867fd13b695d16d13e44e3fd44
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Unlike the existing Metal3 job, this one covers a large number of
different Ironic configurations and is also sensitive to performance
regressions on the API layer.
Claude Code was used for the initial pass of converting the existing
Github workflow to Ansible.
Assisted-By: Claude Code
Change-Id: I80490c4ca89ab40d3cdc4ced7964d3dc06cd9a05
Signed-off-by: Dmitry Tantsur <dtantsur@protonmail.com>
This reverts commit 907df2c40a
as the centos stream maintainers have indicated they fixed the
build issues as it relates to mirroring the images with yesterday's
push to the CDN.
Change-Id: I8cfcbe83267c3fc28bec117248b7cb3caa42197f
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
It appears we have an address conflict as it relates to the
ir-novnc service, but we don't need it for these jobs.
So, disable the extra service.
Change-Id: I28fc766f62d9dda93f2d3469eaaec73e63057415
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
This reverts commit bd69ef1c57
which was a temporary change in order to facilitate the switch-over
from using eventlet to using threading.
Change-Id: I41efb4ed7c63d67fc1f709055727e624717c91eb
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
In order to merge eventlet removal patches and fixes related to
metal3, we need to merge four separate changes in series to get
the metal3 CI job back into a happy state. This is because:
* Eventlet removal changes the process model to sub-processes
* Metal3 Integration uses sqlite, which we've learned in the past
can have locking issues between processes.
* The fix requires removal of direct database calls from the
API surface, and instead for calls to be routed through the RPC
layer.
Once the changes are merged together, the metal3 job has been shown
to work in other test runs, so we have high confidence overall, we
just unfortunately need give ourselves the window where the job is
not passing to merge sequence of changes which will be stacked
after this change.
Change-Id: I3e6cb1c25b04ff965fa40ff6dbac9bd1bb53c44b
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
The meteal3 job has traditionally operated as a very tight memory
condition for quite some time. Specifically running minikube, ironic,
and two VMs in 8 GB of memory also influenced the job design.
However, with the removal of eventlet, this memory footprint and
process model swells a little bit creating conditions where the
job is essentially guarenteed to fail. Further discussion with
Riccardo, one of the other Ironic/Metal3 contributros yielded
that they were already thinking of increasing the size of the
VM because they had been encountering memory issues.
Given that this job already runs lean, albeit with no swap
which is intentional to prevent disk IO generated noisey neighbor
conditions, the only real choice is to just increase the VM size
to the next realistic size in OpenDev CI.
Change-Id: I87f9c94e6585347d8a35a1d04dd7d101a9e68261
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
Log steps performed during step-based flows in Node History
at the beginning and at completion (or abort).
Closes-Bug: #2106758
Change-Id: Ieffacf174180036d6a2418a8faf72a94eea74fb8
Signed-off-by: Afonne-CID <afonnepaulc@gmail.com>
The snmp job has started failing because we don't have
an event reconcilation loop for asyncio in the main service.
It seems somehow the asyncio version of pysnmp is slipping into
the CI job which ultimately is breaking us at this point, although
I see no actual state of it in the logs. Its just weird.
In any event, it is a known issue.
The other issue is the grenade jobs are failing on neutron upgrades.
This appears to be due to legacy names being used in the CI job
configuration.
We should fix those, but we've got an open bug for that now:
https://bugs.launchpad.net/ironic/+bug/2118780
Change-Id: I1fbe4b0c519b5911db6f92e2963df99a882fa317
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
While looking at issue reports, I noticed we are likely stressing
the ramdisk too much and running it out of space.
Anyhow, one of our CI jobs fails more than others, and it needs to
be swapped around so the image is not downloaded, then converted in
the ramdisk. It is the only job which does it, and we should keep that
behavior, but we need to get CI in a happier place first.
Related-bug: 2116135
Change-Id: I77c30c370cf5288703663e495ab9e60f3e8a7b2e
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
We're seeing some CI jobs fail due to the ramdisk running out of
storage. This should increase the memory allocation slightly to
overall hopefully enable CI jobs to pass cleanly without issues.
Change-Id: Iec639cfc029065e378eb69f09200bf92d2313ee0
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
By putting CI job descriptions into the place they are defined, it will
be much more difficult to forget to update the documentation.
Change-Id: I7836fa3d2f6adf6a97762a6cd13b92177a2cd12e
Adds an advanced operations standalone test which utilizes a new proposed
tempest test to execute against the API which exerises the dhcp-less
virtual media path AND passes the node through the rebuild scenario case
which has been identified as problematic in the past.
With this, we know it works, so \o/.
Presently as non-voting.
Change-Id: Ibb6f9228672966c3708227e37bead6a45648e177
This change moves multinode jobs to be leveraged across multiple
"compute" nodes with an increased amount of memory, which increases
the overall test resources available and limits controller node
hot spotting for deployment operations.
This effectively chagnes multinode jobs from being a single
compute node with a single controller node, to two compute
nodes and a single controller node. The controller node's
hosted virtual machines is also dialed back.
This was done to eliminate usage of tinyipa in favor of a more
realistic Centos based IPA ramdisk, and also removes fallback
logic to use tinyipa on more limited resource nodes.
Change-Id: Ib52f7039072901ce72ac96e660d35a10cca59737
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
Our CI broke this morning with an error around start_neutron_api
function not existing. That function was removed from devstack
in Nov 2022 in a52041cd3f067156e478e355f5712a60e12ce649.
Upon further research, is_service_enabled neutron-api appears to have
been returning false, even for neutron-enabled jobs, for a long time. A
recent fix to this behavior in devstack exposed this dead code and the
lurking breakage which we've now experienced.
I'm making the assumption that since our CI has been fine for 2.5 years
without this code block running, it'll be fine for now too. We likely
want to follow up on if missing these calls have a side effect.
This change also disables voting for the
ironic-tempest-ovn-uefi-ipxe-ipv6 CI job which is in a similar state
as the other CI jobs which use networking-generic-switch.
This is a side-effect due to a lack of the uwsgi launched neutron having
the configuration files to load plugins. That issue is being worked
separately and once networking-generic-switch is fixed, the job will
be returned to voting status.
Change-Id: If47e74751ba66a1296f16d9c43433033c04beffb
This change involves:
- Moves ironic-standalone jobs to use 32GB nodes which is a
relatively simple change.
- Changes other jobs excluding multinode jobs to use DIB image
builds by default.
- Changes one of job names to remove tinyipa from the name.
- Also notes a job which can be removed, but removal will be in
a later change... and adds a release note in case anyone looks.
Change-Id: If9110c8f5041428df3e59f40fe0cb71bcf8580a8