7.x (2025.2 “Flamingo”)¶
v7.0.0-beta.3-2¶
Other Notes¶
Add the
OctaviaLoadBalancerMultipleMasteralert to detect when an Octavia load balancer has multipleMASTERAmphorae for more than 15 minutes. Having multiple masters may cause frequent port revision and rebinding across different hosts, resulting in performance degradation.
Updated Ceph monitoring with new dashboards for Non-Volatile Memory Express (NVMe) over Fabrics, multi-cluster overview, Server Message Block (SMB), and application overview, along with improved alerts and metrics.
Updated node exporter monitoring with improved alerts and dashboards for system monitoring.
Updated CoreDNS monitoring with refined alerts for DNS query performance and forwarding issues.
Fixed alert manager configuration for alert routing.
v7.0.0-beta.3¶
New Features¶
Added support for
ovsinitwhich coordinates the roll-out of Open vSwitch pods to minimize network disruption during upgrades.
Upgrade
kubectlto versionv1.34.1,mod_auth_openidcto version2.4.18.1, andnoVNCto versionv1.6.0in the respective container images.
v7.0.0-beta.2¶
Bug Fixes¶
Image publishing jobs fixed to correctly build and publish images again.
v7.0.0-beta.1¶
New Features¶
Valkey service is now available on Atmosphere. This is required service for introduce Octavia Amphora V2 support.
Add specific helm-toolkit patch on 0.2.78. This will allow DB drop and init job compatible with SQLAlchemy 2.0
Octavia Amphere V2 is now supported and enable by default with Atmosphere. The Amphora V2 provider driver improves control plane resiliency. Should a control plane host go down during a load balancer provisioning operation, an alternate controller can resume the in-process provisioning and complete the request. This solves the issue with resources stuck in PENDING_* states by writing info about task states in persistent backend and monitoring job claims via jobboard.
Add
confluent-kafkaPython package to OpenStack images to enable the use of Kafka for notifications.
The Keystone role now supports additional parameters when creating the Keycloak realm to allow for the configuration of options such as password policy, brute force protection, and more.
Added support for deploying the frr-k8s chart for BGP routing with OVN. Introduced the
ovn_bgp_agent_enabledflag. When set totrue, the frr-k8s chart will be automatically installed before OVN deployment.
Add
glance_image_tempfile_pathvariable to allow users for changing the temporary path for downloading images before uploading them to Glance.
Keycloak is now configured to have the
token-exchangeand theadmin-fine-grained-authzfeatures enabled to allow for use of the OAuth Token Exchange protocol.
The Keystone role now supports configuring multi-factor authentication for the users within the Atmosphere realm.
Add Neutron plugins for neutron-dynamic-routing and networking-generic-switch. These modules enable support for Neutron BGP agents and Ironic networking.
Add support for Neutron policy check when perform port update with add address pairs. This will add a POST method
/address-pair. It will check if both ports (to be paired) are created within same project. With this check, we can give non-admin user to operate address pair binding without risk on expose resource to other projects.
The
ovn-bgp-agenthas been added to the chart. Theovn-bgp-agentis deployed as a DaemonSet within the OVN Helm chart.
Add OVN BGP Agent image build.
Introduced a new Rust-based binary
ovsinitwhich focuses on handling the migration of IP addresses from a physical interface to an OVS bridge during the Neutron or OVN initialization process.
Added udev rules for Pure Storage devices to optimize iSCSI LUN performance. The rules: - Set the I/O scheduler to none for improved throughput. - Reduce CPU usage by disabling entropy collection. - Balance CPU load by directing I/O completions to the originating CPU. - Increase the HBA timeout to 60 seconds for reliable I/O operations.
Adding basic Atmosphere upgrade process.
It is now possible to configure DPDK interfaces using the interface names in addition to possibly being able to use the
pci_idto ease deploying in heterogeneous environments.
All roles that deploy
Ingressresources as part of the deployment process now support the ability to specify the class name to use for theIngressresource. This is done by setting the<role>_ingress_class_namevariable to the desired class name.
Introduced the ability to specify a prefix for image names. This allows for easier integration with image proxies and caching mechanisms, eliminating the need to maintain separate inventory overrides for each image.
It’s now possible to use the default TLS certificates configured within the ingress by using the
ingress_use_default_tls_certificatevariable which will omit thetlssection from anyIngressresources managed by Atmosphere.
Barbican now supports multiple KEKs in configuration. The config value
.conf.simple_crypto_plugin_rewrap.old_keknow accepts comma-separated strings for KEK lists, and multiple.conf.barbican.simple_crypto_plugin.kekvalues can now be specified. The first key in the comma-separated.conf.simple_crypto_plugin_rewrap.old_kekstring is used for encrypting new data, while additional keys are used for decrypting existing data. This behavior is consistent with.conf.barbican.simple_crypto_plugin.kek.
The Barbican role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
Bump pxc-operator to 1.17.0 that improves observability, reliability, and monitoring. New features include HAProxy/ProxySQL stats endpoints, automatic backup queuing and suspension during cluster recovery, readiness/liveness probes, and Prometheus metrics for backups.
Bump pxc-operator to 1.18.0 that improves observability, reliability, and monitoring. Improved backup retention for streamlined management of scheduled backups in cloud storage.
The Storpool driver has been updated from the Bobcat release to the Caracal release.
Upgraded OpenStack service containers from Ubuntu 22.04 (Jammy) to Ubuntu 24.04 (Noble). All images now run on the latest Ubuntu LTS release with improved security and enhanced system libraries.
Upgraded OpenStack service containers from Python 3.10 to 3.12, delivering significant performance improvements and better memory management while maintaining backward compatibility.
The Cinder role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
The Designate role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
Atmosphere previously deactivated the Keystone auth token cache due to bug https://tracker.ceph.com/issues/64094. This issue is now resolved upstream, making it safe to reactivate the cache in the new version of Ceph which includes the fix (18.2.7).
The Atmosphere project now includes the Tap-as-a-Service (TaaS) extension for the OpenStack Neutron networking service. This feature introduces local and remote port mirroring capabilities, enabling tenants and cloud administrators to monitor and debug complex virtual networks by capturing and analyzing network traffic associated with virtual machines.
Applied the same pod affinity rules used for OVN NB/SB sts’s to northd deployment and changed the default pod affinity rules from preferred during scheduling to required during scheduling.
The
ovn-northdservice did not have liveness probes enabled which can result in the pod failing readiness checks but not being automatically restarted. The liveness probe is now enabled by default which will restart any stuckovn-northdprocesses.
The Glance role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
The Heat role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
The Horizon role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
The Ironic role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
The Keystone role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
The OpenStack database exporter has been updated and the collection of Octavia metrics happens through it only.
Added alerting for amphoras to cover cases for when an Amphora becomes in
ERRORstate or not ready for an unexpected duration.
The Magnum role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
The Manila role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
Adjust Neutron policy server to network scope checks for port update or delete operations. This will improve scope check when Neutron goes through policy for port update or delete when allowed-address-pair binding exists.
The Neutron role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
The Nova role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
The Octavia role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
The Open vSwitch container image now uses a more centralized location at ghcr.io/vexxhost/docker-openvswitch. This provides better maintainability and a dedicated repository for the Open vSwitch container images. The image now uses a specific version tag (v3.3.6-2) for better reproducibility and stability.
Neutron now supports using the built-in DHCP agent when using OVN (Open Virtual Network) for cases when DHCP relay is necessary.
Updated Open vSwitch images to use AVX-512 optimized builds for better performance on supported hardware.
The Placement role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
The
ovn-controllerimage is now being pre-pulled on the nodes prior to the Helm chart being deployed. This will help reduce the time it takes to switch over to the new version of theovn-controllerimage.
The Staffeln role now allows users to configure the
priorityClassNameand theruntimeClassNamefor all of the different components of the service.
Add required neutron plugin to support port mirroring with OVN backend.
Update the frr-k8s webhook server runs on the control plane.
Upgrade Percona XtraDB Cluster operator from 1.14.0 to 1.16.1 and Percona XtraDB Cluster from 8.0.36-28.1 to 8.0.41-32.1. This update includes performance improvements and bug fixes.
Known Issues¶
The MTU for the metadata interfaces for OVN was not being set correctly, leading to a mismatch between the MTU of the metadata interface and the MTU of the network. This has been fixed with a Neutron change to ensure the
neutron:mtuvalue inexternal_idsis set correctly.
Upgrade Notes¶
Bump Cert-Manager from v1.12.10 to v1.12.17 to address a breaking change in Cloudflare’s API which impacted ACME DNS-01 challenges using Cloudflare.
Bump Kubernetes collection from 2.0.1 to 2.3.2 fix bugs and add new features.
Bump the Cluster API driver for Magnum from 0.31.2 to 0.33.0 to improve stability, fix bugs and add new features.
Bump the Cluster API driver for Magnum from 0.30.0 to 0.31.2 to improve stability, fix bugs and add new features.
Bump OVN from 24.03.1-44 to 24.03.2.34.
Upgraded Portworx CSI operator to version 25.2.1 from 23.10.5 for improved stability and performance.
Updated Portworx OCI monitor to version 25.4.0 from 3.1.1 to support the latest operator features.
Upgraded RabbitMQ operator to version 2.16.1 from 2.9.0 for improved stability and performance.
Upgraded RabbitMQ server to version 4.1.4 from 3.13.3 for improved stability and performance.
RabbitMQ 4.1.x supports upgrades from 3.13.x and 4.0.x versions.
The
max_allowed_packetsetting increased from4M(the default in MySQL 5.x) to16Mto support larger queries. Because MySQL 8.x uses a new default of64M, the configuration no longer specifies this setting.
Upgrade Cluster API driver for Magnum to 0.26.0.
Upgrade CAPI and CAPO version to 1.10.5 and 0.12.4 respectively.
Security Issues¶
The Horizon service now runs as the non-privileged user horizon in the container.
The Horizon service
ALLOWED_HOSTSsetting is now configured to point to the configured endpoints for the service.
The CORS headers are now configured to only allow requests from the configured endpoints for the service.
Set libvirt’s TLS remote API port 16514 to use TLS 1.3 only to improve service security.
Upgrade nginx ingress controller from 1.10.1 to 1.12.1 to fix CVE-2025-1097 CVE-2025-1098, CVE-2025-1974, CVE-2025-24513, CVE-2025-24514.
Bug Fixes¶
Applied patch 948053 to resolve database synchronization issues between Neutron and Open Virtual Network (OVN) for log resources. This patch addresses bug 2107925 where the
neutron_pg_droptable could be incorrectly deleted during synchronization when existing log resources are present. The fix also updates the Access Control List (ACL) table to maintain proper synchronization of log resources between the Neutron and OVN databases.
Add missing
mdevctlpackage for vGPU feature.
The
[privsep_osbrick]/helper_commandconfiguration value was not configured in both of the Cinder and Nova services, which lead to the inability to run certain CLI commands since it instead tried to do a plainsudoinstead. This has been fixed by adding the missing helper command configuration to both services.
The
dmidecodepackage which is required by theos-bricklibrary for certain operations was not installed on the images that needed it, which can cause NVMe-oF discovery issues. The package has been added to all images that require it.
The
[cinder]/auth_typeconfiguration value was not set resulting in the entire Cinder section not being rendered in the configuration file, it is now set topasswordwhich will fully render the Cinder section for OpenStack Nova.
The
novauser within thenova-sshimage was missing theSHELLbuild argument which would cause live & cold migrations to fail, this has been resolved by adding the missing build argument.
The generic switch networking driver now uses a coordination backend to enable a distributed lock on switches.
During a Neutron or OVN initialization process, the routes assigned to the physical interface are now removed and added to the OVS bridge to maintain the connectivity of the host.
The Cluster API driver for Magnum has been bumped to 0.28.0 to improve stability, fix bugs and add new features.
The Cluster API driver for Magnum has been bumped to 0.27.0 to improve stability, fix bugs and add new features.
The Cluster API driver for Magnum has been bumped to 0.26.2 to address bugs around cluster deletion.
The Open vSwitch version has been bumped to 3.3.0 in order to resolve packet drops include
Packet dropped. Max recirculation depth exceeded.log messages in the Open vSwitch log.
This change fixes a regression where Cinder volume creation fails with error
FailedToDropPrivileges. Since update to Cinder 24.0.0, Cinder-Ceph container needs access to more capabilities for operations such as boot from volume or create a volume from an image.
This fix introduces a kernel option to adjust
aio-max-nr, ensuring that the system can handle more asynchronous I/O events, preventing VM startup failures related to AIO limits.
Fixed containers failing to validate TLS certificates on Red Hat-based systems. The issue occurred when mounting the OpenSSL trusted certificate bundle (
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt) which uses the “TRUSTED CERTIFICATE” format that’s incompatible with Go applications. The configuration now uses the standard PEM format bundle (/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem) on Red Hat systems, which resolves certificate validation errors.
Added a custom build of Cluster API driver for OpenStack which includes fixes unblocking upgrades of Magnum clusters created using a specific network or subnet configuration.
Corrected Cinder authentication configuration handling in Nova. Nova now respects authentication overrides defined in OpenStack Helm endpoints, such as
openstack_helm_endpoints_nova_region_name.
In an OVN deployment where external (baremetal) ports connect to VLAN networks, you need to bind the internal router port associated with the network to the same ha_chassis_group as the network. This setup mimics how the external port of the router functions in relation to the upstream gateway.
In essence, the baremetal ports aren’t able to communicate with their default gateway if either the internal router port is unbound or if the vrouter doesn’t have an external gateway set, with the external router port bound to the same exact chassis and with the same exact priorities as the ha_chassis_group of the VLAN network.
The Ironic agent for Neutron uses the
internalAPI endpoint by default to avoid hitting the public endpoint unnecessarily.
Manila now uses Nova micro-version 2.60 by default. This change enables support for attaching multiple volumes to an instance.
Manila now connects to the internal Nova and Glance endpoints instead of the public ones. This improves performance and reduces reliance on external network paths.
Fixed an issue in the Manila service image where the
fetch-public-ssh-keyssystemd service could start too early in the boot process, before the instance metadata service or network was fully available. This caused failures to retrieve and install SSH public keys.
Fixed an issue where the
neutron-ironic-agentservice failed to start.
Fixed the node-exporter Prometheus monitoring configuration by setting the
nodeExporterSelectorto filter metrics byjob="node-exporter"label. This ensures that node-exporter dashboards and alerts correctly reference the appropriate metrics.
Addressed an issue where instances not booted from volume would fail to resize. This issue was caused by a missing trailing newline in the SSH key, which led to misinterpretation of the key material during the resize operation. Adding proper handling of SSH keys ensures that the resize process works as intended for all instances.
Fixed the OAuth2 Proxy configuration to enable API access using valid JWT tokens without requiring interactive login. Previously, OAuth2 Proxy enforced login for all requests by default. This change lets the
AlertmanagerAPI and other services behind OAuth2 Proxy support programmatic access via JWT tokens.
Fix
OctaviaAmphoraNotOperationalmonitoring rule to excludeDELETEDAmphora status.
Fix
OctaviaAmphoraNotReadymonitoring rule to recognize bothREADYandALLOCATEDas valid Amphora statuses. Previously, the monitoring rule fired for Amphora instances inALLOCATEDstatus, which is a normal operational state. The monitoring rule now uses the nameOctaviaAmphoraNotOperationalto better reflect its purpose of detecting non-operational Amphora instances.
Improve alert generation for load balancers that have a non-
ACTIVEprovisioning state despite anONLINEoperational state. Previously, if a load balancer was in a transitional state such asPENDING_UPDATE(provisioning_state) while still marked asONLINE(operational_state), the gauge metricopenstack_loadbalancer_loadbalancer_status{provisioning_status!="ACTIVE"}did not trigger an alert. This update addresses the issue by ensuring that alerts are properly generated in these scenarios.
Add required OVN VPN configuration files to Neutron server so VPN features behave as expected. The Neutron server receives RPC calls from the Neutron OVN VPN agent and executes VPN operations. Therefore, VPN configurations must be present on the Neutron server.
When use OVS with DPDK, by default both OVS and OVN run with root user, this may cause issue that QEMU can’t write vhost user socket file in openvswitch runtime directory (
/run/openvswitch). This has been fixed by config Open vSwitch and OVN componments to run with non root user id 42424 which is same with QEMU and other OpenStack services inside the container.
The CI tooling for pinning images has been fixed to properly work after a regression caused by the introduction of the
atmosphere_image_prefixvariable.
Increased the liveness probe timeouts for the Percona XtraDB Cluster. The configuration now sets
timeoutSecondsto60andfailureThresholdto100. This change helps the cluster remain responsive and prevents unnecessary restarts during prolonged operations.
Changed the liveness check from the MySQL exporter sidecar to a readiness check. The sidecar should wait indefinitely for the main containers and shouldn’t terminate database pods. Especially during long SST operations. This change improves the cluster’s stability during extended operations.
Resolve the issue where the QEMU VNC and API TLS certificate fails to renew, preventing access to the virtual machine (VM) console via the dashboard and causing live migration failures.
Make sure that Staffeln Cinder policy honors the
atmosphere_staffeln_enabledsetting with boolean values.
The documentation for using the vTPM was pointing to the incorrect metadata properties for images. This has been corrected to point to the correct metadata properties.
Fix two redundant securityContext problems in statefulset-compute-ironic.yaml template.
Checking DB transaction already starts in barbican kek rewrap. And use nested transaction if DB session already starts it’s root transaction.
Fixed an issue preventing automatic certificate renewal for Octavia load balancers. The fix ensures proper TLS certificate mounting for job board communication between Octavia components and Valkey, enabling certificates to renew correctly.
Fixed type errors in
networking-generic-switchwhen users pass numeric configuration values as strings. The driver now automatically converts port numbers and timeout values to their correct types (intorfloat), preventingConnectHandlerfailures when establishing connections to network devices.
Switched Valkey and Redis exporter images to Bitnami legacy repository due to Bitnami retiring their main registry. The upstream Valkey images don’t work out of the box, so this serves as a temporary workaround.
The
designate-producerservice runs a single replica instead of three to avoid issues with DNS zone serial updates. This is a workaround until the service has proper centralized locking.
Upgrade the libvirt Helm chart from
0.1.27to1.1.0to address critical issues with pod termination on systems using newer kernels. The updated chart includes proper mounting of themisccgroup controller, which resolves failures where pods were unable to terminate correctly. This fix ensures stable pod lifecycle management in environments with modern kernel versions.
The Cluster API driver for Magnum is now configured to use the internal endpoints by default in order to avoid going through the ingress and leverage client-side load balancing.
Other Notes¶
Add documentation about database backup and restore procedures.
The documentation has been updated to include release notes for all of the current supported Atmosphere releases.
Updated Helm Toolkit dependency from version
0.2.69to2025.1.8. This update includes improved template consistency, enhanced support for newer Kubernetes versions, and updated helper functions for better maintainability.
The Atmosphere collection now uses the new major version of the OpenStack collection as a dependency.
The libvirt exporter image switch to use
ghcr.io/inovex/prometheus-libvirt-exporter, offering greater stability and performance on libvirt metrics collection.
The
uploadjobs have been removed from thegatepipeline and replaced by the samebuildjobs since we use the intermediate registry to store the images.
The project has adopted the use of
renofor release notes, ensuring that all changes include it from now on to ensure proper release notes.
The heavy CI jobs are now skipped when release notes are changed.
The image build process has been refactored to use
docker-bakewhich allows us to use context/built images from one target to another, allowing for a much easier local building experience. There is no functional change in the images.
The images now use the uv tool to create the virtual environment which is faster and more reliable than the previous method.