2023.2 Series Release Notes

v3.5.0

New Features

  • The Open vSwitch container image now uses a more centralized location at ghcr.io/vexxhost/docker-openvswitch. This provides better maintainability and a dedicated repository for the Open vSwitch container images. The image now uses a specific version tag (v3.3.6-2) for better reproducibility and stability.

Bug Fixes

  • Fixed the node-exporter Prometheus monitoring configuration by setting the nodeExporterSelector to filter metrics by job="node-exporter" label. This ensures that node-exporter dashboards and alerts correctly reference the appropriate metrics.

  • Fix OctaviaAmphoraNotOperational monitoring rule to exclude DELETED Amphora status.

  • Fixed an issue preventing automatic certificate renewal for Octavia load balancers. The fix ensures proper TLS certificate mounting for job board communication between Octavia components and Valkey, enabling certificates to renew correctly.

Other Notes

  • The libvirt exporter image switch to use ghcr.io/inovex/prometheus-libvirt-exporter, offering greater stability and performance on libvirt metrics collection.

v3.4.7

Bug Fixes

  • Fixed containers failing to validate TLS certificates on Red Hat-based systems. The issue occurred when mounting the OpenSSL trusted certificate bundle (/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt) which uses the “TRUSTED CERTIFICATE” format that’s incompatible with Go applications. The configuration now uses the standard PEM format bundle (/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem) on Red Hat systems, which resolves certificate validation errors.

v3.4.6

Bug Fixes

  • Fix OctaviaAmphoraNotReady monitoring rule to recognize both READY and ALLOCATED as valid Amphora statuses. Previously, the monitoring rule fired for Amphora instances in ALLOCATED status, which is a normal operational state. The monitoring rule now uses the name OctaviaAmphoraNotOperational to better reflect its purpose of detecting non-operational Amphora instances.

  • Switched Valkey and Redis exporter images to Bitnami legacy repository due to Bitnami retiring their main registry. The upstream Valkey images don’t work out of the box, so this serves as a temporary workaround.

  • Upgrade the libvirt Helm chart from 0.1.27 to 1.1.0 to address critical issues with pod termination on systems using newer kernels. The updated chart includes proper mounting of the misc cgroup controller, which resolves failures where pods were unable to terminate correctly. This fix ensures stable pod lifecycle management in environments with modern kernel versions.

Other Notes

  • Updated Helm Toolkit dependency from version 0.2.69 to 2025.1.8. This update includes improved template consistency, enhanced support for newer Kubernetes versions, and updated helper functions for better maintainability.

v3.4.5

Upgrade Notes

  • Upgrade CAPI and CAPO version to 1.10.5 and 0.12.4 respectively.

v3.4.4

New Features

  • Add confluent-kafka Python package to OpenStack images to enable the use of Kafka for notifications.

Upgrade Notes

  • Bump Kubernetes collection from 2.0.1 to 2.3.2 fix bugs and add new features.

  • Bump the Cluster API driver for Magnum from 0.30.0 to 0.31.2 to improve stability, fix bugs and add new features.

Security Issues

  • Set libvirt’s TLS remote API port 16514 to use TLS 1.3 only to improve service security.

Bug Fixes

  • The designate-producer service runs a single replica instead of three to avoid issues with DNS zone serial updates. This is a workaround until the service has proper centralized locking.

v3.4.3

Upgrade Notes

    • Upgraded Portworx CSI operator to version 25.2.1 from 23.10.5 for improved stability and performance.

    • Updated Portworx OCI monitor to version 25.4.0 from 3.1.1 to support the latest operator features.

Bug Fixes

  • Corrected Cinder authentication configuration handling in Nova. Nova now respects authentication overrides defined in OpenStack Helm endpoints, such as openstack_helm_endpoints_nova_region_name.

v3.4.2

New Features

  • Added udev rules for Pure Storage devices to optimize iSCSI LUN performance. The rules: - Set the I/O scheduler to none for improved throughput. - Reduce CPU usage by disabling entropy collection. - Balance CPU load by directing I/O completions to the originating CPU. - Increase the HBA timeout to 60 seconds for reliable I/O operations.

Upgrade Notes

  • Bump Cert-Manager from v1.12.10 to v1.12.17 to address a breaking change in Cloudflare’s API which impacted ACME DNS-01 challenges using Cloudflare.

v3.4.1

New Features

  • Atmosphere previously deactivated the Keystone auth token cache due to bug https://tracker.ceph.com/issues/64094. This issue is now resolved upstream, making it safe to reactivate the cache in the new version of Ceph which includes the fix (18.2.7).

  • Upgrade Percona XtraDB Cluster operator from 1.14.0 to 1.16.1 and Percona XtraDB Cluster from 8.0.36-28.1 to 8.0.41-32.1. This update includes performance improvements and bug fixes.

Upgrade Notes

  • The max_allowed_packet setting increased from 4M (the default in MySQL 5.x) to 16M to support larger queries. Because MySQL 8.x uses a new default of 64M, the configuration no longer specifies this setting.

Bug Fixes

  • The [cinder]/auth_type configuration value wasn’t set resulting in the entire Cinder section not render in the configuration file, it is now set to password which will fully render the Cinder section for OpenStack Nova.

  • Added a custom build of Cluster API driver for OpenStack which includes fixes unblocking upgrades of Magnum clusters created using a specific network or subnet configuration.

  • Manila now uses Nova micro-version 2.60 by default. This change enables support for attaching multiple volumes to an instance.

  • Manila now connects to the internal Nova and Glance endpoints instead of the public ones. This improves performance and reduces reliance on external network paths.

  • Fixed the OAuth2 Proxy configuration to enable API access using valid JWT tokens without requiring interactive login. Previously, OAuth2 Proxy enforced login for all requests by default. This change lets the Alertmanager API and other services behind OAuth2 Proxy support programmatic access via JWT tokens.

  • Increased the liveness probe timeouts for the Percona XtraDB Cluster. The configuration now sets timeoutSeconds to 60 and failureThreshold to 100. This change helps the cluster remain responsive and prevents unnecessary restarts during prolonged operations.

  • Changed the liveness check from the MySQL exporter sidecar to a readiness check. The sidecar should wait indefinitely for the main containers and shouldn’t terminate database pods. Especially during long SST operations. This change improves the cluster’s stability during extended operations.

  • Resolve the issue where the QEMU VNC and API TLS certificate fails to renew, preventing access to the virtual machine (VM) console via the dashboard and causing live migration failures.

Other Notes

  • Add documentation about database backup and restore procedures.

v3.4.0

New Features

  • Valkey service is now available on Atmosphere. This is required service for introduce Octavia Amphora V2 support.

  • Octavia Amphora V2 is now supported and enable by default with Atmosphere. The Amphora V2 provider driver improves control plane resiliency. Should a control plane host go down during a load balancer provisioning operation, an alternate controller can resume the in-process provisioning and complete the request. This solves the issue with resources stuck in PENDING_* states by writing info about task states in persistent data structure and monitoring job claims via Jobboard.

  • The OpenStack database exporter has been updated and the collection of Octavia metrics happens through it only.

  • Added alerting for amphoras to cover cases for when an Amphora becomes in ERROR state or not ready for an unexpected duration.

Security Issues

  • Upgrade nginx ingress controller from 1.10.1 to 1.12.1 to fix CVE-2025-1097 CVE-2025-1098, CVE-2025-1974, CVE-2025-24513, CVE-2025-24514.

Bug Fixes

  • Backport fixes for Octavia Redis driver for support authentication and SSL for Redis Sentinel and multiple Sentinel servers.

  • The Cluster API driver for Magnum has been bumped to 0.28.0 to improve stability, fix bugs and add new features.

  • Addressed an issue where instances not booted from volume would fail to resize. This issue was caused by a missing trailing newline in the SSH key, which led to misinterpretation of the key material during the resize operation. Adding proper handling of SSH keys ensures that the resize process works as intended for all instances.

  • Improve alert generation for load balancers that have a non-ACTIVE provisioning state despite an ONLINE operational state. Previously, if a load balancer was in a transitional state such as PENDING_UPDATE (provisioning_state) while still marked as ONLINE (operational_state), the gauge metric openstack_loadbalancer_loadbalancer_status{provisioning_status!="ACTIVE"} did not trigger an alert. This update addresses the issue by ensuring that alerts are properly generated in these scenarios.

v3.3.1

New Features

  • The Keystone role now supports additional parameters when creating the Keycloak realm to allow for the configuration of options such as password policy, brute force protection, and more.

  • Add glance_image_tempfile_path variable to allow users for changing the temporary path for downloading images before uploading them to Glance.

  • The Keystone role now supports configuring multi-factor authentication for the users within the Atmosphere realm.

  • It is now possible to configure DPDK interfaces using the interface names in addition to possibly being able to use the pci_id to ease deploying in heterogeneous environments.

  • All roles that deploy Ingress resources as part of the deployment process now support the ability to specify the class name to use for the Ingress resource. This is done by setting the <role>_ingress_class_name variable to the desired class name.

  • It’s now possible to use the default TLS certificates configured within the ingress by using the ingress_use_default_tls_certificate variable which will omit the tls section from any Ingress resources managed by Atmosphere.

  • The Barbican role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Cinder role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Designate role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • Applied the same pod affinity rules used for OVN NB/SB sts’s to northd deployment and changed the default pod affinity rules from preferred during scheduling to required during scheduling.

  • The Glance role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Heat role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Horizon role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Ironic role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Keystone role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Magnum role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Manila role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Neutron role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Nova role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Octavia role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Placement role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

  • The Staffeln role now allows users to configure the priorityClassName and the runtimeClassName for all of the different components of the service.

Security Issues

  • The Horizon service now runs as the non-privileged user horizon in the container.

  • The Horizon service ALLOWED_HOSTS setting is now configured to point to the configured endpoints for the service.

  • The CORS headers are now configured to only allow requests from the configured endpoints for the service.

Bug Fixes

  • The Cluster API driver for Magnum has been bumped to 0.27.0 to improve stability, fix bugs and add new features.

  • Fix two redundant securityContext problems in statefulset-compute-ironic.yaml template.

Other Notes

  • The Atmosphere collection now uses the new major version of the OpenStack collection as a dependency.

v3.3.0

New Features

  • Introduced a new Rust-based binary ovsinit which focuses on handling the migration of IP addresses from a physical interface to an OVS bridge during the Neutron or OVN initialization process.

  • The Storpool driver has been updated from the Antelope release to the Bobcat release.

Known Issues

  • The MTU for the metadata interfaces for OVN was not being set correctly, leading to a mismatch between the MTU of the metadata interface and the MTU of the network. This has been fixed with a Neutron change to ensure the neutron:mtu value in external_ids is set correctly.

Upgrade Notes

  • Upgrade Cluster API driver for Magnum to 0.26.0.

Bug Fixes

  • During a Neutron or OVN initialization process, the routes assigned to the physical interface are now removed and added to the OVS bridge to maintain the connectivity of the host.

  • The Cluster API driver for Magnum has been bumped to 0.26.2 to address bugs around cluster deletion.

  • Updated Manila to utilize device UUIDs instead of device names for mounting operations. This change ensures consistent device identification and prevents device name conflicts that could occur after rebooting the Manila server.

v3.2.12

New Features

  • The ovn-northd service did not have liveness probes enabled which can result in the pod failing readiness checks but not being automatically restarted. The liveness probe is now enabled by default which will restart any stuck ovn-northd processes.

  • Neutron now supports using the built-in DHCP agent when using OVN (Open Virtual Network) for cases when DHCP relay is necessary.

Upgrade Notes

  • Bump OVN from 24.03.1-44 to 24.03.2.34.

Bug Fixes

  • The [privsep_osbrick]/helper_command configuration value was not configured in both of the Cinder and Nova services, which lead to the inability to run certain CLI commands since it instead tried to do a plain sudo instead. This has been fixed by adding the missing helper command configuration to both services.

  • The dmidecode package which is required by the os-brick library for certain operations was not installed on the images that needed it, which can cause NVMe-oF discovery issues. The package has been added to all images that require it.

  • The nova user within the nova-ssh image was missing the SHELL build argument which would cause live & cold migrations to fail, this has been resolved by adding the missing build argument.

  • This fix introduces a kernel option to adjust aio-max-nr, ensuring that the system can handle more asynchronous I/O events, preventing VM startup failures related to AIO limits.

  • The Cluster API driver for Magnum is now configured to use the internal endpoints by default in order to avoid going through the ingress and leverage client-side load balancing.

v3.2.11

New Features

  • Add specific helm-toolkit patch on 0.2.78. This will allow the database drop and init jobs to be compatible with SQLAlchemy 2.0

Bug Fixes

  • The Open vSwitch version has been bumped to 3.3.0 in order to resolve packet drops include Packet dropped. Max recirculation depth exceeded. log messages in the Open vSwitch log.

Other Notes

  • The image build process has been refactored to use docker-bake which allows us to use context/built images from one target to another, allowing for a much easier local building experience. There is no functional change in the images.

v3.2.10

New Features

  • Add support for Neutron policy check when perform port update with add address pairs. This will add a POST method /address-pair. It will check if both ports (to be paired) are created within same project. With this check, we can give non-admin user to operate address pair binding without risk on expose resource to other projects.

  • Introduced the ability to specify a prefix for image names. This allows for easier integration with image proxies and caching mechanisms, eliminating the need to maintain separate inventory overrides for each image.

  • The ovn-controller image is now being pre-pulled on the nodes prior to the Helm chart being deployed. This will help reduce the time it takes to switch over to the new version of the ovn-controller image.

Security Issues

  • Update update_port:fixed_ips policy for neutron policy server check to stay with RBAC rule. This issue is not affect much on service security as policy update_port:fixed_ips always comes next to update_port, but still we should honor SRABC design to add role member check on.

Bug Fixes

  • Fixed an issue where the neutron-ironic-agent service failed to start.

  • When use OVS with DPDK, by default both OVS and OVN run with root user, this may cause issue that QEMU can’t write vhost user socket file in openvswitch runtime directory (/run/openvswitch). This has been fixed by config Open vSwitch and OVN componments to run with non root user id 42424 which is same with QEMU and other OpenStack services inside the container.

  • The CI tooling for pinning images has been fixed to properly work after a regression caused by the introduction of the atmosphere_image_prefix variable.

  • The documentation for using the vTPM was pointing to the incorrect metadata properties for images. This has been corrected to point to the correct metadata properties.

Other Notes

  • The project has adopted the use of reno for release notes, ensuring that all changes include it from now on to ensure proper release notes.

  • The heavy CI jobs are now skipped when release notes are changed.