Fix automatic SystemVM template download to S3 secondary storage#12426
Fix automatic SystemVM template download to S3 secondary storage#12426Damans227 wants to merge 9 commits intoapache:4.20from
Conversation
|
@blueorangutan package |
|
@nvazquez a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16368 |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## 4.20 #12426 +/- ##
============================================
+ Coverage 16.23% 16.26% +0.02%
- Complexity 13382 13434 +52
============================================
Files 5657 5661 +4
Lines 498999 500054 +1055
Branches 60566 60723 +157
============================================
+ Hits 81035 81356 +321
- Misses 408928 409626 +698
- Partials 9036 9072 +36
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@blueorangutan test |
|
@nvazquez a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian Build Failed (tid-15187) |
|
|
||
| client.setEndpoint(clientOptions.getEndPoint()); | ||
| // Enable path-style access for S3-compatible storage | ||
| client.setS3ClientOptions(com.amazonaws.services.s3.S3ClientOptions.builder().setPathStyleAccess(true).build()); |
There was a problem hiding this comment.
So, when debugging the issue... I noticed that the connection to MinIO failed at the time of template upload, with an error that looked something like:
UnknownHostException: cloudstack-secondary.10.0.34.157 i.e. the SDK was trying to connect to the http://cloudstack-secondary.10.0.34.157:9000/... which is the virtual-hosted style (refer: virtual style vs path style syntax for s3).
Looking at other S3-compatible plugins in CloudStack, I found that both CephObjectStoreDriverImpl and CloudianHyperStoreUtil use enablePathStyleAccess() to get path-style URLs http://10.0.34.157:9000/cloudstack-secondary/... i.e.
AmazonS3 client = AmazonS3ClientBuilder.standard()
.enablePathStyleAccess()
.withCredentials(new AWSStaticCredentialsProvider(new BasicAWSCredentials(accessKey, secretKey)))
.withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration(url, "auto"))
.build();
Applying the same fix here worked. The AWS SDK documentation confirms that path-style access must be explicitly enabled for S3-compatible stores.
There was a problem hiding this comment.
Pull request overview
This PR fixes an issue where SystemVM templates fail to automatically download to S3 secondary storage when adding it to a CloudStack zone. The root cause was that S3 stores use REGION scope, but the endpoint selector only returned LocalHostEndpoint for ZONE-scoped stores with null scope IDs.
Changes:
- Modified endpoint selection logic to support
REGION-scoped stores for SYSTEM template downloads - Added null safety checks for data stores without URLs (e.g., S3 object stores)
- Enabled path-style access for S3-compatible storage systems like MinIO
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| engine/storage/src/main/java/org/apache/cloudstack/storage/endpoint/DefaultEndPointSelector.java | Extended condition to allow LocalHostEndpoint for REGION-scoped stores with SYSTEM templates |
| services/secondary-storage/controller/src/main/java/org/apache/cloudstack/secondarystorage/SecondaryStorageManagerImpl.java | Added null checks to skip data stores without URLs when building secondary storage addresses |
| services/secondary-storage/controller/src/test/java/org/apache/cloudstack/secondarystorage/SecondaryStorageManagerImplTest.java | Added comprehensive test coverage for null handling in data store processing |
| utils/src/main/java/com/cloud/utils/storage/S3/S3Utils.java | Enabled path-style access for S3-compatible storage systems |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Enable path-style access for S3-compatible storage | ||
| client.setS3ClientOptions(com.amazonaws.services.s3.S3ClientOptions.builder().setPathStyleAccess(true).build()); |
There was a problem hiding this comment.
Path-style access is being enabled unconditionally for all S3 endpoints, including AWS S3 which deprecated path-style access in favor of virtual-hosted-style. This could cause compatibility issues with AWS S3. Consider making path-style access configurable through ClientOptions, or only enabling it when a custom endpoint is detected (non-AWS S3).
There was a problem hiding this comment.
hmm, path-style access code is already inside the if (StringUtils.isNotBlank(clientOptions.getEndPoint())) block - which means path-style access is only enabled when a custom endpoint is specified. The code is already doing what Copilot is asking for.
| if (tmplInfo.getTemplateType() == TemplateType.SYSTEM && | ||
| (store.getScope().getScopeType() == ScopeType.REGION || | ||
| (store.getScope().getScopeType() == ScopeType.ZONE && store.getScope().getScopeId() == null))) { |
There was a problem hiding this comment.
The modified endpoint selection logic for REGION-scoped SYSTEM templates lacks test coverage. Consider adding unit tests in the engine/storage module to verify that LocalHostEndpoint is correctly returned for REGION-scoped stores with SYSTEM templates, similar to the existing test coverage in SecondaryStorageManagerImplTest.
|
@blueorangutan package |
|
@kiranchavala a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16615 |
There was a problem hiding this comment.
Please find the issues that i observed
Tested with the pr packages and on oracle linux 8.6
- Unable to use ceph s3 storage , got the following exception is logs
2026-01-30 07:52:30,770 DEBUG [o.a.c.s.r.NfsSecondaryStorageResource] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) Executing command "DownloadCommand" [{"hvm":false,"description":"SystemVM Template (KVM)","checksum":"c059b0d051e0cd6fbe9d5d4fc40c7e5d","maxDownloadSizeInBytes":53687091200,"id":3,"resourceType":"TEMPLATE","installPath":"template/tmpl/1/3/routing-3","_store":{"id":1,"uuid":"fdb66906-6b57-4e32-a7df-cbb93a917fce","accessKey":"AFYT2BKNI1U8T6DY6435","secretKey":"kia5kyDAZuP7QjwxNmhVAEE5l5dzsSJWbxtSXCIA","endPoint":"https://10.0.33.100","bucketName":"testbucket","httpsFlag":true,"created":"Jan 30, 2026, 7:52:20 AM","enableRRS":false,"maxSingleUploadSizeInBytes":5368709120},"followRedirects":false,"url":"http://download.cloudstack.org/systemvm/4.6/systemvm64template-4.6.0-kvm.qcow2.bz2","format":"QCOW2","accountId":1,"name":"routing-3","contextMap":{},"wait":0,"bypassHostMaintenance":false}].
2026-01-30 07:52:30,798 DEBUG [c.c.u.n.HTTPUtils] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) Initializing new HttpMethodRetryHandler with retry count 5
2026-01-30 07:52:30,892 INFO [c.c.s.t.S3TemplateDownloader] (pool-10-thread-1:[ctx-2cad9dd2]) (logid:1b4a38df) Starting download from http://download.cloudstack.org/systemvm/4.6/systemvm64template-4.6.0-kvm.qcow2.bz2 to S3 bucket testbucket and size (304.60 MB) 319401369 bytes
2026-01-30 07:52:30,897 DEBUG [c.c.u.s.S.S3Utils] (pool-10-thread-1:[ctx-2cad9dd2]) (logid:1b4a38df) Sending stream as S3 object template/tmpl/1/3/routing-3/systemvm64template-4.6.0-kvm.qcow2.bz2 in bucket testbucket using PutObjectRequest
2026-01-30 07:52:31,217 DEBUG [c.c.u.s.S.S3Utils] (pool-10-thread-1:[ctx-2cad9dd2]) (logid:1b4a38df) Creating S3 client with configuration: [protocol: https, signer: null, connectionTimeOut: 10000, maxErrorRetry: -1, socketTimeout: 50000, useTCPKeepAlive: null, connectionTtl: null]
2026-01-30 07:52:31,468 DEBUG [c.c.u.s.S.S3Utils] (pool-10-thread-1:[ctx-2cad9dd2]) (logid:1b4a38df) Setting the end point for S3 client with access key AFYT2BKNI1U8T6DY6435 to https://10.0.33.100.
2026-01-30 07:52:33,803 INFO [o.a.c.s.i.BaseImageStoreDriverImpl] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) Updating store ref entry for template Template {"format":"QCOW2","id":3,"name":"SystemVM Template (KVM)","uniqueName":"routing-3","uuid":"56911227-fd0c-11f0-9d05-1e00e00002fb"}
2026-01-30 07:52:33,817 WARN [c.c.a.AlertManagerImpl] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) alertType=[28] dataCenterId=[1] podId=[null] clusterId=[null] message=[Failed to register template: 56911227-fd0c-11f0-9d05-1e00e00002fb with error: ].
2026-01-30 07:52:33,825 WARN [c.c.a.AlertManagerImpl] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) No recipients set in global setting 'alert.email.addresses', skipping sending alert with subject [Failed to register template: 56911227-fd0c-11f0-9d05-1e00e00002fb with error: ] and content [Failed to register template: 56911227-fd0c-11f0-9d05-1e00e00002fb with error: ].
2026-01-30 07:52:33,825 ERROR [o.a.c.s.i.BaseImageStoreDriverImpl] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) Failed to register template: 56911227-fd0c-11f0-9d05-1e00e00002fb
with error:
- Used Minio s3 storage, the systemvm template got registered successfully
But it was of "systemvm64template-4.6.0-kvm.qcow2.bz2" and the systemvm were struck in starting state
- When the primary storage is of zone scope,
logs
2026-01-30 09:32:23,838 DEBUG [o.a.c.s.v.VolumeServiceImpl] (Work-Job-Executor-7:[ctx-45ba2dab, job-40/job-47, ctx-3f578420]) (logid:17e47580) Found template Template {"format":"QCOW2","id":3,"name":"SystemVM Template (KVM)","uniqueName":"routing-3","uuid":"56911227-fd0c-11f0-9d05-1e00e00002fb"} in storage pool StoragePool {"id":1,"name":"pri","poolType":"NetworkFilesystem","uuid":"cfc7f591-fc1e-36dd-b2c5-dc6712acf57e"} with VMTemplateStoragePool: TmplPool[3-3-1-null]
2026-01-30 09:32:23,839 DEBUG [o.a.c.s.v.VolumeServiceImpl] (Work-Job-Executor-7:[ctx-45ba2dab, job-40/job-47, ctx-3f578420]) (logid:17e47580) Acquire lock on VMTemplateStoragePool 3 with timeout 3600 seconds
- When the primary storage is of cluster scope,
logs
2026-01-30 09:47:14,659 DEBUG [o.a.c.s.c.m.StorageCacheManagerImpl] (Work-Job-Executor-6:[ctx-fdef51b0, job-62/job-64, ctx-9c2c6347]) (logid:11a5fb7e) waiting cache copy completion type: template, id: 3, lock: 638897157
2026-01-30 09:47:24,659 DEBUG [o.a.c.s.c.m.StorageCacheManagerImpl] (Work-Job-Executor-6:[ctx-fdef51b0, job-62/job-64, ctx-9c2c6347]) (logid:11a5fb7e) waken up
2026-01-30 09:47:24,661 DEBUG [o.a.c.s.c.m.StorageCacheManagerImpl] (Work-Job-Executor-6:[ctx-fdef51b0, job-62/job-64, ctx-9c2c6347]) (logid:11a5fb7e) waiting cache copy completion type: template, id: 3, lock: 638897157
Reproduced this issue around Ceph S3 being used as secondary storage. I too get the empty error - logs show logs: |
|
@kiranchavala Regarding the Ceph issue, your logs show HTTPS endpoint ( However, testing with with HTTP The
Could you try with HTTP instead? Is there a reason HTTPS was configured? |
|
Thanks @Damans227 The issue is solved when I point it to a HTTP s3 link Create a zone from scratch , in the zone creation wizard add s3 as the secondary storage
Will check again with a fresh deployment |
Got it. Thanks for checking. |
|
@Damans227 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 16690 |
|
@blueorangutan package |
|
@Damans227 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16691 |
|
@kiranchavala Bucket details shown in the secondary storage details: |
|
@blueorangutan package |
|
@Damans227 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16717 |
|
Thanks @Damans227 of the fixes is the scope set to Region by Default or can Zone be set ? |
No, S3 is hardcoded to REGION scope only and can't be set to ZONE. |
|
@blueorangutan package |
|
@Damans227 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16730 |
|
@kiranchavala I tested the fix. On adding S3 as secondary storage, the SystemVM template URL in |
|
@Damans227 i see the zone creation is struck though the correct template got downloaded to s3 storage logs
|
|
[SF] Trillian Build Failed (tid-15446) |
|
[SF] Trillian Build Failed (tid-15458) |
|
[SF] Trillian Build Failed (tid-15460) |
|
[SF] Trillian Build Failed (tid-15462) |
|
[SF] Trillian Build Failed (tid-15463) |
|
[SF] Trillian Build Failed (tid-15465) |







Description
This PR fixes an issue where the SystemVM template is not automatically downloaded to S3 secondary storage when adding it to a CloudStack zone.
Root Cause:
S3 stores use
REGIONscope butDefaultEndPointSelectoronly returnedLocalHostEndpointforZONEscope, so no endpoint was found to download the SystemVM template.Fix:
Allow
LocalHostEndpointto handle SYSTEM template downloads forREGION-scoped stores, plus added null checks for S3 stores without URLs and enabled path-style access for S3-compatible storage.Fixes: #9002
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
Broken:
Fixed:
Screencast.from.2026-01-14.13-52-40.mp4
How Has This Been Tested?
Test Environment:
Test Steps: