Optimize data cluster format for VHD and QCOW#6895
Open
last-genius wants to merge 9 commits intoxapi-project:26.1-lcmfrom
Open
Optimize data cluster format for VHD and QCOW#6895last-genius wants to merge 9 commits intoxapi-project:26.1-lcmfrom
last-genius wants to merge 9 commits intoxapi-project:26.1-lcmfrom
Conversation
Contributor
Author
|
Opened the corresponding xs-opam PR: xapi-project/xs-opam#755 |
lindig
approved these changes
Feb 18, 2026
lindig
approved these changes
Feb 18, 2026
| | x :: y :: _ -> | ||
| (to_int x, to_int y) | ||
| | _ -> | ||
| raise (Invalid_argument "Invalid JSON") |
Contributor
There was a problem hiding this comment.
You might want to report the json.
Contributor
Author
There was a problem hiding this comment.
It can be rather large and pollute the logs... This shouldn't happen unless you have a version incompatibility, really
…rmat Qcow_stream now uses Qcow_mapping to store information on allocated clusters, which offers .to_interval_seq, outputting a list of pairs representing intervals of allocated virtual clusters. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
This allows to switch on the more efficient interval format later. (QCOW always uses the new format) Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
This is just an easy way to make sure the semantics are preserved in any future refactorings, without having to run full VHD exports. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
This command returns a more efficient representation of allocated clusters (when compared to read_headers), utilizing a sparse interval format instead of returning every single allocated cluster. This is the more efficient option, decreasing the filesize and memory usage in vhd-tool, but it's currently under a feature flag, so it's added as a new command instead of replacing read_headers immediately. Cram test for read_headers is still passing, so this refactoring has preserved the legacy format. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
Since the runtime feature flag vhd_legacy_blocks_format determines which block format is used to describe allocated VHD clusters, this requires duplicate parse_header_interval functions for VHD and QCOW. The right functions are selected in stream_vdi based on the feature flag. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
…er allocation Instead of using a set with every individual allocated cluster index as a member, use a sorted list of intervals to verify if cluster is allocated - this uses much less memory and directly follows from the JSON format qcow-stream-tool and vhd-tool output now. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
…_clusters nonzero_clusters no longer contain every single allocated cluster and instead are intervals of allocated clusters. Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
… files Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
d41347a to
bcf47a8
Compare
Contributor
Author
|
Changed the approach:
This way, XS (which only uses VHD) can keep using the legacy format until they can ensure the new format didn't accidentally break anything else. XCP-ng will flip the feature flag to make VHD consistent with QCOW - we will use the interval-based format. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
mirage/ocaml-qcow#134 changes the type of the data structure containing info on allocated data clusters, returning allocated intervals instead of all the virtual cluster addresses. Change
qcow2-to-stdoutto the new interval-based format.Add
vhd-tool read_headers_intervalcommand which also conforms to this new format, and change the parsing code instream_vdito accept both formats depending on a feature flag. Add cram tests verifying legacy format is preserved as-is.I've ran vm export and vdi integrity quicktests and tested this extensively locally. The PR will only build once the new ocaml-qcow version is packaged into xs-opam, so keeping this as draft for now.