Feature Description
Hi team, in the metrics layer, I found we collect and report two layers of metrics:
- operation level (metrics prefixed with
opendal_operation_), which tracks at Access trait boundary
- http request level (metrics prefixed by
opendal_http_requests_), which tracks each individual HTTP requests
Both tiers correspond to the internal mechanisms of the IO library, what's missing is a user-facing, object-level tier which captures the complete lifecycle of a user's API call (i.e., read(file)).
Pain point
Today, when a user calls op.read("large_file.bin") with concurrent(4) and chunk(8MB), OpenDAL internally splits this into multiple concurrent chunk reads. The metrics layer wraps each Access::read() invocation independently via MetricsWrapper, so a single user-level object download appears as N separate operation-level read metrics, each measuring a chunk. The OperationDurationSeconds reflects chunk read time, not the end-to-end object download latency.
In other words, I think operation is an opendal internal concept, for different operations it "behaves differently".
- For multipart concurrent write, OperationDurationSeconds records the latency for the whole object upload
- For concurrent read, OperationDurationSeconds records one chunk read latency
Propose to add object level metrics
Object level metrics are important. Usually there're three layers in terms of IO path: library user, IO library (like opendal) and storage service (like S3), with each of belonging to different companies and departments.
The object level metrics and dashboards are useful and important to prove the effectiveness for IO library, and it's different from server-side metrics.
A few object-level metrics I would expect, which is similar to what we do for operation and http metrics
- counter for requests
- gauge for ongoing requests
- histogram for access latency
- counter for errors encounter
Problem and Solution
"object" is a term commonly seen in object storage.
I understand the goal for this library is for "all storage backend", the object I refer to generally means "the object to access from opendal's APIs".
Now to get object-level metrics, I have to add a thin layer around opendal and collect metrics myself.
Additional Context
No response
Are you willing to contribute to the development of this feature?
Feature Description
Hi team, in the metrics layer, I found we collect and report two layers of metrics:
opendal_operation_), which tracks atAccesstrait boundaryopendal_http_requests_), which tracks each individual HTTP requestsBoth tiers correspond to the internal mechanisms of the IO library, what's missing is a user-facing, object-level tier which captures the complete lifecycle of a user's API call (i.e.,
read(file)).Pain point
Today, when a user calls op.read("large_file.bin") with concurrent(4) and chunk(8MB), OpenDAL internally splits this into multiple concurrent chunk reads. The metrics layer wraps each
Access::read()invocation independently viaMetricsWrapper, so a single user-level object download appears as N separate operation-level read metrics, each measuring a chunk. TheOperationDurationSecondsreflects chunk read time, not the end-to-end object download latency.In other words, I think operation is an opendal internal concept, for different operations it "behaves differently".
Propose to add object level metrics
Object level metrics are important. Usually there're three layers in terms of IO path: library user, IO library (like opendal) and storage service (like S3), with each of belonging to different companies and departments.
The object level metrics and dashboards are useful and important to prove the effectiveness for IO library, and it's different from server-side metrics.
A few object-level metrics I would expect, which is similar to what we do for operation and http metrics
Problem and Solution
"object" is a term commonly seen in object storage.
I understand the goal for this library is for "all storage backend", the object I refer to generally means "the object to access from opendal's APIs".
Now to get object-level metrics, I have to add a thin layer around opendal and collect metrics myself.
Additional Context
No response
Are you willing to contribute to the development of this feature?