Skip to content

DAOS-18891 object: retry if vos_update_end return -DER_AGAIN#18245

Draft
Nasf-Fan wants to merge 1 commit into
masterfrom
Nasf-Fan/DAOS-18891
Draft

DAOS-18891 object: retry if vos_update_end return -DER_AGAIN#18245
Nasf-Fan wants to merge 1 commit into
masterfrom
Nasf-Fan/DAOS-18891

Conversation

@Nasf-Fan
Copy link
Copy Markdown
Contributor

On server side, for an update operation, there may be CPU yield between related vos_update_begin() and vos_update_end(). During yield interval, the object that is held via vos_update_begin() maybe evicted by others, such as by another failed modification against the same object shard or evicted under md-on-ssd mode. So vos_update_end() logic will check such case and return -DER_AGAIN instead of -DER_TX_RESTART to the caller for notification. And then related caller needs to retry update instead of fail out.

The patch also adds initialization for some local varilables in object module to avoid random corruption when handle some failure cases.

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

On server side, for an update operation, there may be CPU yield between
related vos_update_begin() and vos_update_end(). During yield interval,
the object that is held via vos_update_begin() maybe evicted by others,
such as by another failed modification against the same object shard or
evicted under md-on-ssd mode. So vos_update_end() logic will check such
case and return -DER_AGAIN instead of -DER_TX_RESTART to the caller for
notification. And then related caller needs to retry update instead of
fail out.

The patch also adds initialization for some local varilables in object
module to avoid random corruption when handle some failure cases.

Signed-off-by: Fan Yong <fan.yong@hpe.com>
@github-actions
Copy link
Copy Markdown

Ticket title is 'osa/online_extend.py:OSAOnlineExtend.test_osa_online_extend_drain_after_rebuild - DER_TX_RESTART(-2025)'
Status is 'In Progress'
Labels: 'ci_master_weekly,weekly_test'
https://daosio.atlassian.net/browse/DAOS-18891

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant