Skip to content

Update QoLA/AITER #599

Open
Micky774 wants to merge 12 commits into
devfrom
zain/qola/aiter-update
Open

Update QoLA/AITER #599
Micky774 wants to merge 12 commits into
devfrom
zain/qola/aiter-update

Conversation

@Micky774
Copy link
Copy Markdown
Contributor

Description

Updates QoLA as well as moves up the pinned AITER commit

Fixes # (issue)

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

  • Change A
  • Change B

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@Micky774 Micky774 added the ci-level 3 CI test level 3 label May 28, 2026
COMMAND sh -c
"tmp=$(mktemp /tmp/gitconfig.XXXXXX) || exit 1; \
GIT_CONFIG_GLOBAL=$tmp git config --global --add safe.directory '*' >/dev/null 2>&1; \
GIT_CONFIG_GLOBAL=$tmp PYTHONPATH=\"${__QOLA_DIR}:$PYTHONPATH\" '${Python_EXECUTABLE}' -m qola.cli checkout \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to integrate safe.directory overriding to qola? The pattern with dubious ownership is probably not TE specific

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry that such behavior is a bit too authoritative for qola if that makes sense? My reasoning is that the permission scope here seems to be outside of qola and hence qola should not be the one in charge of it. I'm open to reconsidering that, but it's just my initial position.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is valid concern bearing in mind QoLA is intended to be reused by different components. May be make this behavior controllable then. It is OK to keep the things in TE, it will just require doing things this overriding twice - here and when build is called - BTW, there are comments there but not actual code change

Comment thread transformer_engine/common/ck_fused_attn/src/ck_fused_attn_bwd.cpp
# commit QoLA will actually check out and build, not whatever happens to be
# the submodule's current HEAD at configure time.
set(__QOLA_MANIFEST "${CMAKE_CURRENT_LIST_DIR}/qola_manifest.toml")
set_property(DIRECTORY APPEND PROPERTY CMAKE_CONFIGURE_DEPENDS "${__QOLA_MANIFEST}")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AITER_SHA is not cached variable. Why is CMAKE_CONFIGURE_DEPENDS needed?

if(NOT AITER_CHECKOUT_RESULT EQUAL 0)
message(FATAL_ERROR
"Failed to sync AITER source tree at ${__AITER_SOURCE_DIR} to "
"manifest-pinned commit ${AITER_SHA}.\n"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it also validate that actual commit matches one detected by prebuilt.cmake? If QoLA checkout AITER unconditionally, may be keep prebuilt.cmake as-is and where it is now? I.e. QoLA fetches AITER, prebuit.cmake checks for git commit as before. It will only loose AITER_SHA value in this error message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-level 3 CI test level 3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants