Skip to content

Conversation

@lcruz99
Copy link
Contributor

@lcruz99 lcruz99 commented Nov 17, 2025

Problem

Some local checks require a virgin natlab environment with all the VMs running. This barely happens locally.

Solution

This PR:

  • skips checking duplicate IPs for interfaces that still have virtual VPN mesh addresses (100.64.0.1)
  • allows duplicate checks failures for stopped VMs
  • skips arp cache checks for local setups where many other interfaces might be running.
  • simplifies pytest marks checking

☑️ Definition of Done checklist

  • Commit history is clean (requirements)
  • README.md is updated
  • Functionality is covered by unit or integration tests

@lcruz99 lcruz99 requested a review from a team as a code owner November 17, 2025 14:25
new_connection_raw(conn_tag)
)
except Exception as e: # pylint: disable=broad-exception-caught
log.warning("Failed to check MAC address for %s: %s", conn_tag.name, e)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: should we add a check if it's CI? Without a check we will need to backtrack to this place from the consequences that are caused by it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean if we should swallow the exception in CI? I can re raise it if we're on CI

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I wonder if it's worth to re-raise in CI since then we could find the root cause sooner

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and think it wouldn't worth to re-raise it. Do you agree?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My wording might have been imprecise - I think it's worth to re-raise. This should allow to detect the connection failures closer to the root cause :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"""
if "GITLAB_CI" not in os.environ:
log.info("setup_check: skipping ARP cache validation on non-CI environment")
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not all VMs are started should we still continue for the ones that are present instead of skipping all of them?

Copy link
Contributor Author

@lcruz99 lcruz99 Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Instead of skipping the check, I can reduce the timeout which currently is 300s/5min or set a number of retries for each ARP entry. At the moment the address is pinged until it gets one of the acceptable states: ["REACHABLE", "STALE", "DELAY", "PROBE", "PERMANENT"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively I can check which VMs are running and if the session marks requires them. Then run the checks to only the ones that are running

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of only checking the containers we care about but maybe the easier solution is to just reduce the timeout as you say

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the reduced timeout the test will still fail, it's easier but definitely not the best. IMHO would be great if we check that the required containers for the session are running and healthy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my experience running natlab locally is quite a challenge already especially when things don't work so ignoring ARPs locally is not a good choice unless I miss something.

The idea of checking only the containers needed for the test is a nice one

Copy link
Collaborator

@jjanowsk jjanowsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I don't have other remarks then Lukas, but there is one place where you could improve the experience. In helpers.py in the if prepare_vpn block we should not prepare nlx if it is not needed. Otherwise I get failures there if I run vpn test without nlx container running

@lcruz99
Copy link
Contributor Author

lcruz99 commented Jan 6, 2026

@jjanowsk

In helpers.py in the if prepare_vpn block we should not prepare nlx if it is not needed. Otherwise I get failures there if I run vpn test without nlx container running

Added commit to only prepare the server containers/VM that are required for the given pytest run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants