Skip to content

chore: optimize localdns provisioning polling and iptables setup#8401

Open
jingwenw15 wants to merge 9 commits intomainfrom
jingwenwu/localdns-perf-optimizations
Open

chore: optimize localdns provisioning polling and iptables setup#8401
jingwenw15 wants to merge 9 commits intomainfrom
jingwenwu/localdns-perf-optimizations

Conversation

@jingwenw15
Copy link
Copy Markdown
Member

@jingwenw15 jingwenw15 commented Apr 24, 2026

Summary

  • Reduce polling intervals from 1s to 0.1s in start_localdns() and wait_for_localdns_ready() since CoreDNS typically starts in <100ms
  • Batch iptables rules using iptables-restore instead of individual calls to avoid repeated xtables lock acquisition
  • Remove now-unused IPTABLES variable

Estimated total savings: ~1.3–2.7s during node provisioning.

Telescope perf runs:
Screenshot 2026-04-24 at 1 36 06 PM

Copilot AI review requested due to automatic review settings April 24, 2026 20:34
@jingwenw15 jingwenw15 changed the title perf: optimize localdns provisioning polling and iptables setup feat: optimize localdns provisioning polling and iptables setup Apr 24, 2026
@jingwenw15 jingwenw15 changed the title feat: optimize localdns provisioning polling and iptables setup chore: optimize localdns provisioning polling and iptables setup Apr 24, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets node provisioning latency in the Linux localdns bootstrap path by tightening readiness polling and batching iptables rule installation to reduce xtables lock contention.

Changes:

  • Reduced polling sleep interval in start_localdns() and wait_for_localdns_ready() from 1s to 0.1s, adjusting attempt counts accordingly.
  • Switched pod-traffic conntrack-skip rule installation to a single iptables-restore call instead of multiple iptables invocations.
  • Removed the previously used IPTABLES command wrapper variable.

Comment thread parts/linux/cloud-init/artifacts/localdns.sh Outdated
Comment thread parts/linux/cloud-init/artifacts/localdns.sh
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

Comment thread spec/parts/linux/cloud-init/artifacts/localdns_spec.sh
Comment thread parts/linux/cloud-init/artifacts/localdns.sh Outdated
Comment thread spec/parts/linux/cloud-init/artifacts/localdns_spec.sh Outdated
Comment thread spec/parts/linux/cloud-init/artifacts/localdns_spec.sh
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread e2e/validators.go Outdated
Comment on lines +1548 to +1568
saved_rules=$(sudo iptables -w -t raw -S | grep "localdns: skip conntrack")
sudo iptables -w -t raw -S | grep "localdns: skip conntrack" | while IFS= read -r rule; do
# Convert -A to -D to delete the rule
sudo iptables -w -t raw $(echo "$rule" | sed 's/^-A/-D/') 2>/dev/null || true
done

# Flush any leftover conntrack entries before the negative test
for ip in 169.254.10.10 169.254.10.11; do
sudo conntrack -D -d "$ip" -p udp --dport 53 2>/dev/null || true
done

# Do a DNS lookup without NOTRACK — this SHOULD create conntrack entries
dig bing.com @169.254.10.10 +short +timeout=2 +tries=1 > /dev/null 2>&1 || true

ct_dns_neg=$(sudo conntrack -L -d 169.254.10.10 -p udp --dport 53 2>/dev/null | wc -l)
echo "Conntrack entries for 169.254.10.10:53 without NOTRACK: $ct_dns_neg"

# Restore NOTRACK rules
echo "$saved_rules" | while IFS= read -r rule; do
sudo iptables -w -t raw $rule 2>/dev/null || true
done
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The negative-test restore path builds iptables commands by expanding the iptables -S output into $rule unquoted (sudo iptables ... $rule). Since iptables -S prints the comment as --comment "localdns: skip conntrack", unquoted expansion will split that into multiple argv tokens (quotes become literal characters), so both deletion and restoration are likely to fail and make this validator flaky (and may leave rules removed). Consider deleting by rule number (using iptables -L --line-numbers like localdns.sh cleanup does) and restoring rules via a known-safe mechanism (e.g., iptables-restore payload or re-adding rules with properly quoted args), and don’t ignore restore errors.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer applicable — the negative test was removed entirely in ceade3b. The validator now only checks rule presence.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment on lines +470 to +474
restore_input="${restore_input}
-A ${chain} -m comment --comment \"localdns: skip conntrack\" ${rule_rest}"
done
restore_input="${restore_input}
COMMIT"
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restore_input is constructed with a leading space before -A and COMMIT (note the newline literal followed by a space on the next line). That whitespace becomes part of the iptables-restore payload, which is non-standard and makes the script depend on iptables-restore being permissive about leading spaces. Consider building the payload without leading whitespace (e.g., via printf/$'\n' concatenation) so the generated restore script matches iptables-save format exactly.

Suggested change
restore_input="${restore_input}
-A ${chain} -m comment --comment \"localdns: skip conntrack\" ${rule_rest}"
done
restore_input="${restore_input}
COMMIT"
printf -v restore_input '%s\n-A %s -m comment --comment "localdns: skip conntrack" %s' "${restore_input}" "${chain}" "${rule_rest}"
done
printf -v restore_input '%s\nCOMMIT' "${restore_input}"

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified — no leading whitespace in the generated payload. Bash multiline string assignment (restore_input="${restore_input} -A ...") does not inherit shell indentation. The -A and COMMIT lines start at column 1 as required by iptables-restore. Confirmed by inspecting actual output on prod VMs via e2e tests.

Comment thread e2e/validators.go Outdated
Comment on lines +1494 to +1500
# Verify NOTRACK rules exist in both chains for both protocols with the comment tag
for chain in OUTPUT PREROUTING; do
for proto in tcp udp; do
if ! echo "$rules" | grep -q -- "-A ${chain}.*-p ${proto}.*localdns: skip conntrack.*--dport 53.*NOTRACK"; then
echo "FAIL: missing NOTRACK rule for $proto in $chain chain"
exit 1
fi
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new validation can produce false positives because it doesn’t assert the destination IPs for the NOTRACK rules (it only checks chain/proto/comment/port). Since localdns programs rules for both 169.254.10.10 and 169.254.10.11, consider aligning this check with the existing raw-table pattern used by getIPTablesRulesCompatibleWithEBPFHostRouting (includes -d 169.254.10.(10|11)/32 and the expected comment/module ordering) so the test fails if rules are applied to the wrong IPs or in an unexpected shape.

Suggested change
# Verify NOTRACK rules exist in both chains for both protocols with the comment tag
for chain in OUTPUT PREROUTING; do
for proto in tcp udp; do
if ! echo "$rules" | grep -q -- "-A ${chain}.*-p ${proto}.*localdns: skip conntrack.*--dport 53.*NOTRACK"; then
echo "FAIL: missing NOTRACK rule for $proto in $chain chain"
exit 1
fi
# Verify NOTRACK rules exist in both chains, for both protocols, and for both localdns listener IPs.
# Match a stricter rule shape so the test fails if rules are attached to the wrong destination IP
# or emitted in an unexpected form.
for chain in OUTPUT PREROUTING; do
for proto in tcp udp; do
for ip in 169.254.10.10 169.254.10.11; do
pattern="^-A ${chain} .* -d ${ip}/32 .* -p ${proto} .* -m comment --comment \\\"localdns: skip conntrack\\\" .* -m ${proto} --dport 53 .* (NOTRACK|CT --notrack)$"
if ! echo "$rules" | grep -Eq -- "$pattern"; then
echo "FAIL: missing localdns NOTRACK rule for chain=${chain} proto=${proto} ip=${ip}"
exit 1
fi
done

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full rule shape including destination IPs and match extension ordering is already validated by ValidateIPTablesCompatibleWithCiliumEBPF in the same test suite (regex: ^-A (PREROUTING|OUTPUT) -d 169\.254\.10\.(10|11)\/32 -p (tcp|udp) -m comment --comment "localdns: skip conntrack" -m (tcp|udp) --dport 53 -j NOTRACK$). Adding the same checks here would be redundant.

jingwenw15 and others added 8 commits April 27, 2026 13:19
Reduce polling intervals from 1s to 0.1s in start_localdns() and
wait_for_localdns_ready() since CoreDNS typically starts in <100ms.
Batch iptables rules using iptables-restore instead of individual
calls to avoid repeated xtables lock acquisition. Estimated total
savings: ~1.3–2.7s during node provisioning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update add_iptable_rules_to_skip_conntrack_from_pods tests to mock
iptables-restore via PATH instead of the old IPTABLES variable, and
assert on the new iptables-restore input format (*raw, -A rules, COMMIT).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
iptables requires -j (jump target) to be last in the rule. The comment
match module must come before it, otherwise iptables-restore rejects
the rule as invalid syntax.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Place -m comment immediately after chain name in iptables-restore input
so that iptables -S displays comment before protocol match, matching the
Cilium eBPF host routing regex. The previous ordering placed comment
after --dport which caused nft backend to display it after the protocol
match extension.

Add ValidateLocalDNSIptablesRules e2e validator that checks:
- localdns.sh uses iptables-restore (batched rules)
- NOTRACK rules exist in both OUTPUT and PREROUTING chains
- Comment tag is present for cleanup logic
- NOTRACK is functional (no conntrack entries for localdns DNS traffic)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Drop NOTRACK rules temporarily, do a DNS lookup, and verify conntrack
entries appear. This proves the conntrack check is actually capable of
detecting entries and isn't silently passing. Rules are restored after.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use unfiltered conntrack lookup (no --dport) for the negative test since
conntrack entry direction fields don't always match the query direction.
TCP queries with retries and sleep for reliable detection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strip down to core validation: verify NOTRACK rules exist in both
OUTPUT and PREROUTING chains for tcp and udp with the localdns comment
tag. Remove iptables-restore file check and conntrack enforcement tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The SSH/bastion transport can inject \r characters into command output,
causing grep to fail matching even when rules are visibly present.
Remove 2>&1 (stderr warnings can corrupt rule output) and pipe through
tr -d '\r' to normalize line endings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 27, 2026 20:19
@jingwenw15 jingwenw15 force-pushed the jingwenwu/localdns-perf-optimizations branch from 23dfc20 to 64622f9 Compare April 27, 2026 20:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment on lines 747 to 751
setup() {
Include "./parts/linux/cloud-init/artifacts/localdns.sh"
LOCALDNS_NODE_LISTENER_IP="10.0.0.1"
LOCALDNS_CLUSTER_LISTENER_IP="10.0.0.2"
IPTABLES_RULES=("raw -t raw -p udp --dport 53 -j NOTRACK" "raw -t raw -p tcp --dport 53 -j NOTRACK")
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This spec uses IPTABLES_RULES entries that start with raw -t raw ..., but add_iptable_rules_to_skip_conntrack_from_pods now parses each entry by treating the first token as the chain name. With the current test data the generated iptables-restore payload contains -A raw ... -t raw ... (table name used as a chain), so the test isn’t exercising the parsing logic against the production rule shape (OUTPUT/PREROUTING + -d <ip>). Consider generating IPTABLES_RULES via build_localdns_iptable_rules (or at least using OUTPUT/PREROUTING rules) in this Describe block so the test can catch regressions in the new batching/parsing logic.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines 783 to +790
When call add_iptable_rules_to_skip_conntrack_from_pods
The output should include "Adding iptables rules to skip conntrack for queries to localdns."
The output should include "iptables -A raw -t raw -p udp --dport 53 -j NOTRACK"
The output should include "iptables -A raw -t raw -p tcp --dport 53 -j NOTRACK"
The output should include "*raw"
The output should include "-A raw -m comment --comment"
The output should include "-p udp"
The output should include "-p tcp"
The output should include "-j NOTRACK"
The output should include "COMMIT"
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iptables-restore mock prints the invoked args, but the test doesn’t assert that add_iptable_rules_to_skip_conntrack_from_pods uses --noflush (and -w). Since omitting --noflush would flush the raw table and be a serious regression, it’s worth adding an assertion that the output includes the expected iptables-restore called with args: ... --noflush string.

Copilot uses AI. Check for mistakes.
Avoid storing iptables output in a shell variable and grepping from
echo, which caused intermittent failures. Instead pipe directly from
iptables -S per chain, matching the pattern used by other e2e validators.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants