Troubleshooting Guide

Overview

This guide helps diagnose and resolve common issues with AI CLI Preparation, including network timeouts, rate limiting, version detection failures, classification errors, and performance problems.

Quick Diagnostics

Enable Debug Mode

CLI_AUDIT_DEBUG=1 python3 cli_audit.py --only problematic-tool

Shows:

Suppressed exceptions
Cache operations
Classification decisions
Best-effort fallbacks

Enable Trace Mode

CLI_AUDIT_TRACE=1 python3 cli_audit.py --only problematic-tool

Shows:

Detailed execution flow
Function entry/exit
Timing breakdowns
Slow operation warnings

Network Trace

CLI_AUDIT_TRACE_NET=1 python3 cli_audit.py --only problematic-tool

Shows:

HTTP request URLs
Response codes
Retry attempts
Error details

Combined Diagnostics

CLI_AUDIT_DEBUG=1 CLI_AUDIT_TRACE=1 CLI_AUDIT_TRACE_NET=1 python3 cli_audit.py --only tool 2>&1 | tee debug.log

Common Issues

1. Network Timeouts

Symptoms:

Empty latest_upstream column
Slow audit execution (>30s for 50 tools)
Timeout messages in debug output

Causes:

Slow network connection
Upstream API latency
DNS resolution issues
Firewall blocking requests

Diagnosis:

# Test network reachability
curl -I https://api.github.com
curl -I https://registry.npmjs.org
curl -I https://pypi.org

# Check DNS resolution
dig api.github.com
dig registry.npmjs.org

# Test with network trace
CLI_AUDIT_TRACE_NET=1 python3 cli_audit.py --only ripgrep

Solutions:

Increase Timeout:

CLI_AUDIT_TIMEOUT_SECONDS=10 python3 cli_audit.py

Increase Retries:

CLI_AUDIT_HTTP_RETRIES=5 CLI_AUDIT_BACKOFF_BASE=0.5 python3 cli_audit.py

Use Offline Mode:

CLI_AUDIT_OFFLINE=1 python3 cli_audit.py

Use Manual-First Mode:

CLI_AUDIT_MANUAL_FIRST=1 python3 cli_audit.py

Configure Proxy (if needed):

export HTTP_PROXY=http://proxy.example.com:8080
export HTTPS_PROXY=http://proxy.example.com:8080
python3 cli_audit.py

2. GitHub Rate Limiting

Symptoms:

Empty upstream versions for GitHub tools
HTTP 403 errors in trace output
Message: "API rate limit exceeded"

Causes:

Unauthenticated requests: 60/hour limit
Authenticated requests: 5000/hour limit
Too many concurrent workers

Diagnosis:

# Check current rate limit status
curl -H "Authorization: Bearer $GITHUB_TOKEN" \
  https://api.github.com/rate_limit

# Test with trace
CLI_AUDIT_TRACE_NET=1 python3 cli_audit.py --only ripgrep 2>&1 | grep "403"

Solutions:

Add GitHub Token:

# Generate token at https://github.com/settings/tokens
export GITHUB_TOKEN=ghp_your_token_here
python3 cli_audit.py

Reduce Concurrency:

CLI_AUDIT_HOST_CAP_GITHUB_API=2 python3 cli_audit.py

Use Offline Mode:

# When rate limited, fall back to manual cache
CLI_AUDIT_OFFLINE=1 python3 cli_audit.py

Check Rate Limit in Script:

import os, json, urllib.request

token = os.environ.get("GITHUB_TOKEN", "")
headers = {"Authorization": f"Bearer {token}"} if token else {}
req = urllib.request.Request("https://api.github.com/rate_limit", headers=headers)
resp = urllib.request.urlopen(req)
data = json.loads(resp.read())
print(f"Remaining: {data['rate']['remaining']}/{data['rate']['limit']}")
print(f"Reset: {data['rate']['reset']}")

3. Version Detection Failures

Symptoms:

Tool shows as NOT INSTALLED but is on PATH
Version column shows "X" or empty
Classification shows as unknown

Causes:

Tool uses non-standard version flag
Version output filtered by error detection
Tool binary not executable
PATH issues

Diagnosis:

# Check if tool is on PATH
which ripgrep || which rg

# Test version flag manually
rg --version
rg -V
rg version

# Check permissions
ls -l $(which rg)

# Test with debug
CLI_AUDIT_DEBUG=1 python3 cli_audit.py --only ripgrep

Solutions:

Add Custom Version Detection:

Edit cli_audit.py function get_version_line():

# Around line 1037
def get_version_line(path: str, tool_name: str) -> str:
    # Add custom handling for your tool
    if tool_name == "your-tool":
        return run_with_timeout([path, "version"])  # Custom flag

    # Try common flags
    for flag in ["--version", "-V", "version", "-v"]:
        output = run_with_timeout([path, flag])
        if output and not is_error_output(output):
            return output

    return ""

Fix PATH Issues:

# Check PATH
echo $PATH | tr ':' '\n'

# Add missing directory
export PATH="$HOME/.local/bin:$PATH"

# Verify tool is found
which your-tool

Fix Permissions:

# Make tool executable
chmod +x ~/.local/bin/your-tool

4. Classification Errors

Symptoms:

Wrong installed_method (e.g., ~/.local/bin instead of uv tool)
Inconsistent classifications for same tool
Missing or generic classifications

Causes:

Ambiguous installation paths
Multiple installation methods present
Shebang parsing failures
Environment detection issues

Diagnosis:

# Check tool path and classification
CLI_AUDIT_JSON=1 python3 cli_audit.py --only ripgrep | jq '.[] | {tool, installed_method, installed_path_resolved, classification_reason}'

# Check shebang
head -1 $(which your-tool)

# Check symlink resolution
ls -l $(which your-tool)
readlink -f $(which your-tool)

# Check environment hints
env | grep -E "(UV_|VIRTUAL_ENV|CONDA|NVM_)"

Solutions:

Refine Classification Rules:

Edit cli_audit.py function _classify_install_method():

# Around line 900
def _classify_install_method(path: str, tool_name: str) -> tuple[str, str]:
    # Add more specific path patterns
    if "/.mymanager/bin/" in path:
        return "mymanager", "path-under-~/.mymanager/bin"

    # Read shebang for ~/.local/bin disambiguation
    if "/.local/bin/" in path:
        shebang = _read_first_line(path)
        if "uv" in shebang:
            return "uv tool", "shebang-contains-uv"
        if "pipx" in shebang:
            return "pipx/user", "shebang-contains-pipx"

    # ... existing logic

Set Classification Hints:

Edit upstream_versions.json:

{
  "__methods__": {
    "your-tool": "uv tool"
  }
}

Resolve Path Ambiguity:

# Check for multiple installations
which -a ripgrep
which -a rg

# Prioritize by PATH order
export PATH="$HOME/.local/bin:/usr/bin"

5. HTTP Retry Failures

Symptoms:

Persistent empty upstream versions
Network trace shows all retries failing
Specific hosts always fail

Causes:

SSL/TLS certificate issues
Proxy interference
Host-specific blocking
Outdated Python urllib

Diagnosis:

# Test with curl
curl -v https://api.github.com/repos/BurntSushi/ripgrep/releases/latest

# Check Python SSL
python3 -c "import ssl; print(ssl.OPENSSL_VERSION)"

# Test with network trace
CLI_AUDIT_TRACE_NET=1 python3 cli_audit.py --only ripgrep 2>&1 | grep -A5 "http_exc"

Solutions:

Update SSL Certificates:

# Debian/Ubuntu
sudo apt-get update
sudo apt-get install --reinstall ca-certificates

# macOS
brew upgrade openssl@3

Configure Proxy:

export HTTP_PROXY=http://proxy.example.com:8080
export HTTPS_PROXY=http://proxy.example.com:8080
export NO_PROXY=localhost,127.0.0.1

Increase Retry Attempts:

CLI_AUDIT_HTTP_RETRIES=10 CLI_AUDIT_BACKOFF_BASE=1.0 python3 cli_audit.py

Use Alternative Host (if available):

Edit cli_audit.py function latest_github():

# Use Atom feed instead of API
def latest_github(owner: str, repo: str) -> tuple[str, str]:
    # Try feed first
    feed_url = f"https://github.com/{owner}/{repo}/releases.atom"
    # ... existing logic

6. Cache Corruption

Symptoms:

JSON parsing errors
Missing cache entries after update
Inconsistent behavior between runs
Snapshot fails to load

Causes:

Interrupted write operations
Concurrent modifications
Disk space issues
File permission problems

Diagnosis:

# Validate JSON files
jq '.' upstream_versions.json
jq '.' tools_snapshot.json

# Check file permissions
ls -l upstream_versions.json tools_snapshot.json

# Check disk space
df -h .

# Look for temp files
ls -la .tmp_*

Solutions:

Recover from Backup:

# Restore from git
git checkout upstream_versions.json tools_snapshot.json

# Rebuild snapshot
make update

Fix Permissions:

chmod 644 upstream_versions.json tools_snapshot.json

Clean Temp Files:

rm -f .tmp_*.json

Prevent Corruption:

# Atomic writes are built-in, but ensure:
# 1. Sufficient disk space (check with df -h)
# 2. Write permissions (check with ls -l)
# 3. No concurrent processes writing same files

Manual Cache Repair:

# Validate and pretty-print
jq '.' upstream_versions.json > upstream_versions.json.tmp
mv upstream_versions.json.tmp upstream_versions.json

# Refresh from upstream
python audit.py --update-baseline

7. Version Parsing Edge Cases

Symptoms:

Status shows UNKNOWN despite valid versions
Version comparison incorrect
Semantic versioning failures

Causes:

Non-standard version formats
Prefixed versions (e.g., jq-1.8.1)
Date-based versions (e.g., 20250822)
Pre-release tags (e.g., 1.0.0-beta)

Diagnosis:

# Check raw version strings
CLI_AUDIT_JSON=1 python3 cli_audit.py --only jq | jq '.[] | {tool, installed, latest_upstream, installed_version, latest_version}'

# Test version parsing
python3 -c "
import re
version = 'jq-1.8.1'
match = re.search(r'(\d+\.\d+\.\d+)', version)
print(match.group(1) if match else 'FAIL')
"

Solutions:

Enhance Version Extraction:

Edit cli_audit.py function extract_version_number():

# Around line 1100
def extract_version_number(s: str) -> str:
    # Add custom patterns

    # Handle prefix formats (e.g., jq-1.8.1)
    match = re.search(r'[a-z]+-(\d+\.\d+\.\d+)', s)
    if match:
        return match.group(1)

    # Handle go versions (e.g., go1.25.1)
    match = re.search(r'go(\d+\.\d+\.\d+)', s)
    if match:
        return match.group(1)

    # ... existing logic

Manual Version Normalization:

Edit upstream_versions.json:

{
  "jq": "1.8.1",
  "parallel": "20250822"
}

Add Comparison Override:

Edit cli_audit.py function audit_tool():

# Special case for date-based versions
if tool.name == "parallel":
    # Compare as integers
    installed_int = int(installed_version) if installed_version.isdigit() else 0
    latest_int = int(latest_version) if latest_version.isdigit() else 0
    status = "UP-TO-DATE" if installed_int >= latest_int else "OUTDATED"

8. Performance Issues

Symptoms:

Audit takes >30s for 50 tools
High CPU usage
Memory growth over time
Slow rendering

Causes:

Too many workers (contention)
Slow upstream APIs
Large number of tools
Inefficient subprocess calls

Diagnosis:

# Profile execution
time make update

# Identify slow tools
CLI_AUDIT_TRACE=1 CLI_AUDIT_SLOW_MS=1000 make update 2>&1 | grep "slow"

# Monitor resources
top -p $(pgrep -f cli_audit.py)

# Test render performance
time make audit

Solutions:

Optimize Worker Count:

# Too many workers cause contention
CLI_AUDIT_MAX_WORKERS=12 make update

# Find optimal value for your system
for workers in 4 8 12 16 20 24; do
  echo "Testing $workers workers:"
  time CLI_AUDIT_MAX_WORKERS=$workers make update
done

Reduce Timeout:

# Fail fast on slow tools
CLI_AUDIT_TIMEOUT_SECONDS=2 make update

Use Fast Mode:

CLI_AUDIT_FAST=1 make update

Enable Manual-First:

CLI_AUDIT_MANUAL_FIRST=1 make update

Limit Host Concurrency:

CLI_AUDIT_HOST_CAP_GITHUB_API=2 CLI_AUDIT_HOST_CAP_NPM=2 make update

Use Snapshot Rendering:

# Collect once
make update

# Render many times (fast)
make audit
make audit
make audit

9. Docker Detection Issues

Symptoms:

Docker version shows as "X" or empty
Slow docker inspection
Docker info missing

Causes:

Docker daemon not running
Permission issues (not in docker group)
Docker info inspection enabled but slow

Diagnosis:

# Check docker status
docker version
docker ps

# Check permissions
groups | grep docker

# Test with docker info disabled
CLI_AUDIT_DOCKER_INFO=0 python3 cli_audit.py --only docker

Solutions:

Fix Permissions:

# Add user to docker group
sudo usermod -aG docker $USER
newgrp docker

# Verify
docker ps

Disable Docker Info:

CLI_AUDIT_DOCKER_INFO=0 make update

Start Docker Daemon:

# systemd
sudo systemctl start docker

# Docker Desktop
# Start from GUI

10. Environment Variable Conflicts

Symptoms:

Unexpected behavior
Settings not applying
Conflicting modes

Causes:

Multiple .env files
Shell environment overrides
Typos in variable names

Diagnosis:

# Check all CLI_AUDIT_* variables
env | grep CLI_AUDIT

# Check for .env files
ls -a .env*

# Check shell config
grep CLI_AUDIT ~/.bashrc ~/.zshrc

Solutions:

Clear Environment:

# Unset all CLI_AUDIT variables
unset $(env | grep '^CLI_AUDIT_' | cut -d= -f1)

# Verify clean state
env | grep CLI_AUDIT

Source Correct .env:

# Load specific profile
set -a
source .env.production
set +a

# Verify
env | grep CLI_AUDIT

Debug Variable Precedence:

# Print all settings at startup
CLI_AUDIT_DEBUG=1 python3 cli_audit.py --only python 2>&1 | head -20

Debugging Workflows

Workflow 1: Diagnose Single Tool

# 1. Test tool manually
which ripgrep
ripgrep --version

# 2. Run audit with full debugging
CLI_AUDIT_DEBUG=1 \
CLI_AUDIT_TRACE=1 \
CLI_AUDIT_TRACE_NET=1 \
python3 cli_audit.py --only ripgrep 2>&1 | tee ripgrep_debug.log

# 3. Check classification
CLI_AUDIT_JSON=1 python3 cli_audit.py --only ripgrep | jq '.[] | {installed_method, classification_reason, installed_path_resolved}'

# 4. Test upstream lookup
python3 -c "
from cli_audit import latest_github
tag, method = latest_github('BurntSushi', 'ripgrep')
print(f'Tag: {tag}, Method: {method}')
"

Workflow 2: Diagnose Network Issues

# 1. Test network reachability
for host in api.github.com registry.npmjs.org pypi.org crates.io; do
  echo "Testing $host:"
  curl -I https://$host
done

# 2. Test with retries
CLI_AUDIT_TRACE_NET=1 \
CLI_AUDIT_HTTP_RETRIES=5 \
python3 cli_audit.py --only ripgrep 2>&1 | grep -E "(http_|retry)"

# 3. Test offline fallback
CLI_AUDIT_OFFLINE=1 python3 cli_audit.py --only ripgrep

# 4. Check cache state
jq '.versions.ripgrep' upstream_versions.json

Workflow 3: Diagnose Performance

# 1. Baseline timing
time CLI_AUDIT_COLLECT=1 python3 cli_audit.py

# 2. Identify slow tools
CLI_AUDIT_TRACE=1 CLI_AUDIT_SLOW_MS=1000 make update 2>&1 | grep "slow"

# 3. Test different worker counts
for workers in 8 12 16 20; do
  echo "Testing $workers workers:"
  time CLI_AUDIT_MAX_WORKERS=$workers make update
done

# 4. Test with optimizations
time CLI_AUDIT_FAST=1 \
CLI_AUDIT_MANUAL_FIRST=1 \
CLI_AUDIT_TIMEOUT_SECONDS=2 \
make update

Workflow 4: Diagnose Cache Issues

# 1. Validate JSON
jq '.' upstream_versions.json
jq '.' local_state.json

# 2. Check metadata
jq '.__meta__' upstream_versions.json
jq '.versions | length' upstream_versions.json

# 3. Rebuild cache
git checkout upstream_versions.json
python audit.py --update-baseline

# 4. Verify consistency
CLI_AUDIT_OFFLINE=1 make audit

Advanced Debugging

Python Debugger (pdb)

# Add breakpoint in cli_audit.py
def audit_tool(tool: Tool) -> tuple:
    import pdb; pdb.set_trace()  # Breakpoint
    # ... rest of function

python3 cli_audit.py --only ripgrep
# (Pdb) commands available:
# n - next line
# s - step into
# c - continue
# p variable - print variable
# l - list code

Logging to File

# Capture all output
make update 2>&1 | tee update.log

# Filter for errors
make update 2>&1 | grep -i "error\|exception\|fail" | tee errors.log

# Extract timing info
CLI_AUDIT_TRACE=1 make update 2>&1 | grep "slow\|ms)" | tee timing.log

Interactive Python Testing

# Launch Python REPL
python3

# Import and test functions
from cli_audit import *

# Test version extraction
extract_version_number("ripgrep 14.1.1 (rev abc123)")

# Test tool audit
tool = Tool("ripgrep", ("rg",), "gh", ("BurntSushi", "ripgrep"))
result = audit_tool(tool)
print(result)

# Test classification
detect_install_method("/home/user/.cargo/bin/rg", "ripgrep")

# Test upstream lookup
latest_github("BurntSushi", "ripgrep")

Environment Variable Reference

Debugging Variables

Variable	Purpose	Example
`CLI_AUDIT_DEBUG`	Print debug messages	`CLI_AUDIT_DEBUG=1`
`CLI_AUDIT_TRACE`	Detailed execution trace	`CLI_AUDIT_TRACE=1`
`CLI_AUDIT_TRACE_NET`	Network call tracing	`CLI_AUDIT_TRACE_NET=1`
`CLI_AUDIT_PROGRESS`	Show progress updates	`CLI_AUDIT_PROGRESS=1`
`CLI_AUDIT_SLOW_MS`	Slow operation threshold	`CLI_AUDIT_SLOW_MS=1000`

Performance Variables

Variable	Purpose	Default	Range
`CLI_AUDIT_MAX_WORKERS`	Thread pool size	16	1-32
`CLI_AUDIT_TIMEOUT_SECONDS`	Operation timeout	3	1-30
`CLI_AUDIT_HTTP_RETRIES`	Retry attempts	2	0-5
`CLI_AUDIT_BACKOFF_BASE`	Retry backoff base	0.2	0.1-2.0
`CLI_AUDIT_FAST`	Skip slow operations	0	0-1

Network Variables

Variable	Purpose	Default
`CLI_AUDIT_OFFLINE`	Force offline mode	0
`CLI_AUDIT_MANUAL_FIRST`	Try cache first	0
`GITHUB_TOKEN`	GitHub API token	""
`HTTP_PROXY`	HTTP proxy URL	""
`HTTPS_PROXY`	HTTPS proxy URL	""

Host Concurrency Caps

Variable	Purpose	Default
`CLI_AUDIT_HOST_CAP_GITHUB`	github.com cap	4
`CLI_AUDIT_HOST_CAP_GITHUB_API`	api.github.com cap	4
`CLI_AUDIT_HOST_CAP_NPM`	registry.npmjs.org cap	4
`CLI_AUDIT_HOST_CAP_CRATES`	crates.io cap	4
`CLI_AUDIT_HOST_CAP_GNU`	GNU FTP cap	2

Getting Help

Documentation Resources

API_REFERENCE.md - Function signatures and parameters
ARCHITECTURE.md - System design and data flow
DEVELOPER_GUIDE.md - Contributing and development
DEPLOYMENT.md - Operations and configuration
TOOL_ECOSYSTEM.md - Tool catalog

Support Channels

GitHub Issues: Report bugs and request features
GitHub Discussions: Ask questions and share solutions
Pull Requests: Contribute fixes and improvements

Reporting Bugs

When reporting issues, include:

Environment:

python3 --version
uname -a
env | grep CLI_AUDIT

Command:

# Exact command that failed
CLI_AUDIT_DEBUG=1 python3 cli_audit.py --only tool

Output:

# Full debug output
CLI_AUDIT_DEBUG=1 CLI_AUDIT_TRACE=1 python3 cli_audit.py --only tool 2>&1 | tee debug.log

Cache State:

jq '.__meta__' upstream_versions.json
jq '.versions | keys' upstream_versions.json

Expected vs Actual Behavior:

What you expected to happen
What actually happened
Steps to reproduce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Troubleshooting Guide

Overview

Quick Diagnostics

Enable Debug Mode

Enable Trace Mode

Network Trace

Combined Diagnostics

Common Issues

1. Network Timeouts

2. GitHub Rate Limiting

3. Version Detection Failures

4. Classification Errors

5. HTTP Retry Failures

6. Cache Corruption

7. Version Parsing Edge Cases

8. Performance Issues

9. Docker Detection Issues

10. Environment Variable Conflicts

Debugging Workflows

Workflow 1: Diagnose Single Tool

Workflow 2: Diagnose Network Issues

Workflow 3: Diagnose Performance

Workflow 4: Diagnose Cache Issues

Advanced Debugging

Python Debugger (pdb)

Logging to File

Interactive Python Testing

Environment Variable Reference

Debugging Variables

Performance Variables

Network Variables

Host Concurrency Caps

Getting Help

Documentation Resources

Support Channels

Reporting Bugs

See Also

FilesExpand file tree

TROUBLESHOOTING.md

Latest commit

History

TROUBLESHOOTING.md

File metadata and controls

Troubleshooting Guide

Overview

Quick Diagnostics

Enable Debug Mode

Enable Trace Mode

Network Trace

Combined Diagnostics

Common Issues

1. Network Timeouts

2. GitHub Rate Limiting

3. Version Detection Failures

4. Classification Errors

5. HTTP Retry Failures

6. Cache Corruption

7. Version Parsing Edge Cases

8. Performance Issues

9. Docker Detection Issues

10. Environment Variable Conflicts

Debugging Workflows

Workflow 1: Diagnose Single Tool

Workflow 2: Diagnose Network Issues

Workflow 3: Diagnose Performance

Workflow 4: Diagnose Cache Issues

Advanced Debugging

Python Debugger (pdb)

Logging to File

Interactive Python Testing

Environment Variable Reference

Debugging Variables

Performance Variables

Network Variables

Host Concurrency Caps

Getting Help

Documentation Resources

Support Channels

Reporting Bugs

See Also