Skip to content

[refactor] Semantic function clustering: 2 near-duplicates and 1 misplaced utility found #6499

@github-actions

Description

@github-actions

Analysis of repository: github/gh-aw-mcpg

Executive Summary

A semantic function clustering analysis was performed across all 129 non-test Go source files in internal/, cataloging 821 functions and methods. The codebase is generally well-organized with clear package boundaries. Three meaningful refactoring opportunities were identified: a near-duplicate rate-limit time-parsing pair, a session-ID truncation helper that lives in the wrong package, and a pattern of logging-only truncation utilities that overlap with strutil's general-purpose truncation primitives.

Function Inventory

By Package

Package Files Primary Purpose
internal/auth 1 Auth header parsing and session ID extraction
internal/cmd 9 CLI flags and Cobra root command
internal/config 13 TOML/JSON config loading and validation
internal/difc 9 Decentralized Information Flow Control
internal/envutil 4 Environment variable helpers
internal/guard 11 Security guards (Noop, WASM, WriteSink)
internal/httputil 4 Shared HTTP utilities
internal/launcher 4 Backend process management
internal/logger 16 Debug + file logging framework
internal/mcp 8 MCP protocol types and connection management
internal/middleware 1 jq schema processing middleware
internal/oidc 1 GitHub Actions OIDC token provider
internal/proxy 7 Filtering HTTP proxy for GitHub API
internal/server 22 HTTP server, routing, session, tool registry
internal/strutil 4 String and formatting utilities
internal/syncutil 1 Concurrency utilities
internal/sys 2 System/Docker detection
internal/testutil 4 Test helpers
internal/tracing 4 OpenTelemetry tracing setup
internal/tty 1 Terminal detection
internal/version 1 Version management

Total files analyzed: 129 · Total functions cataloged: 821 · Packages: 21


Identified Issues

1. Near-Duplicate Rate-Limit Reset Parsing Functions

Issue: Two functions with similar purpose but different input formats are defined in different packages without sharing any common logic or cross-referencing each other.

Occurrence 1 internal/httputil/github_http.go ParseRateLimitResetHeader(value string) time.Time
Occurrence 2 internal/server/circuit_breaker.go parseRateLimitResetFromText(text string) time.Time

Both functions:

  • Return a time.Time representing when a GitHub API rate limit resets
  • Return the zero value time.Time{} on parse failure
  • Apply the same RFC 3339 formatting for debug logging
// httputil/github_http.go — parses the X-RateLimit-Reset Unix-timestamp header
func ParseRateLimitResetHeader(value string) time.Time {
    unix, err := strconv.ParseInt(strings.TrimSpace(value), 10, 64)
    ...
    return time.Unix(unix, 0)
}

// server/circuit_breaker.go — parses "rate reset in Ns" from error message text
func parseRateLimitResetFromText(text string) time.Time {
    secs, err := strconv.ParseInt(strings.TrimSpace(rest[:end]), 10, 64)
    ...
    return time.Now().Add(time.Duration(secs) * time.Second)
}

Recommendation: The functions solve the same conceptual problem (determine when a rate limit expires) from two different sources. Consider moving parseRateLimitResetFromText into internal/httputil/github_http.go (or a new internal/httputil/ratelimit.go) so all rate-limit time-parsing logic is co-located. circuit_breaker.go can call the shared function. This makes it easy to extend (e.g., add a third parsing strategy) in one place.

Estimated Impact: Low effort (move + update one call site), improves discoverability.


2. TruncateSessionID Misplaced in auth Package

Issue: TruncateSessionID is a string-formatting utility (first 8 chars + ...) that lives inside internal/auth/header.go. It is not related to authentication logic; it is used exclusively for safe log output of session IDs.

  • File: internal/auth/header.go:179
  • Function: func TruncateSessionID(sessionID string) string
  • Used in: internal/server/session.go, internal/server/http_helpers.go, internal/server/middleware.go, internal/server/session_auto_init.go, internal/server/routed.go (via auth.TruncateSessionID)

The function is a thin wrapper around strutil.Truncate(sessionID, 8) with a special "(none)" sentinel for empty strings:

// internal/auth/header.go
func TruncateSessionID(sessionID string) string {
    if sessionID == "" {
        return "(none)"
    }
    return strutil.Truncate(sessionID, 8)
}

This creates an awkward import dependency: server package imports auth solely for this formatting helper, even in files that have no other authentication logic.

Recommendation: Move TruncateSessionID to internal/strutil/truncate.go (or a new internal/strutil/session.go) as a general session-ID formatting utility. Update the ~10 call sites in internal/server/ to import strutil instead of auth. This removes a cross-package concern from auth and gives strutil all string-shaping utilities.

Estimated Impact: ~10 call-site updates, cleaner auth package boundary, no behavior change.


3. Logging-Specific Truncation Helper Overlapping with strutil

Issue: internal/server/http_helpers.go defines truncateCacheKeyForLog, a private function that constructs a log-safe representation of a backendID/sessionID cache key. It duplicates the session-truncation pattern already provided by auth.TruncateSessionID.

// server/http_helpers.go
func truncateCacheKeyForLog(key string) string {
    backendID, sessionID, found := strings.Cut(key, "/")
    if !found {
        return key
    }
    // effectively: backendID + "/" + auth.TruncateSessionID(sessionID)
    ...
}

This function is only called inside server/routed.go for debug log lines. It is a natural companion to TruncateSessionID (issue 2 above) and would become trivial once TruncateSessionID is in strutil.

Recommendation: After moving TruncateSessionID to strutil (see issue 2), inline or simplify truncateCacheKeyForLog in server/routed.go using strutil.Truncate + auth.TruncateSessionID. Alternatively, expose a combined strutil.TruncateCacheKey(key string) string if the pattern is reused.

Estimated Impact: Minor cleanup, follows naturally from issue 2.


Function Clusters — Summary

Clusters in Good Shape ✓

Cluster Files Notes
Configuration validation config/validation*.go Well-decomposed across 4 focused files
Guard policy parsing config/guard_policy*.go 3 files cleanly split by concern
HTTP transport mcp/http_transport.go Large but cohesive; all HTTP MCP logic together
WASM guard logic guard/wasm*.go 4 files split by responsibility (wasm, parse, payload, validate)
Logger framework logger/*.go 16 files, each with clear single purpose
DIFC evaluation difc/evaluator.go, difc/labels.go Distinct concerns, well-separated
Session management server/session.go, server/session_auto_init.go Clear split
Response writer httputil/response_writer.go + server/response_writer.go Correct composition pattern (base + extension)

Refactoring Recommendations

Priority 1 — Medium Impact, Low Effort

Move TruncateSessionID out of auth

  • Move internal/auth/header.go:TruncateSessionIDinternal/strutil/truncate.go (or new internal/strutil/ids.go)
  • Update ~10 import sites in internal/server/
  • Estimated effort: ~30 min
  • Benefit: Cleaner auth package, strutil becomes the single home for all string-shaping utilities

Priority 2 — Low Impact, Low Effort

Co-locate rate-limit reset parsing in httputil

  • Move parseRateLimitResetFromText from internal/server/circuit_breaker.gointernal/httputil/ratelimit.go
  • Export as ParseRateLimitResetFromText for consistency with ParseRateLimitResetHeader
  • One call-site update in circuit_breaker.go
  • Estimated effort: ~20 min
  • Benefit: All rate-limit time-parsing in one place, consistent export style

Priority 3 — Follow-on from Priority 1

Simplify truncateCacheKeyForLog

  • After Priority 1 is done, simplify or inline this single-use private function
  • Estimated effort: ~10 min

Implementation Checklist

  • Move TruncateSessionID to internal/strutil/ and update all call sites
  • Move parseRateLimitResetFromText to internal/httputil/ratelimit.go (exported)
  • Simplify truncateCacheKeyForLog in server/routed.go
  • Run make test-all to verify no regressions
  • Update any relevant doc comments referencing old locations

Analysis Metadata

  • Total Go Files Analyzed: 129
  • Total Functions Cataloged: 821
  • Function Clusters Identified: 18
  • Outliers Found: 1 (TruncateSessionID in wrong package)
  • Near-Duplicates Detected: 1 pair (rate-limit reset parsers)
  • Detection Method: Static pattern analysis + naming convention clustering + import graph inspection
  • Analysis Date: 2026-05-25

Generated by Semantic Function Refactoring · sonnet46 2.1M ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions