Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 93 additions & 0 deletions packages/telemetry/telemetry.go
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟑 IdentifyUser does not alias the new identity-<id> distinctId to the user's email

IdentifyUser (line 68) only creates a PostHog alias from anonymous_cli_ + machineId β†’ email. After this PR, if a machine-identity token is present in the environment (e.g. INFISICAL_TOKEN), GetDistinctId() returns identity-<id> instead of anonymous_cli_<machineId> for pre-login events. When the user later logs in interactively, IdentifyUser aliases the now-unused anonymous_cli_<machineId> to their email but never aliases identity-<id>. This means any CLI telemetry events captured before login (under the identity-<id> distinctId) will remain orphaned in PostHog and won't be merged into the logged-in person record, which is the exact scenario IdentifyUser was designed to handle.

(Refers to lines 82-89)

Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
package telemetry

import (
"encoding/base64"
"encoding/json"
"errors"
"os"
"strings"

"github.com/Infisical/infisical-merge/packages/util"
"github.com/denisbrodbeck/machineid"
Expand Down Expand Up @@ -89,6 +93,76 @@ func (t *Telemetry) IdentifyUser(email string) {
// all enqueued events (Identify, Alias, and Capture).
}

// getMachineIdentityIdFromEnv inspects the environment variables that the
// CLI uses to receive machine-identity access tokens (the same set checked
// by util.GetInfisicalToken, minus the `--token` flag which is per-command
// and not visible to the telemetry layer) and, if a machine-identity JWT
// is present, returns the `identityId` claim from its payload.
//
// The function is intentionally best-effort and silent on failure:
// - returns "" if no token is set
// - returns "" for service tokens (`st.` prefix), which carry no JWT
// payload and represent the deprecated service-token auth mode
// - returns "" if the JWT is malformed or missing the `identityId` claim
//
// The token's signature is not verified β€” the value is only used to derive
// a PostHog distinctId, never for authorization. The same token has already
// been (or is about to be) sent to the Infisical API where its signature is
// verified server-side.
func getMachineIdentityIdFromEnv() string {
// Mirror the env-var precedence in util.GetInfisicalToken so that the
// telemetry distinctId aligns with the credential the API call will
// actually use:
// 1. INFISICAL_UNIVERSAL_AUTH_ACCESS_TOKEN
// 2. INFISICAL_TOKEN
// 3. TOKEN (legacy gateway env var)
envVars := []string{
util.INFISICAL_UNIVERSAL_AUTH_ACCESS_TOKEN_NAME,
util.INFISICAL_TOKEN_NAME,
util.INFISICAL_GATEWAY_TOKEN_NAME_LEGACY,
}

var token string
for _, name := range envVars {
if v := os.Getenv(name); v != "" {
token = v
break
}
}

if token == "" {
return ""
}

// Service tokens are deprecated and not JWTs β€” no identityId to extract.
if strings.HasPrefix(token, "st.") {
return ""
}

parts := strings.Split(token, ".")
if len(parts) != 3 {
return ""
}

payloadBytes, err := base64.RawURLEncoding.DecodeString(parts[1])
if err != nil {
// Some JWT issuers pad the payload with `=`; tolerate that variant.
payloadBytes, err = base64.URLEncoding.DecodeString(parts[1])
if err != nil {
return ""
}
}

var claims struct {
IdentityID string `json:"identityId"`
}
if err := json.Unmarshal(payloadBytes, &claims); err != nil {
return ""
}

return claims.IdentityID
}

func (t *Telemetry) GetDistinctId() (string, error) {
var distinctId string

Expand All @@ -102,8 +176,27 @@ func (t *Telemetry) GetDistinctId() (string, error) {
log.Debug().Err(err).Msg("failed to get config file for telemetry")
}

// Resolution priority:
// 1. Logged-in user email from the persisted config. A logged-in user
// takes precedence over any machine-identity token that happens to
// be exported in the shell, because some commands never authenticate
// against the backend at all (e.g. `infisical user switch`, the
// local-config branch of `infisical login`) and others authenticate
// with the user's session JWT rather than the env-token. Attributing
// those events to a stale `identity-<id>` would corrupt person-level
// analytics, while attributing them to the logged-in email is always
// correct.
// 2. Machine-identity access token from env. This is the dominant case
// in CI / containers / Kubernetes pods, where there is no logged-in
// user and the only credential is `INFISICAL_TOKEN` (or the UA-scoped
// env var). Aligns with the `identity-<id>` distinctId the backend
// uses for MachineIdentityLogin and other identity-scoped events,
// so CLI events flow into the same person record.
// 3. Anonymous fallback keyed by the local machine ID.
if infisicalConfig.LoggedInUserEmail != "" {
distinctId = infisicalConfig.LoggedInUserEmail
} else if identityId := getMachineIdentityIdFromEnv(); identityId != "" {
distinctId = "identity-" + identityId
} else if machineId != "" {
distinctId = "anonymous_cli_" + machineId
}
Expand Down
Loading