Skip to content

Implement time-boxed CLI telemetry with opt-out controls#4927

Draft
arturcic wants to merge 5 commits intoGitTools:next/v7from
arturcic:feature/telemetry
Draft

Implement time-boxed CLI telemetry with opt-out controls#4927
arturcic wants to merge 5 commits intoGitTools:next/v7from
arturcic:feature/telemetry

Conversation

@arturcic
Copy link
Copy Markdown
Member

Description

Introduces an optional telemetry pipeline to collect CLI usage data, featuring automated redaction of paths and sensitive values. Collection is restricted to a 3-month window following the release date and supports multiple opt-out mechanisms including environment variables and a CLI flag.

Related Issue

Resolves #XYZ

Motivation and Context

This change helps maintainers make data-driven OSS design decisions by understanding common CLI patterns and environment contexts while preserving user privacy through strict redaction and transparency notices.

How Has This Been Tested?

Tested via new unit test suites covering argument parsing, redaction logic, environment-based opt-out, and release window validation.

Screenshots (if appropriate):

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@arturcic arturcic requested review from HHobeck, asbjornu and gep13 April 29, 2026 15:41
@arturcic
Copy link
Copy Markdown
Member Author

Here is a bit more context:

Implement time-boxed, privacy-focused telemetry for the GitVersion CLI.

Introduce a telemetry pipeline that collects narrow CLI usage data for a 3-month window following a release. This system includes automated redaction of sensitive paths and credentials while providing multiple opt-out mechanisms via environment variables and CLI flags.

Changes
Add CommandLineTelemetry model and TelemetryReporter to handle data collection and user disclosure.
Integrate telemetry tracking into both ArgumentParser and LegacyArgumentParser with automated PII redaction.
Update MSBuild targets to identify internal tool invocations via the GITVERSION_INTERNAL_CALLER environment variable.
Embed release date metadata into assemblies during build to enforce the three-month collection window.
Impact
Users receive a one-time disclosure notice in the console upon the first telemetry-eligible execution.
Enables maintainers to analyze CLI argument usage and CI provider distribution without collecting repository content.
Provides three opt-out methods: DO_NOT_TRACK, GITVERSION_TELEMETRY_OPTOUT, or the --telemetry-opt-out flag.
Introduces new dependencies on local application data storage for tracking the disclosure notice state.

What is left is to decide on the backend where all the usage is saved, preferable with some dashboard we can use to see the different usage types.

This is a follow up on the refactoring done for v7 where the new POSIX compliant cli was introduced. @asbjornu @gep13 @HHobeck, do you mind to have a look and suggest a backend solution for this? Thank you.

arturcic added 5 commits May 5, 2026 11:03
Implement a telemetry pipeline to collect redacted CLI usage data for OSS design decisions. Users can opt out via environment variables `DO_NOT_TRACK`, `GITVERSION_TELEMETRY_OPTOUT`, or the `--telemetry-opt-out` flag.
Embed release dates in assembly metadata during build and disable telemetry if the window has expired or metadata is missing.
Detect the current CI environment and identify whether the CLI was invoked directly or via GitVersion.MsBuild to better understand usage patterns.
…older

Organize the GitVersion.App project structure by grouping telemetry components into their own directory.
…amespaces

Remove fully qualified names for build agent interfaces and assembly reflection types to improve code readability.
@arturcic arturcic force-pushed the feature/telemetry branch from 86aadfa to 7305c88 Compare May 5, 2026 09:03
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 5, 2026

Copy link
Copy Markdown
Member

@asbjornu asbjornu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! Where and how are you (or we) going to monitor the data collected?

@@ -0,0 +1,163 @@
using System.Globalization;

namespace GitVersion;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the Telemetry stuff be inside a GitVersion.Telemetry namespace, perhaps?

public const string MetadataKey = "GitVersionReleaseDate";
public const string Format = "yyyy-MM-dd";

public static bool TryGetReleaseDate(Assembly assembly, out DateOnly releaseDate)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't the out parameter of type TelemetryReleaseDate? That would simplify the IsWithinWindow method so it can work on the TelemetryReleaseDate instance instead of being static and taking the release date it as a DateOnly argument.

Suggested change
public static bool TryGetReleaseDate(Assembly assembly, out DateOnly releaseDate)
public static bool TryGetReleaseDate(Assembly assembly, out TelemetryReleaseDate releaseDate)

return DateOnly.TryParseExact(value, Format, CultureInfo.InvariantCulture, DateTimeStyles.None, out releaseDate);
}

public static bool IsWithinWindow(DateOnly releaseDate, DateOnly utcToday) =>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the previous suggestion, this would be easier to work with rather than having to go through DateOnly for every interaction.

Suggested change
public static bool IsWithinWindow(DateOnly releaseDate, DateOnly utcToday) =>
public bool IsWithinWindow(DateOnly utcToday) =>

public const string Verbosity = "verbosity";
}

internal static class TelemetryReleaseDate
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make TelemetryReleaseDate a proper value object rather than just a collection of static methods?

Suggested change
internal static class TelemetryReleaseDate
internal class TelemetryReleaseDate


public void AddFlag(string name) => AddValues(name, ["true"]);

public void AddValue(string name, string? value, TelemetryValueKind kind = TelemetryValueKind.Plain)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it isn't much work, I'm thinking it would be better to have a one-time immutable mapping of all arguments we want to collect and their kind here inside TelemetryModels.cs such that it's impossible to pass the wrong kind of TelemetryValueKind, and also making the invocation of the AddValue() and AddValues() methods simpler.

Suggested change
public void AddValue(string name, string? value, TelemetryValueKind kind = TelemetryValueKind.Plain)
public void AddValue(string name, string? value)

AddValues(name, [value], kind);
}

public void AddValues(string name, IEnumerable<string>? values, TelemetryValueKind kind = TelemetryValueKind.Plain)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public void AddValues(string name, IEnumerable<string>? values, TelemetryValueKind kind = TelemetryValueKind.Plain)
public void AddValues(string name, IEnumerable<string>? values)

}

private void MapParsedValues(Arguments arguments, ParseResult parseResult, CommandOptions options)
private void MapParsedValues(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to reduce the complexity of MapParsedValues from 84 as suggested by SonarCloud. Perhaps not in this PR, but it might be relevant for what this PR wants to do to map arguments to TelemetryValueKind. So perhaps something that could be done first to make the implementation of Telemetry simpler? Make the change easy, then make the easy change. :)

@arturcic
Copy link
Copy Markdown
Member Author

arturcic commented May 5, 2026

Great stuff! Where and how are you (or we) going to monitor the data collected?

That's exactly why I tagged everybody, we need to find a solution for that, and might require some infra to setup for the telemetry. If there is any past experience/ best practices from you, that would be great.

@HHobeck
Copy link
Copy Markdown
Contributor

HHobeck commented May 5, 2026

FYI: I have seen that reqnroll provides some reports with statistics as well. Maybe we can use the experience of the project member. Or reuse part of the source code.

https://reqnroll.net/news/2026/03/monthly-stats-2026-02/

@asbjornu
Copy link
Copy Markdown
Member

asbjornu commented May 6, 2026

That's exactly why I tagged everybody, we need to find a solution for that, and might require some infra to setup for the telemetry. If there is any past experience/ best practices from you, that would be great.

Being the maintainer of Sentry's C# SDK for a while, I have good experience with them. We can apply for an open source (free) license if we want. Sentry is at its core geared more towards error monitoring, but it can be used for analytics as well. And I think it would be useful to also implement its error reporting mechanisms into GitVersion if we choose to go in this direction.

FYI: I have seen that reqnroll provides some reports with statistics as well. Maybe we can use the experience of the project member. Or reuse part of the source code.

https://reqnroll.net/news/2026/03/monthly-stats-2026-02/

Very interesting @HHobeck! I really like the transparency. This is something we could (mostly) automate with a GitHub action to extract data from the backend we choose and create a post that can be published monthly on gitversion.net. Do you know which analytics backend Reqnroll uses?

@arturcic
Copy link
Copy Markdown
Member Author

arturcic commented May 6, 2026

That's exactly why I tagged everybody, we need to find a solution for that, and might require some infra to setup for the telemetry. If there is any past experience/ best practices from you, that would be great.

Being the maintainer of Sentry's C# SDK for a while, I have good experience with them. We can apply for an open source (free) license if we want. Sentry is at its core geared more towards error monitoring, but it can be used for analytics as well. And I think it would be useful to also implement its error reporting mechanisms into GitVersion if we choose to go in this direction.

FYI: I have seen that reqnroll provides some reports with statistics as well. Maybe we can use the experience of the project member. Or reuse part of the source code.
https://reqnroll.net/news/2026/03/monthly-stats-2026-02/

Very interesting @HHobeck! I really like the transparency. This is something we could (mostly) automate with a GitHub action to extract data from the backend we choose and create a post that can be published monthly on gitversion.net. Do you know which analytics backend Reqnroll uses?

I've being researching and found https://github.com/aptabase/aptabase and https://github.com/aptabase/self-hosting, but then we need to host it ourselves

@HHobeck
Copy link
Copy Markdown
Contributor

HHobeck commented May 7, 2026

Please review my proposal I want to ask the maintainer of reqnroll:

Hi!

I really like the transparency reports/statistics you publish for Reqnroll. We’re discussing something similar for GitVersion and I was wondering: how did you implement the reporting pipeline?

Mainly curious about:

  • which analytics backend(s) you use,
  • how the data is collected,
  • where are you hosting your infra (self-hosting, third party)
  • whether the reports are generated automatically and
  • if any part of the implementation is reusable or open source?

We’re especially interested in automated monthly report generation (possibly via GitHub Actions) containng the following information:

  • .NET Framework Usage
  • CI/CD Server Usage
  • Visual Studio version usage

We’d love to learn from your experience before reinventing the wheel 🙂

Cheers!

@gep13
Copy link
Copy Markdown
Member

gep13 commented May 7, 2026

Working with telemetry is not something that I have any experience with, so I am not sure how much, if any, I can offer here.

I only thing I would ask for clarification on is, is the default to opt-in to telemetry, or to opt-out? I have seen some backlash recently about decisions made to make the default opt-in for things.

@gep13
Copy link
Copy Markdown
Member

gep13 commented May 7, 2026

@HHobeck said...
Please review my proposol I want to ask the maintainer of reqnroll:

This looks good to me!

@gep13
Copy link
Copy Markdown
Member

gep13 commented May 7, 2026

The was the recent change (not linking directly)

https://github.com/cli/cli/pull/13254

@arturcic
Copy link
Copy Markdown
Member Author

arturcic commented May 7, 2026

Working with telemetry is not something that I have any experience with, so I am not sure how much, if any, I can offer here.

I only thing I would ask for clarification on is, is the default to opt-in to telemetry, or to opt-out? I have seen some backlash recently about decisions made to make the default opt-in for things.

at least for OSS projects, it's said the user has to opt-out of the telemetry, and by default we will collect the telemetry (at least I've see a couple of cli tools, including dotnet doing that) And the reason for that is we as OSS we want to collect the telemetry to understand the usage and that will drive the decisions we make regarding the CLI in this case. And the idea is we want to keep it time-boxed to collect enough usage for a specific version. We could also enable the time-boxed telemetry for example only for major (and minor) version, but not for patch. We don't want to keep it enabled for too long as we need only a snapshot of the usage.

As for the opt-in or opt-out, I prefer the user to have to opt-out as I guess the users most probably will not now there is a telemetry they want to opt-in or even if they know, they might not want to.

@arturcic
Copy link
Copy Markdown
Member Author

arturcic commented May 7, 2026

https://github.com/cli/cli/pull/13254

thanks for sharing

@arturcic
Copy link
Copy Markdown
Member Author

arturcic commented May 7, 2026

@HHobeck said...
Please review my proposol I want to ask the maintainer of reqnroll:

This looks good to me!

same

@arturcic
Copy link
Copy Markdown
Member Author

arturcic commented May 7, 2026

we could also get inspired by how homebrew does it.

What is really important is to be really transparent on what we want to collect and as seen in the brew case, have the analytics be publicaly available

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants