Skip to content

[o365] Keep numbers as strings through to Elasticsearch#19245

Open
chrisberkhout wants to merge 2 commits into
elastic:mainfrom
chrisberkhout:o365-decode_json_string_numbers
Open

[o365] Keep numbers as strings through to Elasticsearch#19245
chrisberkhout wants to merge 2 commits into
elastic:mainfrom
chrisberkhout:o365-decode_json_string_numbers

Conversation

@chrisberkhout
Copy link
Copy Markdown
Contributor

Proposed commit message

[o365] Keep numbers as strings through to Elasticsearch

By using `decode_json_string_numbers` instead of `decode_json` in the
CEL program, numbers will arrive at the ingest pipeline as strings
matching how they were formatted in the API's JSON string.

In most cases those values can be passed straight through to
Elasticsearch to be indexed as `long` or as `keyword`, so most
conversion processing has been removed.

There are two types of exceptions to this, handled by a normalization
script early in the pipeline:
- The `o365audit` input will still provide numbers, so where values need
  to be compared, we need to ensure we have a consistently formatted
  string. That is the case for the `RecordType` field.
- Some fields arrive from the API as numbers formatted with scientific
  notation, but they are indexed as `keyword` and should not use
  scientific notation there. That is the case for the other fields
  procesed by the normalization script.

To use `decode_json_string_numbers`, the minimum agent versions needed
to be updated from "^8.18.0 || ^9.0.0" to "^8.19.0 || ^9.1.0".

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

Related issues

By using `decode_json_string_numbers` instead of `decode_json` in the
CEL program, numbers will arrive at the ingest pipeline as strings
matching how they were formatted in the API's JSON string.

In most cases those values can be passed straight through to
Elasticsearch to be indexed as `long` or as `keyword`, so most
conversion processing has been removed.

There are two types of exceptions to this, handled by a normalization
script early in the pipeline:
- The `o365audit` input will still provide numbers, so where values need
  to be compared, we need to ensure we have a consistently formatted
  string. That is the case for the `RecordType` field.
- Some fields arrive from the API as numbers formatted with scientific
  notation, but they are indexed as `keyword` and should not use
  scientific notation there. That is the case for the other fields
  procesed by the normalization script.

To use `decode_json_string_numbers`, the minimum agent versions needed
to be updated from "^8.18.0 || ^9.0.0" to "^8.19.0 || ^9.1.0".
@chrisberkhout chrisberkhout self-assigned this May 27, 2026
@chrisberkhout chrisberkhout requested review from a team as code owners May 27, 2026 15:46
@chrisberkhout chrisberkhout added Integration:o365 Microsoft Office 365 bugfix Pull request that fixes a bug issue Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations] labels May 27, 2026
@infra-vault-gh-plugin-prod
Copy link
Copy Markdown

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

@elasticmachine
Copy link
Copy Markdown

💚 Build Succeeded

cc @chrisberkhout

Copy link
Copy Markdown
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some cases here where the fields should be converted to numbers in the ingest pipeline as their semantics are numeric; ….ConditionsMatched.SensitiveInformation.{Confidence,Count,UniqueCount}, o365audit.ExchangeMetaData.{FileSize,RecipientCount} and file.size for example (there are others).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Pull request that fixes a bug issue Integration:o365 Microsoft Office 365 Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants