Skip to content

Conversation

@baxeaz
Copy link
Contributor

@baxeaz baxeaz commented May 17, 2025

What was the problem/requirement? (What/Why)

We want to support redacted environment variables through openjd_redacted_env from RFC 0003

What was the solution? (How)

Add support for the new token.

What is the impact of this change?

Environment variables may be set similarly to openjd_env while their values become redacted strings in the logs through openjd_redacted_env:

How was this change tested?

All existing tests pass

All new unit tests pass

Running with the following new sample template:

specificationVersion: "jobtemplate-2023-09"
extensions:
  - REDACTED_ENV_VARS
name: Env Enter Setting Vars
description: |
  Setting environment variables in a way which won't expose secret values to logs

jobEnvironments:
  - name: JobEnvVars
    variables:
      PYTHONUNBUFFERED: "True"
    script:
      actions:
        onEnter:
          command: python
          args: ["{{Env.File.Enter}}"]
        onExit:
          command: python
          args: ["{{Env.File.Exit}}"]
      embeddedFiles:
        - name: Enter
          type: TEXT
          data: |
            print("Entering environment and setting vars..")
            print(f"openjd_redacted_env: SECRETVAR=SECRETVAL")
            print(f"openjd_redacted_env: INNERVAL=RET")
            print(f"openjd_redacted_env: KEYSPACE =SECRETVAL")
            print(f"openjd_redacted_env: VALSPACE= SPACEVAL")
            print(f"openjd_redacted_env: DOUBLEEQUALS=SECRETVAL=SECRETVAL")
            print(f"openjd_redacted_env: SECRETVAL")
        - name: Exit
          type: TEXT
          data: |
            import os
            print(f"SECRETVAR is {os.environ.get('SECRETVAR')}")
            print(f"KEYSPACE is {os.environ.get('KEYSPACE')}")
            print(f"VALSPACE is {os.environ.get('VALSPACE')}")
            print(f"DOUBLEEQUALS is {os.environ.get('DOUBLEEQUALS')}")
            print(f"INNERVAL is {os.environ.get('INNERVAL')}")
steps:
  - name: ProcessWithPython
    script:
      actions:
        onRun:
          command: python
          args: ["{{Task.File.Run}}"]
      embeddedFiles:
        - name: Run
          type: TEXT
          data: |
            import os

            print(f"SECRETVAR is {os.environ.get('SECRETVAR')}")
            print(f"KEYSPACE is {os.environ.get('KEYSPACE')}")
            print(f"VALSPACE is {os.environ.get('VALSPACE')}")
            print(f"DOUBLEEQUALS is {os.environ.get('DOUBLEEQUALS')}")
            print(f"INNERVAL is {os.environ.get('INNERVAL')}")
            print(f"{os.environ.get('SECRETVAR')}")
            print(f" {os.environ.get('SECRETVAR')}")
            print(f".{os.environ.get('SECRETVAR')}")
            print(f".{os.environ.get('SECRETVAR')} ;")
            print("DONE")
            # Run Application Consuming SECRETVAR..

Produces the following output:

openjd run ./redaction_template.yaml
0:00:00.000039  Open Job Description CLI: Session start 2025-05-17T01:14:54.283239+00:00
0:00:00.000104  Open Job Description CLI: Running job 'Env Enter Setting Vars'
0:00:00.000177
0:00:00.000218  ==============================================
0:00:00.000253  --------- Entering Environment: JobEnvVars
0:00:00.000285  ==============================================
0:00:00.000329  Setting: PYTHONUNBUFFERED=True
0:00:00.000623  ----------------------------------------------
0:00:00.000674  Phase: Setup
0:00:00.000710  ----------------------------------------------
0:00:00.000749  Writing embedded files for Environment to disk.
0:00:00.000949  Mapping: Env.File.Enter -> /tmp/OpenJD/CLI-sessionaxj378ht/embedded_files_1iiomay/tmpng1nezxq
0:00:00.001011  Mapping: Env.File.Exit -> /tmp/OpenJD/CLI-sessionaxj378ht/embedded_files_1iiomay/tmpprsjxh2m
0:00:00.001159  Wrote: Enter -> /tmp/OpenJD/CLI-sessionaxj378ht/embedded_files_1iiomay/tmpng1nezxq
0:00:00.001316  Wrote: Exit -> /tmp/OpenJD/CLI-sessionaxj378ht/embedded_files_1iiomay/tmpprsjxh2m
0:00:00.001620  ----------------------------------------------
0:00:00.001669  Phase: Running action
0:00:00.001701  ----------------------------------------------
0:00:00.002198  Running command /tmp/OpenJD/CLI-sessionaxj378ht/tmp4a52d3ge.sh
0:00:00.003410  Command started as pid: 11390
0:00:00.003663  Output:
0:00:00.014806  Entering environment and setting vars..
0:00:00.014953  openjd_redacted_env: SECRETVAR=********
0:00:00.015435  openjd_redacted_env: INNERVAL=********
0:00:00.015578  Malformed openjd_redacted_env command: invalid variable name. No environment variable will be set.
0:00:00.015546  openjd_redacted_env: KEYSPACE =********
0:00:00.015649  openjd_redacted_env: VALSPACE=********
0:00:00.015714  openjd_redacted_env: DOUBLEEQUALS=********
0:00:00.015800  Malformed openjd_redacted_env command: missing equals sign. No environment variable will be set.
0:00:00.015779  openjd_redacted_env: ********
0:00:00.017008  Process pid 11390 exited with code: 0 (unsigned) / 0x0 (hex)
0:00:00.017214  Open Job Description CLI: Running step 'ProcessWithPython'
0:00:00.017402
0:00:00.017502  ==============================================
0:00:00.017545  --------- Running Task
0:00:00.017577  ==============================================
0:00:00.017979  ----------------------------------------------
0:00:00.018034  Phase: Setup
0:00:00.018070  ----------------------------------------------
0:00:00.018108  Writing embedded files for Task to disk.
0:00:00.018241  Mapping: Task.File.Run -> /tmp/OpenJD/CLI-sessionaxj378ht/embedded_files_1iiomay/tmpaeh3ghtx
0:00:00.018400  Wrote: Run -> /tmp/OpenJD/CLI-sessionaxj378ht/embedded_files_1iiomay/tmpaeh3ghtx
0:00:00.018680  ----------------------------------------------
0:00:00.018729  Phase: Running action
0:00:00.018775  ----------------------------------------------
0:00:00.018971  Running command /tmp/OpenJD/CLI-sessionaxj378ht/tmp6ke7zb1n.sh
0:00:00.019735  Command started as pid: 11393
0:00:00.019903  Output:
0:00:00.030240  SEC********VAR is ********
0:00:00.030392  KEYSPACE is None
0:00:00.030461  VALSPACE is ********
0:00:00.030523  DOUBLEEQUALS is ********
0:00:00.030583  INNERVAL is ********
0:00:00.030638  ********
0:00:00.030691   ********
0:00:00.030742  .********
0:00:00.030792  .******** ;
0:00:00.030842  DONE
0:00:00.033111  Process pid 11393 exited with code: 0 (unsigned) / 0x0 (hex)
0:00:00.033390
0:00:00.033469  ==============================================
0:00:00.033515  --------- Exiting Environment: JobEnvVars
0:00:00.033562  ==============================================
0:00:00.033895  ----------------------------------------------
0:00:00.033956  Phase: Setup
0:00:00.033997  ----------------------------------------------
0:00:00.034039  Writing embedded files for Environment to disk.
0:00:00.034282  Mapping: Env.File.Enter -> /tmp/OpenJD/CLI-sessionaxj378ht/embedded_files_1iiomay/tmphvxftul3
0:00:00.034355  Mapping: Env.File.Exit -> /tmp/OpenJD/CLI-sessionaxj378ht/embedded_files_1iiomay/tmpay2mgfoj
0:00:00.034511  Wrote: Enter -> /tmp/OpenJD/CLI-sessionaxj378ht/embedded_files_1iiomay/tmphvxftul3
0:00:00.034652  Wrote: Exit -> /tmp/OpenJD/CLI-sessionaxj378ht/embedded_files_1iiomay/tmpay2mgfoj
0:00:00.035059  ----------------------------------------------
0:00:00.035115  Phase: Running action
0:00:00.035150  ----------------------------------------------
0:00:00.035338  Running command /tmp/OpenJD/CLI-sessionaxj378ht/tmpe8qpbfyp.sh
0:00:00.036338  Command started as pid: 11397
0:00:00.036535  Output:
0:00:00.047704  SEC********VAR is ********
0:00:00.047820  KEYSPACE is None
0:00:00.047882  VALSPACE is ********
0:00:00.047941  DOUBLEEQUALS is ********
0:00:00.048005  INNERVAL is ********
0:00:00.050284  Process pid 11397 exited with code: 0 (unsigned) / 0x0 (hex)
0:00:00.050511
0:00:00.050599  Open Job Description CLI: All actions completed successfully!
0:00:00.050649  Open Job Description CLI: Local Session ended! Now cleaning up Session resources.

Running with the following sample template:

specificationVersion: "jobtemplate-2023-09"
extensions:
  - REDACTED_ENV_VARS
name: Env Enter Setting Vars
description: |
  Setting environment variables in a way which won't expose secret values to logs

jobEnvironments:
  - name: JobEnvVars
    variables:
      PYTHONUNBUFFERED: "True"
    script:
      actions:
        onEnter:
          command: python
          args: ["{{Env.File.Enter}}"]
        onExit:
          command: python
          args: ["{{Env.File.Exit}}"]
      embeddedFiles:
        - name: Enter
          type: TEXT
          data: |
            import json
            print('openjd_redacted_env: \"SECRETVAR2=line\\nbreak\"')
            print('openjd_redacted_env: "SECRETVAR3=first\\nsecond"')
        - name: Exit
          type: TEXT
          data: |
            import os
            print(f"SECRETVAR2 is {os.environ.get('SECRETVAR2')}")
            print(f"SECRETVAR3 is {os.environ.get('SECRETVAR3')}")
steps:
  - name: ProcessWithPython
    script:
      actions:
        onRun:
          command: python
          args: ["{{Task.File.Run}}"]
      embeddedFiles:
        - name: Run
          type: TEXT
          data: |
            print("DONE")
            # Run Application Consuming SECRETVAR..

Produces the following output:

0:00:00.000038  Open Job Description CLI: Session start 2025-05-29T23:52:35.091735+00:00
0:00:00.000097  Open Job Description CLI: Running job 'Env Enter Setting Vars'
0:00:00.000168
0:00:00.000209  ==============================================
0:00:00.000244  --------- Entering Environment: JobEnvVars
0:00:00.000278  ==============================================
0:00:00.000325  Setting: PYTHONUNBUFFERED=True
0:00:00.000605  ----------------------------------------------
0:00:00.000657  Phase: Setup
0:00:00.000694  ----------------------------------------------
0:00:00.000736  Writing embedded files for Environment to disk.
0:00:00.000967  Mapping: Env.File.Enter -> /tmp/OpenJD/CLI-sessionowy82bbc/embedded_files3igg4xdh/tmpuh6nc77a
0:00:00.001038  Mapping: Env.File.Exit -> /tmp/OpenJD/CLI-sessionowy82bbc/embedded_files3igg4xdh/tmpfr2h5h4f
0:00:00.001183  Wrote: Enter -> /tmp/OpenJD/CLI-sessionowy82bbc/embedded_files3igg4xdh/tmpuh6nc77a
0:00:00.001335  Wrote: Exit -> /tmp/OpenJD/CLI-sessionowy82bbc/embedded_files3igg4xdh/tmpfr2h5h4f
0:00:00.001648  ----------------------------------------------
0:00:00.001698  Phase: Running action
0:00:00.001734  ----------------------------------------------
0:00:00.002107  Running command /tmp/OpenJD/CLI-sessionowy82bbc/tmp3kbs_ddp.sh
0:00:00.003223  Command started as pid: 41921
0:00:00.003338  Output:
0:00:00.021855  openjd_redacted_env: "SECRETVAR2=********
0:00:00.022163  openjd_redacted_env: "SECRETVAR3=********
0:00:00.024670  Process pid 41921 exited with code: 0 (unsigned) / 0x0 (hex)
0:00:00.024926  Open Job Description CLI: Running step 'ProcessWithPython'
0:00:00.025035
0:00:00.025089  ==============================================
0:00:00.025137  --------- Running Task
0:00:00.025180  ==============================================
0:00:00.025591  ----------------------------------------------
0:00:00.025660  Phase: Setup
0:00:00.025708  ----------------------------------------------
0:00:00.025758  Writing embedded files for Task to disk.
0:00:00.025891  Mapping: Task.File.Run -> /tmp/OpenJD/CLI-sessionowy82bbc/embedded_files3igg4xdh/tmp944v7cz2
0:00:00.026043  Wrote: Run -> /tmp/OpenJD/CLI-sessionowy82bbc/embedded_files3igg4xdh/tmp944v7cz2
0:00:00.026326  ----------------------------------------------
0:00:00.026389  Phase: Running action
0:00:00.026441  ----------------------------------------------
0:00:00.026686  Running command /tmp/OpenJD/CLI-sessionowy82bbc/tmp6c7ovo4f.sh
0:00:00.027449  Command started as pid: 41924
0:00:00.027585  Output:
0:00:00.039175  DONE
0:00:00.041628  Process pid 41924 exited with code: 0 (unsigned) / 0x0 (hex)
0:00:00.042090
0:00:00.042242  ==============================================
0:00:00.042345  --------- Exiting Environment: JobEnvVars
0:00:00.042454  ==============================================
0:00:00.042981  ----------------------------------------------
0:00:00.043035  Phase: Setup
0:00:00.043074  ----------------------------------------------
0:00:00.043114  Writing embedded files for Environment to disk.
0:00:00.043307  Mapping: Env.File.Enter -> /tmp/OpenJD/CLI-sessionowy82bbc/embedded_files3igg4xdh/tmpn6doj3_f
0:00:00.043390  Mapping: Env.File.Exit -> /tmp/OpenJD/CLI-sessionowy82bbc/embedded_files3igg4xdh/tmptvqj4yzg
0:00:00.043552  Wrote: Enter -> /tmp/OpenJD/CLI-sessionowy82bbc/embedded_files3igg4xdh/tmpn6doj3_f
0:00:00.043703  Wrote: Exit -> /tmp/OpenJD/CLI-sessionowy82bbc/embedded_files3igg4xdh/tmptvqj4yzg
0:00:00.044148  ----------------------------------------------
0:00:00.044205  Phase: Running action
0:00:00.044242  ----------------------------------------------
0:00:00.044556  Running command /tmp/OpenJD/CLI-sessionowy82bbc/tmpspi3x026.sh
0:00:00.045568  Command started as pid: 41928
0:00:00.045695  Output:
0:00:00.057254  SECRETVAR2 is ********
0:00:00.057450  ********
0:00:00.057566  SECRETVAR3 is ********
0:00:00.057686  ********
0:00:00.060073  Process pid 41928 exited with code: 0 (unsigned) / 0x0 (hex)
0:00:00.060376
0:00:00.060639  Open Job Description CLI: All actions completed successfully!
0:00:00.060762  Open Job Description CLI: Local Session ended! Now cleaning up Session resources.

Was this change documented?

Yes

Is this a breaking change?

No

Does this change impact security?

Yes - it has been through a security review and approved.


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@baxeaz baxeaz requested a review from a team as a code owner May 17, 2025 01:29
@baxeaz baxeaz force-pushed the mainline branch 4 times, most recently from f5485d3 to 50db13f Compare May 18, 2025 22:11
Comment on lines 508 to 510
h = sha256()
h.update(message.encode("utf-8"))
logger_name = "redacted" + h.hexdigest()[0:32]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the logger names need to be these hashes? Hard to tell if it's a functional part of the logger or just a way to make a random name for the test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah kind of ugly having this in the tests. Basically we're just setting up unique logger names. I've moved all of these to be dealt with inside the helper I made based on your other suggestion so they aren't cluttering up these tests anymore

mock_log.warning.assert_called_once()
assert "REDACTED_ENV_VARS extension is not enabled" in mock_log.warning.call_args[0][0]

# The callback should NOT be called since our implementation changed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"since our implementation changed" seems like a leftover AI comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, removed, thanks!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the code in these tests looks like it gets repeated a lot, like setting up the logger. Is there a good way to parametrize the tests or make some helpers? (There might not be a great way.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good call, it sure does! I've created a helper that reduces the boilerplate in these tests

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One higher level test that would be easy to read and help explain the feature at the same time could be having an input log file and output file, and assert on the output contents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a similar test in here like what you're thinking? I see some similar ones in openjd-cli and did add a more long form higher level test like what I think you might be describing, that will be in the next PR

@baxeaz baxeaz force-pushed the mainline branch 2 times, most recently from 3aa697e to 7f3a1e1 Compare May 20, 2025 02:47
crowecawcaw
crowecawcaw previously approved these changes May 20, 2025
# On Windows, we need to handle redaction in command strings
cmd_line = list2cmdline(self._args)
# Pre-redact any sensitive information in the command string
cmd_line_for_logger = pre_redact_command(cmd_line)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the redacted value gets escaped as a result of the list2cmdline call, it would probably slip through here. Should this instead redact the _args list to in parallel construct the cmd line for logging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what case you're thinking here - pre redact will redact everything after the openjd_redacted_env: token appears (We don't try to figure out where the value(s) begin and end in what might be a complicated multi line string) - we just want to be sure we're redacting something that is not yet but is likely about to become a redacted value.

@baxeaz baxeaz force-pushed the mainline branch 3 times, most recently from ae54d72 to 4ada5c9 Compare May 23, 2025 21:20
# Split into name and value
parts = message.split("=", 1)
name = parts[0].rstrip() # Remove trailing spaces
value = parts[1] # Keep leading spaces in value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this doesn't handle the JSON case of _handle_env. Can you add tests for a case like:

openjd_redacted_env: "MY_VAR=abc\ndef"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the JSON case was missed, thanks! I've added JSON/multi line support with testing and a second openjd run template example in the description to show the current behavior.

or "REDACTED_ENV_VARS" in self._revision_extensions.extensions
)

def apply_message_redaction(self, record: logging.LogRecord):

Check notice

Code scanning / CodeQL

Explicit returns mixed with implicit (fall through) returns Note

Mixing implicit and explicit returns may indicate an error, as implicit returns always return None.

# If we have an original value (from JSON parsing), add it to redaction set too
if original_value is not None:
self._redacted_values.add(original_value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will have the trailing quote, is that what you intended here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is yeah - not completely necessary though

… openjd_redacted_env

Signed-off-by: Brian Axelson <86568017+baxeaz@users.noreply.github.com>
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
3 Security Hotspots
4.7% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

@baxeaz baxeaz merged commit b949a16 into OpenJobDescription:mainline Jun 2, 2025
18 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants