Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ Chart.lock
**/secrets.yml
values-local.yaml
values-local.yml
values-*.yaml
values-*.yml

# Helm output and temporary files
*.tmp
Expand All @@ -31,4 +33,3 @@ test-output/
manifests/
rendered/
debug/

154 changes: 154 additions & 0 deletions braintrust/examples/google-autopilot-cel/values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
# Sample values for GKE Autopilot deployment with CEL policy compliance

global:
orgName: "<your Braintrust org name>"
namespace: "braintrust"

cloud: "google"

google:
mode: "autopilot"
autopilotMachineFamily: "c4"

objectStorage:
google:
brainstoreBucket: "<your brainstore bucket name>"
apiBucket: "<your api bucket name>"

api:
name: "braintrust-api"
# Uncomment the following section to use a different image or tag from the version in the Helm release
#image:
#repository: public.ecr.aws/braintrust/standalone-api
#tag: "<your image tag>"
annotations:
service:
networking.gke.io/load-balancer-type: "Internal"
replicas: 4
service:
type: LoadBalancer
port: 8000
portName: http
serviceAccount:
name: "braintrust-api"
googleServiceAccount: "<your Braintrust API Google service account>"
enableGcsAuth: false
resources:
requests:
cpu: "4"
memory: "4Gi"
limits:
cpu: "4"
memory: "8Gi"
securityContext:
readOnlyRootFilesystem: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we smoke this with product created custom code scorers rather than only pod-level checks? Maybe:

  1. a trivial Python/TypeScript scorer that returns 1.0
  2. a trace-level scorer that calls trace.get_spans() / trace.getSpans()

This should exercise the actual scorer sandbox startup path and the trace/object-fetch path that would surface runtime filesystem assumptions under readOnlyRootFilesystem

allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
tmpVolume:
enabled: true
sizeLimit: "1Gi"
extraEnvVars:
- name: AWS_REGION
value: "us-central1"

brainstore:
serviceAccount:
name: "brainstore"
googleServiceAccount: "<your Braintrust Brainstore Google service account>"
# Uncomment the following section to use a different image or tag from the version in the Helm release
#image:
#repository: public.ecr.aws/braintrust/brainstore
#tag: "<your image tag>"
locksBackend: "objectStorage"

reader:
name: "brainstore-reader"
replicas: 2
service:
name: ""
type: ClusterIP
port: 4000
portName: http
resources:
requests:
cpu: "16"
memory: "32Gi"
limits:
cpu: "16"
memory: "32Gi"
cacheDir: "/mnt/tmp/brainstore"
objectStoreCacheMemoryLimit: "1Gi"
objectStoreCacheFileSize: "900Gi"
verbose: true
securityContext:
readOnlyRootFilesystem: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache path is still writable via /mnt/tmp/brainstore, so this looks structurally good. Can we confirm with a runtime smoke test that Brainstore does not write temp/cache files outside cacheDir when readOnlyRootFilesystem is enabled?

i.e.

kubectl -n braintrust exec deploy/brainstore-reader -- sh -c 'touch /mnt/tmp/brainstore/smoke && rm /mnt/tmp/brainstore/smoke'
kubectl -n braintrust exec deploy/braintrust-api -- sh -c 'touch /tmp/smoke && rm /tmp/smoke'

then run one product level through the API

  1. create/write one trace
  2. read/query it back
  3. run one eval or scorer/code-function path if code execution is enabled
  • check API logs/events for EROFS, read-only file system, permission denied, or No space left on device? That should catch whether Brainstore writes anywhere outside cacheDir at runtime

allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
volume:
size: "1000Gi"
sizeLimit: "900Gi"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question on sizing

objectStoreCacheFileSize is also 900Gi above, so this leaves no headroom inside the emptyDir limit for filesystem overhead, temp files, partial writes, or Brainstore metadata. Should sizeLimit match the requested volume.size (1000Gi), or should objectStoreCacheFileSize be set lower than sizeLimit?

same applies to fastreader/writer below

extraEnvVars:

fastreader:
name: "brainstore-fastreader"
replicas: 2
service:
name: ""
type: ClusterIP
port: 4000
portName: http
resources:
requests:
cpu: "16"
memory: "32Gi"
limits:
cpu: "16"
memory: "32Gi"
cacheDir: "/mnt/tmp/brainstore"
objectStoreCacheMemoryLimit: "1Gi"
objectStoreCacheFileSize: "900Gi"
verbose: true
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
volume:
size: "1000Gi"
sizeLimit: "900Gi"
extraEnvVars:

writer:
name: "brainstore-writer"
replicas: 1
service:
name: ""
type: ClusterIP
port: 4000
portName: http
resources:
requests:
cpu: "32"
memory: "64Gi"
limits:
cpu: "32"
memory: "64Gi"
cacheDir: "/mnt/tmp/brainstore"
objectStoreCacheMemoryLimit: "1Gi"
objectStoreCacheFileSize: "900Gi"
verbose: true
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
volume:
size: "1000Gi"
sizeLimit: "900Gi"
extraEnvVars:
28 changes: 25 additions & 3 deletions braintrust/templates/api-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ spec:
{{- end }}
spec:
serviceAccountName: {{ .Values.api.serviceAccount.name }}
{{- with .Values.api.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.api.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
Expand All @@ -60,6 +64,10 @@ spec:
- name: api
image: "{{ .Values.api.image.repository }}:{{ .Values.api.image.tag }}"
imagePullPolicy: {{ .Values.api.image.pullPolicy }}
{{- with .Values.api.securityContext }}
securityContext:
{{- toYaml . | nindent 12 }}
{{- end }}
ports:
- containerPort: {{ .Values.api.service.port }}
resources:
Expand Down Expand Up @@ -122,17 +130,32 @@ spec:
{{- if .Values.api.extraEnvVars }}
{{- toYaml .Values.api.extraEnvVars | nindent 12 }}
{{- end }}
{{- if and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver }}
{{- if or .Values.api.tmpVolume.enabled (and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver) }}
volumeMounts:
{{- if .Values.api.tmpVolume.enabled }}
- name: tmp-volume
mountPath: /tmp
{{- end }}
{{- if and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver }}
- name: secrets-store-inline
mountPath: "/mnt/secrets-store"
readOnly: true
{{- end }}
{{- end }}
{{- with .Values.api.extraContainers }}
{{- toYaml . | nindent 8 }}
{{- end }}
volumes:
{{- if or (and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver) .Values.api.extraVolumes }}
{{- if or .Values.api.tmpVolume.enabled (and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver) .Values.api.extraVolumes }}
{{- if .Values.api.tmpVolume.enabled }}
- name: tmp-volume
emptyDir:
{{- if .Values.api.tmpVolume.sizeLimit }}
sizeLimit: {{ .Values.api.tmpVolume.sizeLimit | quote }}
{{- else }}
{}
{{- end }}
{{- end }}
{{- if and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver }}
- name: secrets-store-inline
csi:
Expand All @@ -147,4 +170,3 @@ spec:
{{- else }}
[]
{{- end }}

15 changes: 14 additions & 1 deletion braintrust/templates/brainstore-fastreader-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ spec:
{{- end }}
spec:
serviceAccountName: {{ .Values.brainstore.serviceAccount.name }}
{{- with .Values.brainstore.fastreader.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if or .Values.brainstore.fastreader.nodeSelector (and (eq .Values.cloud "google") (eq .Values.google.mode "autopilot")) }}
nodeSelector:
{{- with .Values.brainstore.fastreader.nodeSelector }}
Expand All @@ -67,6 +71,10 @@ spec:
- name: brainstore-fastreader
image: "{{ .Values.brainstore.image.repository }}:{{ .Values.brainstore.image.tag }}"
imagePullPolicy: {{ .Values.brainstore.image.pullPolicy }}
{{- with .Values.brainstore.fastreader.securityContext }}
securityContext:
{{- toYaml . | nindent 12 }}
{{- end }}
command: ["brainstore"]
args: ["web"]
ports:
Expand Down Expand Up @@ -155,7 +163,12 @@ spec:
requests:
storage: {{ required "brainstore.fastreader.volume.size must be set" .Values.brainstore.fastreader.volume.size | quote }}
{{- else }}
emptyDir: {}
emptyDir:
{{- if .Values.brainstore.fastreader.volume.sizeLimit }}
sizeLimit: {{ .Values.brainstore.fastreader.volume.sizeLimit | quote }}
{{- else }}
{}
{{- end }}
{{- end }}
{{- if and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver }}
- name: secrets-store-inline
Expand Down
15 changes: 14 additions & 1 deletion braintrust/templates/brainstore-reader-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ spec:
{{- end }}
spec:
serviceAccountName: {{ .Values.brainstore.serviceAccount.name }}
{{- with .Values.brainstore.reader.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if or .Values.brainstore.reader.nodeSelector (and (eq .Values.cloud "google") (eq .Values.google.mode "autopilot")) }}
nodeSelector:
{{- with .Values.brainstore.reader.nodeSelector }}
Expand All @@ -67,6 +71,10 @@ spec:
- name: brainstore-reader
image: "{{ .Values.brainstore.image.repository }}:{{ .Values.brainstore.image.tag }}"
imagePullPolicy: {{ .Values.brainstore.image.pullPolicy }}
{{- with .Values.brainstore.reader.securityContext }}
securityContext:
{{- toYaml . | nindent 12 }}
{{- end }}
command: ["brainstore"]
args: ["web"]
ports:
Expand Down Expand Up @@ -155,7 +163,12 @@ spec:
requests:
storage: {{ required "brainstore.reader.volume.size must be set" .Values.brainstore.reader.volume.size | quote }}
{{- else }}
emptyDir: {}
emptyDir:
{{- if .Values.brainstore.reader.volume.sizeLimit }}
sizeLimit: {{ .Values.brainstore.reader.volume.sizeLimit | quote }}
{{- else }}
{}
{{- end }}
{{- end }}
{{- if and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver }}
- name: secrets-store-inline
Expand Down
15 changes: 14 additions & 1 deletion braintrust/templates/brainstore-writer-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ spec:
{{- end }}
spec:
serviceAccountName: {{ .Values.brainstore.serviceAccount.name }}
{{- with .Values.brainstore.writer.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if or .Values.brainstore.writer.nodeSelector (and (eq .Values.cloud "google") (eq .Values.google.mode "autopilot")) }}
nodeSelector:
{{- with .Values.brainstore.writer.nodeSelector }}
Expand All @@ -67,6 +71,10 @@ spec:
- name: brainstore-writer
image: "{{ .Values.brainstore.image.repository }}:{{ .Values.brainstore.image.tag }}"
imagePullPolicy: {{ .Values.brainstore.image.pullPolicy }}
{{- with .Values.brainstore.writer.securityContext }}
securityContext:
{{- toYaml . | nindent 12 }}
{{- end }}
command: ["brainstore"]
args: ["web"]
ports:
Expand Down Expand Up @@ -155,7 +163,12 @@ spec:
requests:
storage: {{ required "brainstore.writer.volume.size must be set" .Values.brainstore.writer.volume.size | quote }}
{{- else }}
emptyDir: {}
emptyDir:
{{- if .Values.brainstore.writer.volume.sizeLimit }}
sizeLimit: {{ .Values.brainstore.writer.volume.sizeLimit | quote }}
{{- else }}
{}
{{- end }}
{{- end }}
{{- if and (eq .Values.cloud "azure") .Values.azure.enableAzureKeyVaultDriver }}
- name: secrets-store-inline
Expand Down
9 changes: 9 additions & 0 deletions braintrust/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,9 @@ api:
limits:
cpu: "4"
memory: "8Gi"
tmpVolume:
enabled: false
sizeLimit: ""
# Allow running user generated code functions (e.g. scorers/tools)
allowCodeFunctionExecution: true
# Brainstore backfill configuration. These defaults are fine for most cases.
Expand Down Expand Up @@ -232,6 +235,8 @@ brainstore:
volume:
# Storage size for ephemeral storage requests (used with GKE Autopilot local SSDs)
size: ""
# Optional emptyDir size limit for CEL policy compliance
sizeLimit: ""
extraEnvVars: []
nodeSelector: {}
tolerations: []
Expand Down Expand Up @@ -271,6 +276,8 @@ brainstore:
volume:
# Storage size for ephemeral storage requests (used with GKE Autopilot local SSDs)
size: ""
# Optional emptyDir size limit for CEL policy compliance
sizeLimit: ""
extraEnvVars: []
nodeSelector: {}
tolerations: []
Expand Down Expand Up @@ -311,6 +318,8 @@ brainstore:
# Storage size for ephemeral storage requests
# Used with GKE Autopilot local SSDs and Azure Container Storage CSI
size: ""
# Optional emptyDir size limit for CEL policy compliance
sizeLimit: ""
extraEnvVars: []
# Example:
# - name: MY_ENV_VAR
Expand Down
Loading