Skip to content

Add CI validation workflow, copilot instructions, and collection lifecycle#75

Closed
diberry wants to merge 10 commits intoAzure-Samples:mainfrom
diberry:ci/validation-workflow-and-lifecycle
Closed

Add CI validation workflow, copilot instructions, and collection lifecycle#75
diberry wants to merge 10 commits intoAzure-Samples:mainfrom
diberry:ci/validation-workflow-and-lifecycle

Conversation

@diberry
Copy link
Copy Markdown
Collaborator

@diberry diberry commented Apr 30, 2026

Summary

Adds infrastructure for sample validation and developer guidance:

GitHub Actions Workflow (validate-samples.yml)

  • Dual-mode: Build-only for PR/push triggers, full-run on manual dispatch
  • Preflight job: Validates SAMPLES_ENV_FILE secret exists before running
  • Security: Uses env: block for secret injection, never direct interpolation
  • Concurrency: Cancels redundant runs on the same branch

Copilot Instruction Files

  • Main .github/copilot-instructions.md with shared conventions
  • Per-language files (TypeScript, Python, Go, Java, .NET) with:
    • Bulk insert patterns (specific method per language)
    • Env var loading approach
    • Collection lifecycle patterns

Collection Lifecycle Standardization

  • Start: Conditional drop (only if collection exists)
  • End: Always drop in finally/defer (cleanup for next run)
  • Prevents name conflicts during parallel CI execution

select-algorithm-typescript Updates

  • Removed IP metric (redundant with COS for normalized vectors)
  • Added multi-query support (5 diverse default queries)
  • Added proper collection cleanup in finally block

Testing

Workflow can be tested via Actions tab > Run workflow (requires SAMPLES_ENV_FILE repo secret).

diberry and others added 10 commits April 29, 2026 12:19
Implement vector index algorithm comparison samples (IVF, HNSW, DiskANN)
for Python, TypeScript, Go, Java, and C#/.NET.

Each sample demonstrates:
- IVF index creation (numLists=10) for <10K documents
- HNSW index creation (m=16, efConstruction=64) for 10K-50K documents
- DiskANN index creation (maxDegree=20, lBuild=10) for 50K+ documents
- Vector search using \ aggregation with cosmosSearch
- Passwordless auth via DefaultAzureCredential/OIDC

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Java: Fix TOKEN_RESOURCE from cosmos.azure.com to ossrdbms-aad.database.windows.net
- TypeScript IVF: Remove inconsistent returnStoredSource field
- .NET .env.example: Fix vector field name to contentVector, remove unused AZURE_TENANT_ID
- Java .env.example: Remove unused AZURE_MANAGED_IDENTITY_PRINCIPAL_ID
- Python .env.example: Fix API version to 2023-05-15 for consistency

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…onBuilder

- Remove DotNetEnv package, add Microsoft.Extensions.Configuration packages
- Add appsettings.json with strongly-typed config sections
- Add Models/Configuration.cs with AppConfiguration classes
- Update Program.cs to use ConfigurationBuilder (json + env var override)
- Update Utils.cs to accept AppConfiguration parameter
- Update all demo Run() methods to receive config from Program.cs
- Delete .env.example (no longer needed)
- Update README to reference appsettings.json + azd env get-values

Matches Article 1 (vector-search-dotnet) configuration pattern.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
All non-.NET Article 2 READMEs now show azd env get-values > .env
as the primary config method after azd up, with manual cp .env.example
as fallback. Matches Article 1 README pattern.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Runs all 9 combinations (3 algorithms x 3 metrics) in a single
execution with formatted comparison output.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- All 5 runners now: drop collection → create fresh → upload data →
  create indexes → run comparisons → drop collection on exit
- Removed 15 individual algorithm files (ivf/hnsw/diskann per language)
- Updated entry points (main.go, Main.java, Program.cs) to only run compare-all
- Simplified package.json scripts (TypeScript)
- All languages use DefaultAzureCredential for auth

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rop at end

All 10 sample directories now follow the same pattern:
- START: conditionally drop collection only if it exists
- END: always drop collection for cleanup (in finally/defer block)

Languages updated: TypeScript, Python, Go, Java, .NET

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…le standardization

- Add dual-mode GitHub Actions workflow (build-only for PR/push, full-run for manual dispatch)
- Add copilot instruction files for all 5 languages (TypeScript, Python, Go, Java, .NET)
- Document collection lifecycle convention (conditional drop-at-start, always-drop-at-end)
- Document naming convention requirement for parallel CI safety
- Update select-algorithm-typescript: remove IP metric, add multi-query support, add cleanup
- Add bulk insert and env var documentation per language

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Serialize run jobs (TS→Py→Go→Java→.NET) to prevent collection collisions
- Add ::add-mask:: for secret values, fix IFS parsing for connection strings
- Fix Go version 1.24→1.23
- Add timeout-minutes: 2 to preflight job
- TypeScript: add MongoSearchResult interface, env validation, safe cleanup
- Add MONGO_CLUSTER_NAME to env var table (required for passwordless auth)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@diberry
Copy link
Copy Markdown
Collaborator Author

diberry commented Apr 30, 2026

Splitting into two focused PRs for cleaner review

@diberry diberry closed this Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant