Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
135 changes: 135 additions & 0 deletions .github/copilot-instructions-dotnet.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# .NET (C#) Specific Instructions

## Stack

- .NET 8+
- `MongoDB.Driver` for DocumentDB access
- `Azure.Identity` for DefaultAzureCredential
- `Azure.AI.OpenAI` for Azure OpenAI

## File Structure

```
ai/select-algorithm-dotnet/
├── src/
│ ├── CompareAll.cs
│ └── Utils.cs
├── select-algorithm-dotnet.csproj
└── README.md

ai/vector-search-dotnet/
├── src/
│ ├── Ivf.cs
│ ├── Hnsw.cs
│ ├── Diskann.cs
│ └── Utils.cs
├── vector-search-dotnet.csproj
└── README.md
```

## Naming Conventions

- Files: `PascalCase.cs`
- Methods: `PascalCase`
- Constants: `PascalCase`
- Private fields: `_camelCase`
- Local variables: `camelCase`
- Namespaces: `Azure.DocumentDB.Samples`

## Authentication Pattern

```csharp
using Azure.Identity;
using MongoDB.Driver;
using MongoDB.Driver.Authentication.Oidc;

var credential = new DefaultAzureCredential();
var oidcCallback = new OidcCallback(async (parameters, cancellationToken) =>
{
var token = await credential.GetTokenAsync(
new TokenRequestContext(new[] { "https://ossrdbms-aad.database.windows.net/.default" }),
cancellationToken);
return new OidcAccessToken(token.Token, token.ExpiresOn);
});
```

## $search Syntax

```csharp
// CORRECT
var searchStage = new BsonDocument("$search",
new BsonDocument("cosmosSearch",
new BsonDocument
{
{ "vector", new BsonArray(queryVector) },
{ "path", embeddedField },
{ "k", topK }
}));

// WRONG — do NOT add cosmosSearchOptions to the $search stage
```

## Bulk Insert

Use `collection.InsertManyAsync()` with `InsertManyOptions { IsOrdered = false }`:

```csharp
using MongoDB.Driver;

try
{
await collection.InsertManyAsync(batch, new InsertManyOptions { IsOrdered = false });
insertedCount += batch.Count;
}
catch (MongoBulkWriteException<BsonDocument> e)
{
// Partial failure — some docs inserted
insertedCount += (int)e.Result.InsertedCount;
failedCount += batch.Count - (int)e.Result.InsertedCount;
}
```

- Batch size configurable via `LOAD_SIZE_BATCH` env var (default: 100)
- 200ms delay between batches (`await Task.Delay(200)`)
- Catch `MongoBulkWriteException` for partial failure handling
- Always use the async variant (`InsertManyAsync`)

## Key Patterns

- Use `Environment.GetEnvironmentVariable("VAR") ?? "default"` for config
- Use `using` statements for disposable resources
- Use `try/finally` for collection cleanup
- Async/await throughout (use `Async` suffix on method names)
- Match TypeScript output format exactly

## Environment Variables

- Use `IConfiguration` with layered sources: `appsettings.json` → environment variables
- Provide `appsettings.json` with placeholder structure (committed) and gitignore `appsettings.local.json`
- Environment variables override JSON config values
- Bind to strongly-typed configuration classes (`AppConfiguration`, `AzureOpenAIConfiguration`, etc.)

```csharp
var configuration = new ConfigurationBuilder()
.AddJsonFile("appsettings.json", optional: false, reloadOnChange: true)
.AddEnvironmentVariables()
.Build();

var appConfig = configuration.Get<AppConfiguration>()
?? throw new InvalidOperationException("Failed to load configuration");
```

- Configuration class hierarchy:
- `AppConfiguration` → root
- `AzureOpenAIConfiguration` → endpoint, model, apiVersion
- `MongoDBConfiguration` → connectionString, clusterName, loadBatchSize
- `EmbeddingConfiguration` → fieldToEmbed, embeddedField, dimensions, batchSize
- `VectorSearchConfiguration` → query, databaseName, topK

- Include `Microsoft.Extensions.Configuration` packages in `.csproj`

## Build & Run

```bash
dotnet run
```
133 changes: 133 additions & 0 deletions .github/copilot-instructions-go.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# Go-Specific Instructions

## Stack

- Go 1.21+
- `go.mongodb.org/mongo-driver/v2` for DocumentDB access
- `github.com/Azure/azure-sdk-for-go/sdk/azidentity` for DefaultAzureCredential
- `github.com/openai/openai-go` for Azure OpenAI

## File Structure

```
ai/select-algorithm-go/
├── src/
│ ├── compare_all.go # Multi-query comparison runner
│ └── utils.go # Shared utilities
├── go.mod
├── go.sum
└── README.md

ai/vector-search-go/
├── src/
│ ├── ivf.go
│ ├── hnsw.go
│ ├── diskann.go
│ └── utils.go
├── go.mod
├── go.sum
└── README.md
```

## Naming Conventions

- Files: `snake_case.go`
- Functions: `PascalCase` (exported), `camelCase` (unexported)
- Constants: `PascalCase` or `camelCase`
- Packages: `lowercase`

## Authentication Pattern

```go
import (
"github.com/Azure/azure-sdk-for-go/sdk/azidentity"
"go.mongodb.org/mongo-driver/v2/mongo"
"go.mongodb.org/mongo-driver/v2/mongo/options"
)

credential, _ := azidentity.NewDefaultAzureCredential(nil)
// Use OIDC callback with DocumentDB scope
```

## $search Syntax

```go
// CORRECT
searchStage := bson.D{{Key: "$search", Value: bson.D{
{Key: "cosmosSearch", Value: bson.D{
{Key: "vector", Value: queryVector},
{Key: "path", Value: embeddedField},
{Key: "k", Value: topK},
}},
}}}

// WRONG — do NOT include cosmosSearchOptions in the $search stage
```

## Bulk Insert

Use `collection.InsertMany()` with `SetOrdered(false)` and handle `BulkWriteException`:

```go
result, err := collection.InsertMany(ctx, documents, options.InsertMany().SetOrdered(false))
if err != nil {
if bulkErr, ok := err.(mongo.BulkWriteException); ok {
// Partial failure — some docs inserted, some failed
failed := len(bulkErr.WriteErrors)
insertedCount += len(batch) - failed
} else {
return fmt.Errorf("batch insert failed: %w", err)
}
} else {
insertedCount += len(result.InsertedIDs)
}
```

- Batch size configurable via `LOAD_SIZE_BATCH` env var (default: 100)
- 200ms delay between batches (`time.Sleep(200 * time.Millisecond)`)
- Type-assert `mongo.BulkWriteException` for partial failure handling

## Key Patterns

- Use `os.Getenv("VAR")` with fallback helper for config
- Always check errors explicitly — no panic in sample code
- Use `context.Background()` or appropriate timeout contexts
- Use `defer` for cleanup (drop collections)
- Match TypeScript output format exactly

## Environment Variables

- Use `github.com/joho/godotenv` to load from `.env` file at startup
- Provide a `.env.example` file in each sample directory
- Access pattern: `os.Getenv("VAR")` with a helper function for defaults
- Call `godotenv.Load()` early — log a warning if `.env` is missing but don't fail (env vars may be set externally)

```go
import (
"os"
"github.com/joho/godotenv"
)

func init() {
err := godotenv.Load()
if err != nil {
fmt.Println("No .env file found, using environment variables")
}
}

func getEnvOrDefault(key, defaultValue string) string {
if value := os.Getenv(key); value != "" {
return value
}
return defaultValue
}
```

- Include `github.com/joho/godotenv` in `go.mod`

## Build & Run

```bash
cd src
go run .
```
122 changes: 122 additions & 0 deletions .github/copilot-instructions-java.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# Java-Specific Instructions

## Stack

- Java 17+
- MongoDB Java Driver (`org.mongodb:mongodb-driver-sync`)
- Azure Identity (`com.azure:azure-identity`)
- Azure OpenAI (`com.azure:azure-ai-openai`)

## File Structure

```
ai/select-algorithm-java/
├── src/main/java/com/azure/documentdb/sample/
│ ├── CompareAll.java
│ └── Utils.java
├── pom.xml
└── README.md

ai/vector-search-java/
├── src/main/java/com/azure/documentdb/sample/
│ ├── Ivf.java
│ ├── Hnsw.java
│ ├── Diskann.java
│ └── Utils.java
├── pom.xml
└── README.md
```

## Naming Conventions

- Files: `PascalCase.java`
- Methods: `camelCase`
- Constants: `UPPER_SNAKE_CASE`
- Classes: `PascalCase`
- Packages: `com.azure.documentdb.sample`

## Authentication Pattern

```java
import com.azure.identity.DefaultAzureCredentialBuilder;
import com.mongodb.MongoClientSettings;
import com.mongodb.MongoCredential;

DefaultAzureCredential credential = new DefaultAzureCredentialBuilder().build();
MongoCredential mongoCredential = MongoCredential.createOidcCredential(null)
.withMechanismProperty("OIDC_CALLBACK", (context) -> {
AccessToken token = credential.getToken(
new TokenRequestContext().addScopes("https://ossrdbms-aad.database.windows.net/.default")
).block();
return new OidcCallbackResult(token.getToken());
});
```

## $search Syntax

```java
// CORRECT
Document searchStage = new Document("$search",
new Document("cosmosSearch",
new Document("vector", queryVector)
.append("path", embeddedField)
.append("k", topK)));

// WRONG — do NOT add cosmosSearchOptions to the $search stage
```

## Bulk Insert

Use `collection.insertMany()` with `InsertManyOptions().ordered(false)`:

```java
import com.mongodb.client.model.InsertManyOptions;
import com.mongodb.MongoBulkWriteException;

try {
collection.insertMany(documents, new InsertManyOptions().ordered(false));
insertedCount += documents.size();
} catch (MongoBulkWriteException e) {
// Partial failure — some docs inserted
insertedCount += e.getWriteResult().getInsertedCount();
failedCount += documents.size() - e.getWriteResult().getInsertedCount();
}
```

- Batch size configurable via `LOAD_SIZE_BATCH` env var (default: 100)
- 200ms delay between batches (`Thread.sleep(200)`)
- Catch `MongoBulkWriteException` for partial failure handling

## Key Patterns

- Use `System.getenv("VAR")` with null check for config
- Use try-with-resources for MongoClient
- Use `try/finally` for collection cleanup
- Match TypeScript output format exactly

## Environment Variables

- Read directly via `System.getenv("VAR")` — **no dotenv library**
- Provide a `.env.example` file in each sample directory for documentation purposes
- Access pattern: `System.getenv("VAR")` with null check or ternary for defaults
- Validate required vars early and fail with a clear message

```java
var clusterName = System.getenv("MONGO_CLUSTER_NAME");
var endpoint = System.getenv("AZURE_OPENAI_EMBEDDING_ENDPOINT");
var model = System.getenv("AZURE_OPENAI_EMBEDDING_MODEL");
var batchSizeStr = System.getenv("LOAD_SIZE_BATCH");
var batchSize = batchSizeStr != null ? Integer.parseInt(batchSizeStr) : 100;

if (clusterName == null || endpoint == null) {
throw new IllegalStateException("Missing required environment variables: MONGO_CLUSTER_NAME, AZURE_OPENAI_EMBEDDING_ENDPOINT");
}
```

- Users set env vars via shell export, IDE run configuration, or azd-provided `.env`

## Build & Run

```bash
mvn compile exec:java -Dexec.mainClass="com.azure.documentdb.sample.CompareAll"
```
Loading
Loading