[MEDI] Use MEVD AutoGenerated Guid key feature; update MEVD/SK to latest versions#7423
[MEDI] Use MEVD AutoGenerated Guid key feature; update MEVD/SK to latest versions#7423Copilot wants to merge 20 commits intodata-ingestion-preview2from
Conversation
…tedChunkRecord base type Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
…toreWriter refactoring Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
…ion property names Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
…al, throw in SetMetadata, make consts protected, remove JSON attributes Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
…oc reference, remove JsonPropertyName, add custom schema test Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
… custom storage names in test Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
… for consistency Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Co-authored-by: Adam Sitnik <adam.sitnik@gmail.com>
Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
…ot IngestedChunk.cs files Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
…n, move SetMetadata to non-sealed VectorStoreWriter Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
…n with provided dimension count
…ied overload, fix typo, update README - Renamed IngestedChunkRecord<TChunk> to IngestionChunkVectorRecord<TChunk> across all source, test, template, and snapshot files - Fixed typo 'additiveDefintion' -> 'additiveDefinition' in VectorStoreExtensions.cs - Added simplified non-generic GetIngestionRecordCollection overload for the common IngestionChunkVectorRecord<string> case - Added XML remarks to generic GetIngestionRecordCollection method documenting usage patterns (simple, custom metadata, custom schema) - Extended README.md with code samples for basic usage, custom metadata, and custom collection schema Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com> Agent-Logs-Url: https://github.com/dotnet/extensions/sessions/4dce2c93-8680-43ae-a663-bcd22316c92b
… to latest Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com> Agent-Logs-Url: https://github.com/dotnet/extensions/sessions/45713a5e-1067-42e6-a6e2-a172b8bef7ee
adamsitnik
left a comment
There was a problem hiding this comment.
@copilot address my feedback
| new VectorStoreKeyProperty(nameof(IngestionChunkVectorRecord<>.Key), typeof(Guid)) | ||
| { | ||
| IsAutoGenerated = true, | ||
| }, | ||
|
|
There was a problem hiding this comment.
We don't need it here, it should be set using the attribute here
| new VectorStoreKeyProperty(nameof(IngestionChunkVectorRecord<>.Key), typeof(Guid)) | |
| { | |
| IsAutoGenerated = true, | |
| }, |
There was a problem hiding this comment.
Unfortunately VectorStoreKeyAttribute.IsAutoGenerated is of type Nullable<bool> in MEVD 10.1.0, which cannot be used as a C# named attribute argument — the compiler rejects it with CS0655: 'IsAutoGenerated' is not a valid named attribute argument because it is not a valid attribute parameter type.
Because of this constraint, IsAutoGenerated = true must be set via VectorStoreKeyProperty in the collection definition (the current approach). Once MEVD changes IsAutoGenerated from bool? to bool, we can move this to the attribute.
There was a problem hiding this comment.
I've provided a fix in the SK repo (microsoft/semantic-kernel#13698). For now, let's stick with what we have here.
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
…pilot/remove-key-setting-requirement # Conflicts: # src/Libraries/Microsoft.Extensions.DataIngestion/README.md # src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestionChunkVectorRecord.cs # src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreExtensions.cs # src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriter.cs # test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Writers/VectorStoreWriterTests.cs
| { | ||
| new VectorStoreKeyProperty(nameof(IngestionChunkVectorRecord<>.Key), typeof(Guid)) | ||
| { | ||
| IsAutoGenerated = true, |
There was a problem hiding this comment.
Technically this should not be needed, as all providers are expected to support Guid auto-generation. But it's a good idea to have it (a) for clarity, and (b) in case some provider out there decides to not support this, setting to true here should cause it to throw.
VectorStoreWriterwas manually assigningKey = Guid.NewGuid()on each record before upsert. MEVD 10.1.0 addedIsAutoGeneratedsupport on key properties, making this unnecessary.Changes
VectorStoreWriter.cs: Remove manualKey = Guid.NewGuid()assignmentVectorStoreExtensions.cs: AddVectorStoreKeyPropertywithIsAutoGenerated = trueto theGetIngestionRecordCollectioncollection definition — the SK connector now auto-generates the Guid on upsertVectorStoreWriterTests.cs: AddIsAutoGenerated = trueto the key property in the custom-definition test so key generation still works for user-supplied schemasDependency updates
MicrosoftExtensionsVectorDataAbstractionsVersion9.7.010.1.0MicrosoftSemanticKernelConnectorsVersion1.67.0-preview1.74.0-previewProjectTemplates.props1.67.1-preview1.74.0-previewNuGet.configaddsnuget.orgas a source scoped exclusively toMicrosoft.SemanticKernel.*packages because1.74.0-previewis not yet mirrored to the internaldotnet-publicfeed.Original prompt
📱 Kick off Copilot coding agent tasks wherever you are with GitHub Mobile, available on iOS and Android.
Microsoft Reviewers: Open in CodeFlow