Implement multimodal request support for Gemini API (#2) #3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces multimodal support for the Google Gemini API within the
ChatAIze.GenerativeCSlibrary, addressing issue #2. Users can now send requests combining text with various file types (PDF, DOC, TXT, images, audio, video).Key Changes Implemented:
Gemini File Service (
FileService.cs,IFileService.cs):FileService,IFileService) are aligned with existing provider naming conventions (e.g.,ChatCompletion.cs).Enhanced Chat Message Structure (
ChatMessage.cs,ChatContentPart.cs):ChatMessage.csnow uses anICollection<IChatContentPart> Partsproperty to hold different content types within a single message.IChatContentPartinterface and concreteTextPartandFileDataPartclasses.FileDataPartencapsulatesFileDataSource(MIME type and file URI) for referencing uploaded files.ChatMessage.Contentproperty has been marked[Obsolete]and now acts as a getter/setter for the firstTextPartin thePartscollection to maintain backward compatibility.Updated Gemini Chat Provider (
ChatCompletion.cs):CreateChatCompletionRequestmethod now iterates throughmessage.Parts.TextPartandFileDataPart(includingmime_typeandfile_uri) into the JSON payload for the Gemini API'sgenerateContentendpoint.ChatMessage.Contentusage (for backward compatibility fallback) have been suppressed with#pragma.Client and DI Integration (
GeminiClient.cs,GeminiClientExtension.cs):GeminiClient.csnow instantiates and exposes anIFileServicethrough a publicFilesproperty.GeminiClientExtension.cshas been updated to registerIFileServiceas a singleton, resolving its instance from theGeminiClient.Filesproperty. This ensures a consistentIFileServiceinstance is used.Model Updates (
Models/Gemini/)GeminiFile.cs,GeminiFileUploadRequest.cs,GeminiListFilesResponse.csto represent data structures for the Gemini Files API.requiredmodifier for non-nullable properties expected from the API and initializing collections.Documentation & Packaging:
README.mdwith a new section explaining how to use the multimodal features, including accessingIFileService, uploading files, and sending chat messages with file references.ChatAIze.GenerativeCS.csprojto0.15.0..csprojfile to reflect the new multimodal capabilities.How to Test:
GeminiClient.geminiClient.Files.fileService.UploadFileAsync(...).Chatobject and add aChatMessage.ChatMessage.Partscollection, add aTextPartand aFileDataPartusing theMimeTypeandUrifrom the uploaded file.geminiClient.CompleteAsync(chat)and observe the model's response, which should consider the content of the uploaded file.Future Considerations (Not in this PR):
GeminiClient.csto simplify the process of sending a message with a local file (e.g., a method that handles both upload and message creation).This implementation adheres to the existing coding patterns and architectural style of the library.
Fixes #2