Skip to content

Conversation

@changjian-wang
Copy link
Member

This pull request introduces the initial release of the Azure Content Understanding Java SDK by adding the new azure-ai-contentunderstanding package and integrating it into the build system. The changes include all necessary configuration, metadata, and source files to support client creation, versioning, and customization for the new service.

Key changes in this pull request:

New SDK Package Addition:

  • Introduced the azure-ai-contentunderstanding Java SDK, including client builder, service version enum, and internal helper classes for the Content Understanding service. [1] [2] [3]

Build System and Versioning Integration:

  • Registered the new module in the root pom.xml and versioning files to ensure the SDK is included in builds and version tracking. [1] [2]

Project Metadata and Configuration:

  • Added Maven project files (pom.xml), changelog, spell-check configuration, and asset tracking for the new SDK package. [1] [2] [3] [4] [5]

Development Environment Setup:

  • Included a .gitignore tailored for local development and security best practices in the new SDK directory.

azure-sdk and others added 30 commits December 3, 2025 22:19
… from .NET SDK

- Sample00_ConfigureDefaults: Demonstrates configuration management (get/update defaults)
- Sample01_AnalyzeBinary: Binary PDF analysis from local file
- Sample02_AnalyzeUrl: Analyze documents from URL
- Sample03_AnalyzeInvoice: Extract structured invoice fields with nested objects and arrays
- Sample04_CreateAnalyzer: Create and use custom analyzer with field schema (Extract/Generate/Classify methods)

Key features:
- All samples use DefaultAzureCredentialBuilder for authentication
- Environment variable based configuration (ENDPOINT)
- Comprehensive JUnit 5 tests with assertions
- GitHub public URLs for test data
- Proper field access patterns with type casting (ContentField, StringField, NumberField, ObjectField, ArrayField)
- All tests passing (6/6 = 100% success rate)

Technical implementation:
- Fixed API differences from C# SDK (ContentSpan, ContentField, 5-parameter beginAnalyze)
- Proper null checking and type casting for all field access
- Detailed validation assertions for all document properties
- Clean resource management with @AfterEach cleanup

Module-info.java formatting cleanup included.
- Sample05_CreateClassifier: Create classifier analyzer with multiple classification fields (document_type, industry, urgency)
- Sample06_GetAnalyzer: Get analyzer information including configuration and field schema

Key features:
- Sample05: Demonstrates classification-only analyzer with 3 classifiers
- Sample06: Shows how to retrieve and inspect analyzer properties including prebuilt analyzers
- Fixed API usage: getAnalyzerId(), getCreatedAt(), getLastModified At() instead of getId(), getCreatedDateTime(), getUpdatedDateTime()
- Comprehensive field schema inspection with all 31 prebuilt-invoice fields
- All tests passing with real Azure service
- Sample07_ListAnalyzers: List and filter all available analyzers (prebuilt and custom)
  * testListAnalyzersAsync: Lists all 134 analyzers (87 prebuilt, 47 custom)
  * testListReadyAnalyzersAsync: Filters for ready analyzers only
- Sample08_UpdateAnalyzer: Update existing analyzer properties
  * Demonstrates updating description, configuration, and field schema
  * Uses @beforeeach to create test analyzer and @AfterEach for cleanup
  * Shows how to add new fields while preserving existing ones

All tests passing with real Azure service
Fixed Sample08_UpdateAnalyzer to avoid 409 conflict error:
- Delete existing analyzer before recreating with updated configuration
- Added note about using updateAnalyzerWithResponse for atomic updates in production
- All 12 tests now passing (Sample00-08 with multiple test methods)

Test results: 12/12 passed (100% success rate)
…iables and Improve Test Patterns

- Updated environment variable names from "ENDPOINT" and "CONTENTUNDERSTANDING_API_KEY" to "CONTENTUNDERSTANDING_ENDPOINT" and "AZURE_CONTENT_UNDERSTANDING_KEY" across multiple sample test files.
- Modified sample tests to load local files instead of using publicly accessible URLs for document analysis.
- Enhanced assertions and logging for better clarity and debugging.
- Improved API usage patterns in tests for creating, copying, and deleting analyzers, including async patterns.
- Added model mappings for analyzers in relevant samples to demonstrate configuration capabilities.
…e validation of source and copied analyzers
… Azure Credential Authentication

- Updated Sample03_AnalyzeInvoice, Sample04_CreateAnalyzer, Sample05_CreateClassifier, Sample06_GetAnalyzer, Sample07_ListAnalyzers, Sample08_UpdateAnalyzer, Sample09_DeleteAnalyzer, Sample10_AnalyzeConfigs, Sample11_AnalyzeReturnRawJson, Sample12_GetResultFile, Sample13_DeleteResult, Sample14_CopyAnalyzer, Sample15_GrantCopyAuth, and Sample16_CreateAnalyzerWithLabels to include logic for initializing the Content Understanding client with either an API key or the Default Azure Credential.
- Added assertions to verify client initialization in each sample.
- Improved code readability and maintainability by consolidating client creation logic.
- Sample12_GetResultFile: Demonstrates how to retrieve keyframe images from video analysis operations.
- Sample13_DeleteResult: Shows how to delete analysis results after they are no longer needed.
- Sample14_CopyAnalyzer: Illustrates how to copy an analyzer within the same resource.
- Sample15_GrantCopyAuth: Demonstrates granting copy authorization for cross-resource analyzer copying.
- Sample16_CreateAnalyzerWithLabels: Shows how to create an analyzer with labeled training data from Azure Blob Storage.
- Delete 13 @disabled test files (replaced by Sample tests)
- Modify Sample00-Sample16 to extend ContentUnderstandingClientTestBase
- Add testResourceNamer for reproducible random IDs in PLAYBACK mode
- Remove problematic sanitizers (AZSDK2003, AZSDK2030, AZSDK3423, AZSDK3430, AZSDK3493)
- Configure maven-surefire-plugin to include Sample*.java
- Use AZURE_CONTENT_UNDERSTANDING_ENDPOINT env var (matches .NET naming)
Exclude src/samples/.../samples/Sample*.java standalone examples from test execution.
- Fixed URI mismatch issue where recorded URLs had double slashes (//contentunderstanding)
- Updated assets.json to point to new recordings tag (3de1635cfc)
- All 23 tests pass in PLAYBACK mode
@changjian-wang changjian-wang marked this pull request as ready for review February 10, 2026 09:03
…ple reliability (#47959)

* Initial plan

* Address review comments: fix error message formatting, unused imports, missing @OverRide, and use deterministic waiting in async sample

Co-authored-by: changjian-wang <15209050+changjian-wang@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: changjian-wang <15209050+changjian-wang@users.noreply.github.com>
@changjian-wang changjian-wang marked this pull request as draft February 10, 2026 09:04
- Add fixGeneratedPollingStrategyCode to customization: replaces
  String.valueOf(PollingUtils.OPERATION_LOCATION_HEADER) with
  PollingUtils.OPERATION_LOCATION_HEADER.getCaseSensitiveName()
- Fix error message comma in OperationLocationPollingStrategy
- Remove @OverRide from getResult in SyncOperationLocationPollingStrategy
- Fix ObjectField NoSuchElementException formatting
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 110 out of 180 changed files in this pull request and generated 5 comments.

changjian-wang and others added 3 commits February 10, 2026 19:03
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…n/java/com/azure/ai/contentunderstanding/models/KnowledgeSource.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…n/java/com/azure/ai/contentunderstanding/models/DocumentParagraph.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@changjian-wang
Copy link
Member Author

@copilot open a new pull request to apply changes based on the comments in this thread

Copy link
Contributor

Copilot AI commented Feb 10, 2026

@changjian-wang I've opened a new pull request, #47964, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 110 out of 180 changed files in this pull request and generated 6 comments.

@yungshinlintw yungshinlintw marked this pull request as ready for review February 10, 2026 14:19
Copy link
Contributor

Copilot AI commented Feb 10, 2026

@yungshinlintw I've opened a new pull request, #47966, to work on those changes. Once the pull request is ready, I'll request review from you.

yungshinlintw and others added 2 commits February 10, 2026 09:43
…ples/java/com/azure/ai/contentunderstanding/samples/Sample06_GetAnalyzerAsync.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ples/java/com/azure/ai/contentunderstanding/samples/Sample00_UpdateDefaultsAsync.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@yungshinlintw yungshinlintw changed the title [Draft] Content Understanding GA SDK for Java Content Understanding GA SDK for Java Feb 10, 2026
@changjian-wang changjian-wang requested a review from a team as a code owner February 11, 2026 02:23
* Initial plan

* Remove System.exit(1) from async samples

Co-authored-by: yungshinlintw <14239352+yungshinlintw@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: yungshinlintw <14239352+yungshinlintw@users.noreply.github.com>
Changjian Wang and others added 4 commits February 11, 2026 15:04
The method matched 4-param beginAnalyze and 5-param beginAnalyzeBinary which
no longer exist in generated code. It was a no-op. The actual utf16 hardcoding
is handled by addBeginAnalyzeConvenienceOverloads and
addBeginAnalyzeBinaryConvenienceOverloads.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants