-
Notifications
You must be signed in to change notification settings - Fork 3.3k
fix(ingestion/extractor): accept bool type in json-schema extractor #15437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix(ingestion/extractor): accept bool type in json-schema extractor #15437
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Bundle ReportBundle size has no change ✅ |
|
@btkcodedev Thanks for the contribution |
| def _get_type_from_schema(schema: Dict) -> str: | ||
| """Returns a generic json type from a schema.""" | ||
| # Handle boolean schemas per JSON Schema spec: true accepts any JSON, false never validates | ||
| if isinstance(schema, bool): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method signature expects schema as Dict and here its been checked against bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This boolean check is to prevent unexpected crashes similar to the issue. The user mentioned a crash from
if Ellipsis in schema: TypeError: argument of type 'bool' is not iterable
This is a check to prevent it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then, should we fix the signature? def _get_type_from_schema(schema: Dict) -> str:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
schema: Union[Dict, bool]
| @staticmethod | ||
| def _get_type_from_schema(schema: Dict) -> str: | ||
| """Returns a generic json type from a schema.""" | ||
| # Handle boolean schemas per JSON Schema spec: true accepts any JSON, false never validates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any public reference to the json schema spec that we could add here as comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, you can find the reference here:
https://json-schema.org/understanding-json-schema/basics#hello-world!
Thanks

closes #15421
Fix for Boolean Schema Properties in JSON Schema Ingestion
What: Fixed TypeError: argument of type 'bool' is not iterable when JSON schema properties are set to boolean values.
Root cause
Methods expected Dict but received bool when properties were set to true/false per JSON Schema spec.
Code Analysis
Main file:
metadata-ingestion/src/datahub/ingestion/extractor/json_schema_util.py-> Code expected all schemas to be Dict objects
-> Crashed at Ellipsis in schema and "key" in schema checks when schema was boolean
-> Boolean schemas are valid per JSON Schema: true = accepts any JSON, false = never validates
Solution
-> Added isinstance(schema, bool) checks before Ellipsis in schema
-> Convert boolean → {} dict at property iteration
This PR prevents crashes throughout the call chain for bool type