Skip to content

CLI: add stdin support for instance validation #614

Open
Vaibhav701161 wants to merge 7 commits intosourcemeta:mainfrom
Vaibhav701161:feat/stdin-validate
Open

CLI: add stdin support for instance validation #614
Vaibhav701161 wants to merge 7 commits intosourcemeta:mainfrom
Vaibhav701161:feat/stdin-validate

Conversation

@Vaibhav701161
Copy link

Adds stdin support for instance input to enable real-time editor diagnostics.
Fixes(#537)

  • Centralized stdin parsing in src/input.h (JSON-first, YAML fallback).
  • validate accepts instances from stdin only; schemas must be file paths.
  • Stdin inputs resolve relative $ref against the current working directory.
  • Includes integration test test/validate/pass_stdin.sh.
  • Unblocks sourcemeta/studio#61; addresses sourcemeta/jsonschema#537.

Refs: sourcemeta/studio#61, #537

*Build and test:
Screenshot 2026-01-15 145159

@Vaibhav701161 Vaibhav701161 force-pushed the feat/stdin-validate branch 2 times, most recently from ddb8312 to 9a040d7 Compare January 15, 2026 10:07
@Vaibhav701161
Copy link
Author

Hi @jviotti , I’ve opened this PR and would appreciate a review when you have time. Thanks!

@jviotti
Copy link
Member

jviotti commented Jan 19, 2026

Hey @Vaibhav701161 , sorry for the delay. It has been a hectic week. I'll try to get into it today

@jviotti
Copy link
Member

jviotti commented Jan 19, 2026

I like the idea but I think there are some things to iron out:

  • If we accept standard input, we should do so in ALL commands, otherwise it gets weird
  • Some commands require a schema as input but for others, passing nothing means "the current directory", which introduces some ambiguity. For example, jsonschema fmt formats all schemas in the current directory, while jsonschema inspect <schema.json> requires a single schema argument
  • Furthermore, some commands require 2 schemas, like validate

I think for consistency, we can require the - to signify standard input in every command. However, for commands take potentially take directories as input, we should transparently implement this logic on src/input.h. So it happens out of the box.

The other problem here will be line numbers. We should make sure all relevant commands handle standard input while still reporting line/column information and potentially some "special" file path like <stdin>?

@Vaibhav701161
Copy link
Author

Vaibhav701161 commented Jan 20, 2026

Hey @jviotti , thanks a lot for the feedback.

I’ve pushed an updated iteration that tries to address the concerns you mentioned:

  • - is now the explicit opt-in for stdin.
  • The stdin logic is centralized in src/input.h, so it’s not handled ad-hoc in individual commands.
  • Ambiguous cases (schema via stdin, multiple stdin inputs) are explicitly rejected.
  • Line/column information is preserved, and stdin is handled as a special input (covered via tests).

Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
@Vaibhav701161
Copy link
Author

Hi @jviotti , just a quick follow-up on this.
Whenever you get a chance, I’d appreciate a review to confirm whether this direction looks good or if there’s anything you’d like adjusted before moving forward. Thanks

inline auto read_from_stdin() -> ParsedJSON {
std::string buffer;
{
std::ostringstream ss;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need this. sourcemeta::core::parse_json should already be able to read from any standard C++ input stream without needing to buffer it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This too is still valid


struct InputJSON {
std::filesystem::path first;
std::filesystem::path resolution_base;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just a display name, maybe make a std::string? Otherwise <stdin> is not really a path

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this still needs to be addressed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please split this test into multiple tests. If you see the other tests in the project, we have specific conventions and we always test a single thing per file. Plus you also want to add test for every other applicable command.

Finally, you didn't register this test in test/CMakeLists.txt, so it won't run at all

copy, sourcemeta::core::schema_walker, custom_resolver,
get_lint_callback(errors_array, entry, output_json), dialect,
sourcemeta::jsonschema::default_id(entry.first),
sourcemeta::core::URI::from_path(entry.resolution_base)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still definitely want to use sourcemeta::jsonschema::default_id here, to concentrate that logic in a single place. Maybe make such function take entry directly, so that inside we can decide to use entry.resolution_base or whatever?

custom_resolver,
get_lint_callback(errors_array, entry, output_json), dialect,
sourcemeta::jsonschema::default_id(entry.first),
sourcemeta::core::URI::from_path(entry.resolution_base)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

// TODO: Print a verbose message for what is getting parsed
auto parsed{read_file(canonical)};
result.push_back({std::move(canonical), std::move(parsed.document),
auto canonical_copy{canonical};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

}
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this deletion was intended?

}
}
if (stdin_count > 1) {
throw std::runtime_error{"Standard input (-) can only be specified once"};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw std::runtime_error{"Standard input (-) can only be specified once"};
throw std::runtime_error{"The standard input position argument `-` can only be specified once"};

std::size_t stdin_count{0};
for (const auto &arg : arguments) {
if (arg == "-") {
stdin_count++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think counting all occurrences makes sense if you only want up to two. Maybe check that if you already got at least 2, then its an error and you can break?

set +o errexit
cat "$TMP/schema.json" > "$TMP/schema_stdin.json"

OUTPUT=$("$1" validate - "$TMP/schema.json" < "$TMP/schema_stdin.json" 2>&1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the other shell scripts for the conventions on how we test stuff. Some obvious ones:

  • We never capture into a variable. We pipe into a file and then diff it
  • We never enable set +o errexit
  • We never test more than one thing at once

Also your test is not registered in test/CMakeLists.txt so it won't run at all. But overall, split it so that there are multiple tests testing each one thing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants