Skip to content

Constraining UI variability, or enforcing a restricted UI/component set? #467

@domfarolino

Description

@domfarolino

Content in an MCP App is kind of a free-for-all, since it's just a web app. The spec makes no mention of host-required style variables, interaction patterns, blessed UI components, etc., so anything goes. This level of UI variability/flexibility is generally unacceptable to hosts, who have evolved their own "UI design principles" and requirements to impose on UI inside an MCP Apps iframe, so they conform to the host's native styling and don't break their brand UX too much.

Style guidelines and Figma component templates have been published for both Claude and ChatGPT so far:

Claude:

ChatGPT:

If I understand correctly, failure to conform to a host's injected styles or UI patterns will likely cause an MCP App to be rejected by "app-store-like review" process. All of this comes from a very understandable place. After all, agent hosts are embedding 3p iframe content directly in their UI, and it's reasonable for them to retain some control over the UX of their product.

Unfortunately, the more host-specific requirements get imposed on developers, the less predictable and portable MCP Apps become. MCP Apps get less "write once run everywhere" and more "write some abstraction, and keep up with a list of ever-growing host-specific requirements for my UX", and over time I think this will cause significant fracturing.

It's probably not possible to solve this whole problem—there will always be some host-specific constraints we don't want to bake directly into the MCP Apps standard. However, as agent hosts like Gemini and others are looking to implement the MCP Apps standard, I'm repeatedly bumping up against the same problem of hosts (and sometimes 3p devs!) wanting an easily enforceable, restricted set of UI components that can be used inside the frame.

Some hosts are looking at https://a2ui.org/ as a path forward, which lets developers express a declarative JSON UI structure describing elements of a finite component library, that gets rendered in the host iframe without 3p script needing to be involved. I'm not proposing this specific approach as a concrete solution, but I'd like to offer it as a jumping off point for discussion.

There are tons of solutions here, ranging from heavy-handed ones like enforcing A2UI-like component libraries backed by renderers that 3Ps don't control, to light-weight ones like specifying how the host injects styles, or otherwise constrains UI variability. (The latter could be done by MCP Apps bootstrapping some closed-root shadow DOM that contains host styles/markup that's isolated from the 3p content inside the iframe, for example...)

The goal is to improve interoperability of MCP Apps across agent hosts and make the platform we're specifying more predictable and less implementation-defined, so developer don't have to write 3 different MCP App View UIs for 3 different agent platforms.

With that, are any hosts interested in firming up some kind of way to constrain UI variability of MCP Apps? Would love thoughts from other implementers here to kick things off. /cc @ochafik @aharvard @idosal @liady @chrishtr @bengreenstein @AbhiGemTest @bricedp.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions