Skip to content

Audit: improve web, environment, asset, and modality coverage #220

@neubig

Description

@neubig

Dataset quality audit category: Web, environment, asset, and modality issues

The May 2026 dataset audit found 28 issues in this class:

  • web_observation_completeness: 14
  • environment_modeling: 6
  • missing_assets_or_portability: 4
  • modality_underuse: 4

Problem

Several web, GUI, database, and multimodal datasets flatten rich environment state into text or leave important assets unavailable. This underuses ADP structures such as WebObservation, ImageObservation, CodeAction, and environment TextObservations.

Examples

  • agenttuning_mind2web: web state and actions are flattened into a multiple-choice text prompt and MessageAction; the sample does not use WebObservation or ApiAction.
  • agenttuning_webshop: web pages are represented as flattened [SEP] text instead of WebObservation page state.
  • go-browse-wa: all 28 WebObservation records have html: null and an empty URL, so raw page state is not fully preserved.
  • agenttuning_db: there are no database execution observations, despite the prompt describing an iterative MySQL environment.
  • android_in_the_wild: image paths are absolute local paths under a developer workspace and are not present in this repository.
  • llava_plus: generated segmentation/inpainting artifacts are not represented as image observations; text says results are displayed, but no output image is recorded.

Suggested work

  • Use WebObservation for browser/page state when raw data includes URLs, HTML, accessibility trees, screenshots, or viewport information.
  • Use environment observations for execution outputs from databases, shells, browsers, and simulators.
  • Replace absolute local asset paths with portable repo-relative paths, documented external references, or generated sample assets that are actually available.
  • Preserve screenshots and multimodal outputs as ImageObservation where possible.
  • Add tests or audits for empty WebObservation.url, missing web HTML/axtree when available upstream, and non-portable absolute image paths.
  • Document cases where the upstream dataset does not provide richer state, so consumers can distinguish source limitations from converter omissions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions