Table scan rejects current-schema column names after `UpdateSchemaAction` commit

# Table scan rejects current-schema column names after `UpdateSchemaAction` commit

**Label:** `bug`

## Is your feature request related to a problem or challenge?

A default `TableScanBuilder::build()` validates caller-supplied column names against the *snapshot's* schema, not the *table's current* schema. After an `UpdateSchemaAction` commit changes the current schema (rename / add / delete column), pre-existing snapshots still point at the pre-evolution `schema_id`, so the scan rejects names that are valid against the post-evolution schema.

### Reproducer

Setup: any iceberg table with at least one snapshot. Apply a schema-evolution transaction (uses the action shipped in #2120 / `UpdateSchemaAction`):

```rust
let tx = Transaction::new(&table);
let action = tx.update_schema()
    .add_column(AddColumn::optional("note", Type::Primitive(PrimitiveType::String)));
let tx = action.apply(tx)?;
let table = tx.commit(&catalog).await?;
```

The catalog now reports the post-evolution schema (verified via `catalog.load_table().metadata().current_schema()`). But a scan over the same `Table`:

```rust
table.scan().select(["note"]).build()
```

returns:

```
DataInvalid => Column note not found in table. Schema: table {
  1: id: optional long
  2: name: optional string
  3: tmp: optional double
}
```

The schema dump is the **snapshot's** schema — the column added a moment ago is missing.

### Root cause

[`crates/iceberg/src/scan/mod.rs:221`](https://github.com/apache/iceberg-rust/blob/main/crates/iceberg/src/scan/mod.rs#L221):

```rust
let schema = snapshot.schema(self.table.metadata())?;
```

`snapshot.schema(metadata)` resolves the snapshot's `schema_id` against `metadata.schemas` and returns *the schema the snapshot was written under*. For time-travel scans (`.snapshot_id(...)`) that's exactly right — the caller is asking for "the table as it existed at this snapshot." But for a default scan, the caller is asking for "the table as it is now," and the post-evolution columns are legitimately part of that vocabulary.

The downstream Parquet projection (`crates/iceberg/src/arrow/reader/projection.rs::get_arrow_projection_mask_with_field_ids`) already maps field IDs to on-disk column names via `PARQUET:field_id` metadata, so resolving names against the current schema is safe end-to-end — field IDs are stable across schema versions, and the file's original column names live in the parquet metadata until the file is rewritten. PyIceberg's reader (`pyiceberg/io/pyarrow.py::_task_to_record_batches`) implements exactly this pattern: project by field ID, rename the arrow batch on the way out.

### Why this wasn't caught upstream

`UpdateSchemaAction` (#2120) shipped with metadata-only tests in `crates/catalog/loader/tests/schema_update_suite.rs` — none of them call `table.scan().select_columns(...)` after the schema commit. The pre-existing `crates/integration_tests/tests/read_evolved_schema.rs` only uses `table.scan().build()` with no `select_columns`, which bypasses the column-name validation loop entirely (it falls through to `column_names.unwrap_or_else(|| schema.as_struct().fields()...)`).

So a column-name lookup combined with a schema-evolved table is the gap. Both `add_column` and `delete_column` (already in `main`) trigger it; `rename_column` (#2563) trips it even more cleanly because the old name continues to exist on disk.

## Describe the solution you'd like

Branch on whether the caller asked for a specific snapshot:

```rust
let schema = if self.snapshot_id.is_some() {
    snapshot.schema(self.table.metadata())?
} else {
    self.table.metadata().current_schema().clone()
};
```

- Explicit `snapshot_id` (time-travel): keep the snapshot-time vocabulary. A caller asking "what existed at snapshot N" should see schema N's columns.
- Default scan (no `snapshot_id`): use the table's current schema. Field IDs are stable across schemas, so the downstream projection still finds the right on-disk columns.

Both the column-name validation loop and the subsequent `field_id_by_name` lookup share the same `schema` variable, so the fix is one assignment.

## Willingness to contribute

I can contribute this independently. I have a working branch with the fix + three regression tests (rename-then-read works, old-name-after-rename errors, time-travel still uses snapshot schema), all 1299 iceberg lib tests passing, clippy + rustfmt clean. PR ready to open once this issue is filed for reference.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Table scan rejects current-schema column names after `UpdateSchemaAction` commit #2565

Table scan rejects current-schema column names after `UpdateSchemaAction` commit

Is your feature request related to a problem or challenge?

Reproducer

Root cause

Why this wasn't caught upstream

Describe the solution you'd like

Willingness to contribute

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Table scan rejects current-schema column names after UpdateSchemaAction commit #2565

Description

Table scan rejects current-schema column names after UpdateSchemaAction commit

Is your feature request related to a problem or challenge?

Reproducer

Root cause

Why this wasn't caught upstream

Describe the solution you'd like

Willingness to contribute

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Table scan rejects current-schema column names after `UpdateSchemaAction` commit #2565

Table scan rejects current-schema column names after `UpdateSchemaAction` commit