[Feature] Add Arrow Flight SQL transport for StarRocks OLAP connector

## Is your feature request related to a problem? Please describe.

The StarRocks connector currently uses only the MySQL wire protocol (port 9030) for all query execution. The MySQL protocol serializes data row-by-row in a text-based format, which becomes a significant performance bottleneck for large result sets commonly encountered in dashboard queries:

- **10K+ rows**: MySQL serialization overhead dominates query latency, with ~54ms for 100K rows at full table scan.
- **High memory allocations**: The row-based deserialization path produces ~1.2M allocations per 100K-row query, increasing GC pressure in the Rill runtime process.
- **Throughput ceiling**: MySQL protocol tops out at ~1.8M rows/sec for typical analytical queries, limiting dashboard responsiveness for large datasets.

StarRocks 3.0+ natively supports Apache Arrow Flight SQL, a columnar transport protocol designed for high-throughput analytical workloads, but Rill does not currently leverage it.

## Describe the solution you'd like

Add Arrow Flight SQL as an **opt-in** query transport for the StarRocks connector, configurable via a `transport: "flight_sql"` property. The implementation should:

1. **Route `Query()` and `QuerySchema()` through Arrow Flight SQL** when configured, while keeping `Exec()` and `DryRun` on MySQL (DDL/DML operations don't benefit from columnar transfer).

2. **Automatically fall back to MySQL for parameterized queries** (`stmt.Args`), since Arrow Flight SQL does not support query parameter binding at the protocol level.

3. **Handle StarRocks' FE/BE split architecture**: FE handles `Execute()` (query planning), while BE nodes serve `DoGet()` (data retrieval). The connector must parse endpoint Location URIs from `FlightInfo` to route `DoGet` to the correct BE node.

4. **Pool BE clients** to avoid per-query gRPC connection overhead, with eviction on connection failure for resilience during rolling restarts.

5. **Limit concurrency** via a configurable semaphore (`flight_sql_max_conns`) to prevent exhausting StarRocks FE's per-user connection limit.

6. **Support `ExecutionTimeout`** via `context.WithTimeout`, consistent with other OLAP drivers (ClickHouse, Druid, Pinot).

7. **Provide a `flight_sql_be_addr` override** for environments where BE nodes advertise internal addresses not reachable from the client (e.g., Docker, NAT).

### Expected configuration

```yaml
type: connector
driver: starrocks

host: "starrocks-fe.example.com"
port: 9030
username: "analyst"
password: "{{ .env.STARROCKS_PASSWORD }}"
database: "my_database"
transport: "flight_sql"          # "mysql" (default) or "flight_sql"
flight_sql_port: 9408            # FE Arrow Flight SQL port (default: 9408)
flight_sql_be_addr: "host:port"  # Optional: override BE address for NAT/Docker
flight_sql_max_conns: 100        # Optional: max concurrent Flight SQL queries (default: 100)
```

### Expected performance improvement

| Query Size | MySQL | Flight SQL | Improvement |
|---|---|---|---|
| 1 row | 0.67ms | 2.28ms | MySQL 3.4x faster (gRPC overhead) |
| 100 rows | 2.75ms | 2.94ms | ~same |
| 1K rows | 3.69ms | 2.97ms | Flight SQL 1.2x faster |
| 10K rows | 9.55ms | 4.87ms | **Flight SQL 2.0x faster** |
| 100K rows | 54.5ms | 19.5ms | **Flight SQL 2.8x faster** |

Crossover point is ~1K rows. Below this threshold, MySQL is faster due to gRPC connection overhead.

## Describe alternatives you've considered

1. **MySQL protocol with compression**: MySQL supports `zlib` and `zstd` compression, which could reduce network overhead. However, this does not address the row-by-row serialization cost on the client side, nor the high allocation count. Benchmarks with compression showed <10% improvement for typical analytical queries.

2. **JDBC with Arrow extensions**: Some JDBC drivers (e.g., Snowflake) support fetching results as Arrow batches. StarRocks' JDBC driver does not currently support this, and JDBC adds Java dependency overhead.

## Additional context

### Requirements

- **StarRocks 4.0.4+**: Earlier versions have Arrow Flight serialization bugs for certain types (e.g., VARBINARY). See [StarRocks PR #65889](https://github.com/StarRocks/starrocks/pull/65889).
- **FE/BE config**: `arrow_flight_port` must be set in both `fe.conf` (default 9408) and `be.conf` (default 9419).
- **Network**: FE Arrow Flight port must be accessible from the Rill runtime. BE Arrow Flight port must also be accessible unless FE proxy mode is enabled (`arrow_flight_proxy_enabled = true` in `fe.conf`).

### Architecture

```
Query() / QuerySchema()
  |
  +-- ExecutionTimeout -> context.WithTimeout (if set)
  |
  +-- transport=mysql (default) -> MySQL protocol (port 9030)
  |
  +-- transport=flight_sql
       +-- stmt.Args non-empty? -> fallback to MySQL
       +-- flightSem.Acquire() -> concurrency limit
       +-- Execute() -> FE (port 9408)
       +-- DoGet()   -> BE (via endpoint Location URI, with pooled clients)
                         +-- on failure: evict cached client, next query reconnects

Exec() / DryRun -> Always MySQL
```

### Full benchmark results (100K rows, full table scan)

| Metric | MySQL | Flight SQL | Improvement |
|---|---|---|---|
| Throughput | 1.79M rows/sec | 4.77M rows/sec | **2.7x faster** |
| Memory | 17.2MB/op | 12.7MB/op | **26% less** |
| Allocations | 1.2M allocs/op | 469K allocs/op | **61% fewer** |

### Scope

**In scope:**
- `transport` config property (`mysql` / `flight_sql`)
- Flight SQL client initialization with Basic Auth over gRPC
- DoGet BE routing via endpoint Location URIs
- BE client pooling with eviction on failure
- Arrow RecordBatch to `drivers.Result` type conversion
- `flightRows` implementing `Rows` interface (`Next`/`MapScan`/`Scan`/`Close`)
- Concurrency limiter (`flight_sql_max_conns` semaphore)
- `ExecutionTimeout` support
- Automatic MySQL fallback for parameterized queries
- `flight_sql_be_addr` override for Docker/NAT environments
- Unified integration tests (MySQL + FlightSQL, single container)
- Performance benchmarks

**Out of scope:**
- Routing `Exec()` through Flight SQL (no performance benefit for DDL/DML)
- Multi-endpoint consumption (StarRocks returns single endpoint for normal SQL queries)
- TLS certificate pinning for BE connections (follows FE SSL setting)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add Arrow Flight SQL transport for StarRocks OLAP connector #8911

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Expected configuration

Expected performance improvement

Describe alternatives you've considered

Additional context

Requirements

Architecture

Full benchmark results (100K rows, full table scan)

Scope

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Query Size	MySQL	Flight SQL	Improvement
1 row	0.67ms	2.28ms	MySQL 3.4x faster (gRPC overhead)
100 rows	2.75ms	2.94ms	~same
1K rows	3.69ms	2.97ms	Flight SQL 1.2x faster
10K rows	9.55ms	4.87ms	Flight SQL 2.0x faster
100K rows	54.5ms	19.5ms	Flight SQL 2.8x faster

Metric	MySQL	Flight SQL	Improvement
Throughput	1.79M rows/sec	4.77M rows/sec	2.7x faster
Memory	17.2MB/op	12.7MB/op	26% less
Allocations	1.2M allocs/op	469K allocs/op	61% fewer

[Feature] Add Arrow Flight SQL transport for StarRocks OLAP connector #8911

Description

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Expected configuration

Expected performance improvement

Describe alternatives you've considered

Additional context

Requirements

Architecture

Full benchmark results (100K rows, full table scan)

Scope

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions