Performance Optimization for cohort_heterozygosity()

## Problem Statement

The `cohort_heterozygosity()` method in `malariagen_data/anoph/heterozygosity.py` computes heterozygosity for each sample in a cohort sequentially using a for-loop. For large cohorts (100+ samples), this approach is unnecessarily slow because:

1. **Repeated Data Loading**: Each `sample_count_het()` call independently loads the same SNP data for the region from disk via `snp_calls()`. For a 100-sample cohort, SNP data is loaded 100 times unnecessarily.

2. **Sequential Processing**: The inner loop processes samples one at a time, preventing any vectorization benefit.

### Current Implementation (Lines 798-809)
```python
het_values = []
for sample_id in df_cohort_samples["sample_id"]:
    df_het = self.sample_count_het(
        sample=sample_id,
        region=region_prepped,
        window_size=window_size,
        site_mask=site_mask,
        chunks=chunks,
        inline_array=inline_array,
    )
    het_values.append(df_het["heterozygosity"].mean())
```

## Root Cause
- `snp_calls()` is called O(N) times where N = number of samples in cohort
- Each call involves expensive disk I/O to load variant position and genotype data
- Data loading dominates execution time for large cohorts, not the actual heterozygosity computation

## Proposed Solution

Implement a vectorized approach that:

1. **Load SNP data once per cohort** using a single `snp_calls()` call instead of N separate calls
2. **Compute heterozygosity across all samples vectorized** using `GenotypeDaskArray.is_het()` which produces shape `(variants, samples)` instead of per-sample `(variants,)`
3. **Apply per-sample windowing** in a loop (unavoidable since `allel.moving_statistic()` is 1D-only)
4. **Return results dict** mapping `sample_id → (windows, counts)` tuples

### Key Optimization Points
- **Data Loading**: O(N) → O(1) disk I/O operations
- **Heterozygosity Computation**: Single batch operation across all samples
- **Per-Sample Aggregation**: Manual loop for windowing (required by API constraints)

### Testing Requirements
- **Regression test**: Verify vectorized method produces identical results as sequential approach
  - Test all cohort sizes: 1, 10, 100+ samples
  - Test all sample_set, site_mask, window_size combinations
  - Numerical tolerance: Within floating-point precision (rtol=1e-10)
- **Integration test**: Confirm existing tests still pass
- **Edge cases**: Single-sample cohorts, missing sites, window_size > region size

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Optimization for cohort_heterozygosity() #1211

Problem Statement

Current Implementation (Lines 798-809)

Root Cause

Proposed Solution

Key Optimization Points

Testing Requirements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance Optimization for cohort_heterozygosity() #1211

Description

Problem Statement

Current Implementation (Lines 798-809)

Root Cause

Proposed Solution

Key Optimization Points

Testing Requirements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions