Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 28 additions & 44 deletions .github/workflows/autotests.yml
Original file line number Diff line number Diff line change
@@ -1,50 +1,34 @@
name: Generate latest builds
name: Python CI Tests

on:
push:
branches: ["master"]
branches: ["master", "dev*", "jvm-api-v1"]
pull_request:
branches: ["master", "dev*"]
branches: ["master", "dev*", "jvm-api-v1"]
workflow_dispatch:

jobs:
run_pytest:
name: HiCT Library autotests
runs-on: [ "ubuntu-latest" ]

tests:
runs-on: ubuntu-latest
steps:
- name: Checkout sources
uses: actions/checkout@v3
with:
submodules: recursive
- name: Setup Python
uses: actions/setup-python@v4.3.1
with:
# Version range or exact version of Python or PyPy to use, using SemVer's version range syntax. Reads from .python-version if unset.
python-version: '>=3.9 <3.11'
# Used to specify a package manager for caching in the default directory. Supported values: pip, pipenv, poetry.
cache: pip
# The target architecture (x86, x64) of the Python or PyPy interpreter.
architecture: x64
# Set this option if you want the action to update environment variables.
update-environment: true
- name: Install HDF5 library
uses: awalsh128/cache-apt-pkgs-action@latest
with:
packages: libhdf5-dev
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install -r requirements-dev.txt
continue-on-error: true
- name: Install dependencies
run: |
pip install pylint
- name: Analysing the code with pylint
run: |
pylint $(git ls-files '*.py')
continue-on-error: true
- name: Analysing the code with mypy
run: |
mypy -p hict
continue-on-error: true
- name: Launch PyTest
run: pytest -v .
- name: Checkout
uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: pip

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install -r requirements-dev.txt
pip install -e .

- name: Run mypy (JVM API package)
run: mypy hict_jvm_api

- name: Run tests (JVM API-first suite)
run: ./run_tests.sh
68 changes: 65 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# HiCT library for interactive manual scaffolding using Hi-C contact maps
# HiCT Python library (JVM API-first)

**Note**: this version is preliminary but provides an overview of essential implementation details for HiCT model.
This repository now provides a JVM-backed Python API as the primary and maintained interface.
Heavy operations are executed in `HiCT_JVM`; Python acts as a fast typed client layer.

## Overview

Expand All @@ -24,7 +25,68 @@ It is recommended to use virtual environments provided by `venv` module to simpl
This library uses HiCT format for the HiC data and you can convert Cooler's `.cool` or `.mcool` files to it using [HiCT utils](https://github.com/ctlab/HiCT_Utils)

## Documentation
This library has ContactMatrixFacet as the main interaction point. It hides all the internal methods, exposing only simple ones. Documentation for this module could be found at [doc directory](https://github.com/ctlab/HiCT/blob/master/doc/hict.api.ContactMatrixFacet.html) (download this file and open it using your web browser).
- JVM API client docs: [`doc/jvm_api_v1.md`](./doc/jvm_api_v1.md)
- Legacy `ContactMatrixFacet` docs (compatibility only):
[`doc/hict.api.ContactMatrixFacet.html`](./doc/hict.api.ContactMatrixFacet.html)

## Building from source
You can run `rebuild.sh` script in source directory which will perform static type-checking of module using mypy (it may produce error messages), build library from source and reinstall it, deleting current version.

## JVM API client (v1)

Use `hict.HiCTClient` (alias of `hict_jvm_api.HiCTJVMClient`) as the default entry point.

### Key capabilities
* Open/attach/close sessions in HiCT_JVM;
* Fetch Hi-C map regions as numpy RGBA arrays (`PNG_BY_PIXELS`) for ML pipelines;
* Fetch numeric submatrices directly as dense arrays/tensors (`/matrix/query`);
* Run scaffolding operations via API (reverse/move/split/group/ungroup/debris);
* Run converter jobs (single and batch) and monitor status;
* Link FASTA, export FASTA selections/assembly, import/export AGP;
* Convert coordinates between BP/BINS/PIXELS with hidden-contig awareness.

### Install

```bash
pip install -e .
```

### Quick start

```python
from hict import HiCTClient, Unit

client = HiCTClient("http://localhost:5000")
session = client.open_file("build/quad/combined_ind2_4DN.hict.hdf5")
resolution = session.resolutions[0]
tile = client.fetch_region_pixels(
start_row_px=0,
start_col_px=0,
rows=256,
cols=256,
bp_resolution=resolution,
)
px = client.convert_units(1_000_000, from_unit=Unit.BP, to_unit=Unit.PIXELS, bp_resolution=resolution)
signal = client.fetch_region_signal(
start_row=0,
start_col=0,
rows=256,
cols=256,
bp_resolution=resolution,
unit=Unit.PIXELS,
signal_mode="TRADITIONAL_NORMALIZED",
dtype="float32",
)
```

### Quick links
* API docs: [`doc/jvm_api_v1.md`](./doc/jvm_api_v1.md)
* Notebooks:
* [`notebooks/jvm_api_quickstart.ipynb`](./notebooks/jvm_api_quickstart.ipynb)
* [`notebooks/jvm_api_pytorch_dataloader.ipynb`](./notebooks/jvm_api_pytorch_dataloader.ipynb)

### Tests
* Unit tests (mocked HTTP transport):
* `./run_jvm_api_tests.sh`
* Optional integration tests against a real running HiCT_JVM:
* `./run_jvm_api_optional_data_tests.sh`
107 changes: 107 additions & 0 deletions doc/jvm_api_v1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# HiCT JVM API v1 Python Library

`hict` now defaults to a JVM-backed API client (`hict.HiCTClient`) that uses a running
`HiCT_JVM` server as the execution backend.

## Design goals

- Keep heavy matrix/assembly logic in JVM.
- Expose a Python API suitable for bioinformatics scripting and ML pipelines.
- Keep I/O efficient through pooled HTTP connections and direct region extraction (`PNG_BY_PIXELS`).

## Main classes

- `hict.HiCTClient` / `hict_jvm_api.client.HiCTJVMClient`
- session management (`open_file`, `attach_session`, `close_session`)
- map region fetch (`fetch_region_pixels`, `fetch_tile_png`, `fetch_tile_with_ranges`)
- numeric matrix fetch (`fetch_region_signal`, `fetch_region_signal_torch`) with
`RAW_COUNTS`, `COOLER_WEIGHTED`, `TRADITIONAL_NORMALIZED`, `PIPELINE_SIGNAL`
- scaffolding operations (`reverse_selection_range`, `move_selection_range`, `split_contig_at_bin`, etc.)
- conversion jobs (`start_conversion_job`, `start_batch_conversion_jobs`, polling helpers)
- FASTA/AGP operations (`link_fasta`, `export_fasta_for_selection`, `load_agp`)
- `hict_jvm_api.units.UnitConverter`
- fast local conversion between `BP`, `BINS`, `PIXELS` at a selected resolution,
respecting hidden contigs via `contigPresenceAtResolution`.
- `hict_jvm_api.dataloader.HiCTRegionDataset`
- PyTorch-friendly random-access dataset fetching regions from a live session.
- `hict_jvm_api.dataloader.HiCTSignalDataset`
- PyTorch/NumPy-friendly dataset fetching scalar matrix windows from `/matrix/query`.

## Installation

From repository root:

```bash
pip install -e .
```

With PyTorch extras:

```bash
pip install -e '.[torch]'
```

## Quick start

```python
from hict import HiCTClient, Unit

client = HiCTClient("http://localhost:5001")
open_resp = client.open_file("build/quad/combined_ind2_4DN.hict.hdf5")

resolution = open_resp.resolutions[0] # coarse level
img = client.fetch_region_pixels(
start_row_px=0,
start_col_px=0,
rows=256,
cols=256,
bp_resolution=resolution,
)

# Convert BP -> visible pixel coordinate
px = client.convert_units(1_000_000, from_unit=Unit.BP, to_unit=Unit.PIXELS, bp_resolution=resolution)

signal = client.fetch_region_signal(
start_row=0,
start_col=0,
rows=256,
cols=256,
bp_resolution=resolution,
unit=Unit.PIXELS,
signal_mode="TRADITIONAL_NORMALIZED",
dtype="float32",
)
```

## Testing

- Unit tests (mocked transport):

```bash
./run_jvm_api_tests.sh
```

- Optional integration tests against a real server and optional files:

```bash
export HICT_JVM_API_BASE_URL=http://localhost:5001
export HICT_DATASET_FILE=build/quad/combined_ind2_4DN.hict.hdf5
# Optional:
# export HICT_FASTA_FILE=build/quad/quad_combined_ind2.fasta
# export HICT_AGP_FILE=build/quad/some.agp
# export HICT_JVM_API_ALLOW_MUTATION=true
./run_jvm_api_optional_data_tests.sh
```

## Notebook examples

See:

- `notebooks/jvm_api_quickstart.ipynb`
- `notebooks/jvm_api_pytorch_dataloader.ipynb`

## OpenAPI docs endpoint

When `HiCT_JVM` is running, interactive API documentation is available at:

- `http://localhost:5000/api/v1/`
90 changes: 57 additions & 33 deletions hict/__init__.py
Original file line number Diff line number Diff line change
@@ -1,33 +1,57 @@
# MIT License
#
# Copyright (c) 2021-2026. Aleksandr Serdiukov, Anton Zamyatin, Aleksandr Sinitsyn, Vitalii Dravgelis and Computer Technologies Laboratory ITMO University team.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

# MIT License
#
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
#
"""HiCT Python API facade.

This package now defaults to the JVM-backed API client for production usage.
Legacy pure-Python modules remain importable from their original subpackages,
but new code should use :class:`HiCTClient`.
"""

from __future__ import annotations

import warnings

from hict_jvm_api.client import HiCTJVMClient
from hict_jvm_api.dataloader import HiCTRegionDataset, HiCTSignalDataset
from hict_jvm_api.exceptions import HiCTAPIError, HiCTClientStateError, HiddenCoordinateError
from hict_jvm_api.models import (
AssemblyInfo,
ContigDescriptor,
OpenFileResponse,
ScaffoldDescriptor,
TileRanges,
TileWithRanges,
Unit,
)
from hict_jvm_api.units import UnitConverter

# Backward-compatible alias for the primary entrypoint.
HiCTClient = HiCTJVMClient

__all__ = [
"HiCTClient",
"HiCTJVMClient",
"HiCTRegionDataset",
"HiCTSignalDataset",
"HiCTAPIError",
"HiCTClientStateError",
"HiddenCoordinateError",
"AssemblyInfo",
"ContigDescriptor",
"OpenFileResponse",
"ScaffoldDescriptor",
"TileRanges",
"TileWithRanges",
"Unit",
"UnitConverter",
]


def _warn_legacy_import(path: str) -> None:
warnings.warn(
(
f"The legacy pure-Python API module '{path}' is kept for compatibility "
"but is no longer the recommended API. Use 'hict.HiCTClient' "
"(JVM-backed) for maintained functionality."
),
category=DeprecationWarning,
stacklevel=2,
)
Loading
Loading