Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
b47dbde
put in the outline for the joss publication
skeating Oct 29, 2024
ba6d0c0
bib file should also be called paper
skeating Oct 29, 2024
b76b14d
Merge remote-tracking branch 'origin/main' into sk/joss-publication
skeating Nov 11, 2024
c8b6483
typo
skeating Nov 11, 2024
a5b4908
added either a readme or toc with a list of the directories files pre…
skeating Nov 15, 2024
0fbf696
making read me and table of contents consistent
skeating Nov 19, 2024
acdda77
making readme and table of contents consistent
skeating Nov 19, 2024
d60c426
making readme and table of contents consistent
skeating Nov 19, 2024
2cf4837
making read me and table of contents consistent
skeating Nov 19, 2024
0e77999
deleting this as it becomes the default
skeating Nov 26, 2024
1e033e1
a public repo cannot access a private repo
skeating Nov 26, 2024
8d93e28
highlight UCLH specific instructions
skeating Nov 26, 2024
b020ac8
trying to highlight specific
skeating Nov 26, 2024
90363ab
still trying to highlight UCLH specific information
skeating Nov 26, 2024
8ca4e90
make the read me not specific to UCLH
skeating Nov 26, 2024
7fa6df1
I don't think this is relevant
skeating Nov 26, 2024
b623274
testing a theory ifthe syntax works
skeating Nov 26, 2024
6b6ff98
make files a neat list
skeating Nov 27, 2024
e4f3fcc
make the headers bold
skeating Nov 27, 2024
ae46eb3
Merge remote-tracking branch 'origin/main' into sk/joss-publication
skeating Dec 24, 2024
5f53f5f
Merge remote-tracking branch 'origin/main' into sk/joss-publication
skeating Jan 13, 2025
35fa6ef
files added by mistake
skeating Jan 27, 2025
664f477
Merge remote-tracking branch 'origin/main' into sk/joss-publication
skeating Jan 27, 2025
8007b79
removed text that went in by mistake
skeating Jan 27, 2025
11b8cfc
delete files committed by mistake
skeating Jan 28, 2025
a4f6358
synchronizing documentation from the-rolling-skeleton repository
skeating Feb 4, 2025
f6c4940
updating links and including references from the book of flower which…
skeating Feb 5, 2025
e819679
updating links
skeating Feb 6, 2025
6972590
Merge remote-tracking branch 'origin/main' into sk/doc_dir_sync
skeating Feb 6, 2025
160de83
updating links
skeating Feb 7, 2025
dc5cae0
updating links
skeating Feb 10, 2025
b92523e
adding read mes
skeating Feb 11, 2025
629035d
Merge remote-tracking branch 'origin/sk/doc_dir_sync' into sk/doc_dir…
skeating Feb 11, 2025
26b6e2d
adding read mes
skeating Feb 11, 2025
56d3477
Merge branch 'main' into sk/joss-publication
tomaroberts Mar 3, 2025
dad57db
Merge branch 'sk/doc_dir_sync' into sk/joss-publication
skeating Jun 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 78 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,19 @@
PIXL Image eXtraction Laboratory

`PIXL` is a system for extracting, linking and de-identifying DICOM imaging data, structured EHR data and free-text data from radiology reports at UCLH.
Please see the [rolling-skeleton]([https://github.com/SAFEHR-data/the-rolling-skeleton=](https://github.com/SAFEHR-data/the-rolling-skeleton/blob/main/docs/design/100-day-design.md)) for more details.

PIXL is intended run on one of the [GAE (General Application Environments)](https://github.com/SAFEHR-data/Book-of-FlowEHR/blob/main/glossary.md#gaes)s and comprises
several services orchestrated by [Docker Compose](https://docs.docker.com/compose/).
It comprises several services orchestrated by [Docker Compose](https://docs.docker.com/compose/).

<details><summary>UCLH SPECIFIC</summary>

PIXL is intended run on one of the [GAE (General Application Environments)](https://github.com/SAFEHR-data/Book-of-FlowEHR/blob/main/glossary.md#gaes)s.

To get access to the GAE, [see the documentation on Slab](https://uclh.slab.com/posts/gae-access-7hkddxap).
Please request access to Slab and add further details in a [new blank issue](https://github.com/SAFEHR-data/PIXL/issues/new).

Please request access to Slab and add further details in a [new blank issue](https://github.com/SAFEHR-data/PIXL/issues/new).

</details>


## Installation in production

Expand Down Expand Up @@ -66,7 +72,7 @@ destination.

Provides helper functions for de-identifying DICOM data

### PostgreSQL
### [PostgreSQL](.postgres/README.md)

RDBMS which stores DICOM metadata, application data and anonymised patient record data.

Expand All @@ -78,7 +84,7 @@ HTTP API to export files (parquet and DICOM) from UCLH to endpoints.

HTTP API to process messages from the `imaging` queue and populate the raw orthanc instance with images from PACS/VNA.

## Setup `PIXL` in GAE
## Setup `PIXL`

<details>
<summary>Click here to expand steps and configurations</summary>
Expand Down Expand Up @@ -208,7 +214,7 @@ These variables can be set in the `.env` file.
For testing, they can be set in the `test/.secrets.env` file.
For dev purposes find the `pixl-dev-secrets.env` note on LastPass for the necessary values.

If an Azure Keyvault hasn't been set up yet, follow [these instructions](./docs/setup/azure-keyvault.md).
At UCLH if an Azure Keyvault hasn't been set up yet, follow [these instructions](./docs/setup/azure-keyvault.md).

A second Azure Keyvault is used to store hashing keys and salts for the `hasher` service.
This kevyault is configured with the following environment variables:
Expand All @@ -224,7 +230,7 @@ See the [hasher documentation](./hasher/README.md) for more information.

</details>

## Run `PIXL` in GAE
## Run `PIXL`

<details>
<summary>Click here to view detailed steps</summary>
Expand Down Expand Up @@ -284,6 +290,9 @@ test/resources/omop/public /*.parquet

### OMOP ES extract dir (input to PIXL)

>[!NOTE]
> OMOP ES is the tool used to extract Electronic Health Records that may be linked to images.

EXTRACT_DIR is the directory passed to `pixl populate` as the input `PARQUET_PATH` argument.

```
Expand All @@ -294,8 +303,8 @@ EXTRACT_DIR/public /*.parquet

### PIXL Export dir (PIXL intermediate)

The directory where PIXL will copy the public OMOP extract files (which now contain
the radiology reports) to.
The directory where PIXL will copy the public OMOP extract files and the radiology reports.
These files will subsequently be uploaded to the `parquet` destination specified in the
[project config](#3-configure-a-new-project).

Expand All @@ -316,10 +325,63 @@ FTPROOT/PROJECT_SLUG/EXTRACT_DATETIME/parquet/radiology/radiology.parquet
..............................................omop/public/*.parquet
```

## :octocat: Cloning repository
* Generate your SSH keys as suggested [here](https://docs.github.com/en/github/authenticating-to-github/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent)
* Clone the repository by typing (or copying) the following lines in a terminal
```
git clone git@github.com:SAFEHR-data/PIXL.git
```
## 'PIXL' Directory Contents

<details>
<summary>

<h3> Subdirectories with links to the relevant README </h3>

</summary>


[bin](./bin/README.md)

[cli](./cli/README.md)

[docker](./docker/README.md)

[docs](./docs/README.md)

[hasher](./hasher/README.md)

[orthanc](./orthanc/README.md)

[pixl_core](./pixl_core/README.md)

[pixl_dcmd](./pixl_dcmd/README.md)

[pixl_export](./pixl_export/README.md)

[pixl_imaging](./pixl_imaging/README.md)

[postgres](./postgres/README.md)

[projects](./projects/README.md)

[pytest-pixl](./pytest-pixl/README.md)

[schemas](./schemas/README.md)

[scripts](./scripts/README.md)

[test](./test/README.md)
</details>
<details>
<summary>

### Files

</summary>

| **Configuration** | **User docs** | **Housekeeping** |
| :--- | :--- | :--- |
| .env.sample | CODE_OF_CONDUCT.md | .renovaterc.json5 |
| .pre-commit-config.yaml | CONTRIBUTING.md | codecov.yml |
| docker-compose.yml | LICENSE | |
| mypy.ini | NOTICE | |
| pytest.ini | README.md | |
| ruff.toml | | |
| template_config.yaml | | |

</details>
18 changes: 18 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
## 'PIXL/docs' Directory Contents

<details>
<summary>
<h3> Subdirectories with links to the relevant README </h3>

</summary>

[archive](./archive/README.md)

[design](./design/README.md)

[developer](./developer/README.md)

[joss-publication](./joss-publication/README.md)

</details>

15 changes: 15 additions & 0 deletions docs/archive/PIXLv1/Considerations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
## Risks and Considerations

### Technical Risks
The primary technical risk is overburdening the PACS & VNA and causing an adverse impact on the operational performance of these systems.
To mitigate this risk, queries will be managed with a task queue. The system will enforce rate limiting of any commands sent to the PACS & VNA with an adapted [token bucket](https://en.wikipedia.org/wiki/Token_bucket) algorithm which can be adjusted at runtime in response to system load. A [circuit breaker](https://en.wikipedia.org/wiki/Circuit_breaker_design_pattern) will wrap the retrieval processes and act as fail-safe. Individual request retries will be subject to an [exponential backoff](https://en.wikipedia.org/wiki/Exponential_backoff) strategy.


### Security Considerations
#### Inbound access to the Cloud Environment in Azure
It is expected that a VPN connection (or ExpressRoute connection) between the on-prem UCLH estate and Azure will not initially be available.
Point-to-point firewall restrictions and Azure access tokens will manage secure communication between PIXL and the DICOM service.

#### Outbound access
All outbound connections will be over HTTPS.
The existing proxy service will be relied upon to manage outbound access from the GAE.
Loading
Loading