Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 0 additions & 14 deletions .github/workflows/pull_request.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,20 +32,6 @@ jobs:
eslint_extensions: ts
tsc: true

openapi-lint:
name: Run OpenAPI lint Check
runs-on: ubuntu-latest

steps:
- name: Check out TS Project Git repository
uses: actions/checkout@v5

- name: Init Nodejs
uses: MapColonies/shared-workflows/actions/init-npm@init-npm-v1

- name: OpenAPI Lint Checks
run: npx @redocly/cli lint --format=github-actions openapi3.yaml

helm-lint:
name: Run Helm lint Check
runs-on: ubuntu-latest
Expand Down
2 changes: 0 additions & 2 deletions .vscode/extensions.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,8 @@
// List of extensions which should be recommended for users of this workspace.
"recommendations": [
"redhat.vscode-yaml",
"Redocly.openapi-vs-code",
"esbenp.prettier-vscode",
"Tim-Koehler.helm-intellisense",
"42Crunch.vscode-openapi",
"dbaeumer.vscode-eslint",
"ms-azuretools.vscode-docker",
"ms-kubernetes-tools.vscode-kubernetes-tools",
Expand Down
1 change: 0 additions & 1 deletion .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@
"TELEMETRY_METRICS_ENABLED": "false",
"TELEMETRY_TRACING_URL": "http://localhost:55681/v1/trace",
"TELEMETRY_METRICS_URL": "http://localhost:55681/v1/metrics"
// "OPENAPI_FILE_PATH": "./openapi3.yaml"
}
}
]
Expand Down
24 changes: 7 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,21 @@
# Map Colonies typescript service template
# Sync Layer Server

----------------------------------

This is a basic repo template for building new MapColonies web services in Typescript.

> [!IMPORTANT]
> To regenerate the types on openapi change run the command `npm run generate:openapi-types`.

> [!WARNING]
> After creating a new repo based on this template, you should delete the CODEOWNERS file.
Service that continuously synchronizes geospatial layer data from a third-party GraphQL API into a remote PostgreSQL database.

See [SYNC.md](./SYNC.md) for a detailed description of the sync module architecture and lifecycle.

## Development
When in development you should use the command `npm run start:dev`. The main benefits are that it enables offline mode for the config package, and source map support for NodeJS errors.

### Template Features:
### Features:

- eslint configuration by [@map-colonies/eslint-config](https://github.com/MapColonies/eslint-config)

- prettier configuration by [@map-colonies/prettier-config](https://github.com/MapColonies/prettier-config)

- jest
- vitest

- .nvmrc

Expand All @@ -32,9 +27,7 @@ When in development you should use the command `npm run start:dev`. The main ben

- logging by [@map-colonies/js-logger](https://github.com/MapColonies/js-logger)

- OpenAPI request validation

- config load with [node-config](https://www.npmjs.com/package/node-config)
- config load with [@map-colonies/config](https://www.npmjs.com/package/@map-colonies/config)

- Tracing and metrics by [@map-colonies/telemetry](https://github.com/MapColonies/telemetry)

Expand All @@ -58,9 +51,6 @@ When in development you should use the command `npm run start:dev`. The main ben

- snyk

## API
Checkout the OpenAPI spec [here](/openapi3.yaml)

## Installation

Install deps with npm
Expand All @@ -83,7 +73,7 @@ Go to the project directory

```bash

cd my-project
cd sync-layer-server

```

Expand Down
112 changes: 112 additions & 0 deletions SYNC.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Sync Layer Module

## Overview

The sync module continuously synchronizes geospatial layer data from a third-party GraphQL API into a remote PostgreSQL database. It fetches data in pages, inserts new objects, updates deprecated ones, and tracks progress per layer using offset-based pagination.

## Architecture

```
┌──────────────────┐
│ SyncManager │ Scheduler loop (min-heap by next run time)
│ │ Manages lifecycle: start / stop
└───────┬──────────┘
┌──────────────────┐
│ LayerSyncHandler │ Orchestrates a single page fetch + process cycle
│ │ Coordinates all repositories and the client
└──┬──────┬────────┘
│ │
▼ ▼
┌──────┐ ┌───────────────────┐ ┌───────────────────┐
│Third │ │ SyncState │ │ LayerData │
│Party │ │ Repository │ │ Repository │
│Client│ │ (offset, status) │ │ (insert/update) │
└──────┘ └───────────────────┘ └───────────────────┘
```

## File Structure

```
src/
├── scheduler/
│ └── syncManager.ts # Scheduler loop with min-heap priority queue
├── handler/
│ └── layerSyncHandler.ts # Single-page fetch and process orchestration
├── externalClients/
│ ├── layersClient/
│ │ └── layersClient.ts # GraphQL client for the third-party API
│ └── layersClientModel.ts # GetLayerPage query string
├── dal/
│ └── repositories/
│ ├── syncStateRepository.ts # Tracks sync offset and status per layer
│ └── layerDataRepository.ts # Inserts new objects / updates deprecated ones
├── common/
│ ├── syncConfig.ts # Static config (layers, intervals, page size, URL)
│ └── ... # Shared infra (config, constants, DI, tracing)
└── types/
├── index.ts # Barrel export
├── syncConfig.ts # SyncConfig interface
├── syncState.ts # SyncStatus enum + SyncStateEntry interface
├── scheduler.ts # ScheduleEntry interface
└── thirdParty.ts # LayerObject, DeprecatedObject, ThirdPartyResponse
```

## How It Works

### Sync Lifecycle

1. **Startup** - `SyncManager.start()` reads the configured layers, initializes sync state for each (status: `SYNCING`, offset: `0`), and pushes them into a min-heap scheduler.

2. **Scheduler Loop** - The loop pops the next due layer, sleeps until its scheduled time, then calls `fetchAndSyncLayerPage()`.

3. **Page Fetch** - `layerClient.fetchPage()` sends a GraphQL query to the third-party API requesting up to `pageSize` objects starting from the current offset.

4. **Data Processing** - The handler orchestrates the response and delegates to `layerDataRepository`:
- `insertObjects()` - Batch upserts new/updated geospatial objects into the remote DB layer table.
- `updateDeprecatedObjects()` - Batch merges updated fields into existing objects in the remote DB.

5. **State Update** - `syncStateRepository` advances the offset to `nextRecord`.

6. **Status Transition** - When a page returns 0 objects during `SYNCING`, the layer transitions to `READY` (initial sync complete).

7. **Re-schedule** - The layer is pushed back into the heap with:
- `syncIntervalMs` (500ms) while `SYNCING` (fast initial catch-up)
- `pollIntervalMs` (10 min) once `READY` (periodic polling for changes)

### Configuration

| Property | Default | Description |
|---------------------|----------------------------------|--------------------------------------------------|
| `layers` | `['obstacles']` | Layer names to sync |
| `syncIntervalMs` | `500` | Delay between pages during initial sync |
| `pollIntervalMs` | `600000` (10 min) | Delay between polls after initial sync completes |
| `pageSize` | `500` | Max records requested per page |
| `thirdPartyBaseUrl` | `http://mock-third-party/graphql`| GraphQL endpoint URL |

## What Still Needs to Be Done

### Remote Database Integration
- [ ] **syncStateRepository** - Persist sync state (offset, status) to a `sync_state` table in the remote PostgreSQL database instead of the in-memory `Map`. Currently resets on restart.
- [ ] **layerDataRepository** - Implement actual SQL queries for `INSERT ... ON CONFLICT` (upsert) and `UPDATE` with JSONB merge against the remote DB. Tables should match layer names.
- [ ] **DB connection** - Set up a connection pool (e.g., `pg` / `knex` / `typeorm`) to the remote PostgreSQL instance with connection string from config/environment.

### Configuration
- [ ] **syncConfig** - Load config from the application config provider (e.g., `node-config` / environment variables) instead of hardcoded values.
- [ ] **thirdPartyBaseUrl** - Set the real third-party GraphQL endpoint URL.

### GraphQL
- [ ] **queries.ts** - Verify and adjust the GraphQL query schema to match the actual third-party API contract.

### Error Handling & Resilience
- [ ] Handle partial page failures (some objects succeed, some fail).

### Observability
- [ ] Add metrics (pages fetched, objects inserted, errors) via `prom-client`.
- [ ] Add OpenTelemetry spans for tracing sync operations.

### Testing
- [ ] Unit tests for `layerSyncHandler` (mock the repositories and client).
- [ ] Unit tests for `syncManager` scheduling logic.
- [ ] Integration tests with a real database.
13 changes: 7 additions & 6 deletions config/default.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
{
"openapiConfig": {
"filePath": "./openapi3.yaml",
"basePath": "/docs",
"rawPath": "/api",
"uiPath": "/api"
},
"telemetry": {
"metrics": {},
"tracing": {
Expand Down Expand Up @@ -32,5 +26,12 @@
"options": null
}
}
},
"sync": {
"layers": ["obstacles"],
"syncIntervalMs": 500,
"pollIntervalMs": 600000,
"pageSize": 500,
"thirdPartyBaseUrl": "http://mock-third-party/graphql"
}
}
106 changes: 0 additions & 106 deletions openapi3.yaml

This file was deleted.

Loading
Loading