Introduces a PostgreSQL data-access layer based on TypeORM and wires its lifecycle into the existing DI container and graceful-shutdown flow. Before this PR, the sync state and layer data were held in memory and reset on every restart; now they are persisted in PostgreSQL.
The layer data schema uses native PostgreSQL LIST partitioning: there is a single logical parent table layer_objects partitioned by layer_name, and each configured layer gets its own physical partition table (layer_<layerName>) created automatically at startup. The application code is layer-agnostic - one entity, one repository - and Postgres routes rows to the correct partition based on the layer_name column.
**migrations/001_create_tables.sql** - bootstrap SQL creating:sync_state- shared table tracking per-layer sync status (layer_name,status,last_sequence,updated_at).layer_objects- LIST-partitioned parent table keyed bylayer_name, with composite PK(layer_name, id)and columnsgeom(PostGISgeometry(Polygon, 4326) NOT NULL) andproperties(JSONB).geomhas a GiST spatial index andCHECKconstraints enforcing validity (ST_IsValid) and world-extent (Box2D ... @ Box2D(...)). Requires thepostgisextension. The parent stores no rows - each layer's data lives in its own partition.- Per-layer partitions (
layer_<name>) are not in the migration; they are created at runtime byensureLayerPartitions()based onsync.layers.
**config/default.json** - newdbsection (host, port, database, username, password, ssl).**src/types/dbConfig.ts**-DbConfiginterface.**src/common/dbConfig.ts**-getDbConfig()helper, mirrors the existinggetSyncConfig()pattern on top of@map-colonies/config.
**src/dal/connection.ts**createDataSource()- builds a TypeORMDataSourcefromgetDbConfig()with two entities:SyncStateEntryandLayerObjectEntity.initializeDb(layers)- idempotent connect, called once on startup. Also runsensureLayerPartitions()so every layer declared in config has a physical partition ready.ensureLayerPartitions(ds, layers)- runsCREATE TABLE IF NOT EXISTS "layer_<name>" PARTITION OF layer_objects FOR VALUES IN ('<name>')for each configured layer.getDataSource()- safe accessor used by repositories.closeDb()- graceful disconnect, wired into the existingonSignalshutdown hook.
**src/dal/entities/syncState.ts**-SyncStateEntry@Entity('sync_state')class withlayerName,status,lastSequence,updatedAtcolumns (plus the existingSyncStatusenum).**src/dal/entities/layerObject.ts**- a singleLayerObjectEntitymapped to the partitioned parentlayer_objects:- Composite primary key
(layer_name, id)(required becauselayer_nameis the partition key). - Columns:
geom geometry(Polygon, 4326) NOT NULL(PostGIS) with a GiST spatial index and validity / world-extentCHECKconstraints,properties JSONB NOT NULL DEFAULT '{}',created_at TIMESTAMPTZ NOT NULL DEFAULT now(). getLayerPartitionName(layerName)helper returns the child-partition name (layer_<layerName>), used byensureLayerPartitions().LayerObjectdomain type for the external API.
- Composite primary key
**src/dal/entities/index.ts** - re-exports the entity class, the partition-name helper, and the types.
**src/dal/repositories/syncStateRepository.ts**- unchanged shape, backed by the sharedsync_statetable.**src/dal/repositories/layerDataRepository.ts**- layer-aware via thelayer_namecolumn, not via dynamic entities or table names:insertObjects(layerName, objects)- bulk insert intolayer_objectswithlayer_namestamped on every row; usesorIgnore()(ON CONFLICT DO NOTHING) so sync retries/replays are idempotent. Postgres routes each row to thelayer_<layerName>partition automatically;geomvalues coming in as GeoJSON Polygons are converted to PostGIS geometry by thepgdriver.deleteDeprecatedObjects(layerName, deletedIds)- batchDELETE FROM layer_objects WHERE layer_name = :layerName AND id IN (:...ids). Partition pruning limits the delete to the matching partition.
**src/containerConfig.ts** - readssync.layersviagetSyncConfig(), callsawait initializeDb(syncConfig.layers)during bootstrap, logs the connection target + active layer partitions, and callscloseDb()alongsidegetTracing().stop()in theonSignalshutdown hook.**src/handler/layerSyncHandler.ts**- awaits all now-async repository calls.**src/scheduler/syncManager.ts**-start()is nowasyncand awaits state initialization / reads.**src/index.ts**-void syncManager.start()to keep the fire-and-forget semantics.**src/types/syncState.ts**+**src/types/index.ts**- re-exportSyncStateEntryas a value (class) instead of a type, since TypeORM needs the class at runtime.
package.json- added:pg@^8.20.0typeorm@^0.3.28
To onboard a new layer (e.g. roads):
- Add it to
sync.layersinconfig/default.json(or the env-specific config):
"sync": { "layers": ["obstacles", "roads"], ... }- Restart the service. On startup,
ensureLayerPartitions()will run:
CREATE TABLE IF NOT EXISTS "layer_roads" PARTITION OF layer_objects FOR VALUES IN ('roads');- No code changes are required -
insertObjects('roads', ...)anddeleteDeprecatedObjects('roads', ...)already takelayerNameas a parameter, and Postgres routes writes tolayer_roadsbased on thelayer_namecolumn.
- Per-layer tables (one
@Entityeach): would force dynamic schema registration in TypeORM and dynamic table-name resolution in every query. Rejected - too much code complexity for an operational gain we can get for free. - Flat single table with a discriminator column: simple code, but one shared heap means one VACUUM cycle, shared bloat, shared indexes, and no per-layer
TRUNCATE/DROP. Hot layers starve cold ones in the page cache. - LIST-partitioned parent + per-layer partitions (this PR): one logical entity and one code path, but each layer is a real separate physical table on disk with its own heap, indexes, statistics, and VACUUM cycle. Queries that filter by
layer_nameget partition pruning;TRUNCATE layer_roads/DETACH PARTITION layer_trees/ per-partition backups are all trivial.
This matches the documented best-practice profile for LIST partitioning: a small, bounded set of discrete values (our layer names), a shared schema, and an access pattern that always filters on the partition key.
| File | Change |
|---|---|
migrations/001_create_tables.sql |
sync_state + LIST-partitioned layer_objects parent (partitions created at runtime) |
config/default.json |
+ db section |
src/types/dbConfig.ts |
new - DbConfig interface |
src/types/index.ts |
re-export DbConfig, SyncStateEntry as value |
src/types/syncState.ts |
re-export SyncStateEntry as value |
src/common/dbConfig.ts |
new - getDbConfig() |
src/dal/connection.ts |
new - DataSource lifecycle + ensureLayerPartitions() |
src/dal/entities/syncState.ts |
interface → @Entity class |
src/dal/entities/layerObject.ts |
single @Entity('layer_objects') with composite PK (layer_name, id) + getLayerPartitionName() helper |
src/dal/entities/index.ts |
export entity + partition-name helper + types |
src/dal/repositories/syncStateRepository.ts |
poolSizein-memory → TypeORM, async |
src/dal/repositories/layerDataRepository.ts |
partition-aware writes via layer_name column; parameterized JSONB merge |
src/handler/layerSyncHandler.ts |
await async repo calls |
src/scheduler/syncManager.ts |
start() → async |
src/index.ts |
void syncManager.start() |
src/containerConfig.ts |
initializeDb(syncConfig.layers) on boot, closeDb() on signal |
package.json |
+ pg, typeorm |
- Apply
migrations/001_create_tables.sqlagainst the target PostgreSQL database (createssync_stateand the partitionedlayer_objectsparent). Per-layer partitions are created automatically on first startup. - Populate the
dbsection ofconfig/default.json(or the environment-specific config file) with the target host, port, database, username, password, ssl. - Set
sync.layersto the list of layers you want to sync. - Deploy -
initializeDb()runs on startup, ensures each layer's partition exists, and the service fails fast if the DB is unreachable. OnSIGINT/SIGTERM, Terminus drains the sync loop and closes theDataSourceviacloseDb().
- Connection settings are read through
@map-colonies/config(same mechanism assync) - no newprocess.envreads were introduced. SyncStateEntryis exported as a value (class) because TypeORM needs the class reference at runtime (e.g.getRepository(SyncStateEntry)).- The partition key
layer_nameis part of the primary key (required by Postgres for partitioned tables), so(layer_name, id)is the effective uniqueness constraint across the whole logical dataset. - Layer names come from trusted config (
sync.layers); they are interpolated into theCREATE TABLE … PARTITION OF …DDL inensureLayerPartitions()- keepsync.layersout of any user-controlled input path. - No HTTP routes were added; the service remains a background sync worker behind Terminus and Express middleware.