Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion appinfo/routes.php
Original file line number Diff line number Diff line change
Expand Up @@ -295,7 +295,10 @@
['name' => 'transition#transition', 'url' => '/api/objects/{id}/transition', 'verb' => 'POST', 'requirements' => ['id' => '[^/]+']],
['name' => 'transition#availableActions', 'url' => '/api/objects/{id}/available-actions', 'verb' => 'GET', 'requirements' => ['id' => '[^/]+']],

// Aggregations sugar endpoint.
// Aggregations — ad-hoc time-bucket primitive (must be ordered
// BEFORE the {name} wildcard so /timeseries literal matches first).
['name' => 'aggregation#timeseries', 'url' => '/api/objects/aggregations/{register}/{schema}/timeseries', 'verb' => 'GET'],
// Aggregations sugar endpoint — named annotation surface.
['name' => 'aggregation#aggregate', 'url' => '/api/objects/aggregations/{register}/{schema}/{name}', 'verb' => 'GET'],

// Contacts matching API — used by ContactsMenuProvider + mail-sidebar.
Expand Down
150 changes: 150 additions & 0 deletions docs/technical/aggregation-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Aggregation API

OpenRegister exposes **two** aggregation surfaces. Pick the right one for your use case:

| Surface | Surface owner | When to use |
|---|---|---|
| **Named declarative** — `x-openregister-aggregations` schema annotation | App author / schema author | KPI tiles, business-rule counts, anything the app owns and ships with its register. Cached for 60s. |
| **Runtime ad-hoc** — REST `/aggregate/timeseries` + GraphQL `groupBy` | Client (per-request) | Dashboard charts, ad-hoc bucketing, "let the user pick a date range". No cache. |

This page documents the **ad-hoc primitive** (added by the `add-time-bucket-aggregation` change). For the named surface see `x-openregister-aggregations` documentation.

## When to use each

The named declarative surface is the right home for behaviours the **schema author** controls — KPIs, counts, business-rule rollups. Those are part of the app's contract and live in `lib/Settings/{app}_register.json`.

The ad-hoc primitive is the right home for behaviours the **client** controls — the user picks a date range, the dashboard widget picks the bucketing interval, the chart picks the metric. None of that belongs in the schema register; it's request-scoped.

A rule of thumb: if you'd hard-code the metric and field in the dashboard's source code, use the named surface. If the user gets to pick them at runtime, use the ad-hoc surface.

## REST surface

### Endpoint

```
GET /api/objects/aggregations/{register}/{schema}/timeseries
```

### Query parameters

| Param | Required | Notes |
|---|---|---|
| `field` | yes | The field to group / bucket on. MUST be a declared property of `{schema}` OR one of `_created`, `_updated`, `_deleted_at`. |
| `interval` | no | One of `MINUTE`, `HOUR`, `DAY`, `WEEK`, `MONTH`, `QUARTER`, `YEAR`. When set, the field is time-bucketed via Postgres `date_trunc()`. When absent, the field is grouped categorically. |
| `from` | required when `interval` set | ISO-8601 lower bound, inclusive. |
| `to` | required when `interval` set | ISO-8601 upper bound, exclusive. |
| `metric` | no | One of `count`, `sum`, `avg`, `min`, `max`. Default `count`. |
| `metricField` | required when `metric != count` | Field to aggregate over. MUST be a declared schema property. |
| `filter[...]` | no | Reuses the existing object-collection filter vocabulary (`filter[status]=active`, `filter[duration][gte]=10`). |

### Sub-day intervals require date-time fields

Bucketing by `MINUTE` or `HOUR` requires the field's JSON-Schema `format` to be `date-time` (not `date`). A `date`-only field can only be bucketed by `DAY`, `WEEK`, `MONTH`, `QUARTER`, or `YEAR`. The endpoint returns `400 Bad Request` if the constraint is violated.

### Response shape

```json
{
"groups": [
{ "key": "2026-05-21T00:00:00Z", "value": 42 },
{ "key": "2026-05-22T00:00:00Z", "value": 17 }
],
"backend": "postgres",
"cached": false
}
```

- `key`: bucket label. For `interval`-bucketed queries this is an ISO-8601-UTC string at the start of the bucket. For categorical groupBy it's the value of the groupBy field.
- `value`: the aggregated metric (always a number; an integer for `count`, a float for other metrics).
- `backend`: `"postgres"` (native `date_trunc` path) or `"php-fallback"` (non-Postgres environments).
- `cached`: always `false` on the ad-hoc path. Caching is tracked in [issue #1610](https://github.com/ConductionNL/openregister/issues/1610).

### Empty buckets

Buckets with zero rows are **omitted** from the response — `GROUP BY` does not emit empty groups. The client fills empties at render time. See [issue #1607](https://github.com/ConductionNL/openregister/issues/1607) for cumulative / windowed series.

### Status codes

| Code | When |
|---|---|
| `200` | Happy path. |
| `400` | Validation failure (unknown field, bad interval, missing bounds, etc.). |
| `403` | Caller lacks `list` permission on the schema. |
| `404` | Register or schema not found. |

### Example

```bash
curl -s 'http://localhost:8080/index.php/apps/openregister/api/objects/aggregations/openconnector/calllogs/timeseries?field=created&interval=DAY&from=2026-05-01T00:00:00Z&to=2026-05-22T00:00:00Z' \
-u admin:admin \
-H 'OCS-APIRequest: true' \
| jq .
```

## GraphQL surface

Every auto-generated list query accepts an optional `groupBy: GroupByInput` argument. When supplied, the connection result includes a non-null `groups: [GroupBucket!]` field.

### Types (auto-generated)

```graphql
input GroupByInput {
field: String!
interval: TimeInterval
from: String # required when interval is set
to: String # required when interval is set
metric: AggregationMetric = COUNT
metricField: String # required when metric != COUNT
}

enum TimeInterval { MINUTE HOUR DAY WEEK MONTH QUARTER YEAR }
enum AggregationMetric { COUNT SUM AVG MIN MAX }
type GroupBucket { key: String! value: Float! }
```

### Example query

```graphql
query CallsPerDay {
calllogs(
filter: { status: "error" }
groupBy: {
field: "created"
interval: DAY
from: "2026-05-01T00:00:00Z"
to: "2026-05-22T00:00:00Z"
}
) {
totalCount
groups {
key
value
}
}
}
```

`totalCount` is the size of the filtered set; the sum of `groups[*].value` equals `totalCount` when `metric: COUNT`.

When the client does not request `groupBy`, the `groups` field is `null` (not an empty array — `null` means "no aggregation requested").

### Validation errors

Validation problems surface as GraphQL field-errors on the `groups` field. The rest of the connection (edges, pageInfo, totalCount, facets) still resolves normally.

## Performance notes

- **Postgres index**: for any field commonly used as a bucketing target (`created`, `updated`, custom date columns), declare a btree index on the magic-table column. `date_trunc()` against an indexed timestamp column is sub-50ms on tens of millions of rows.
- **Row-level RBAC**: the multi-tenant predicate (`_organisation = ?`) and the schema's `PermissionHandler::canRead()` verdict both apply BEFORE bucketing. Aggregations cannot leak rows the caller could not read row-by-row.
- **Non-Postgres fallback**: SQLite and MySQL fall through to the PHP-side bucketer (`backend: "php-fallback"`). Correct, but slow on tables > 10k rows — the row cap is 10 000 and `truncated: true` is set when exceeded. Native MySQL / SQLite bucketing is tracked in [issue #1609](https://github.com/ConductionNL/openregister/issues/1609).
- **No cache**: ad-hoc queries hit Postgres on every request. Caching is tracked in [issue #1610](https://github.com/ConductionNL/openregister/issues/1610).

## Non-goals (deferred)

| Topic | Issue |
|---|---|
| Multi-field groupBy (`groupBy: [status, priority]`) | [#1606](https://github.com/ConductionNL/openregister/issues/1606) |
| Running / cumulative series | [#1607](https://github.com/ConductionNL/openregister/issues/1607) |
| Multi-metric in one request (`count` + `sum`) | [#1608](https://github.com/ConductionNL/openregister/issues/1608) |
| Native MySQL / SQLite bucketing | [#1609](https://github.com/ConductionNL/openregister/issues/1609) |
| Caching of ad-hoc queries | [#1610](https://github.com/ConductionNL/openregister/issues/1610) |
98 changes: 93 additions & 5 deletions lib/Controller/AggregationController.php
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,27 @@
/**
* OpenRegister AggregationController
*
* Sugar HTTP entry point for the x-openregister-aggregations annotation.
* HTTP entry point for the two aggregation surfaces OR exposes:
*
* - {@see aggregate()} — named-annotation surface backed by the
* `x-openregister-aggregations` block on a schema. Schema-author
* declared, immutable per release. Original surface.
* - {@see timeseries()} — ad-hoc surface where the client supplies
* the field, optional bucketing interval, and bounds at request
* time. Added by `add-time-bucket-aggregation`. Backs the
* nextcloud-vue `CnChartWidget.dataSource` bucket shorthand.
*
* Both paths share `AggregationRunner` for RBAC + multi-tenant
* gating + Postgres / fallback dispatch. The ad-hoc path does not
* consult `AggregationCache` (its key shape is keyed on the named
* annotation — extending it is tracked in issue #1610).
*
* @category Controller
* @package OCA\OpenRegister\Controller
*
* SPDX-License-Identifier: EUPL-1.2
* SPDX-FileCopyrightText: 2026 Conduction B.V. <dev@conduction.nl>
*
* @author Conduction Development Team <dev@conduction.nl>
* @copyright 2026 Conduction B.V.
* @license EUPL-1.2 https://joinup.ec.europa.eu/collection/eupl/eupl-text-eupl-12
Expand All @@ -21,8 +37,10 @@

namespace OCA\OpenRegister\Controller;

use InvalidArgumentException;
use OCA\OpenRegister\Exception\NotAuthorizedException;
use OCA\OpenRegister\Service\Aggregation\AggregationRunner;
use OCA\OpenRegister\Service\Aggregation\TimeseriesRequestValidator;
use OCP\AppFramework\Controller;
use OCP\AppFramework\Http;
use OCP\AppFramework\Http\JSONResponse;
Expand All @@ -34,14 +52,16 @@ class AggregationController extends Controller
/**
* Constructor.
*
* @param string $appName The application name.
* @param IRequest $request The current request.
* @param AggregationRunner $runner The aggregation runner.
* @param string $appName The application name.
* @param IRequest $request The current request.
* @param AggregationRunner $runner The aggregation runner.
* @param TimeseriesRequestValidator $validator Ad-hoc request validator.
*/
public function __construct(
string $appName,
IRequest $request,
private readonly AggregationRunner $runner
private readonly AggregationRunner $runner,
private readonly TimeseriesRequestValidator $validator
) {
parent::__construct(appName: $appName, request: $request);
}//end __construct()
Expand Down Expand Up @@ -82,4 +102,72 @@ public function aggregate(string $register, string $schema, string $name): JSONR
);
return $response;
}//end aggregate()

/**
* Ad-hoc time-bucket aggregation entry point.
*
* Accepts query params:
* - field (required)
* - interval (optional — MINUTE|HOUR|DAY|WEEK|MONTH|QUARTER|YEAR)
* - from, to (required when interval set; ISO-8601)
* - metric (optional, default `count`)
* - metricField (required when metric != count)
* - filter[...] (optional, reuses the existing filter vocabulary)
*
* Returns `{ groups: [{ key, value }], backend, cached }` matching the
* GraphQL `groups` field shape so `CnChartWidget` can normalise once.
*
* @param string $register Register reference.
* @param string $schema Schema reference.
*
* @return JSONResponse JSON response with bucketed groups.
*
* @NoAdminRequired
* @NoCSRFRequired
*/
public function timeseries(string $register, string $schema): JSONResponse
{
// Resolve schema first so the validator can consult the
// declared property list. A missing schema is a 404; a bad
// query-param shape is a 400.
try {
$schemaEntity = $this->runner->findSchema(schemaRef: $schema);
} catch (RuntimeException $e) {
return new JSONResponse(['error' => $e->getMessage()], Http::STATUS_NOT_FOUND);
}

// Pull the request shape from the active IRequest. The filter
// map comes through as a nested array because PHP parses
// `filter[x][op]=y` into `$_GET['filter']['x']['op']='y'`.
$input = [
'field' => $this->request->getParam('field', ''),
'interval' => $this->request->getParam('interval'),
'from' => $this->request->getParam('from'),
'to' => $this->request->getParam('to'),
'metric' => $this->request->getParam('metric', 'count'),
'metricField' => $this->request->getParam('metricField'),
'filter' => (array) ($this->request->getParam('filter', [])),
];

try {
$query = $this->validator->validate(input: $input, schema: $schemaEntity);
} catch (InvalidArgumentException $e) {
return new JSONResponse(['error' => $e->getMessage()], Http::STATUS_BAD_REQUEST);
}

try {
$result = $this->runner->runAdhocByRef(
registerRef: $register,
schemaRef: $schema,
query: $query
);
} catch (NotAuthorizedException $e) {
return new JSONResponse(['error' => $e->getMessage()], Http::STATUS_FORBIDDEN);
} catch (RuntimeException $e) {
return new JSONResponse(['error' => $e->getMessage()], Http::STATUS_NOT_FOUND);
}

return new JSONResponse($result);

}//end timeseries()
}//end class
Loading
Loading