Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
1586fcb
create logbatcher parser
viktorbeck98 Mar 7, 2026
51f464c
create logbatcher parser
viktorbeck98 Mar 7, 2026
e5c00f3
update dependencies
viktorbeck98 Mar 7, 2026
96b42f6
sort arguments in CoreComponent
ipmach Mar 13, 2026
e782111
move fitlogic outside core component
ipmach Mar 13, 2026
402d261
split core component into two, the empty methods and the rest
ipmach Mar 13, 2026
45defce
Move schema pipeline to a different file
ipmach Mar 13, 2026
be8f911
move fit logic to a new file
ipmach Mar 13, 2026
ec74369
rename files
ipmach Mar 16, 2026
3a5f8be
minor unittest issue corrected
ipmach Mar 16, 2026
a2d3573
add missing dependencies
ipmach Mar 17, 2026
e313dde
add dependencies that generate merge conflict
ipmach Mar 17, 2026
b890a85
Merge branch 'development' into feat/LLM_parser
ipmach Mar 17, 2026
fcfdc53
solve lock issue
ipmach Mar 17, 2026
bdbf6b8
rename file
ipmach Mar 17, 2026
51ecf0f
Merge pull request #86 from ait-detectmate/refact/core
viktorbeck98 Mar 17, 2026
fc7cc18
update gitignore
viktorbeck98 Mar 17, 2026
f5bdc8c
add option for combo and value detector to use static or stable varia…
viktorbeck98 Mar 17, 2026
a52aa36
fix prek issues
viktorbeck98 Mar 17, 2026
f5e4801
fix prek issues
viktorbeck98 Mar 17, 2026
db55467
adapt LogBatcher to be gpt agnostic and remove dead code
viktorbeck98 Mar 17, 2026
e8ab855
add documentation
viktorbeck98 Mar 17, 2026
92eb298
Merge pull request #73 from ait-detectmate/feat/LLM_parser
viktorbeck98 Mar 17, 2026
46c1b14
update config docs
viktorbeck98 Mar 17, 2026
b1f02b7
move events and global parameters to core detector
viktorbeck98 Mar 17, 2026
5f53b97
add inline comments for config
viktorbeck98 Mar 17, 2026
b3504fd
Merge pull request #93 from ait-detectmate/config_docs
ipmach Mar 19, 2026
afee30c
Merge pull request #96 from ait-detectmate/development
ipmach Mar 19, 2026
1ea810d
Bump version from 0.1.0 to 0.2.0
whotwagner Mar 19, 2026
2a9db88
Solve bug of workspace json files from pip
ipmach Mar 24, 2026
398d34f
propose server solution
ipmach Mar 24, 2026
16783fc
minor correction
ipmach Mar 24, 2026
25d9632
rename component type
ipmach Mar 24, 2026
95731d0
Bump prek from 0.3.6 to 0.3.8
dependabot[bot] Mar 25, 2026
0d5d4ef
Bump protobuf from 7.34.0 to 7.34.1
dependabot[bot] Mar 25, 2026
8ba2229
Bump polars from 1.39.0 to 1.39.3
dependabot[bot] Mar 25, 2026
fab4a9b
Bump openai from 2.28.0 to 2.29.0
dependabot[bot] Mar 25, 2026
4b780f4
Bump pytest-cov from 7.0.0 to 7.1.0
dependabot[bot] Mar 25, 2026
33468dc
Merge pull request #102 from ait-detectmate/hotfix/main
ipmach Mar 25, 2026
080ca8e
Bump requests from 2.32.5 to 2.33.0
dependabot[bot] Mar 26, 2026
0a55953
path_templates is not longer required by the MatcherParser
viktorbeck98 Mar 30, 2026
faea48c
Merge pull request #109 from ait-detectmate/small_fixes
ipmach Mar 30, 2026
b01cfa1
Merge pull request #108 from ait-detectmate/dependabot/uv/requests-2.…
ipmach Mar 30, 2026
359df50
Merge pull request #107 from ait-detectmate/dependabot/uv/pytest-cov-…
ipmach Mar 30, 2026
9f2970a
Merge pull request #106 from ait-detectmate/dependabot/uv/openai-2.29.0
ipmach Mar 30, 2026
df2bc12
Merge pull request #103 from ait-detectmate/dependabot/uv/prek-0.3.8
ipmach Mar 30, 2026
0641866
Merge branch 'main' into development
ipmach Mar 30, 2026
b2e00b3
add security file
ipmach Mar 30, 2026
76d6ec6
Merge pull request #105 from ait-detectmate/dependabot/uv/polars-1.39.3
ipmach Mar 30, 2026
04cb190
Merge branch 'development' into fix/new_value_detector
viktorbeck98 Mar 30, 2026
67dc28b
Warn when detector config doesn't match training data
viktorbeck98 Mar 30, 2026
7ff1817
Bump pygments from 2.19.2 to 2.20.0
dependabot[bot] Mar 30, 2026
fe9f7c8
fix minor issue
viktorbeck98 Mar 30, 2026
3e73ae8
Merge pull request #110 from ait-detectmate/feat/security
viktorbeck98 Mar 30, 2026
f410dff
add issue and PR templates
viktorbeck98 Mar 31, 2026
f574b55
feat(config): support named wildcards and named event IDs in template…
viktorbeck98 Mar 31, 2026
1190139
Merge pull request #92 from ait-detectmate/fix/new_value_detector
ipmach Mar 31, 2026
d0d24cf
Merge pull request #112 from ait-detectmate/dependabot/uv/pygments-2.…
ipmach Mar 31, 2026
b082403
Merge pull request #104 from ait-detectmate/dependabot/uv/protobuf-7.…
ipmach Mar 31, 2026
f159791
Merge branch 'main' into development
ipmach Mar 31, 2026
467f50d
Merge pull request #113 from ait-detectmate/feat/templates
ipmach Mar 31, 2026
e7f0bf7
Merge pull request #114 from ait-detectmate/feat/configuration2.0
ipmach Mar 31, 2026
3ad3c58
Merge pull request #111 from ait-detectmate/feat/warnings
ipmach Mar 31, 2026
3b21465
Merge branch 'feature/new-event-detector' into development
ernstleierzopf Apr 1, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .github/ISSUE_TEMPLATE/01_bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
name: 🐜 Bug report
about: If something isn't working 🔧
---

### Subject of the issue
Describe your issue here.

### Your environment
* Version of detectmate
* Version of python
* Docker or manual installation?

### Steps to reproduce
Tell us how to reproduce this issue.

### Expected behaviour
Tell us what should happen

### Actual behaviour
Tell us what happens instead
20 changes: 20 additions & 0 deletions .github/ISSUE_TEMPLATE/02_feature_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
name: 🚀 Feature request
about: If you have a feature request 💡
---

**Context**

What are you trying to do and how would you want to do it differently? Is it something you currently you cannot do? Is this related to an issue/problem?

**Alternatives**

Can you achieve the same result doing it in an alternative way? Is the alternative considerable?

**Has the feature been requested before?**

Please provide a link to the issue.

**If the feature request is approved, would you be willing to submit a PR?**

Yes / No _(Help can be provided if you need assistance submitting a PR)_
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
blank_issues_enabled: false
19 changes: 19 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Task
<!-- Please add link a relevant issue or task -->

# Description
<!-- Please include a summary of the change -->
<!-- Any details that you think are important to review this PR? -->
<!-- Are there other PRs related to this one? -->

# How Has This Been Tested?
<!-- Please describe how you tested your changes -->

# Checklist
<!-- Go over all the following points, and put an `x` in all the boxes that apply -->

- [ ] This Pull-Request goes to the **development** branch.
- [ ] I have successfully run prek locally.
- [ ] I have added tests to cover my changes.
- [ ] I have linked the issue-id to the task-description.
- [ ] I have performed a self-review of my own code.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -199,3 +199,6 @@ cython_debug/
local/
test.ipynb
test.py

# claude code
CLAUDE.md
102 changes: 102 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

DetectMateLibrary is a Python library for log processing and anomaly detection. It provides composable, stream-friendly components (parsers and detectors) that communicate via Protobuf-based schemas. The library is designed for both single-process and microservice deployments.

## Development Commands

```bash
# Install dependencies and pre-commit hooks
uv sync --dev
uv run prek install

# Run tests
uv run pytest -q
uv run pytest -s # verbose with stdout
uv run pytest --cov=. --cov-report=term-missing # with coverage
uv run pytest tests/test_foo.py # single test file

# Run linting/formatting (all pre-commit hooks)
uv run prek run -a

# Recompile Protobuf (only if schemas.proto is modified)
protoc --proto_path=src/detectmatelibrary/schemas/ \
--python_out=src/detectmatelibrary/schemas/ \
src/detectmatelibrary/schemas/schemas.proto

# Scaffold a new component workspace
mate create --type <parser|detector> --name <name> --dir <target_dir>
```

## Architecture

### Data Flow

```
Raw Logs → Parser → ParserSchema → Detector → DetectorSchema (Alerts)
```

All data flows through typed Protobuf-backed schema objects. Components are stateful and support an optional training phase before detection.

### Core Abstractions (`src/detectmatelibrary/common/`)

- **`CoreComponent`** — base class managing buffering, ID generation, and training state
- **`CoreParser(CoreComponent)`** — parse raw logs into `ParserSchema`
- **`CoreDetector(CoreComponent)`** — detect anomalies in `ParserSchema`, emit `DetectorSchema`
- **`CoreConfig`** / **`CoreParserConfig`** / **`CoreDetectorConfig`** — Pydantic-based configuration hierarchy

### Schema System (`src/detectmatelibrary/schemas/`)

- `BaseSchema` wraps generated Protobuf messages with dict-like access (`schema["field"]`)
- Key schemas: `LogSchema`, `ParserSchema`, `DetectorSchema`
- Support serialization to/from bytes for transport and persistence

### Buffering Modes (`src/detectmatelibrary/utils/data_buffer.py`)

Three modes via `ArgsBuffer` config:
- **NO_BUF** — one item at a time (default)
- **BATCH** — accumulate N items, process as batch
- **WINDOW** — sliding window of size N

### Implementations

- **Parsers** (`src/detectmatelibrary/parsers/`): `JsonParser`, `DummyParser`, `TemplateMatcherParser` (uses Drain3 for template mining)
- **Detectors** (`src/detectmatelibrary/detectors/`): `NewValueDetector`, `NewValueComboDetector`, `RandomDetector`, `DummyDetector`
- **Utilities** (`src/detectmatelibrary/utils/`): `DataBuffer`, `EventPersistency`, `KeyExtractor`, `TimeFormatHandler`, `IdGenerator`

## Extending the Library

Implement a custom detector by subclassing `CoreDetector`:

```python
class MyDetectorConfig(CoreDetectorConfig):
method_type: str = "my_detector"
my_param: int = 10

class MyDetector(CoreDetector):
def __init__(self, name="MyDetector", config=MyDetectorConfig()):
super().__init__(name=name, config=config)

def train(self, input_: ParserSchema) -> None:
pass # optional

def detect(self, input_: ParserSchema, output_: DetectorSchema) -> bool:
output_["detectorID"] = self.name
output_["score"] = 0.0
return False # True = anomaly detected
```

Same pattern applies for `CoreParser` — implement `parse(input_: LogSchema, output_: ParserSchema) -> bool`.

## Code Quality

Pre-commit hooks enforce:
- **mypy** strict mode
- **flake8** linting, **autopep8** formatting (max line 110)
- **bandit** security checks, **vulture** dead-code detection (70% threshold)
- **docformatter** docstring style

Python 3.12 is required (see `.python-version`).
32 changes: 32 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Security Policy

## Supported Versions

| Version | Supported |
| ------- | ------------------ |
| 1.x.x | :white_check_mark: |
| < 1.0.0 | :x: |

> [!IMPORTANT]
> Currently DetectMateService is a work in progress and heavily under development. Possible vulnerabilities will not be treated any special and can be issued using [GitHub-Issues](https://github.com/ait-detectmate/DetectMateService/issues)

## Reporting a Vulnerability

Please email reports about any security related issues you find to aecid@ait.ac.at. This mail is delivered to a small developer team. Your email will be acknowledged within one business day, and you'll receive a more detailed response to your email within 7 days indicating the next steps in handling your report.

Please use a descriptive subject line for your report email. After the initial reply to your report, our team will endeavor to keep you informed of the progress being made towards a fix and announcement.

In addition, please include the following information along with your report:

* Your name and affiliation (if any).
* A description of the technical details of the vulnerabilities. It is very important to let us know how we can reproduce your findings.
* An explanation who can exploit this vulnerability, and what they gain when doing so -- write an attack scenario. This will help us evaluate your report quickly, especially if the issue is complex.
* Whether this vulnerability public or known to third parties. If it is, please provide details.
* Whether we could mention your name in the changelogs.

Once an issue is reported we use the following disclosure process:

* When a report is received, we confirm the issue and determine its severity.
* If we know of specific third-party services or software based on DetectMateService that require mitigation before publication, those projects will be notified.
* Fixes are prepared for the last minor release of the latest major release.
* Patch releases are published for all fixed released versions.
2 changes: 0 additions & 2 deletions config/pipeline_config_default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,6 @@ detectors:
NewValueComboDetector:
method_type: new_value_combo_detector
auto_config: False
params:
comb_size: 3
events:
1:
test:
Expand Down
66 changes: 43 additions & 23 deletions docs/detectors.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ This document describes the minimal API, implementation guidance, a short exampl

```python
class CoreDetectorConfig(CoreConfig):
comp_type: str = "detectors"
component_type: str = "detectors"
method_type: str = "core_detector"
parser: str = "<PLACEHOLDER>"

Expand Down Expand Up @@ -89,43 +89,63 @@ List of detectors:
* [Combo Detector](detectors/combo.md): Detect new combination of variables in the logs.
* [New Event](detectors/new_event.md): Detect new events in the variables in the logs.

## Configuration

## Auto-configuration (optional)

Detectors can optionally support **auto-configuration** — a process where the detector automatically discovers which variables are worth monitoring, instead of requiring the user to specify them manually.

### Enabling auto-configuration

Auto-configuration is controlled by the `auto_config` flag in the pipeline config (e.g. `config/pipeline_config_default.yaml`):
When `auto_config` is set to `False`, the detector expects an explicit `events` block that specifies exactly which variables to monitor:

```yaml
detectors:
NewValueDetector:
method_type: new_value_detector
auto_config: True # enable auto-configuration
params: {}
# no "events" block needed — it will be generated automatically
auto_config: False
params: {} # global parameters
events: # event-specific configuration
1: # event_id
instance1: # name of instance (arbitrary)
params: {} # additional params
variables:
- pos: 0 # location of an unnamed variable from the log message
name: var1 # name of variable (arbitrary)
header_variables:
- pos: level # location of a named variable (defined in log_format of parser)
global: # define global instance for new_value_detector similar to "events"
global_instance1: # define instance name
header_variables: # same logic as header_variables in "events"
- pos: Status
```

When `auto_config` is set to `False`, the detector expects an explicit `events` block that specifies exactly which variables to monitor:

### Configuration semantics (preliminary)

**`events` key** — The integer key is the `EventID` (or `event_id`) to monitor (see the MatcherParser docs for how EventID is assigned).

**`variables[].pos`** — The 0-indexed position of the `<*>` wildcard in the matched template, counting from left to right starting at 0. For example, given:

```text
pid=<*> uid=<*> auid=<*> ses=<*> msg='op=<*> acct=<*> exe=<*> hostname=<*> addr=<*> terminal=<*> res=<*>'
```

`pos: 0` captures `pid=`, `pos: 6` captures `exe=`, etc.

**`header_variables[].pos`** — A named field from the log format string (e.g., `Type`, `Time`, `Content`) rather than a wildcard position.


### Auto-configuration (optional)

Detectors can optionally support **auto-configuration** — a process where the detector automatically discovers which variables are worth monitoring, instead of requiring the user to specify them manually.

Auto-configuration is controlled by the `auto_config` flag in the pipeline config (e.g. `config/pipeline_config_default.yaml`):

```yaml
detectors:
NewValueDetector:
method_type: new_value_detector
auto_config: False
auto_config: True # enable auto-configuration
params: {}
events:
1:
instance1:
params: {}
variables:
- pos: 0
name: var1
header_variables:
- pos: level
# no "events" block needed — it will be generated automatically
```


### How it works

When auto-configuration is enabled, the detector goes through two extra phases before training:
Expand Down Expand Up @@ -173,7 +193,7 @@ The `set_configuration()` method queries the tracker results and generates the f
def set_configuration(self):
variables = {}
for event_id, tracker in self.auto_conf_persistency.get_events_data().items():
stable_vars = tracker.get_variables_by_classification("STABLE")
stable_vars = tracker.get_features_by_classification("STABLE")
variables[event_id] = stable_vars

config_dict = generate_detector_config(
Expand Down
2 changes: 1 addition & 1 deletion docs/detectors/combo.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ detectors:
method_type: new_value_combo_detector
auto_config: False
params:
comb_size: 3
max_combo_size: 3
events:
1:
test:
Expand Down
6 changes: 6 additions & 0 deletions docs/parsers.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,4 +102,10 @@ def test_my_parser_parse():
assert out["variables"] == ["a", "b", "c"]
```

## Available parsers

- [JSON Parser](parsers/json_parser.md): extracts structured fields from JSON-formatted logs.
- [Template Matcher](parsers/template_matcher.md): matches logs against a predefined set of `<*>` templates.
- [LogBatcher Parser](parsers/logbatcher_parser.md): LLM-based parser that infers templates from raw logs with no training data.

Go back to [Index](index.md)
Loading
Loading