Proposition provenance metadata

## Observation

When a proposition is extracted, `Proposition` captures the content (`text`, `mentions`, `confidence`) and source chunks (`grounding`), but not the conversational context of extraction:

- **Which turn** was this extracted from? (Turn 1 vs. turn 50 matters for temporal reasoning)
- **Who said it?** (System instruction, operator, user, AI response — different trust implications)
- **How was it extracted?** (Initial context loading, mid-conversation incremental, revision, manual entry)

Without provenance, downstream systems can't make informed decisions about trust or authority. The `metadata` field could store this, but there's no standard schema — every consumer invents their own.

## What DICE already has

- **`grounding: List<String>`** — chunk IDs, but not conversational metadata
- **`metadata: Map<String, Any>`** — could carry provenance, but no standard keys
- **`ExtractionPerspective`** — `USER`, `AGENT`, `ALL` — controls *whose* knowledge is extracted during the extraction step, but isn't stored on the resulting proposition
- **`SourceAnalysisContext`** — passed to `PropositionPipeline.processChunk()`, carries `schema`, `entityResolver`, `contextId`, `knownEntities`, `relations`, `promptVariables` — but no provenance fields
- **`ConversationSource`** — wraps a `Conversation` for incremental extraction, has message indices — but provenance isn't propagated to extracted propositions

## The question

Should DICE standardize provenance metadata on propositions?

Some possibilities:

1. **Standard metadata keys** — define constants like `dice.provenance.speakerRole`, `dice.provenance.extractionTurn`, `dice.provenance.extractionMode`. Uses existing `metadata` map, no schema changes. Extraction pipeline populates them when information is available.

2. **PropositionProvenance data class** — a structured provenance record stored in `metadata` or as a first-class field:
   ```kotlin
   data class PropositionProvenance(
       val extractionTurn: Int?,
       val speakerRole: SpeakerRole?,     // SYSTEM, OPERATOR, USER, ASSISTANT
       val extractionMode: ExtractionMode  // INITIAL, INCREMENTAL, REVISION, MANUAL
   )
   ```

3. **Extend SourceAnalysisContext** — add provenance fields so they flow through the extraction pipeline automatically. `IncrementalPropositionExtraction` could derive provenance from `ConversationSource` (turn = message index, speaker = message role).

### Where provenance matters

| Provenance Signal | Impact |
|-------------------|--------|
| Speaker role | Trust: SYSTEM/OPERATOR > USER > ASSISTANT |
| Extraction turn | Staleness: earlier turns more established but potentially more stale |
| Extraction mode | Confidence: MANUAL > INITIAL > INCREMENTAL |
| Source context | Cross-context reasoning: was this imported from another conversation? |


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposition provenance metadata #7

Observation

What DICE already has

The question

Where provenance matters

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Provenance Signal	Impact
Speaker role	Trust: SYSTEM/OPERATOR > USER > ASSISTANT
Extraction turn	Staleness: earlier turns more established but potentially more stale
Extraction mode	Confidence: MANUAL > INITIAL > INCREMENTAL
Source context	Cross-context reasoning: was this imported from another conversation?

Proposition provenance metadata #7

Description

Observation

What DICE already has

The question

Where provenance matters

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions