Skip to content

fix: duplicate edges on re-ingestion — replace CREATE with MERGE in upsert_relations() and add_graph_documents() #21

@polaz

Description

@polaz

Problem

Re-ingesting the same data creates duplicate edges. Every call to upsert_relations() (LlamaIndex) or add_graph_documents() (LangChain) creates new edges unconditionally:

# First ingest
store.upsert_relations([Relation(label="KNOWS", source_id="alice", target_id="bob")])
# Second ingest — creates a DUPLICATE edge
store.upsert_relations([Relation(label="KNOWS", source_id="alice", target_id="bob")])

This makes repeated indexing non-idempotent and bloats the graph.

Root cause

CoordiNode Cypher does not support MERGE for relationship patterns — only for node patterns (NodeScan). The adapters work around this with CREATE, accepting duplicate edges as a known limitation.

Tracked as G072 in CoordiNode DB repository: MERGE (src)-[r:TYPE]->(dst) fails with "MERGE create from non-NodeScan pattern".

SDK changes (after G072 is fixed in DB)

Once G072 is resolved:

llama-index-coordinode/llama_index/graph_stores/coordinode/base.py

  • upsert_relations(): replace CREATE (src)-[r:{label}]->(dst)MERGE (src)-[r:{label}]->(dst)
  • Remove the comment block explaining the CREATE fallback
  • Remove the SET r += $props workaround comment

langchain-coordinode/langchain_coordinode/graph.py

  • _create_edge(): replace CREATE (src)-[r:{rel_type}]->(dst)MERGE (src)-[r:{rel_type}]->(dst)
  • Rename _create_edge()_upsert_edge() (or keep name, update docstring)
  • _link_document_to_entities(): replace CREATE (d)-[:MENTIONS]->(n)MERGE (d)-[:MENTIONS]->(n)
  • add_graph_documents() docstring: remove duplicate-edge warning

Tests

  • test_add_graph_documents_idempotent: change assertion from cnt >= 1 to cnt == 1
  • Add similar idempotency test for LlamaIndex upsert_relations()

Acceptance criteria

  • Re-ingesting the same document creates exactly 1 edge (not N)
  • upsert_relations() is idempotent
  • add_graph_documents() is idempotent for both nodes and edges
  • All existing tests pass

Gate

Blocked by G072 — requires CoordiNode DB fix: MERGE clause for relationship patterns in Cypher executor.
Do not start SDK implementation until a CoordiNode release with G072 fixed is available.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions