Skip to content

fix: hallucination detection tests #1006

@jakelorocco

Description

@jakelorocco

Our hallucination detection tests in test_rag.py are failing. I believe this is because the checks in those tests have fallen out of sync with those in the formatter tests. We should fix them. This might also require loosening the expectations for the output like we've done with citations.

example failure:

    -   result = rag.flag_hallucinated_content(None, docs, context, backend)
                for r, e in zip(result, expected, strict=True):  # type: ignore
        >           assert r == e
        E           assert {'explanation...end': 31, ...} == {'explanation...end': 31, ...}
        E
        E             Omitting 4 identical items, use -vv to show
        E             Differing items:
        E             {'explanation': "This sentence makes a factual claim about the color of purple bumble fish. The provided context states: 'The only type
        of fish that is yellow is the purple bumble fish.' This directly supports the claim in the sentence."} != {'explanation': "This sentence makes a factu
        al claim about the color of purple bumble fish. The document states 'The only type of fish that is yellow is the purple bumble fish.' This directly su
        pports the claim in the sentence."}
        E             Use -v to get more diff

        test/stdlib/components/intrinsic/test_rag.py:266: AssertionError

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions