Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion python-sdk/folders.mdx
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
title: "Folder Management"
description: "Organize and isolate data into logical folder groups in Morphik"

Check warning on line 3 in python-sdk/folders.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/folders.mdx#L3

Did you really mean 'Morphik'?
---

## Overview

Folders in Morphik provide a way to organize documents into logical groups. This is particularly useful for multi-project environments where you want to maintain separation between different contexts. Documents within a folder are isolated from those in other folders, allowing for clean organization and data separation.

Check warning on line 8 in python-sdk/folders.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/folders.mdx#L8

Did you really mean 'Morphik'?

> ℹ️ All folder APIs accept **either the folder’s UUID or its name**. Use whichever identifier you already have—Morphik resolves it automatically.

Expand Down Expand Up @@ -90,13 +90,13 @@

## Folder Methods

All the core document operations available on the main Morphik client are also available on folder objects, but they are automatically scoped to the specific folder:

Check warning on line 93 in python-sdk/folders.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/folders.mdx#L93

Did you really mean 'Morphik'?

- `ingest_text` - Ingest text content into this folder
- `ingest_file` - Ingest a file into this folder
- `ingest_files` - Ingest multiple files into this folder
- `ingest_directory` - Ingest all files from a directory into this folder
- `retrieve_chunks` - Retrieve chunks matching a query from this folder
- `retrieve_chunks` - Retrieve chunks matching a query from this folder (supports [reverse image search](/python-sdk/retrieve_chunks#reverse-image-search))
- `retrieve_docs` - Retrieve documents matching a query from this folder
- `query` - Generate a completion using context from this folder (supports `llm_config` parameter for custom LLM configuration)
- `list_documents` - List all documents in this folder
Expand Down
67 changes: 64 additions & 3 deletions python-sdk/retrieve_chunks.mdx
Original file line number Diff line number Diff line change
@@ -1,49 +1,52 @@
---
title: "retrieve_chunks"

Check warning on line 2 in python-sdk/retrieve_chunks.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/retrieve_chunks.mdx#L2

Did you really mean 'retrieve_chunks'?
description: "Retrieve relevant chunks from Morphik"

Check warning on line 3 in python-sdk/retrieve_chunks.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/retrieve_chunks.mdx#L3

Did you really mean 'Morphik'?
---

<Tabs>
<Tab title="Sync">
```python
def retrieve_chunks(
query: str,
query: Optional[str] = None,
filters: Optional[Dict[str, Any]] = None,
k: int = 4,
min_score: float = 0.0,
use_colpali: bool = True,
folder_name: Optional[Union[str, List[str]]] = None,
padding: int = 0,
output_format: Optional[str] = None,
query_image: Optional[str] = None,
) -> List[FinalChunkResult]
```
</Tab>
<Tab title="Async">
```python
async def retrieve_chunks(
query: str,
query: Optional[str] = None,
filters: Optional[Dict[str, Any]] = None,
k: int = 4,
min_score: float = 0.0,
use_colpali: bool = True,
folder_name: Optional[Union[str, List[str]]] = None,
padding: int = 0,
output_format: Optional[str] = None,
query_image: Optional[str] = None,
) -> List[FinalChunkResult]
```
</Tab>
</Tabs>

## Parameters

- `query` (str): Search query text
- `query` (str, optional): Search query text. Mutually exclusive with `query_image`.
- `filters` (Dict[str, Any], optional): Optional metadata filters
- `k` (int, optional): Number of results. Defaults to 4.
- `min_score` (float, optional): Minimum similarity threshold. Defaults to 0.0.
- `use_colpali` (bool, optional): Whether to use ColPali-style embedding model to retrieve the chunks (only works for documents ingested with `use_colpali=True`). Defaults to True.
- `folder_name` (str | List[str], optional): Optional folder scope. Accepts a single folder name or a list of folder names.
- `padding` (int, optional): Number of additional chunks/pages to retrieve before and after matched chunks (ColPali only). Defaults to 0.
- `output_format` (str, optional): Controls how image chunks are returned. Set to `"url"` to receive presigned URLs; omit or set to `"base64"` (default) to receive base64 content.

Check warning on line 48 in python-sdk/retrieve_chunks.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/retrieve_chunks.mdx#L48

Did you really mean 'presigned'?
- `query_image` (str, optional): Base64-encoded image for reverse image search. Mutually exclusive with `query`. Requires `use_colpali=True`.

## Metadata Filters

Expand Down Expand Up @@ -123,7 +126,7 @@

The `FinalChunkResult` objects returned by this method have the following properties:

- `content` (str | PILImage): Chunk content (text or image)

Check warning on line 129 in python-sdk/retrieve_chunks.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/retrieve_chunks.mdx#L129

Did you really mean 'PILImage'?
- `score` (float): Relevance score
- `document_id` (str): Parent document ID
- `chunk_number` (int): Chunk sequence number
Expand All @@ -134,9 +137,67 @@

## Image URL output

- When `output_format="url"` is provided, image chunks are returned as presigned HTTPS URLs in `content`. This is convenient for UIs and LLMs that accept remote image URLs (e.g., via `image_url`).

Check warning on line 140 in python-sdk/retrieve_chunks.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/retrieve_chunks.mdx#L140

Did you really mean 'presigned'?

Check warning on line 140 in python-sdk/retrieve_chunks.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/retrieve_chunks.mdx#L140

Did you really mean 'UIs'?

Check warning on line 140 in python-sdk/retrieve_chunks.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/retrieve_chunks.mdx#L140

Did you really mean 'LLMs'?
- When `output_format` is omitted or set to `"base64"` (default), image chunks are returned as base64 data (the SDK attempts to decode these into a `PIL.Image` for `FinalChunkResult.content`).
- Text chunks are unaffected by `output_format` and are always returned as strings.
- The `download_url` field may be populated for image chunks. When using `output_format="url"`, it will typically match `content` for those chunks.

Tip: To download the original raw file for a document, use [`get_document_download_url`](./get_document_download_url).

## Reverse Image Search

You can search using an image instead of text by providing `query_image` with a base64-encoded image. This enables finding visually similar content in your documents.

<Tabs>
<Tab title="Sync">
```python
import base64
from morphik import Morphik

db = Morphik()

# Load and encode your query image
with open("query_image.png", "rb") as f:
image_b64 = base64.b64encode(f.read()).decode("utf-8")

# Search using the image
chunks = db.retrieve_chunks(
query_image=image_b64,
use_colpali=True, # Required for image queries
k=5,
)

for chunk in chunks:
print(f"Score: {chunk.score}")
print(f"Document ID: {chunk.document_id}")
print("---")
```
</Tab>
<Tab title="Async">
```python
import base64
from morphik import AsyncMorphik

async with AsyncMorphik() as db:
# Load and encode your query image
with open("query_image.png", "rb") as f:
image_b64 = base64.b64encode(f.read()).decode("utf-8")

# Search using the image
chunks = await db.retrieve_chunks(
query_image=image_b64,
use_colpali=True, # Required for image queries
k=5,
)

for chunk in chunks:
print(f"Score: {chunk.score}")
print(f"Document ID: {chunk.document_id}")
print("---")
```
</Tab>
</Tabs>

<Note>
Reverse image search requires documents to be ingested with `use_colpali=True`. You must provide either `query` or `query_image`, but not both.
</Note>
69 changes: 66 additions & 3 deletions python-sdk/retrieve_chunks_grouped.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "retrieve_chunks_grouped"

Check warning on line 2 in python-sdk/retrieve_chunks_grouped.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/retrieve_chunks_grouped.mdx#L2

Did you really mean 'retrieve_chunks_grouped'?
description: "Retrieve relevant chunks with grouping for UI display"
---

Expand All @@ -7,7 +7,7 @@
<Tab title="Sync">
```python
def retrieve_chunks_grouped(
query: str,
query: Optional[str] = None,
filters: Optional[Dict[str, Any]] = None,
k: int = 4,
min_score: float = 0.0,
Expand All @@ -20,13 +20,14 @@
graph_name: Optional[str] = None,
hop_depth: int = 1,
include_paths: bool = False,
query_image: Optional[str] = None,
) -> GroupedChunkResponse
```
</Tab>
<Tab title="Async">
```python
async def retrieve_chunks_grouped(
query: str,
query: Optional[str] = None,
filters: Optional[Dict[str, Any]] = None,
k: int = 4,
min_score: float = 0.0,
Expand All @@ -39,26 +40,28 @@
graph_name: Optional[str] = None,
hop_depth: int = 1,
include_paths: bool = False,
query_image: Optional[str] = None,
) -> GroupedChunkResponse
```
</Tab>
</Tabs>

## Parameters

- `query` (str): Search query text
- `query` (str, optional): Search query text. Mutually exclusive with `query_image`.
- `filters` (Dict[str, Any], optional): Optional metadata filters
- `k` (int, optional): Number of results. Defaults to 4.
- `min_score` (float, optional): Minimum similarity threshold. Defaults to 0.0.
- `use_colpali` (bool, optional): Whether to use ColPali-style embedding model. Defaults to True.
- `use_reranking` (bool, optional): Override workspace reranking configuration for this request.

Check warning on line 56 in python-sdk/retrieve_chunks_grouped.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/retrieve_chunks_grouped.mdx#L56

Did you really mean 'reranking'?
- `folder_name` (str | List[str], optional): Optional folder scope (single name or list of names)
- `end_user_id` (str, optional): Optional end-user scope
- `padding` (int, optional): Number of additional chunks/pages to retrieve before and after matched chunks. Defaults to 0.
- `output_format` (str, optional): Controls how image chunks are returned. Set to `"url"` for presigned URLs or `"base64"` (default) for base64 content.

Check warning on line 60 in python-sdk/retrieve_chunks_grouped.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/retrieve_chunks_grouped.mdx#L60

Did you really mean 'presigned'?
- `graph_name` (str, optional): Name of the graph to use for knowledge graph-enhanced retrieval
- `hop_depth` (int, optional): Number of relationship hops to traverse in the graph. Defaults to 1.
- `include_paths` (bool, optional): Whether to include relationship paths in the response. Defaults to False.
- `query_image` (str, optional): Base64-encoded image for reverse image search. Mutually exclusive with `query`. Requires `use_colpali=True`.

## Returns

Expand Down Expand Up @@ -179,6 +182,66 @@

- This method is similar to [`retrieve_chunks`](./retrieve_chunks) but provides additional grouping for UI display.
- The `chunks` list provides backward compatibility with flat chunk lists.
- The `groups` list organizes results with their padding context, ideal for building search result UIs.

Check warning on line 185 in python-sdk/retrieve_chunks_grouped.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/retrieve_chunks_grouped.mdx#L185

Did you really mean 'UIs'?
- When `padding` is specified, surrounding chunks are included in `padding_chunks` for each group.
- Knowledge graph parameters (`graph_name`, `hop_depth`, `include_paths`) enable graph-enhanced retrieval.

## Reverse Image Search

You can search using an image instead of text by providing `query_image` with a base64-encoded image:

<Tabs>
<Tab title="Sync">
```python
import base64
from morphik import Morphik

db = Morphik()

# Load and encode your query image
with open("query_image.png", "rb") as f:
image_b64 = base64.b64encode(f.read()).decode("utf-8")

# Search using the image with grouped results
response = db.retrieve_chunks_grouped(
query_image=image_b64,
use_colpali=True, # Required for image queries
k=5,
padding=1,
)

for group in response.groups:
print(f"Main chunk score: {group.main_chunk.score}")
print(f"Document: {group.main_chunk.document_id}")
print("---")
```
</Tab>
<Tab title="Async">
```python
import base64
from morphik import AsyncMorphik

async with AsyncMorphik() as db:
# Load and encode your query image
with open("query_image.png", "rb") as f:
image_b64 = base64.b64encode(f.read()).decode("utf-8")

# Search using the image with grouped results
response = await db.retrieve_chunks_grouped(
query_image=image_b64,
use_colpali=True, # Required for image queries
k=5,
padding=1,
)

for group in response.groups:
print(f"Main chunk score: {group.main_chunk.score}")
print(f"Document: {group.main_chunk.document_id}")
print("---")
```
</Tab>
</Tabs>

<Note>
Reverse image search requires documents to be ingested with `use_colpali=True`. You must provide either `query` or `query_image`, but not both.
</Note>
2 changes: 1 addition & 1 deletion python-sdk/users.mdx
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
title: "User Management"
description: "Organize and isolate data by end user in Morphik"

Check warning on line 3 in python-sdk/users.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/users.mdx#L3

Did you really mean 'Morphik'?
---

## Overview

User scoping in Morphik allows multi-tenant applications to isolate data on a per-user basis. This ensures that in applications serving multiple users, each user can only access their own documents and data. User scoping is particularly valuable for building customer-facing applications where data privacy and separation are essential.

Check warning on line 8 in python-sdk/users.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/users.mdx#L8

Did you really mean 'Morphik'?

## Creating User Scopes

Expand Down Expand Up @@ -84,13 +84,13 @@

## User Scope Methods

The UserScope class provides the same document operations as the main Morphik client, but automatically scoped to the specific user:

Check warning on line 87 in python-sdk/users.mdx

View check run for this annotation

Mintlify / Mintlify Validation (databridge) - vale-spellcheck

python-sdk/users.mdx#L87

Did you really mean 'Morphik'?

- `ingest_text` - Ingest text content for this user
- `ingest_file` - Ingest a file for this user
- `ingest_files` - Ingest multiple files for this user
- `ingest_directory` - Ingest all files from a directory for this user
- `retrieve_chunks` - Retrieve chunks matching a query from this user's documents
- `retrieve_chunks` - Retrieve chunks matching a query from this user's documents (supports [reverse image search](/python-sdk/retrieve_chunks#reverse-image-search))
- `retrieve_docs` - Retrieve documents matching a query from this user's documents
- `query` - Generate a completion using context from this user's documents (supports `llm_config` parameter for custom LLM configuration)
- `list_documents` - List all documents owned by this user
Expand Down