Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions en/ai/ai-providers-and-api-keys.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,8 @@ If you have some money on your credit balance, you can chat with your library!

To increase your credit balance on OpenAI, follow these steps:

1. Add payment method [here](https://platform.openai.com/settings/organization/billing/payment-methods).
2. Add credit balance on [this](https://platform.openai.com/settings/organization/billing/overview) page.
1. Add payment method [on this page](https://platform.openai.com/settings/organization/billing/payment-methods).
2. Add credit balance [on this page](https://platform.openai.com/settings/organization/billing/overview).

### Mistral AI

Expand Down
38 changes: 38 additions & 0 deletions en/ai/answer-engines.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Answer Engines

The AI chat uses Large Language Models to generate the response, but they need to know the contents of the papers. But at the same time a paper might be too long to be included in the AI message. For this reason, JabRef has a notion of "Answer Engines" -- those are the algorithms that choose how AI will repond to a question and how it will search for the answer.

There are 2 answer engines available:

- "Embeddings" answer engine,
- "Full document" answer engine.

## "Embeddings" Answer Engine

This answer engine implements the classical RAG approach: the paper is split into chunks, then an embedding vector is generated for each of the chunks, and, in the end, everything is stored in a database.

When a question is asked, the question is also embedded, and by using embedding search the relevant chunks are found and attached to the AI context. As a result, the AI sees only the information from the papers that is relevant to the question.

Pros of the approach:

- Handles information of any size. Chatting with thousands of papers is possible.
- Puts only relevant information to the AI context, reducing the token usage.

Cons of the approach:

- Requires ingestion and embedding of the papers, which takes time.
- Might miss relevant information, because embeddings models are not ideal.

## "Full document" Answer Engine

This answer engine puts all paper content into the AI prompt without any preprocessing. It might be useful if a question requires deeper understanding of the paper or if it is short.

Pros of the approach:

- AI has the full knowledge of the paper content.
- Does not require any additional preprocessing like embedding.

Cons of the approach:

- Unable to handle many papers or long papers, because that will overflow the context window of an LLM.

27 changes: 24 additions & 3 deletions en/ai/preferences.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,6 @@
## General settings

* "Enable AI functionality in JabRef": by default it is turned off, so you need to check this option if you want to use the new AI features
* "Automatically generate embeddings for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically start an embeddings generation task. (If you do not know what are the embeddings, take a look at ["How does the AI functionality work?"](https://docs.jabref.org/ai#how-does-the-ai-functionality-work)).
* "Automatically generate summaries for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically generate a summary.

If you import a lot of entries at a time, we recommend you to switch off options "Automatically generate embeddings for new entries" and "Automatically generate summaries for new entries", because this may slow down your computer, and you may reach the usage limit of the AI provider.

Expand Down Expand Up @@ -117,9 +115,10 @@ To use the templates, we employ the [Apache Velocity](https://velocity.apache.or
There are four templates that JabRef uses:

* **System Message for Chatting**: This template constructs the system message (also known as the instruction) for every AI chat in JabRef (whether chatting with an entry or with a group).
* **User Message for Chatting**: This template is also used in chats and is responsible for forming a request to AI with document embeddings. The user message created by this template is sent to AI; however, only the plain user question will be saved in the chat history.
* **User Message for Chatting**: This template is also used in chats and is responsible for forming a request to AI with the relevant information found by an answer engine. The user message created by this template is sent to AI; however, only the plain user question will be saved in the chat history.
* **Summarization Chunk**: In cases where the chat model does not have enough context window to fit the entire document in one message, our algorithm will split the document into chunks. This template is used to summarize a single chunk of a document.
* **Summarization Combine**: This template is used only when the document size exceeds the context window of a chat model. It combines the summarized chunks into one piece of text.
* **System message for 'full document' summarization**: This template is used for the "Full document" summarizator. For more information about different types of summarization algorithms, check out ["Summarization Algorithms"](https://docs.jabref.org/ai/summarization-algorithms).

You can create any template you want, but we advise starting from the default template, as it has been carefully designed and includes special syntax from Apache Velocity.

Expand All @@ -131,6 +130,28 @@ For each template, there is a context that holds all necessary variables used in
* **User Message for Chatting**: There are two variables: `message` (the user question) and `excerpts` (pieces of information found in documents through the embeddings search). Each object in `excerpts` is of type `PaperExcerpt`, which has two fields: `citationKey` and `text`.
* **Summarization Chunk**: There is only the `text` variable, which contains the chunk.
* **Summarization Combine**: There is only the `chunks` variable, which contains a list of summarized chunks.
* **System message for 'full document' summarization**: No additional context.

## Miscellaneous settings

* "Automatically generate embeddings for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically start an embeddings generation task. (If you do not know what are the embeddings, take a look at ["How does the AI functionality work?"](https://docs.jabref.org/ai#how-does-the-ai-functionality-work)).
* "Automatically generate summaries for new entries": when this check box is switched on, for every new entry in the library, JabRef will automatically generate a summary.
* "Generate follow-up questions after AI response" and "Number of follow-up questions": controls the generation of follow-up questions in the AI chat.
* "Default summarization algorithm": the summarization algorithm used by default. For more information about summarization algorithms, check ["Summarization algorithms"](https://docs.jabref.org/ai/summarization-algorithms).
* "Default answer engine": the answer engine used by default in AI chat. For more information about answer engines, check ["Answer Engines"](https://docs.jabref.org/ai/answer-engines).
* "Default token estimation algorithm": the token estimation algorithm used by default for various AI tasks. For more information about token estimators, check the section ["Token Estimators"](#token-estimators).

### Token Estimators

Some AI features require estimating whether a message will exceed the context window of the underlying language model. However, the exact tokenizer used by the model is not always accessible. For this reason, JabRef provides a separate option called "Token Estimator", which approximates the number of tokens in a message.

The following estimation strategies are available:

- **Average**: Computes the average of the Words and Characters estimations.
- **Words**: Estimates tokens based on the number of words, assuming that 0.75 words ≈ 1 token.
- **Characters**: Estimates tokens based on the number of characters, assuming that 4 characters ≈ 1 token.
- **Max**: Takes the maximum value between the Words and Characters estimations.
- **Min**: Takes the minimum value between the Words and Characters estimations.

## Further literature

Expand Down
43 changes: 43 additions & 0 deletions en/ai/summarization-algorithms.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Summarization algorithms

In JabRef you can change the algorithm that is used to summarize papers. You can also customize the templates for all algorithms.

## "Chunked" Summarization Algorithm

This algorithm is made for working with long papers or even books, because it splits a paper into smaller chunks that do not overflow the model's context window. The algorithm works as follows:

1. Split the file into small chunks.
2. Summarize each of the chunks separately.
3. Combine all of the summarized chunks into one message.
4. If this message is too big for the LLM, then summarize all of the chunks again.
5. If not, then summarize the collection of chunks. The result is the final summary.

Pros of the algorithm:

- It is able to handle long texts.
- All parts of the paper are considered and processed.

Cons of the algorithm:

- Takes a long time.
- Spends a lot of tokens.

Templates that algorithm uses:

- "System message for summarization of a chunk": prompt that is used to summarize parts of a paper.
- "System message for summarization of several chunks": prompt that is used to make a final summary out of the summarized parts of a paper.


## "Full document" Summarization Algorithm

This algorithm pushes the whole paper text into AI context window without any preprocessing. It can be useful if the paper is not that long (so it does not need to be chunked) or if you have an LLM that is fine-tuned for summarization.

Pros of the algorithm:

- Fast (because only one message is sent).
- Uses less tokens than the "chunked" version.

Cons of the algorithm:

- Cannot handle papers that are longer than the LLM's context window.
- For long papers it might overlook the information in the middle of the paper (because AI captures better information at the beginning and the end).
Loading