Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions data-extract-langchain4j/README.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ In this example, we'll convert those text conversations into Java Objects that c
image::schema.png[]

In order to achieve this extraction, we'll need a https://en.wikipedia.org/wiki/Large_language_model[Large Language Model (LLM)] and related serving framework that natively supports https://ollama.com/blog/structured-outputs[JSON structured output].
Here, we choose https://ollama.com/library/granite4:3b-h[granite4:3b-h] served through https://ollama.com/[ollama].
Here, we choose https://ollama.com/library/granite4.1:3b[granite4.1:3b] served through https://ollama.com/[ollama].
In order to request inference to the served model, we'll use the high-level LangChain4j APIs like https://docs.langchain4j.dev/tutorials/ai-services[AiServices].

=== Start the Large Language Model
Expand All @@ -24,7 +24,7 @@ Let's start a container to serve the LLM with Ollama, in a first shell type:

[source,shell]
----
docker run --rm -it -v cqex-data-extract-ollama:/root/.ollama -p 11434:11434 --name cqex-data-extract-ollama ollama/ollama:0.19.0
docker run --rm -it -v cqex-data-extract-ollama:/root/.ollama -p 11434:11434 --name cqex-data-extract-ollama ollama/ollama:0.24.0
----

After a moment, a log like below should be output:
Expand All @@ -34,11 +34,11 @@ After a moment, a log like below should be output:
time=2026-01-07T14:28:19.092Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="62.2 GiB" available="62.2 GiB"
----

Then, download the granite4:3b-h model, in a second shell type:
Then, download the granite4.1:3b model, in a second shell type:

[source,shell]
----
docker exec -it cqex-data-extract-ollama ollama pull granite4:3b-h
docker exec -it cqex-data-extract-ollama ollama pull granite4.1:3b
----

After a moment, log like below should be output:
Expand Down Expand Up @@ -99,12 +99,12 @@ At the end, we are provided with a Plain Old Java Object (POJO) handling the ext

[source,shell]
----
2026-01-06 15:24:56,889 INFO [org.acme.extraction.CustomPojoStore] (Camel (camel-1) thread #1 - file://target/transcripts) An extracted POJO has been added to the store:
2026-05-27 18:31:55,373 INFO [org.acme.extraction.CustomPojoStore] (Camel (camel-1) thread #1 - file://target/transcripts) An extracted POJO has been added to the store:
{
"customerSatisfied": "true",
"customerName": "Sarah London",
"customerBirthday": "10 July 1986",
"summary": "The customer, Sarah London, is calling to declare an accident and seek reimbursement for related expenses."
"customerBirthday": "10 JULY 1986",
"summary": "The customer, Sarah London, called to declare an accident on her main vehicle. The operator confirmed that all expenses related to the accident will be reimbursed."
}
----

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ quarkus.default-locale=en_US
# Adjust as per your where your Ollama instance is running
langchain4j.ollama.base-url=http://localhost:11434
# The chat model to use
langchain4j.ollama.chat-model.model-id=granite4:3b-h
langchain4j.ollama.chat-model.model-id=granite4.1:3b
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ public class OllamaTestResource implements QuarkusTestResourceLifecycleManager {

private static final Logger LOG = LoggerFactory.getLogger(OllamaTestResource.class);

private static final String OLLAMA_IMAGE = "ollama/ollama:0.17.7";
private static final String OLLAMA_IMAGE = "ollama/ollama:0.24.0";
private static final int OLLAMA_SERVER_PORT = 11434;

private static final String MODE_MOCK = "mock";
Expand Down
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
{
"id" : "4400676b-1fdd-4664-bfbd-80be34aae24b",
"id" : "862490ff-9aa9-45d3-b4dc-1dc489ac6dda",
"name" : "api_chat",
"request" : {
"url" : "/api/chat",
"method" : "POST",
"bodyPatterns" : [ {
"equalToJson" : "{\n \"model\" : \"granite4:3b-h\",\n \"messages\" : [ {\n \"role\" : \"user\",\n \"content\" : \"Extract information about a customer from the transcript delimited by triple backticks: ```Operator: Hello, how may I help you ?\\nCustomer: Hello, I'm John. I need to share a problem with you. Actually, the insurance has reimbursed only half the money I have spent due to the accident.\\nOperator: Hello John, could you please give me your last name so that I can find your contract.\\nCustomer: Sure, my surname is Doe.\\nOperator: And last thing, I need to know the date you were born.\\nCustomer: Yes, so I was born in 2001, actually during the first day of November.\\nOperator: Great, I see your contract now. Actually, the full reimbursement option has been cancelled automatically by our system. This explain the half reimbursement.\\nCustomer: Ah damn, this is not acceptable. I've not even been notified about this automatic change.\\nOperator: Oh, I'm sorry to hear that but the full reimbursement option was free for one year and at the time of subscription you were not interested in automatic renewal.\\nCustomer: I don't discuss that. The important fact is that I should have been notified.\\nOperator: Sure, I understand your resentment. The best I can do is to inform my manager.\\nCustomer: OK, let's do that. Good bye.\\nOperator: Good bye. And again let me apologize for the issue.```.The customerName field should be formatted as FIRSTNAME LASTNAME, for instance Isaac Newton.The summary field should concisely relate the customer main ask.Source any extracted field values from what is explicitly mentioned in the transcript.Extracted field values should be as accurate as possible.\"\n } ],\n \"options\" : {\n \"temperature\" : 0.0,\n \"top_k\" : 1,\n \"top_p\" : 0.1,\n \"stop\" : [ ]\n },\n \"format\" : {\n \"type\" : \"object\",\n \"properties\" : {\n \"customerSatisfied\" : {\n \"type\" : \"boolean\"\n },\n \"customerName\" : {\n \"type\" : \"string\"\n },\n \"customerBirthday\" : {\n \"type\" : \"object\",\n \"properties\" : {\n \"year\" : {\n \"type\" : \"integer\"\n },\n \"month\" : {\n \"type\" : \"integer\"\n },\n \"day\" : {\n \"type\" : \"integer\"\n }\n },\n \"required\" : [ \"year\", \"month\", \"day\" ]\n },\n \"summary\" : {\n \"type\" : \"string\"\n }\n },\n \"required\" : [ \"customerSatisfied\", \"customerName\", \"customerBirthday\", \"summary\" ]\n},\n \"stream\" : false,\n \"tools\" : [ ]\n}",
"equalToJson" : "{\n \"model\" : \"granite4.1:3b\",\n \"messages\" : [ {\n \"role\" : \"user\",\n \"content\" : \"Extract information about a customer from the transcript delimited by triple backticks: ```Operator: Hello, how may I help you ?\\nCustomer: Hello, I'm John. I need to share a problem with you. Actually, the insurance has reimbursed only half the money I have spent due to the accident.\\nOperator: Hello John, could you please give me your last name so that I can find your contract.\\nCustomer: Sure, my surname is Doe.\\nOperator: And last thing, I need to know the date you were born.\\nCustomer: Yes, so I was born in 2001, actually during the first day of November.\\nOperator: Great, I see your contract now. Actually, the full reimbursement option has been cancelled automatically by our system. This explain the half reimbursement.\\nCustomer: Ah damn, this is not acceptable. I've not even been notified about this automatic change.\\nOperator: Oh, I'm sorry to hear that but the full reimbursement option was free for one year and at the time of subscription you were not interested in automatic renewal.\\nCustomer: I don't discuss that. The important fact is that I should have been notified.\\nOperator: Sure, I understand your resentment. The best I can do is to inform my manager.\\nCustomer: OK, let's do that. Good bye.\\nOperator: Good bye. And again let me apologize for the issue.```.The customerName field should be formatted as FIRSTNAME LASTNAME, for instance Isaac Newton.The summary field should concisely relate the customer main ask.Source any extracted field values from what is explicitly mentioned in the transcript.Extracted field values should be as accurate as possible.\"\n } ],\n \"options\" : {\n \"temperature\" : 0.0,\n \"top_k\" : 1,\n \"top_p\" : 0.1,\n \"stop\" : [ ]\n },\n \"format\" : {\n \"type\" : \"object\",\n \"properties\" : {\n \"customerSatisfied\" : {\n \"type\" : \"boolean\"\n },\n \"customerName\" : {\n \"type\" : \"string\"\n },\n \"customerBirthday\" : {\n \"type\" : \"object\",\n \"properties\" : {\n \"year\" : {\n \"type\" : \"integer\"\n },\n \"month\" : {\n \"type\" : \"integer\"\n },\n \"day\" : {\n \"type\" : \"integer\"\n }\n },\n \"required\" : [ \"year\", \"month\", \"day\" ]\n },\n \"summary\" : {\n \"type\" : \"string\"\n }\n },\n \"required\" : [ \"customerSatisfied\", \"customerName\", \"customerBirthday\", \"summary\" ]\n},\n \"stream\" : false,\n \"tools\" : [ ]\n}",
"ignoreArrayOrder" : true,
"ignoreExtraElements" : true
} ]
},
"response" : {
"status" : 200,
"body" : "{\"model\":\"granite4:3b-h\",\"created_at\":\"2026-04-02T08:15:31.297936294Z\",\"message\":{\"role\":\"assistant\",\"content\":\"{\\n \\\"customerSatisfied\\\": false,\\n \\\"customerName\\\": \\\"John Doe\\\",\\n \\\"customerBirthday\\\": {\\\"year\\\": 2001, \\\"month\\\": 11, \\\"day\\\": 1},\\n \\\"summary\\\": \\\"John Doe, the customer, is dissatisfied with the insurance company as the full reimbursement option was cancelled automatically by their system and he was not notified. He requested to inform the manager about this issue.\\\"\\n}\"},\"done\":true,\"done_reason\":\"stop\",\"total_duration\":13730120429,\"load_duration\":46627949,\"prompt_eval_count\":363,\"prompt_eval_duration\":7507657361,\"eval_count\":93,\"eval_duration\":5022486287}",
"body" : "{\"model\":\"granite4.1:3b\",\"created_at\":\"2026-05-27T17:05:54.559691677Z\",\"message\":{\"role\":\"assistant\",\"content\":\"{\\n \\\"customerSatisfied\\\": false,\\n \\\"customerName\\\": \\\"John Doe\\\",\\n \\\"customerBirthday\\\": {\\\"year\\\": 2001, \\\"month\\\": 11, \\\"day\\\": 1},\\n \\\"summary\\\": \\\"Customer John Doe is upset because the insurance only reimbursed half of his expenses due to an accident. He claims he was never notified about the automatic cancellation of the full reimbursement option that was free for one year at the time of subscription.\\\"\\n}\"},\"done\":true,\"done_reason\":\"stop\",\"total_duration\":14490804900,\"load_duration\":44741877,\"prompt_eval_count\":341,\"prompt_eval_duration\":7048672674,\"eval_count\":102,\"eval_duration\":5948503371}",
"headers" : {
"Date" : "Thu, 02 Apr 2026 08:15:31 GMT",
"Date" : "Wed, 27 May 2026 17:05:54 GMT",
"Content-Type" : "application/json; charset=utf-8"
}
},
"uuid" : "4400676b-1fdd-4664-bfbd-80be34aae24b",
"uuid" : "862490ff-9aa9-45d3-b4dc-1dc489ac6dda",
"persistent" : true,
"insertionIndex" : 5
}
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
{
"id" : "173e3a1f-b194-46f7-a510-c1d497f98986",
"id" : "bc242ba3-eb22-4c60-9bab-22fabf382956",
"name" : "api_chat",
"request" : {
"url" : "/api/chat",
"method" : "POST",
"bodyPatterns" : [ {
"equalToJson" : "{\n \"model\" : \"granite4:3b-h\",\n \"messages\" : [ {\n \"role\" : \"user\",\n \"content\" : \"Extract information about a customer from the transcript delimited by triple backticks: ```Operator: Hello, how may I help you ?\\nCustomer: Hello, I'm calling because I need to declare an accident on my main vehicle.\\nOperator: Ok, can you please give me your name ?\\nCustomer: My name is Sarah London.\\nOperator: Could you please give me your birth date ?\\nCustomer: 1986, July the 10th.\\nOperator: Ok, I've got your contract and I'm happy to share with you that we'll be able to reimburse all expenses linked to this accident.\\nCustomer: Oh great, many thanks.```.The customerName field should be formatted as FIRSTNAME LASTNAME, for instance Isaac Newton.The summary field should concisely relate the customer main ask.Source any extracted field values from what is explicitly mentioned in the transcript.Extracted field values should be as accurate as possible.\"\n } ],\n \"options\" : {\n \"temperature\" : 0.0,\n \"top_k\" : 1,\n \"top_p\" : 0.1,\n \"stop\" : [ ]\n },\n \"format\" : {\n \"type\" : \"object\",\n \"properties\" : {\n \"customerSatisfied\" : {\n \"type\" : \"boolean\"\n },\n \"customerName\" : {\n \"type\" : \"string\"\n },\n \"customerBirthday\" : {\n \"type\" : \"object\",\n \"properties\" : {\n \"year\" : {\n \"type\" : \"integer\"\n },\n \"month\" : {\n \"type\" : \"integer\"\n },\n \"day\" : {\n \"type\" : \"integer\"\n }\n },\n \"required\" : [ \"year\", \"month\", \"day\" ]\n },\n \"summary\" : {\n \"type\" : \"string\"\n }\n },\n \"required\" : [ \"customerSatisfied\", \"customerName\", \"customerBirthday\", \"summary\" ]\n},\n \"stream\" : false,\n \"tools\" : [ ]\n}",
"equalToJson" : "{\n \"model\" : \"granite4.1:3b\",\n \"messages\" : [ {\n \"role\" : \"user\",\n \"content\" : \"Extract information about a customer from the transcript delimited by triple backticks: ```Operator: Hello, how may I help you ?\\nCustomer: Hello, I'm calling because I need to declare an accident on my main vehicle.\\nOperator: Ok, can you please give me your name ?\\nCustomer: My name is Sarah London.\\nOperator: Could you please give me your birth date ?\\nCustomer: 1986, July the 10th.\\nOperator: Ok, I've got your contract and I'm happy to share with you that we'll be able to reimburse all expenses linked to this accident.\\nCustomer: Oh great, many thanks.```.The customerName field should be formatted as FIRSTNAME LASTNAME, for instance Isaac Newton.The summary field should concisely relate the customer main ask.Source any extracted field values from what is explicitly mentioned in the transcript.Extracted field values should be as accurate as possible.\"\n } ],\n \"options\" : {\n \"temperature\" : 0.0,\n \"top_k\" : 1,\n \"top_p\" : 0.1,\n \"stop\" : [ ]\n },\n \"format\" : {\n \"type\" : \"object\",\n \"properties\" : {\n \"customerSatisfied\" : {\n \"type\" : \"boolean\"\n },\n \"customerName\" : {\n \"type\" : \"string\"\n },\n \"customerBirthday\" : {\n \"type\" : \"object\",\n \"properties\" : {\n \"year\" : {\n \"type\" : \"integer\"\n },\n \"month\" : {\n \"type\" : \"integer\"\n },\n \"day\" : {\n \"type\" : \"integer\"\n }\n },\n \"required\" : [ \"year\", \"month\", \"day\" ]\n },\n \"summary\" : {\n \"type\" : \"string\"\n }\n },\n \"required\" : [ \"customerSatisfied\", \"customerName\", \"customerBirthday\", \"summary\" ]\n},\n \"stream\" : false,\n \"tools\" : [ ]\n}",
"ignoreArrayOrder" : true,
"ignoreExtraElements" : true
} ]
},
"response" : {
"status" : 200,
"body" : "{\"model\":\"granite4:3b-h\",\"created_at\":\"2026-04-02T08:15:17.527674637Z\",\"message\":{\"role\":\"assistant\",\"content\":\"{\\n \\\"customerSatisfied\\\": true,\\n \\\"customerName\\\": \\\"Sarah London\\\",\\n \\\"customerBirthday\\\": {\\n \\\"year\\\": 1986,\\n \\\"month\\\": 7,\\n \\\"day\\\": 10\\n },\\n \\\"summary\\\": \\\"The customer, Sarah London, needed to declare an accident on her main vehicle and was informed that all expenses linked to the accident would be reimbursed.\\\"\\n}\"},\"done\":true,\"done_reason\":\"stop\",\"total_duration\":11199370161,\"load_duration\":1209262913,\"prompt_eval_count\":211,\"prompt_eval_duration\":4245491703,\"eval_count\":87,\"eval_duration\":4676863156}",
"body" : "{\"model\":\"granite4.1:3b\",\"created_at\":\"2026-05-27T17:05:40.026864606Z\",\"message\":{\"role\":\"assistant\",\"content\":\"{\\n \\\"customerSatisfied\\\": true,\\n \\\"customerName\\\": \\\"Sarah London\\\",\\n \\\"customerBirthday\\\": {\\n \\\"year\\\": 1986,\\n \\\"month\\\": 7,\\n \\\"day\\\": 10\\n },\\n \\\"summary\\\": \\\"The customer, Sarah London, called to declare an accident on her main vehicle. The operator confirmed that all expenses related to the accident will be reimbursed.\\\"\\n}\"},\"done\":true,\"done_reason\":\"stop\",\"total_duration\":11120222180,\"load_duration\":1206826475,\"prompt_eval_count\":189,\"prompt_eval_duration\":3962041623,\"eval_count\":87,\"eval_duration\":4859237798}",
"headers" : {
"Date" : "Thu, 02 Apr 2026 08:15:17 GMT",
"Date" : "Wed, 27 May 2026 17:05:40 GMT",
"Content-Type" : "application/json; charset=utf-8"
}
},
"uuid" : "173e3a1f-b194-46f7-a510-c1d497f98986",
"uuid" : "bc242ba3-eb22-4c60-9bab-22fabf382956",
"persistent" : true,
"insertionIndex" : 6
}
Loading
Loading