You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Java OCR SDK supports the [Invoice Splitter API](https://platform.mindee.com/mindee/invoice_splitter).
8
8
9
-
Using [this sample](https://github.com/mindee/client-lib-test-data/blob/main/products/invoice_splitter/default_sample.pdf), we are going to illustrate how to detect the pages of multiple invoices within the same document.
9
+
Using the [sample below](https://github.com/mindee/client-lib-test-data/blob/main/products/invoice_splitter/default_sample.pdf), we are going to illustrate how to extract the data that we want using the OCR SDK.
@@ -55,69 +50,82 @@ public class SimpleMindeeClient {
55
50
// page -> System.out.println(page.toString())
56
51
// );
57
52
}
53
+
58
54
}
55
+
59
56
```
60
57
61
58
**Output (RST):**
62
-
63
59
```rst
64
60
########
65
61
Document
66
62
########
67
-
:Mindee ID: 8c25cc63-212b-4537-9c9b-3fbd3bd0ee20
68
-
:Filename: default_sample.jpg
63
+
:Mindee ID: 15ad7a19-7b75-43d0-b0c6-9a641a12b49b
64
+
:Filename: default_sample.pdf
69
65
70
66
Inference
71
67
#########
72
-
:Product: mindee/carte_vitale v1.0
73
-
:Rotation applied: Yes
68
+
:Product: mindee/invoice_splitter v1.1
69
+
:Rotation applied: No
74
70
75
71
Prediction
76
72
==========
77
-
:Given Name(s): NATHALIE
78
-
:Surname: DURAND
79
-
:Social Security Number: 269054958815780
80
-
:Issuance Date: 2007-01-01
73
+
:Invoice Page Groups:
74
+
:Page indexes: 0
75
+
:Page indexes: 1
81
76
82
77
Page Predictions
83
78
================
84
79
85
80
Page 0
86
81
------
87
-
:Given Name(s): NATHALIE
88
-
:Surname: DURAND
89
-
:Social Security Number: 269054958815780
90
-
:Issuance Date: 2007-01-01
82
+
:Invoice Page Groups:
83
+
84
+
Page 1
85
+
------
86
+
:Invoice Page Groups:
91
87
```
92
88
93
89
# Field Types
90
+
## Standard Fields
91
+
These fields are generic and used in several products.
94
92
95
-
## Specific Fields
93
+
### BaseField
94
+
Each prediction object contains a set of fields that inherit from the generic `BaseField` class.
95
+
A typical `BaseField` object will have the following attributes:
96
96
97
-
### Page Indexes
97
+
***confidence** (`Double`): the confidence score of the field prediction.
98
+
***boundingBox** (`Polygon`): contains exactly 4 relative vertices (points) coordinates of a right rectangle containing the field in the document.
99
+
***polygon** (`Polygon`): contains the relative vertices coordinates (`polygon` extends `List<Point>`) of a polygon containing the field in the image.
100
+
***pageId** (`Integer`): the ID of the page, always `null` when at document-level.
98
101
99
-
List of page group indexes.
102
+
> **Note:** A `Point` simply refers to a List of `Double`.
100
103
101
-
A `PageIndexes` implements the following attributes:
102
104
103
-
-**pageIndexes** (`List<Integer>`): List of indexes of the pages of a single invoice.
104
-
-**confidence** (`Double`): The confidence of the prediction.
105
+
Aside from the previous attributes, all basic fields have access to a custom `toString` method that can be used to print their value as a string.
105
106
106
-
# Attributes
107
+
## Specific Fields
108
+
Fields which are specific to this product; they are not used in any other product.
109
+
110
+
### Invoice Page Groups Field
111
+
List of page groups. Each group represents a single invoice within a multi-invoice document.
112
+
113
+
A `InvoiceSplitterV1InvoicePageGroup` implements the following attributes:
107
114
115
+
***pageIndexes** (`List<Integer>`): List of page indexes that belong to the same invoice (group).
116
+
117
+
# Attributes
108
118
The following fields are extracted for Invoice Splitter V1:
109
119
110
120
## Invoice Page Groups
111
-
112
-
**invoicePageGroups** (`List<`[invoicePageGroups](#page-indexes)`>`): List of page indexes that belong to the same invoice in the PDF.
121
+
**invoicePageGroups**(List<[InvoiceSplitterV1InvoicePageGroup](#invoice-page-groups-field)>): List of page groups. Each group represents a single invoice within a multi-invoice document.
113
122
114
123
```java
115
124
for (invoicePageGroupsElem : result.getDocument().getInference().getPrediction().getInvoicePageGroups())
0 commit comments