You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: reference.md
+6-18Lines changed: 6 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1615,7 +1615,7 @@ client.rerank(
1615
1615
],
1616
1616
query="What is the capital of the United States?",
1617
1617
top_n=3,
1618
-
model="rerank-v3.5",
1618
+
model="rerank-v4.0-pro",
1619
1619
)
1620
1620
1621
1621
```
@@ -2492,10 +2492,7 @@ If tool_choice isn't specified, then the model is free to choose whether to use
2492
2492
<dl>
2493
2493
<dd>
2494
2494
2495
-
**priority:**`typing.Optional[int]`
2496
-
2497
-
The priority of the request (lower means earlier handling; default 0 highest priority).
2498
-
Higher priority requests are handled first, and dropped last when the system is under load.
2495
+
**priority:**`typing.Optional[int]` — Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
2499
2496
2500
2497
</dd>
2501
2498
</dl>
@@ -2793,10 +2790,7 @@ If tool_choice isn't specified, then the model is free to choose whether to use
2793
2790
<dl>
2794
2791
<dd>
2795
2792
2796
-
**priority:**`typing.Optional[int]`
2797
-
2798
-
The priority of the request (lower means earlier handling; default 0 highest priority).
2799
-
Higher priority requests are handled first, and dropped last when the system is under load.
2793
+
**priority:**`typing.Optional[int]` — Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
2800
2794
2801
2795
</dd>
2802
2796
</dl>
@@ -2972,10 +2966,7 @@ If `NONE` is selected, when the input exceeds the maximum input token length an
2972
2966
<dl>
2973
2967
<dd>
2974
2968
2975
-
**priority:**`typing.Optional[int]`
2976
-
2977
-
The priority of the request (lower means earlier handling; default 0 highest priority).
2978
-
Higher priority requests are handled first, and dropped last when the system is under load.
2969
+
**priority:**`typing.Optional[int]` — Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
2979
2970
2980
2971
</dd>
2981
2972
</dl>
@@ -3038,7 +3029,7 @@ client.v2.rerank(
3038
3029
],
3039
3030
query="What is the capital of the United States?",
3040
3031
top_n=3,
3041
-
model="rerank-v3.5",
3032
+
model="rerank-v4.0-pro",
3042
3033
)
3043
3034
3044
3035
```
@@ -3102,10 +3093,7 @@ For optimal performance we recommend against sending more than 1,000 documents i
3102
3093
<dl>
3103
3094
<dd>
3104
3095
3105
-
**priority:**`typing.Optional[int]`
3106
-
3107
-
The priority of the request (lower means earlier handling; default 0 highest priority).
3108
-
Higher priority requests are handled first, and dropped last when the system is under load.
3096
+
**priority:**`typing.Optional[int]` — Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
Copy file name to clipboardExpand all lines: src/cohere/v2/client.py
+10-18Lines changed: 10 additions & 18 deletions
Original file line number
Diff line number
Diff line change
@@ -160,8 +160,7 @@ def chat_stream(
160
160
thinking : typing.Optional[Thinking]
161
161
162
162
priority : typing.Optional[int]
163
-
The priority of the request (lower means earlier handling; default 0 highest priority).
164
-
Higher priority requests are handled first, and dropped last when the system is under load.
163
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
165
164
166
165
request_options : typing.Optional[RequestOptions]
167
166
Request-specific configuration.
@@ -331,8 +330,7 @@ def chat(
331
330
thinking : typing.Optional[Thinking]
332
331
333
332
priority : typing.Optional[int]
334
-
The priority of the request (lower means earlier handling; default 0 highest priority).
335
-
Higher priority requests are handled first, and dropped last when the system is under load.
333
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
336
334
337
335
request_options : typing.Optional[RequestOptions]
338
336
Request-specific configuration.
@@ -451,8 +449,7 @@ def embed(
451
449
If `NONE` is selected, when the input exceeds the maximum input token length an error will be returned.
452
450
453
451
priority : typing.Optional[int]
454
-
The priority of the request (lower means earlier handling; default 0 highest priority).
455
-
Higher priority requests are handled first, and dropped last when the system is under load.
452
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
456
453
457
454
request_options : typing.Optional[RequestOptions]
458
455
Request-specific configuration.
@@ -529,8 +526,7 @@ def rerank(
529
526
Defaults to `4096`. Long documents will be automatically truncated to the specified number of tokens.
530
527
531
528
priority : typing.Optional[int]
532
-
The priority of the request (lower means earlier handling; default 0 highest priority).
533
-
Higher priority requests are handled first, and dropped last when the system is under load.
529
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
534
530
535
531
request_options : typing.Optional[RequestOptions]
536
532
Request-specific configuration.
@@ -558,7 +554,7 @@ def rerank(
558
554
],
559
555
query="What is the capital of the United States?",
560
556
top_n=3,
561
-
model="rerank-v3.5",
557
+
model="rerank-v4.0-pro",
562
558
)
563
559
"""
564
560
_response=self._raw_client.rerank(
@@ -704,8 +700,7 @@ async def chat_stream(
704
700
thinking : typing.Optional[Thinking]
705
701
706
702
priority : typing.Optional[int]
707
-
The priority of the request (lower means earlier handling; default 0 highest priority).
708
-
Higher priority requests are handled first, and dropped last when the system is under load.
703
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
709
704
710
705
request_options : typing.Optional[RequestOptions]
711
706
Request-specific configuration.
@@ -884,8 +879,7 @@ async def chat(
884
879
thinking : typing.Optional[Thinking]
885
880
886
881
priority : typing.Optional[int]
887
-
The priority of the request (lower means earlier handling; default 0 highest priority).
888
-
Higher priority requests are handled first, and dropped last when the system is under load.
882
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
889
883
890
884
request_options : typing.Optional[RequestOptions]
891
885
Request-specific configuration.
@@ -1012,8 +1006,7 @@ async def embed(
1012
1006
If `NONE` is selected, when the input exceeds the maximum input token length an error will be returned.
1013
1007
1014
1008
priority : typing.Optional[int]
1015
-
The priority of the request (lower means earlier handling; default 0 highest priority).
1016
-
Higher priority requests are handled first, and dropped last when the system is under load.
1009
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
1017
1010
1018
1011
request_options : typing.Optional[RequestOptions]
1019
1012
Request-specific configuration.
@@ -1098,8 +1091,7 @@ async def rerank(
1098
1091
Defaults to `4096`. Long documents will be automatically truncated to the specified number of tokens.
1099
1092
1100
1093
priority : typing.Optional[int]
1101
-
The priority of the request (lower means earlier handling; default 0 highest priority).
1102
-
Higher priority requests are handled first, and dropped last when the system is under load.
1094
+
Controls how early the request is handled. Lower numbers indicate higher priority (default: 0, the highest). When the system is under load, higher-priority requests are processed first and are the least likely to be dropped.
1103
1095
1104
1096
request_options : typing.Optional[RequestOptions]
1105
1097
Request-specific configuration.
@@ -1132,7 +1124,7 @@ async def main() -> None:
1132
1124
],
1133
1125
query="What is the capital of the United States?",
0 commit comments