[Feature] made api_server model-name case insensitive in request

### Motivation

Recently, I migrated several model serving services from vllm/sglang to lmdeploy. By the way, lmdeploy’s turbomind engine shows significantly better performance on V100 (Volta GPU)—thank you for your excellent work.

I encountered one issue: sglang is quite lenient with model names in requests, whereas lmdeploy is very strict and doesn’t support multiple model names (vllm allows configuring multiple aliases).  
In my case, some lazy downstream clients interchangeably use names like `Qwen3-32B` or `qwen3-32b` to call the model, which results in a 404 error for the latter.

I would like lmdeploy to support multiple aliases for a model, or at least allow a default alias pointing to the primary model.  
For example:

```
lmdeploy serve api_server ./Qwen3-32B-gptqmodel-4bit --model-name Qwen3-32B qwen3-32b default
```

As a quick and temporary fix, I have implemented a simple patch as shown below.
```
--- a/data/miniforge3/envs/lmdeploy_0.11.x/lib/python3.12/site-packages/lmdeploy/serve/openai/api_server.py.orig
+++ b/data/miniforge3/envs/lmdeploy_0.11.x/lib/python3.12/site-packages/lmdeploy/serve/openai/api_server.py
@@ -131,8 +131,24 @@ def create_error_response(status: HTTPStatus, message: str, error_type='invalid_

 def check_request(request) -> Optional[JSONResponse]:
     """Check if a request is valid."""
-    if hasattr(request, 'model') and request.model not in get_model_list():
-        return create_error_response(HTTPStatus.NOT_FOUND, f'The model {request.model!r} does not exist.')
+    #if hasattr(request, 'model') and request.model not in get_model_list():
+    #    return create_error_response(HTTPStatus.NOT_FOUND, f'The model {request.model!r} does not exist.')
+    if hasattr(request, 'model') and request.model and isinstance(request.model, str):
+        available = get_model_list()
+        req_model = request.model
+        # 支持 "default" -> 选第一个模型
+        if req_model.lower() == 'default':
+            request.model = available[0]
+        else:
+            # 简单循环匹配（大小写敏感优先，其次大小写不敏感）
+            for m in available:
+                if req_model == m:
+                    break
+                if req_model.lower() == m.lower():
+                    request.model = m
+                    break
+            else:
+                return create_error_response(HTTPStatus.NOT_FOUND, f'The model {request.model!r} does not exist.')

     # Import the appropriate check function based on request type
     if isinstance(request, ChatCompletionRequest):
```

### Related resources

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] made api_server model-name case insensitive in request #4225

Motivation

Related resources

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] made api_server model-name case insensitive in request #4225

Description

Motivation

Related resources

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions