-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Environment
- httpxthrottlecache version: 0.1.6+
- Python version: 3.12.11 (also affects earlier versions)
- Operating System: Windows 11 (affects all platforms with non-English locales)
- System Locale: Chinese (zh_CN), but affects any non-English locale
Description
The library uses time.strptime() to parse HTTP Last-Modified headers, which is locale-dependent. When users have their system locale set to a non-English language (Chinese, German, French, etc.), date parsing fails with a ValueError because month and day names are in the local language format instead of English.
Root Cause
In httpxthrottlecache/filecache/transport.py, the code uses:
time.strptime(last_modified, "%a, %d %b %Y %H:%M:%S GMT")The time.strptime() function is locale-dependent. With a Chinese locale, the parsed date string becomes '周五, 10 10月 2025 11:57:10 GMT' (Friday, October in Chinese) instead of 'Fri, 10 Oct 2025 11:57:10 GMT'.
Impact
This is a critical bug that makes httpxthrottlecache (and any library depending on it) completely unusable for users with non-English system locales. The error message is also misleading, making it difficult for users to diagnose the real problem.
Error Message
ValueError: time data '周五, 10 10月 2025 11:57:10 GMT' does not match format '%a, %d %b %Y %H:%M:%S GMT'
Reproduction
Minimal Test Case
import locale
import time
# Simulate Chinese locale (or any non-English locale)
try:
locale.setlocale(locale.LC_TIME, 'zh_CN.UTF-8') # Linux/macOS
# locale.setlocale(locale.LC_TIME, 'Chinese_China.936') # Windows
except locale.Error:
print("Chinese locale not available, but the same issue affects any non-English locale")
# This will fail with Chinese locale
date_string = "Fri, 10 Oct 2025 11:57:10 GMT"
try:
parsed = time.strptime(date_string, "%a, %d %b %Y %H:%M:%S GMT")
print(f"SUCCESS: {parsed}")
except ValueError as e:
print(f"FAILED: {e}")Real-World Impact
This issue was reported by a user of EdgarTools (which depends on httpxthrottlecache) who couldn't use the library at all on their Chinese Windows system. See: dgunning/edgartools#457
Proposed Solutions
Option 1: Use email.utils.parsedate_to_datetime() (Recommended)
This is the standard library function for parsing HTTP date headers and is locale-independent:
from email.utils import parsedate_to_datetime
# Instead of:
# parsed_time = time.strptime(last_modified, "%a, %d %b %Y %H:%M:%S GMT")
# Use:
parsed_datetime = parsedate_to_datetime(last_modified)Advantages:
- Locale-independent
- RFC 2822 compliant (HTTP date format)
- Returns
datetimeobject directly - Handles multiple date formats automatically
- Part of Python standard library
Option 2: Force C Locale Temporarily
import locale
import time
from contextlib import contextmanager
@contextmanager
def c_locale():
"""Temporarily set LC_TIME to C for locale-independent parsing"""
old_locale = locale.setlocale(locale.LC_TIME)
try:
locale.setlocale(locale.LC_TIME, 'C')
yield
finally:
locale.setlocale(locale.LC_TIME, old_locale)
# Usage:
with c_locale():
parsed_time = time.strptime(last_modified, "%a, %d %b %Y %H:%M:%S GMT")Advantages:
- Minimal code change
- Keeps existing logic
Disadvantages:
- More complex than Option 1
- Thread-safety concerns (locale is process-wide)
Option 3: Use datetime.strptime() with Locale Context
Similar to Option 2 but using datetime.strptime() instead of time.strptime().
Recommended Fix
Option 1 (email.utils.parsedate_to_datetime()) is the best solution because:
- It's designed specifically for parsing HTTP date headers
- It's locale-independent by design
- It's simpler and more maintainable
- It's part of the standard library
- It handles edge cases better
Additional Context
HTTP date headers follow RFC 2822/RFC 5322 format, which always uses English month and day names regardless of locale. Using locale-dependent parsing for HTTP headers is incorrect and causes failures on international systems.
This affects any user with a non-English system locale, including:
- Chinese (zh_CN, zh_TW)
- Japanese (ja_JP)
- Korean (ko_KR)
- German (de_DE)
- French (fr_FR)
- Spanish (es_ES)
- And many others