Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 0 additions & 39 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,9 @@ services:
# 컨테이너 내부에서 localhost는 도커 네트워크에서 자신의 주소
- SPRING_DATA_MONGODB_URI=mongodb://mongo:27017/llv_api_local
- SPRING_DATA_REDIS_HOST=redis
# Redlock 기본 활성화
- WORD_SINGLE_FLIGHT_REDLOCK_ENABLED=true
- WORD_SINGLE_FLIGHT_REDLOCK_NODE_ADDRESSES=redis://redis-a:6379,redis://redis-b:6379,redis://redis-c:6379
depends_on:
- mongo
- redis
- redis-a
- redis-b
- redis-c
volumes:
- ./logs:/app/logs
mongo:
Expand All @@ -40,39 +34,6 @@ services:
- redis_data:/data
restart: unless-stopped

redis-a:
image: redis:7-alpine
container_name: llv-redis-a
ports:
- "6380:6379"
command: ["redis-server", "--appendonly", "yes"]
volumes:
- redis_a_data:/data
restart: unless-stopped

redis-b:
image: redis:7-alpine
container_name: llv-redis-b
ports:
- "6381:6379"
command: ["redis-server", "--appendonly", "yes"]
volumes:
- redis_b_data:/data
restart: unless-stopped

redis-c:
image: redis:7-alpine
container_name: llv-redis-c
ports:
- "6382:6379"
command: ["redis-server", "--appendonly", "yes"]
volumes:
- redis_c_data:/data
restart: unless-stopped

volumes:
mongo_data:
redis_data:
redis_a_data:
redis_b_data:
redis_c_data:
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Word Single-Flight 분산 안정화 보고서 (Redlock 적용)
# Word Single-Flight 분산 안정화 보고서 (RLock 표준화)

## 문제

Expand All @@ -7,22 +7,22 @@

## 선택

single-flight 조정 경로를 Redisson 기반으로 전환하고, Redlock 경로를 기본값으로 채택했다.
동시에 노드 설정 이상 상황에서는 단일 락 폴백 경로로 기동하도록 fail-safe 동작을 추가했다.
single-flight 조정 경로를 Redisson 기반으로 전환하고, 운영 표준은 `RLock + watchdog`으로 확정했다.
또한 follower 에러 처리와 락 만료 시맨틱을 보정해 단기 장애 증폭 가능성을 낮췄다.

## 이유

우선순위는 단기간 내 운영 리스크 완화와 기동 안전성 확보로 설정했다.
AWS Bedrock 동기 추론 API(Converse/InvokeModel)에는 멱등키가 없어 호출 단계 중복 제거를 플랫폼에 위임하기 어려웠다.
`fencing token` 기반 모델은 저장소/다운스트림 검증 지점 추가와 토큰 단조성 보장 설계가 필요해 즉시 적용 범위에서 제외했다.
또한 본 건의 핵심 목표가 AI 요청 수 절감인데, fencing token은 stale write 방지에는 유효해도 AI 중복 호출 자체를 차단하지는 못한다.
이에 따라 1차 조치는 duplicate-call 완화와 fail-safe 확보에 집중하고, 엄격 정합성 요구는 후속 과제로 분리했다.
이에 따라 1차 조치는 `RLock + watchdog + 결과 캐시/idempotency key` 조합으로 duplicate-call 완화에 집중하고, 엄격 정합성 요구는 후속 과제로 분리했다.

## 검증

- [WordSingleFlightRedisCoordinator.java](../../src/main/java/com/linglevel/api/word/service/singleflight/WordSingleFlightRedisCoordinator.java) 기준으로 락 경로가 Redisson 기반으로 전환된 것을 확인했다.
- [WordSingleFlightRedisCoordinator.java](../../src/main/java/com/linglevel/api/word/service/WordSingleFlightRedisCoordinator.java) 기준으로 락 경로가 Redisson `RLock` 기반으로 표준화된 것을 확인했다.
- follower timeout 및 leader 실패 전파 시맨틱 보정 사항을 코드 단위로 확인했다.
- 로컬 3노드 Redis 환경에서 Redlock 경로와 단일 락 폴백 경로를 테스트로 검증했다.
- 두 인스턴스 동시 요청에서 single-flight 1회 실행, leader 실패 전파, timeout fallback을 테스트로 검증했다.
- 변경 사항은 PR 단위로 분리해 검증했다: `#328`(분산 안정화), `#330`(만료/에러 시맨틱 보정), `#331`(Redlock + 폴백 검증).

## 결과와 남은 이슈
Expand Down
2 changes: 1 addition & 1 deletion docs/decisions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,4 @@
- [008. 글로벌 이미지 전달 성능 최적화](008-image-delivery-optimization.md)
- [009. DSL 기반 크롤링 규칙 관리 구조 도입](009-dsl-driven-crawling.md)
- [010. 미션 기반 Codex 에이전트 운영 규칙 정리](010-mission-oriented-agent-guidelines.md)
- [011. Word Single-Flight 분산 안정화와 Redlock 도입](011-word-single-flight-distributed-stability-with-redlock.md)
- [011. Word Single-Flight 분산 안정화와 RLock 표준화](011-word-single-flight-distributed-stability-with-redlock.md)
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.stereotype.Component;

import java.util.ArrayList;
import java.util.List;

@Getter
@Setter
@Component
Expand All @@ -16,8 +13,6 @@ public class WordSingleFlightProperties {

private boolean enabled = true;

private long lockTtlMs = 20_000;

private long waitTimeoutMs = 5_000;

private long resultTtlMs = 60_000;
Expand All @@ -27,8 +22,4 @@ public class WordSingleFlightProperties {
private String model = "default";

private String schemaVersion = "v2";

private boolean redlockEnabled = false;

private List<String> redlockNodeAddresses = new ArrayList<>();
}
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,8 @@
import jakarta.annotation.PreDestroy;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.redisson.Redisson;
import org.redisson.RedissonRedLock;
import org.redisson.api.RLock;
import org.redisson.api.RedissonClient;
import org.redisson.config.Config;
import org.springframework.data.redis.connection.Message;
import org.springframework.data.redis.connection.MessageListener;
import org.springframework.data.redis.core.StringRedisTemplate;
Expand All @@ -26,7 +23,6 @@
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.time.Duration;
import java.util.ArrayList;
import java.util.HexFormat;
import java.util.List;
import java.util.Locale;
Expand Down Expand Up @@ -54,26 +50,17 @@ public class WordSingleFlightRedisCoordinator {
private final ObjectMapper objectMapper;

private final ConcurrentHashMap<String, CopyOnWriteArrayList<CompletableFuture<Void>>> channelWaiters = new ConcurrentHashMap<>();
private final List<RedissonClient> redlockClients = new ArrayList<>();

private final MessageListener doneListener = this::onDoneMessage;

@PostConstruct
void initialize() {
redisMessageListenerContainer.addMessageListener(doneListener, new PatternTopic(DONE_PATTERN));
initializeRedlockClients();
}

@PreDestroy
void shutdown() {
for (RedissonClient client : redlockClients) {
try {
client.shutdown();
} catch (Exception e) {
log.warn("Failed to shutdown single-flight Redlock client", e);
}
}
redlockClients.clear();

}

public List<WordAnalysisResult> execute(
Expand All @@ -91,7 +78,7 @@ public List<WordAnalysisResult> execute(
return unwrap(cached, keys.digest());
}

RLock lock = createLeaderLock(keys.lockKey());
RLock lock = createLock(keys.lockKey());
boolean lockAcquired = tryAcquireLeaderLock(lock);
if (lockAcquired) {
return executeAsLeader(keys, lock, leaderAction);
Expand Down Expand Up @@ -178,68 +165,10 @@ private void releaseLock(RLock lock, String lockKey) {
}
}

private RLock createLeaderLock(String lockKey) {
if (properties.isRedlockEnabled() && redlockClients.size() >= 3) {
RLock[] locks = redlockClients.stream()
.map(client -> client.getLock(lockKey))
.toArray(RLock[]::new);
return new RedissonRedLock(locks);
}

if (properties.isRedlockEnabled()) {
log.warn("Redlock is enabled but usable node count is {} (<3). Fallback to single RLock.",
redlockClients.size());
}
private RLock createLock(String lockKey) {
return redissonClient.getLock(lockKey);
}

private void initializeRedlockClients() {
if (!properties.isRedlockEnabled()) {
return;
}

List<String> addresses = properties.getRedlockNodeAddresses().stream()
.map(String::trim)
.filter(value -> !value.isBlank())
.toList();

if (addresses.isEmpty()) {
log.warn("Redlock is enabled but no node addresses configured. Fallback to single RLock.");
return;
}

for (String rawAddress : addresses) {
String address = normalizeAddress(rawAddress);
try {
Config config = new Config();
config.useSingleServer().setAddress(address);
redlockClients.add(Redisson.create(config));
} catch (Exception e) {
log.warn("Skipping invalid/unavailable Redlock node address '{}'. Fallback candidates will continue.", rawAddress, e);
}
}

if (redlockClients.isEmpty()) {
log.warn("Redlock is enabled but no valid/usable nodes initialized. Fallback to single RLock.");
return;
}

if (redlockClients.size() < 3) {
log.warn("Redlock requires at least 3 independent nodes, but only {} configured. Fallback to single RLock.",
redlockClients.size());
return;
}

log.info("Single-flight Redlock mode initialized with {} nodes.", redlockClients.size());
}

private String normalizeAddress(String rawAddress) {
if (rawAddress.startsWith("redis://") || rawAddress.startsWith("rediss://")) {
return rawAddress;
}
return "redis://" + rawAddress;
}

private ResultEnvelope readResult(String resultKey) {
String raw = stringRedisTemplate.opsForValue().get(resultKey);
if (raw == null) {
Expand Down
4 changes: 0 additions & 4 deletions src/main/resources/application-local.properties
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,6 @@ spring.data.mongodb.database=llv_api_local
spring.data.redis.host=localhost
spring.data.redis.port=6379

# Redlock (local default)
word.single-flight.redlock-enabled=true
word.single-flight.redlock-node-addresses=redis://localhost:6379,redis://localhost:6380,redis://localhost:6381

# Logging for Local Development
logging.level.com.linglevel.api=DEBUG
logging.level.org.springframework.data.mongodb=DEBUG
Expand Down
3 changes: 0 additions & 3 deletions src/main/resources/application.properties
Original file line number Diff line number Diff line change
Expand Up @@ -53,14 +53,11 @@ spring.data.redis.ssl.enabled=false

# Word single-flight (Redis lock + Pub/Sub + result key fallback)
word.single-flight.enabled=true
word.single-flight.lock-ttl-ms=13000
word.single-flight.wait-timeout-ms=11000
word.single-flight.result-ttl-ms=25000
word.single-flight.prompt-version=v1
word.single-flight.model=${spring.ai.bedrock.converse.chat.options.model:default}
word.single-flight.schema-version=v2
word.single-flight.redlock-enabled=false
word.single-flight.redlock-node-addresses=

# AWS S3 (AI Input/Output buckets)
aws.s3.region=${S3_REGION}
Expand Down
Loading
Loading