Skip to content

KYC level-50 completion mail delayed ~20 minutes after merge — identify and remove the bottleneck #3801

@TaprootFreak

Description

@TaprootFreak

Problem

After an account merge that should push the master to KycLevel.LEVEL_50, the "Verifizierung abgeschlossen" mail (MailContext.KYC_CHANGED, sent by kycChanged() in src/subdomains/generic/kyc/services/kyc-notification.service.ts:168–196) arrives ~20 minutes after the merge completes. Static code review narrows the candidates but does not pinpoint the single bottleneck — Application-Insights traces are needed to confirm.

Code paths examined (and ruled in / out)

  • Notification mail send itself is synchronous. notificationService.sendMail() calls mailService.send() (nodemailer transport.sendMail) immediately (mail.service.ts:39+). The debounce: 60000 option in account-merge.service.ts:86 is not a delayed-send — it suppresses duplicate notifications with the same correlationId + context within 60 s (notification.entity.ts:53–58). It does not buffer outgoing mail.
  • MAIL_RETRY cron runs EVERY_10_MINUTES (src/subdomains/supporting/notification/services/notification-job.service.ts:36) — only resends mails that failed the first time. Would explain up to a 10-minute lag if the initial send transiently failed (e.g. Postmark rate-limit).
  • KYC review cron runs EVERY_MINUTE (kyc.service.ts:160reviewKycSteps()reviewNationalityStep / reviewIdentSteps / reviewFinancialData / reviewRecommendationStep). Processes steps in specific review statuses, advances them, can trigger updateProgresskycChanged(user, nextLevel). Worst-case wait = ~1 minute, not 20.
  • autoKycStepReminder cron runs EVERY_HOUR (kyc-notification.service.ts:29–60) — sends only reminders, not the completion mail.

The actual kycChanged(user, LEVEL_50) trigger sits inside updateProgress() (kyc.service.ts:1162–1167) which fires when the next-step calculation lifts the level. The question is which of the upstream awaits between "merge committed" and "updateProgress runs with LEVEL_50" consumes the time.

Suggested investigation steps

  1. Trace a real merge in Application Insights / GS logs: timestamps for mergeUserData commit, checkDfxApproval(master) (called from user-data.service.ts:1374), updateProgress, kycNotificationService.kycChanged, and the outgoing mail.
  2. Identify the largest gap and the responsible await.
  3. If the gap is a cron-cycle wait (a step sitting in INTERNAL_REVIEW until the next EVERY_MINUTE tick), evaluate whether the merge-finishing path can call the relevant reviewXyzStep() directly instead of waiting.
  4. If the gap is in mailService.send first-attempt failures plus MAIL_RETRY cron, harden the immediate-send error handling.

Suggested fix shape

After identifying the bottleneck, the cleanest path follows the API-as-Decision-Authority principle: turn the post-merge level transition into an event-driven dispatch rather than a cron-polled state, so the completion mail fires within seconds of the underlying level change. This dovetails with #3801 (status endpoint must expose a MergeProcessing state) — both flows benefit from the same event-driven follow-up.

Pair-PR

No app-side change unless the investigation reveals a missing API field (e.g. a KycEventType enum the app should observe).

Source

Surfaced in DFXswiss/realunit-app#611 (item 4). The 21-minute observation comes from a single Android internal-test session 2026-05-28; reproducing it on DEV with Application-Insights enabled is the first step.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions