Skip to content

dev(narugo1992): add targeted enhance command#24

Open
narugo1992 wants to merge 8 commits into
mainfrom
dev/narugo1992-enhance-targeted
Open

dev(narugo1992): add targeted enhance command#24
narugo1992 wants to merge 8 commits into
mainfrom
dev/narugo1992-enhance-targeted

Conversation

@narugo1992
Copy link
Copy Markdown
Contributor

@narugo1992 narugo1992 commented May 18, 2026

Summary

Implements Scope A of #23 as a local animedex enhance QUERY helper that emits the query strings a user should actually try per backend, instead of dumping every possible rewrite.

  • Adds animedex.enhance.suggest() plus structured EnhanceResult / EnhanceTarget / EnhanceSuggestion models.
  • Adds top-level animedex enhance with --type, --source, --limit-per-source, --json, and --jq.
  • Keeps runtime enhancement deterministic and offline: no HTTP, no LLM, no fixture/cache reads, and no packaged anime alias table.
  • Uses lightweight deterministic tooling only where it is useful: OpenCC for Chinese script folds, jaconv for kana/width normalization, a built-in conservative romaji-to-katakana converter, and existing anyascii/unidecode only as late generic transliteration data.
  • Reworks the anime profiles from broad probes to a smaller measured set: English punctuation cleanup, directional Chinese script folds, conservative romaji-to-katakana for complete romaji input, and raw-only for Japanese/Korean anime inputs where live checks did not show strict rescue.
  • Keeps the rejected jaraco hidden-import workaround out of tools/generate_spec.py; the frozen spec only explicitly collects transliterator/OpenCC runtime data.
  • Adds a full Anime100 multilingual audit matrix under tools/search_eval/: 100 MAL/Jikan top-popularity samples with real source-backed Mainland Chinese, Hong Kong/Macau Chinese, Taiwan Chinese, Korean, Japanese, and English names, plus a runtime-only live search run of the actual enhancement output.

Demo

enhance.gif

The committed source tape is docs/source/_static/gifs/enhance.tape.

CLI Examples And Measured Rescue

These are intentionally short CLI examples. The expected output is the TTY renderer output; the rescue note is from the committed live matrix in tools/search_eval/runs/mal_cjk100_runtime/summary.md and cells.jsonl.

python -m animedex enhance '進擊的巨人' --type anime --source anilist

Expected output:

Enhance queries for '進擊的巨人' (type=anime, source=anilist)
  anime/anilist: '進擊的巨人' | '进击的巨人'

Measured result: AniList raw 進擊的巨人 returned 0 rows; enhanced 进击的巨人 hit Attack on Titan / Shingeki no Kyojin at position 0.

python -m animedex enhance '鏈鋸人' --type anime --source anilist

Expected output:

Enhance queries for '鏈鋸人' (type=anime, source=anilist)
  anime/anilist: '鏈鋸人' | '链锯人'

Measured result: AniList raw 鏈鋸人 returned 0 rows; enhanced 链锯人 hit Chainsaw Man at position 0.

python -m animedex enhance '东京喰种√A' --type anime --source kitsu

Expected output:

Enhance queries for '东京喰种√A' (type=anime, source=kitsu)
  anime/kitsu: '东京喰种√A' | '東京喰種√A'

Measured result: Kitsu raw 东京喰种√A returned 0 rows; enhanced 東京喰種√A hit Tokyo Ghoul √A at position 0.

python -m animedex enhance 'One-Punch Man' --type anime --source ann

Expected output:

Enhance queries for 'One-Punch Man' (type=anime, source=ann)
  anime/ann: 'One-Punch Man' | 'one punch man'

Measured result: ANN raw One-Punch Man returned 0 rows; enhanced one punch man hit One Punch Man at position 0.

Japanese native input is deliberately not over-expanded when raw search is already strong.

python -m animedex enhance '葬送のフリーレン' --type anime --source jikan

Expected output:

Enhance queries for '葬送のフリーレン' (type=anime, source=jikan)
  anime/jikan: '葬送のフリーレン'

Measured result: Jikan raw 葬送のフリーレン already hit Sousou no Frieren at position 0, so the helper emits no noisy romanization or Chinese decode fallback.

Measured Runtime Impact

The committed runtime-only matrix measures 100 MAL/Jikan popular anime samples across five anime search backends and six input-language buckets. The primary metric is strict rescue: raw missed, then raw plus the current runtime animedex enhance output found the target by backend ID or exact known-title/alias match. Substring-only matches are excluded.

Overall, across 2,615 input-language/sample/backend groups, raw search hit 1,185 targets and raw plus enhance hit 1,278 targets. That is +93 strict raw-miss rescues, moving overall strict hit rate from 45.3% to 48.9% (+3.6 percentage points). The raw-miss rescue rate is 93/1,430 = 6.5%.

The improvement is concentrated rather than uniform: English benefits from punctuation/case cleanup, while Chinese benefits from directional Simplified/Traditional folds. Japanese native input is already strong on AniList/Jikan/Kitsu/Shikimori and produced no strict runtime rescue, and Korean remains weak without real cross-language alias knowledge.

Language Summary

input language raw -> raw + enhance strict rescue raw-miss rescue rate conclusion
English 436/500 -> 470/500 +34 53% Most stable gain; mainly lowercase_strip_punct.
Japanese 369/500 -> 369/500 +0 0% Raw is already strong except ANN; no fake alias expansion.
Korean 49/500 -> 49/500 +0 0% Raw is weak, but deterministic Hangul romanization did not produce reliable strict rescues.
Simplified Chinese 124/445 -> 144/445 +20 6% Simplified-to-Traditional helps Jikan/Shikimori/AniList.
Hong Kong/Macau Traditional Chinese 86/270 -> 104/270 +18 10% Traditional-to-Simplified is especially valuable for AniList.
Taiwan Traditional Chinese 121/400 -> 142/400 +21 8% Traditional-to-Simplified is the main gain, again strongest on AniList.

Backend Summary

backend raw hit raw + enhance hit strict rescue raw-miss rescue rate conclusion
AniList 265/523 305/523 +40 16% Biggest beneficiary, especially Traditional Chinese -> Simplified Chinese.
ANN 66/523 79/523 +13 3% Only English cleanup helps; CJK recall remains poor.
Jikan 314/523 327/523 +13 6% Raw is already strong; Chinese script folds and English cleanup add small gains.
Kitsu 231/523 243/523 +12 4% English and Chinese both add small, targeted gains.
Shikimori 309/523 324/523 +15 7% Japanese raw is strong; Chinese folds and English cleanup add measured value.

Language By Backend Matrix

input AniList ANN Jikan Kitsu Shikimori
English 91% -> 98% (+7) 61% -> 74% (+13) 96% -> 99% (+3) 94% -> 99% (+5) 94% -> 100% (+6)
Japanese 88% -> 88% (+0) 0% -> 0% (+0) 96% -> 96% (+0) 89% -> 89% (+0) 96% -> 96% (+0)
Korean 8% -> 8% (+0) 0% -> 0% (+0) 19% -> 19% (+0) 8% -> 8% (+0) 14% -> 14% (+0)
Simplified Chinese 49% -> 55% (+5) 1% -> 1% (+0) 39% -> 45% (+5) 11% -> 15% (+3) 38% -> 46% (+7)
Hong Kong/Macau Traditional Chinese 24% -> 50% (+14) 4% -> 4% (+0) 52% -> 56% (+2) 22% -> 26% (+2) 57% -> 57% (+0)
Taiwan Traditional Chinese 26% -> 44% (+14) 2% -> 2% (+0) 50% -> 54% (+3) 22% -> 25% (+2) 50% -> 52% (+2)

Practical Conclusions

  • English enhancement is worth keeping for every measured backend because lowercase_strip_punct rescues real raw misses on all five anime backends.
  • Chinese enhancement should remain directional and backend-targeted: Simplified input mainly benefits from zh_traditional, while Traditional/HK/MO/TW input mainly benefits from zh_simplified.
  • AniList is the clearest Chinese winner: Hong Kong/Macau Traditional Chinese moves 24% -> 50%, and Taiwan Traditional Chinese moves 26% -> 44%.
  • ANN should not receive aggressive CJK guesses. Its CJK recall remains effectively unrecovered, while English cleanup gives the only measured benefit.
  • Japanese native input should not be padded with romanization or Han transliteration by default. The measured raw path is already 88%-96% on AniList/Jikan/Kitsu/Shikimori, and runtime variants produced +0 strict rescues.
  • Korean should stay conservative for now. Without packaged alias data, network lookup, or an LLM, deterministic Hangul transliteration did not rescue strict raw misses and would mostly add noise.
  • Hit-zero rescue is limited but real: 33/894 raw zero-return groups were rescued by current runtime variants, mostly through Chinese script folds and English cleanup.

Empirical Notes

  • tools/search_eval/samples/mal_multilingual_anime100_enriched.json contains 100 real MAL/Jikan anime samples. Input coverage: en=100, ja=100, ko=100, zh_cn=89, zh_hk_mo=54, zh_tw=80.
  • tools/search_eval/runs/mal_cjk100_runtime/summary.md is the live runtime-only matrix: 4,137 current-runtime cells over ja, zh_cn, zh_hk_mo, zh_tw, ko, and en, measured separately instead of merging CJK.
  • Strict rescue is the primary improvement metric because it measures cases where raw search failed but the current runtime enhance output found the intended target.
  • Cross-language alias recovery that requires real title knowledge remains out of runtime scope. The collected real Chinese/Korean names are audit/sample data only, not packaged runtime answers.

Verification

  • make format
  • pytest test/enhance/test_suggest.py test/entry/test_enhance.py -q -m unittest (34 passed)
  • make test (2934 passed, 84 skipped, 100% coverage)
  • make rst_auto
  • python -m animedex.policy.lint
  • python animedex_cli.py enhance --help
  • python animedex_cli.py --help
  • make build && make test_cli (4 binary smoke probes passed)
  • python -m tools.search_eval.eval_mal_cjk100 --out tools/search_eval/runs/mal_cjk100_runtime --runtime-only --report-only
  • python /home/zhangshaoang/terminal-capture-workflow/scripts/terminal_capture.py probe-media docs/source/_static/gifs/enhance.gif
  • python /home/zhangshaoang/terminal-capture-workflow/scripts/terminal_capture.py extract-frames docs/source/_static/gifs/enhance.gif --times 6.5,19.8,26.02,31.5
  • Full live matrix was collected through PP_AO3 with --runtime-only and committed in tools/search_eval/runs/mal_cjk100_runtime/.
  • git diff --check

Refs #23 (Scope A only; agg merge-anime remains separate).

@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (138281e) to head (2bdac71).

Additional details and impacted files
@@            Coverage Diff             @@
##              main       #24    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files          118       122     +4     
  Lines         9235      9604   +369     
==========================================
+ Hits          9235      9604   +369     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Replace the romkan runtime dependency with a conservative built-in romaji-to-katakana converter because the romkan sdist fails metadata generation on Windows CI.
Remove unreachable defensive checks from the built-in romaji converter so patch coverage reflects the executable enhance path after replacing the external romkan dependency.
Comment thread test/enhance/test_suggest.py Outdated
def fail_opencc(conversion, value):
raise RuntimeError(f"{conversion}:{value}")

monkeypatch.setattr(_heuristics, "_opencc_convert", fail_opencc)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patches an internal animedex helper, which bypasses project code inside the unit under test. The repo test policy keeps mocks at the wire or OS-boundary level; monkeypatching _opencc_convert can let the behavior under review drift while the test still passes. Please exercise this through real build_variants() behavior, or remove the defensive branch/coverage expectation if the OpenCC failure path is not worth carrying.

out.append((f"enhance_{suggestion.variant_id}", q))
seen.add(q)

if runtime_only:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--runtime-only currently has no effect: variant_queries() always builds variants from suggest(), and both branches return the same out. If this flag is meant to distinguish current CLI output from exploratory variants, either put the non-runtime candidate set behind the false branch or remove the flag and update the PR verification command. As written, the option name promises a restriction that the code never applies.

all_cells = _read_cells(cells_path)
if all_cells and all(cell.variant in {"raw"} or cell.variant.startswith("enhance_") for cell in all_cells):
current_values = _current_variant_values()
cells = [
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This filters stale/non-current variants only for the generated summaries, but the committed cells.jsonl remains unfiltered. In the current PR artifact there are still 415 enhance_first_token cells even though current anime animedex enhance no longer emits first_token; summary.md then reports 4,137 cells while the JSONL has 4,560 lines. Please either rewrite/prune cells.jsonl when producing a runtime-only report, or commit a separate raw exploratory file and make the runtime artifact internally consistent.

@narugo1992 narugo1992 force-pushed the dev/narugo1992-enhance-targeted branch from ba1c686 to 2bdac71 Compare May 20, 2026 06:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant