Skip to content

运行报错 Novelty check requires embeddings, but embedding is unavailable #27

@zaku1521

Description

@zaku1521

报错信息如下:
✅ Story终稿已根据Reflection建议生成

🔄 迭代轮次: 2/3

================================================================================
🔍 Phase 3: Multi-Agent Critic (多智能体评审 - Anchored)

📝 Reviewer A (Methodology) 评审中...
评分: 7.1/10
反馈: 【实际输出内容省略】.

📝 Reviewer B (Novelty) 评审中...
评分: 7.1/10
反馈: 【实际输出内容省略】.

📝 Reviewer C (Storyteller) 评审中...
评分: 6.9/10
反馈: 【实际输出内容省略】.

📊 诊断信息:
分数分布: [7.099999999999892, 7.109999999999892, 6.909999999999896]
最低分评审员: Reviewer C (Storyteller), 分数: 6.909999999999896


📊 评审结果: 平均分 7.04/10 - ✅ PASS

🏆 更新全局最佳版本: 得分 7.04 (迭代 2)

✅ 评审通过,进入查重验证阶段

❌ 错误: Novelty check requires embeddings, but embedding is unavailable
Traceback (most recent call last):
File "F:\software\ChatGPT\Idea2Paper-main\Paper-KG-Pipeline\scripts\idea2story_pipeline.py", line 304, in main
result = pipeline.run()
File "F:\software\ChatGPT\Idea2Paper-main\Paper-KG-Pipeline\src\idea2paper\application\pipeline\manager.py", line 512, in run
raise RuntimeError("Novelty check requires embeddings, but embedding is unavailable")
RuntimeError: Novelty check requires embeddings, but embedding is unavailable


环境配置:已经将 Hugging Face 上 paper-embedding 中的两个文件夹放入 paper-KG-Pipeline/output


.env.example 文件内容如下:

LLM_API_URL=https://api.siliconflow.cn/v1/chat/completions
LLM_MODEL=Pro/zai-org/GLM-4.7

-----------------------------

Embedding (optional overrides)

-----------------------------

If not set, Embedding uses:

- EMBEDDING_API_URL=https://api.siliconflow.cn/v1/embeddings

- EMBEDDING_MODEL=Qwen/Qwen3-Embedding-8B

- EMBEDDING_API_KEY falls back to SILICONFLOW_API_KEY

Tip: For frequent switching, set I2P_INDEX_DIR_MODE=auto_profile to auto-select

per-embedding index dirs (no manual profile scripts needed). You can still override

I2P_NOVELTY_INDEX_DIR / I2P_RECALL_INDEX_DIR if you prefer.

EMBEDDING_API_URL=https://api.siliconflow.cn/v1/embeddings
EMBEDDING_MODEL=Qwen/Qwen3-Embedding-8B
EMBEDDING_API_KEY=your_embedding_key_here

Optional: auto profile index directories

I2P_INDEX_DIR_MODE=auto_profile

-----------------------------

Run logging (repo root log/)

-----------------------------

1 = enable structured run logs under log/run_.../

0 = disable run logs (pipeline still runs)

I2P_ENABLE_LOGGING=1

Optional: override log output directory (absolute path recommended)

I2P_LOG_DIR=/abs/path/to/log

Optional: max chars saved for prompt/response per call (avoid huge JSONL)

I2P_LOG_MAX_TEXT_CHARS=20000

-----------------------------

Results bundling (repo root results/)

-----------------------------

1 = enable bundling final artifacts under results/run_.../

0 = disable bundling (pipeline still runs)

I2P_RESULTS_ENABLE=1

Bundling mode: link (preferred) or copy

- link: create symlink if possible, fallback to copy

- copy: always duplicate files

I2P_RESULTS_MODE=link

-----------------------------

Critic strictness (quality)

-----------------------------

1 = strict JSON mode (quality-first): critic JSON invalid -> retry -> still invalid => fail the run

0 = allow non-strict behavior (useful for offline smoke tests when no API key)

I2P_CRITIC_STRICT_JSON=1

How many retries after the first failure (default 2)

I2P_CRITIC_JSON_RETRIES=2

-----------------------------

Pass rule (pattern-aware)

-----------------------------

Default is the objective "Scheme B":

- at least 2 of 3 role scores >= pattern q75

- and avg_score >= pattern q50

If pattern has too few papers (see I2P_PASS_MIN_PATTERN_PAPERS), fallback is controlled by I2P_PASS_FALLBACK.

I2P_PASS_MODE=two_of_three_q75_and_avg_ge_q50

I2P_PASS_MIN_PATTERN_PAPERS=20

I2P_PASS_FALLBACK=global # global|fixed

I2P_PASS_SCORE=7.0 # only used when fallback=fixed or distribution unavailable

-----------------------------

Advanced: anchors & scoring

-----------------------------

Quantiles for the 5 fixed anchors (comma-separated floats)

I2P_ANCHOR_QUANTILES=0.1,0.25,0.5,0.75,0.9

I2P_ANCHOR_MAX_INITIAL=7

I2P_ANCHOR_MAX_TOTAL=9

I2P_ANCHOR_MAX_EXEMPLARS=2

I2P_DENSIFY_OFFSETS=-0.5,0.5,-0.25,0.25

I2P_SIGMOID_K=1.2

I2P_GRID_STEP=0.01

I2P_DENSIFY_LOSS_THRESHOLD=0.03

I2P_DENSIFY_MIN_AVG_CONF=0.45

I2P_ANCHOR_DENSIFY_ENABLE=0 # disable adaptive densify to reduce latency

-----------------------------

Local novelty check (A方案)

-----------------------------

Enable local novelty check against nodes_paper.json

I2P_NOVELTY_ENABLE=1

Do NOT auto-build novelty index during run (quality-first + predictable)

I2P_NOVELTY_AUTO_BUILD_INDEX=1

Offline build batch size

I2P_NOVELTY_INDEX_BUILD_BATCH_SIZE=32

Action on high similarity: report_only | pivot | fail

I2P_NOVELTY_ACTION=pivot

Max pivot attempts when similarity is high

I2P_NOVELTY_MAX_PIVOTS=2

-----------------------------

Index auto-prepare (one-command run)

-----------------------------

1 = auto-preflight and build missing indexes; 0 = skip preflight

I2P_INDEX_AUTO_PREPARE=1

1 = allow auto-build when missing; 0 = fail and ask for manual build

I2P_INDEX_ALLOW_BUILD=1

-----------------------------

Final collision threshold (Phase 4)

-----------------------------

1 = enable final verification (Phase 4), 0 = skip

I2P_VERIFICATION_ENABLE=1

Recommendation: set between novelty.medium_th and novelty.high_th (e.g. 0.82~0.88)

I2P_COLLISION_THRESHOLD=0.88

-----------------------------

Recall audit (persist recall candidates)

-----------------------------

1 = enable recall audit, 0 = disable

I2P_RECALL_AUDIT_ENABLE=1

Top-N pattern scores per path to persist

I2P_RECALL_AUDIT_TOPN=50

Recall embedding batch params

I2P_RECALL_EMBED_BATCH_SIZE=32
I2P_RECALL_EMBED_MAX_RETRIES=3
I2P_RECALL_EMBED_SLEEP_SEC=0.5

Recall offline index (optional)

I2P_RECALL_USE_OFFLINE_INDEX=1


i2p_config.json文件,没有修改任何内容

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions