🎯 智能考点预测系统 (Smart Exam Prediction System)

"Profile is the Lens, Prediction is the Goal"

本系统基于 “可考核单元 (Examinable Unit)” 架构与 “负向过滤 (Negative Filter)” 机制，模拟专业出题人的思维路径：定范围 -> 找行为 -> 验信号 -> 组试卷。旨在从海量课堂录音文本中，精准挖掘出真正具有考核价值的知识点，并生成一份符合 80/20 法则的模拟试卷蓝图。

🌟 核心理念 (Core Philosophy)

传统的关键词提取（Keyword Matching）容易陷入“假阳性”陷阱——提到“重点”不代表一定会考，提到“概念”不代表就是名词解释。本系统采用 Action-Oriented (行为导向) 的深度挖掘逻辑：

Teacher Profiling (建立雷达): 先读懂老师的人（侧写），生成专属的信号词表（必考词/免考词）。
Contextual Clustering (划分战场): 基于语义流变将长文本切分为独立的教学主题 (Topic Blocks)。
Targeted Mining (靶向扫描): 寻找 Topic + Action 的组合（如“推导伯努利方程”），并利用信号词进行加权或剔除。
Exam Assembly (策略组卷): 基于布鲁姆认知深度与信号强度，自动组装出一份有梯度的试卷。

🚀 功能模块 (Modules)

Phase 0: 数据输入与清洗 (Input & Correction)

ASR 纠错: 利用 LLM 结合上下文，自动修复转写文本中的专业术语错误（如 "雷糯薯" -> "雷诺数"），并去除口语废话。
文件读取: 支持直接读取本地 TXT 文件或粘贴文本。

Phase 1: 教师侧写与信号校准 (Profiling & Calibration)

全量心理侧写 (Full-Text Profiling): 摒弃传统的采样分析，采用分块并行处理 (Chunking & Merging) 技术扫描全量文本，捕捉每一个角落的 Axial Terms (必考信号) 和 Exclusion Terms (免考信号)。
人工校准 (Human-in-the-Loop): 提供交互式界面，允许用户对 AI 提取的信号词进行增删改，确保“雷达”精准无误。

Phase 2: 深度挖掘与考点分析 (Mining & Analysis)

语义聚类与兜底 (Clustering & Recovery):
- 将长文本智能切分为核心教学主题。
- 自动回收机制: 引入“遗漏片段回收”逻辑，确保所有未被聚类的文本片段都会被强制送入挖掘流程，实现内容零遗漏。
细粒度考点挖掘 (Fine-grained Mining):
- 深度提取: 遵循“宁可多抓，不可漏过”原则，深入段落细节，提取每个概念、公式和案例。
- 行为识别: 区分定义、推导、计算、应用等不同教学行为。
- 负向过滤: 一旦命中“免考信号”（如“了解即可”），直接标记为 Excluded。
- 认知定级: 基于布鲁姆分类法预测题型（记忆/应用/分析）和分值。

Phase 3: 考点雷达与模拟试卷 (Dashboard & Exam)

必考高危区 (Must Study): 展示高信号强度、高认知深度的核心考点（预测大题）。
普通掌握区 (Normal): 展示基础考点（预测小题）。
已过滤内容 (Excluded): 展示被系统判定为“不考”的内容及理由，防止误杀。

🛠️ 技术架构 (Architecture)

Tech Stack

Frontend: Streamlit (交互式 Web 界面)
LLM Orchestration: Instructor (结构化输出保障)
Async Runtime: asyncio (高并发处理)
Visualization: Plotly (交互式图表)

Project Structure

画像提取/
├── data/
│   └── data.txt          # 默认输入数据
├── src/
│   ├── app.py            # 主程序入口 (Streamlit)
│   ├── agents.py         # 智能体逻辑实现 (Profiling, Clustering, Mining)
│   └── schemas.py        # Pydantic 数据模型定义
├── requirements.txt      # 项目依赖
└── README.md             # 说明文档

🚦 快速开始 (Quick Start)

1. 环境准备

确保 Python 3.8+ 环境，并安装依赖：

pip install -r requirements.txt

2. 配置 API

在界面侧边栏输入 DeepSeek API Key，或修改 src/app.py 中的默认配置：

DEFAULT_API_KEY = "your-api-key"
DEFAULT_BASE_URL = "https://api.deepseek.com"

3. 启动系统

streamlit run src/app.py

🧠 核心算法逻辑 (Algorithm)

The Mining Algorithm

def mine_unit(text, profile):
    # 1. Extract Topic + Action
    unit = llm.extract(text)
    
    # 2. Negative Filter (Priority)
    if any(term in text for term in profile.exclusion_terms):
        unit.status = "Excluded"
        return unit
        
    # 3. Positive Boost
    if any(term in text for term in profile.axial_terms):
        unit.importance += 2
        
    # 4. Cognitive Scoring
    if unit.action_type == "Derive/Calculate":
        unit.score = 15
    elif unit.action_type == "Define/Recall":
        unit.score = 5
        
    return unit

📝 开发者指南

调整提示词: 所有 Prompt 均位于 src/agents.py 顶部，可根据具体学科特点进行微调。
扩展题型: 在 src/schemas.py 中的 predicted_question_type 字段添加新的题型枚举。

Powered by DeepSeek & Instructor

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.trae/documents		.trae/documents
data		data
output		output
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎯 智能考点预测系统 (Smart Exam Prediction System)

🌟 核心理念 (Core Philosophy)

🚀 功能模块 (Modules)

Phase 0: 数据输入与清洗 (Input & Correction)

Phase 1: 教师侧写与信号校准 (Profiling & Calibration)

Phase 2: 深度挖掘与考点分析 (Mining & Analysis)

Phase 3: 考点雷达与模拟试卷 (Dashboard & Exam)

🛠️ 技术架构 (Architecture)

Tech Stack

Project Structure

🚦 快速开始 (Quick Start)

1. 环境准备

2. 配置 API

3. 启动系统

🧠 核心算法逻辑 (Algorithm)

The Mining Algorithm

📝 开发者指南

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎯 智能考点预测系统 (Smart Exam Prediction System)

🌟 核心理念 (Core Philosophy)

🚀 功能模块 (Modules)

Phase 0: 数据输入与清洗 (Input & Correction)

Phase 1: 教师侧写与信号校准 (Profiling & Calibration)

Phase 2: 深度挖掘与考点分析 (Mining & Analysis)

Phase 3: 考点雷达与模拟试卷 (Dashboard & Exam)

🛠️ 技术架构 (Architecture)

Tech Stack

Project Structure

🚦 快速开始 (Quick Start)

1. 环境准备

2. 配置 API

3. 启动系统

🧠 核心算法逻辑 (Algorithm)

The Mining Algorithm

📝 开发者指南

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages