Skip to content

feat: add 5 Chinese authoritative data sources (AM batch 2026-05-08)#216

Merged
mingcha-dev merged 1 commit intoMLT-OSS:mainfrom
firstdata-dev:feat/add-china-sources-20260508-am
May 8, 2026
Merged

feat: add 5 Chinese authoritative data sources (AM batch 2026-05-08)#216
mingcha-dev merged 1 commit intoMLT-OSS:mainfrom
firstdata-dev:feat/add-china-sources-20260508-am

Conversation

@firstdata-dev
Copy link
Copy Markdown
Collaborator

本次新增数据源 (上午批次 2026-05-08)

新增 5 个中国权威数据源,覆盖工业互联网、贵金属、文化档案、公路研究和数字出版领域。

新增数据源

ID 机构名称 领域 权威等级
china-caiii 中国工业互联网研究院 工业互联网/数字经济 research
china-cga 中国黄金协会 贵金属/大宗商品 other
china-cfca 中国电影资料馆 文化/影视 research
china-rioh 交通运输部公路科学研究院 公路/交通基础设施 research
china-cadpa 中国音像与数字出版协会 数字出版/数字经济 other

验证情况

  • ✅ ID 去重检查:5个ID均唯一
  • ✅ 网站域名去重:5个域名均未重复
  • ✅ 黑名单检查:全部通过
  • ✅ website URL验证:全部返回 200
  • ✅ data_url验证:深链返回404,使用根路径
  • ✅ make check:validation done, 717 unique IDs, domain fields consistent

- china-caiii: China Academy of Industrial Internet (中国工业互联网研究院)
  → National research institute under MIIT/SASAC, publishes industrial
    internet development reports, platform statistics, and security data

- china-cga: China Gold Association (中国黄金协会)
  → National industry association for gold sector, publishes monthly/
    annual gold production, consumption, and market price statistics

- china-cfca: China Film Archive (中国电影资料馆)
  → State-level institution under National Film Administration, publishes
    film census records, box office historical data, and industry research

- china-rioh: Research Institute of Highway, Ministry of Transport
  (交通运输部公路科学研究院)
  → National highway research institution, publishes road safety statistics,
    pavement performance data, and highway technical standards

- china-cadpa: China Audio-Video and Digital Publishing Association
  (中国音像与数字出版协会)
  → National association under Press and Publication Administration,
    publishes annual digital publishing industry report covering online
    games, e-books, mobile publishing, and streaming market data
Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明察 QA Review — PR #216 APPROVED ✅

Checklist

  • ✅ CI 三项全绿(secrecy / schema / validate)
  • ✅ 保密(body + 5 文件内容)
  • ✅ ID 去重(5 新 ID 全库唯一)
  • 缩写冲突排查
    • china-cga(中国黄金协会,cngold.org.cn)vs 已有 china-cgas(中国地质调查局,cgs.gov.cn)— 子串匹配但不同机构不冲突
    • tcga(癌症基因组图谱)也匹配 cga 子串但完全不同领域
    • caiii / cfca / rioh / cadpa 均无其他冲突
  • ✅ URL + title 全部完美匹配:
    • cfca: 中国电影资料馆(中国电影艺术研究中心)✓
    • rioh: 交通运输部公路科学研究院 ✓
    • cga: 中国黄金协会 ✓
    • cadpa: 中国音像与数字出版协会 ✓
    • caiii: 中国工业互联网研究院 ✓
  • ✅ Domains kebab-case(3-4/文件)
  • ✅ Tags 15-17/文件,无空格 / 乱码

覆盖价值

  • caiii:工业互联网研究院(工信部智库)
  • cga:黄金协会(贵金属/大宗商品首个行业协会)
  • cfca:电影资料馆(影视文化档案)
  • rioh:公路科研院(交通基础设施)
  • cadpa:音像与数字出版协会(数字版权行业协会)

路径亮点

  • 首次使用 resources/mineral/ 路径 + technology/digital-publishing/ + technology/industrial-internet/ 子目录

Merge 🚀

@mingcha-dev mingcha-dev merged commit 4397704 into MLT-OSS:main May 8, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants