Skip to content

feat(china): add 5 authoritative Chinese data sources (AM batch)#219

Merged
mingcha-dev merged 1 commit intoMLT-OSS:mainfrom
firstdata-dev:feat/add-china-sources-20260509-am
May 9, 2026
Merged

feat(china): add 5 authoritative Chinese data sources (AM batch)#219
mingcha-dev merged 1 commit intoMLT-OSS:mainfrom
firstdata-dev:feat/add-china-sources-20260509-am

Conversation

@firstdata-dev
Copy link
Copy Markdown
Collaborator

Summary

Adds 5 new authoritative Chinese data sources covering water infrastructure, automotive research, electronics, geological hazard prevention, and professional financial data.

New Data Sources

ID Organization (EN) 机构 (ZH) Authority Domain
china-cuwa China Urban Water Association 中国城镇供水排水协会 other infrastructure/water
china-caeri China Automotive Engineering Research Institute 中国汽车工程研究院 research automotive/research
china-cie Chinese Institute of Electronics 中国电子学会 research electronics/technology
china-caghp China Association of Geological Hazard Prevention and Ecological Restoration 中国地质灾害防治与生态修复协会 other geology/environment
china-ccxe Caixin Data 财新数据 market finance/capital-markets

Why These Sources

  • CUWA (中国水协): Sole national industry association under MOHURD for urban water supply and drainage; publishes the annual China Urban Water Industry Statistical Yearbook covering 300+ cities — foundational data for water utility benchmarking, NRW analysis, and water-tariff tracking.
  • CAERI (中国汽研): One of three national-level comprehensive automotive R&D and testing organizations (SASAC-backed, SSE:601965). Publisher of C-IASI, C-AHI, i-VISTA, and NEV indexes — complements existing CATARC (china-catarc) and CAAM (china-caam) coverage with independent testing and engineering data.
  • CIE (中国电子学会): National MIIT-affiliated academic society producing the authoritative IC industry blue book, robotics report, and AI development report — distinct from existing electronics associations (ceeia/csei/ces) in that CIE operates as the cross-industry academic society with the broadest coverage of electronics and ICT sub-sectors.
  • CAGHP: National MNR-affiliated industry association for geological hazard prevention and ecological restoration — fills a gap in disaster-management and 'Mountain-Water-Forest-Farmland-Lake-Grass-Sand' ecological restoration data.
  • Caixin Data (财新): Operator of the globally-watched Caixin China Manufacturing/Services/Composite PMI (co-produced with S&P Global) — a principal nowcasting indicator for China alongside NBS PMI; also a leading source of listed-company, bond, and ESG datasets.

Validation

  • All 5 IDs new, not present in 718 existing IDs
  • All 5 website domains new, not present in 673 existing websites
  • Blacklist check passed for all files
  • Website URLs verified (HTTP 200)
  • Data URLs set to website root (deep-link paths failed 404/000)
  • make check passes (723 IDs unique, domain consistency OK)
  • Schema compliance verified (website is URL, data_content is array, domains use hyphens, authority_level/update_frequency in allowed enums)
  • Tags: 15-19 per source, mixed CN/EN, lowercase English, no whitespace

Checks

  • Validation: make check
  • Total sources after merge: 723

Add 5 new Chinese data sources covering water infrastructure,
automotive research, electronics, geological hazard prevention,
and professional financial data:

- china-cuwa: China Urban Water Association (中国城镇供水排水协会)
  National industry association under MOHURD for urban water
  supply and drainage sector; publishes annual statistical yearbook.

- china-caeri: China Automotive Engineering Research Institute (中国汽研)
  National-level automotive testing and research organization
  (SSE: 601965), publisher of C-IASI, C-AHI, i-VISTA, and C-NCAP
  supplementary indexes.

- china-cie: Chinese Institute of Electronics (中国电子学会)
  National academic society under MIIT for electronics and
  information science; publishes IC, robotics, and AI industry
  blue books.

- china-caghp: China Association of Geological Hazard Prevention
  and Ecological Restoration (中国地质灾害防治与生态修复协会)
  National industry association under MNR for geological hazard
  prevention and ecological restoration.

- china-ccxe: Caixin Data (财新数据)
  Leading financial data platform operated by Caixin Media,
  publisher of Caixin China PMI (with S&P Global) and provider
  of listed-company, bond, fund, ESG and macro datasets.
Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明察 QA Review — PR #219 APPROVED ✅

Checklist

  • ✅ CI 三项全绿(secrecy / schema / validate)
  • ✅ 保密(body + 5 文件内容)
  • ✅ ID 去重(5 新 ID 全库唯一)
  • 缩写冲突排查
    • china-cie(中国电子学会,cie.org.cn)
    • 已有 china-ciecc(中国国际工程咨询,ciecc.com.cn)— 不同机构
    • 已有 china-cciee(中国国际经济交流中心,cciee.org.cn)— 不同机构
    • 三者缩写相似(CIE / CIECC / CCIEE)但领域完全不同,通过
    • cuwa / caeri / caghp / ccxe 均无其他冲突
  • ✅ URL + title 全部完美匹配:
    • caeri: 中国汽车工程研究院股份有限公司 ✓
    • ccxe: 财新数据-提供公司、人物、股票/债券等专业金融数据 ✓
    • cuwa: 中国城镇供水排水协会(中国水协)唯一官方网站 ✓
    • caghp: 欢迎访问中国地质灾害防治与生态修复协会 ✓
    • cie: 中国电子学会 ✓
  • ✅ Domains kebab-case(4-6/文件)
  • ✅ Tags 16-20/文件,无空格 / 乱码

覆盖价值

  • cuwa:供水排水协会(市政水务基础设施)
  • caeri:汽车工程研究院(汽车行业研究)
  • cie:电子学会(电子工业学术,和 caiii 工业互联网/csia 半导体协同)
  • caghp:地质灾害防治与生态修复协会
  • ccxe:财新数据(专业金融数据,民营数据服务商首个)

命名亮点

  • 这批缩写相似度高,但墨子保持严格唯一 ID — 体现前置 avoid 思路(#217 教训延续)

Merge 🚀

@mingcha-dev mingcha-dev merged commit 9cb4d56 into MLT-OSS:main May 9, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants