speech-analysis

A project scraping, analyzing, and visualizing speeches from the Russian and Chinese foreign ministries by Chris Cooper, Jack Lennon, and Mandy Tao.

Datasets

Using the china-speech-scraper and russia-speech-scraper, we gathered china_speeches.csv, Lavrov_2026_2014_D4P.csv, and Lavrov_Speeches_D4P.json from the Russian and Chinese Ministries of Foreign Affairs.

We used keywords.py to make viz_cache.json, which is what generates the interactable chart on the Insights page.

We used R to join the two datasets into China_Russia_Speeches.csv and .json.

We used word_count.py to make CH_RU_processed_lemmatized.json

We used count_noun_chunks_entities.py to make noun_chunks_entities_count.json, top_1000_words_combined.json, china_top_1000_words.json and russia_top_1000_words.json.

We used tfidf_analysis.py to make tfidf_rsults.json, tfidf_results_china_edited.json and tfidif_results_russia_edited.json.

Name		Name	Last commit message	Last commit date
Latest commit History 369 Commits
data-page		data-page
data		data
faq		faq
images		images
insights		insights
our-team		our-team
.gitignore		.gitignore
README.md		README.md
index.html		index.html
index.js		index.js
style.css		style.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

speech-analysis

Datasets

About

Uh oh!

Releases

Packages

Contributors 3

Languages

ccoop129/speech-analysis

Folders and files

Latest commit

History

Repository files navigation

speech-analysis

Datasets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages