Skip to content

Project written in Python 3 that can be used to extract, analyze and prepare data from sources to bring information and finally knowledge into a visual and reusable format like word cloud.

License

Notifications You must be signed in to change notification settings

creativecodemind/py-word-cloud-modeling

Repository files navigation

Word cloud modeling

Word cloud modeling is bound to processes used to extract, analyze and prepare data from sources to bring information and finally knowlege into a visual and reusable format and as such it can be understood as approach to simulate intelligence.

The project written in Python 3 can be used for word cloud modeling in the scope of text mining where unstructured text is aimed to get converted into a structured format like a word cloud to have meaningful patterns and semantic contexts for new insights at first catch. For now, the word cloud generation is supported for languages English and Russian.

To operate with this project, it needs to be installed via command

python -m pip install -e py-word-cloud-modeling

Imagine there is a text file (.txt) named about-tagging that is wanted to get processed to a word cloud holding most common 70 words to file named about-tagging_word-cloud-random-most-common-70 in context of a font definition stored in file (.ttf) named trebuchet-ms, following example exposed via code snippet gives a how-to the word cloud creation could get applied if random architecture would be preferred over predictable.

from word_cloud_modeling.source import LanguageModel, TextReader
from word_cloud_modeling.styling import RGBColourGradientTemplate
from word_cloud_modeling import (
    ArchitectureModel,
    ImageInterfacing,
    WordCloudModeling,
)


word_cloud = WordCloudModeling.model_word_cloud(
    TextReader.create_text_stream_from_file("about-tagging"),
    LanguageModel.ENGLISH,
    RGBColourGradientTemplate.BLUE,
    "trebuchet-ms",
    word_volume=70,
    architecture=ArchitectureModel.RANDOM,
)
ImageInterfacing.save_image(word_cloud, "about-tagging_word-cloud-random-most-common-70", "jpg")

If light mode would be preferred over dark mode (default), the ModeModel enumeration can be imported from word_cloud_modeling namespace additionally and can be passed to model_word_cloud method of WordCloudModeling class via parameter called mode. Then, for example the above displayed calling of word cloud modeling would look as shown below.

word_cloud = WordCloudModeling.model_word_cloud(
    TextReader.create_text_stream_from_file("about-tagging"),
    LanguageModel.ENGLISH,
    RGBColourGradientTemplate.BLUE,
    "trebuchet-ms",
    word_volume=70,
    architecture=ArchitectureModel.RANDOM,
    mode=ModeModel.LIGHT,
)

If predictable architecture (default) would be preferred over random, the ArrangementModel enumeration can be imported from word_cloud_modeling namespace additionally and can be passed to model_word_cloud method of WordCloudModeling class via parameter called arrangement. Then, for example the first above shown calling of word cloud modeling could look as shown below.

word_cloud = WordCloudModeling.model_word_cloud(
    TextReader.create_text_stream_from_file("about-tagging"),
    LanguageModel.ENGLISH,
    RGBColourGradientTemplate.BLUE,
    "trebuchet-ms",
    word_volume=70,
    arrangement=ArrangementModel.MIXED,
)

The arrangement either can be ArrangementModel.CENTRIC or ArrangementModel.MIXED and is only supported if architecture requested is predictable.

In the samples/word-cloud directory of this project the involved files (text source and font file) and a few word cloud outputs are stored as example.

About

Project written in Python 3 that can be used to extract, analyze and prepare data from sources to bring information and finally knowledge into a visual and reusable format like word cloud.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published