Skip to content

isaacus-dev/text2markdown

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

text2markdown 📝

text2markdown is a Python library for intelligently converting plain text into Markdown.

text2markdown is powered by the Isaacus enrichment API, which converts unstructured documents into rich, highly structured knowledge graphs that can easily be transformed into Markdown.

In all, text2markdown is capable of:

  • Identifying and formatting headings.
  • Segmenting text into nested sections.
  • Hyperlinking cross-references within texts to other sections.
  • Italicizing cited documents.
  • Italicizing defined terms.
  • Detecting and formatting block quotations.
  • Striking through junk text.

Setup 📦

text2markdown can be installed with pip (or uv):

pip install text2markdown

An Isaacus API key is also required to use this library.

Usage 👩‍💻

The code snippet below demonstrates how you might use text2markdown() to intelligently convert a short document into Markdown.

from text2markdown import text2markdown

text = """\
The Smallest Document In The World
This is a generic document.

Section 1 - Background
One upon a time, there was a mayor who said:
We love Markdown so much that everyone should and must use it for everything.

Section 2 - Problem
The mayor's directive, as stated in Section 1, was sadly too difficult to enforce."""

output = text2markdown(text)
print(output)

The output should look something like this:

# The Smallest Document In The World 

This is a generic document. 

## <a id="seg-1"></a>Section 1 - Background 

One upon a time, there was a mayor who said: 

> We love Markdown so much that everyone should and must use it for everything. 

## Section 2 - Problem 

The mayor's directive, as stated in [Section 1](#seg-1), was sadly too difficult to enforce.

An asynchronous version of text2markdown() is also available, supporting all of the same features and arguments as its synchronous equivalent. It can be used like so:

from text2markdown import text2markdown_async

output = await text2markdown_async(text)
print(output)

All of the various capabilities of text2markdown can be toggled on or off using optional Boolean parameters, as shown below:

from text2markdown import text2markdown

from isaacus import Isaacus

output = text2markdown(
    text,
    link_xrefs=True,
    strike_junk=True,
    block_quotes=True,
    escape_lists=True,
    italicize_refs=True,
    italicize_terms=True,
    enrichment_model="kanon-2-enricher",
    isaacus_client=Isaacus(),
)
print(output)

License 📜

This library is licensed under the MIT License.

About

text2markdown is a Python library for intelligently converting plain text into Markdown.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages