Text Formatting
This module provides the ability to parse HTML and Markdown formatted text and convert formatting entities to TLRPC objects suitable for the Telegram API.
The extera_utils.text_formatting module allows you to easily convert text with HTML or Markdown formatting into a plain text string and a list of TLRPC.MessageEntity objects.
You generally don't need to use this module directly if you are using client_utils.send_message (or send_text, etc.), as those functions accept a convenient parse_mode parameter. However, if you need to manually parse text for other purposes, this module provides the necessary tools.
parse_text
The main entry point is the parse_text function.
Parameters:
text: The input string containing formatting tags or markdown.parse_mode: The format to use. Either'HTML'(default) or'Markdown'.is_caption: IfTrue, the returned dictionary key for text will be"caption"; otherwise"message".
Returns: A dictionary containing:
"message"(or"caption"): The plain text content with formatting markers removed."entities": A list ofTLRPC.MessageEntityobjects.
Usage Example
Supported Formatting
HTML Tags
The HTML parser supports the following tags:
<b>,<strong>: Bold<i>,<em>: Italic<u>: Underline<s>,<del>,<strike>: Strikethrough<a href="...">: Text Link<code>: Inline Code<pre language="...">: Preformatted Code Block (language is optional)<spoiler>,<tg-spoiler>: Spoiler<blockquote>: Blockquote- Add attribute
expandableorcollapsedfor expandable blockquotes (e.g.<blockquote expandable>).
- Add attribute
<emoji id="...">: Custom Emoji
Markdown
The Markdown parser supports the following syntax:
*bold*: Bold_italic_: Italic__underline__: Underline~strikethrough~: Strikethrough||spoiler||: Spoiler`code`: Inline Code```code block```: Preformatted Code Block```python ... ```: Code Block with language[text](url): Text Link: Custom Emoji> Quote: Blockquote**> Quote: Expandable Quote
Helper Classes
TLEntityType
Enum representing supported entity types:
CODE, PRE, STRIKETHROUGH, TEXT_LINK, BOLD, ITALIC, UNDERLINE, SPOILER, CUSTOM_EMOJI, BLOCKQUOTE.
RawEntity
Intermediate representation of an entity before converting to TLRPC. Contains offset, length, and extra attributes like url, language, document_id (for emojis), collapsed (for blockquotes).