Public mdpo APIs

md2po

Markdown to PO files extractor according to mdpo specification.

mdpo.md2po.markdown_to_pofile(files_or_content, ignore=frozenset({}), msgstr='', po_filepath=None, save=False, mo_filepath=None, plaintext=False, wrapwidth=78, mark_not_found_as_obsolete=True, preserve_not_found=True, location=True, extensions=['collapse_whitespace', 'tables', 'strikethrough', 'tasklists', 'latex_math_spans', 'wikilinks'], po_encoding=None, md_encoding='utf-8', xheader=False, include_codeblocks=False, ignore_msgids=frozenset({}), command_aliases=None, metadata=None, events=None, debug=False, **kwargs)

Extract all the msgids from Markdown content or files.

Parameters:
  • files_or_content (str, list) – Glob path to Markdown files, a list of files or a string with Markdown content.

  • ignore (list) – Paths of files to ignore. Useful when a glob does not fit your requirements indicating the files to extract content. Also, filename or a dirname can be defined without indicate the full path.

  • msgstr (str) – Default message string for extracted msgids.

  • po_filepath (str) – File that will be used as polib.POFile instance where to dump the new msgids and that will be used as source checking not found strings that will be marked as obsolete if is the case (see save and mark_not_found_as_obsolete optional parameters).

  • save (bool) – Save the new content to the PO file indicated in the parameter po_filepath. If is enabled and po_filepath is None a ValueError will be raised.

  • mo_filepath (str) – The resulting PO file will be compiled to a MO file and saved in the path specified at this parameter.

  • plaintext (bool) – If you pass True to this parameter (as default) the content will be extracted as is, without markup characters included. Passing plaintext as False, extracted msgids will contain some markup characters used to appoint the location of `inline code`, **bold text**, *italic text* and `[links]`, that might be useful for you. It depends on the use you are going to give to this library activate this mode (plaintext=False) or not.

  • wrapwidth (int) – Wrap width for po file indicated at po_filepath parameter. If negative, 0, ‘inf’ or ‘math.inf’ the content won’t be wrapped.

  • mark_not_found_as_obsolete (bool) – The strings extracted from markdown that will not be found inside the provided PO file will be marked as obsolete.

  • preserve_not_found (bool) – The strings extracted from markdown that will not be found inside the provided PO file wouldn’t be removed. Only has effect if mark_not_found_as_obsolete is False.

  • location (bool) – Store references of top-level blocks in which are found the messages in PO file #: reference comments.

  • extensions (list) – md4c extensions used to parse markdown content, formatted as a list of ‘pymd4c’ keyword arguments. You can see all available at pymd4c documentation.

  • po_encoding (str) – Resulting PO file encoding.

  • md_encoding (str) – Markdown content encoding.

  • xheader (bool) – Indicates if the resulting PO file will have the mdpo x-header included.

  • include_codeblocks (bool) – Include all code blocks found inside PO file result. This is useful if you want to translate all your blocks of code. Equivalent to append <!-- mdpo-include-codeblock --> command before each code block.

  • ignore_msgids (list) – List of msgids ot ignore from being extracted.

  • command_aliases (dict) – Mapping of aliases to use custom mdpo command names in comments. The mdpo- prefix in command names resolution is optional. For example, if you want to use <!-- mdpo-on --> instead of <!-- mdpo-enable -->, you can pass the dictionaries {"mdpo-on": "mdpo-enable"} or {"mdpo-on": "enable"} to this parameter.

  • metadata (dict) – Metadata to include in the produced PO file. If the file contains previous metadata fields, these will be updated preserving the values of the already defined.

  • events (dict) –

    Preprocessing events executed during the parsing process that can be used to customize the extraction process. Takes functions or list of functions as values. If one of these functions returns False, that part of the parsing is skipped by md2po. Available events are the next:

    • enter_block(self, block, details): Executed when the parsing a Markdown block starts.

    • leave_block(self, block, details): Executed when the parsing a Markdown block ends.

    • enter_span(self, span, details): Executed when the parsing of a Markdown span starts.

    • leave_span(self, span, details): Executed when the parsing of a Markdown span ends.

    • text(self, block, text): Executed when the parsing of text starts.

    • command(self, mdpo_command, comment, original command): Executed when a mdpo HTML command is found.

    • msgid(self, msgid, msgstr, msgctxt, tcomment, flags): Executed when a msgid is going to be stored.

    • link_reference(self, target, href, title): Executed when a link reference is going to be stored.

    You can also define the location of these functions by strings with the syntax path/to/file.py::function_name.

    All self arguments are an instance of Md2Po parser. You can take advanced control of the parsing process manipulating the state of the parser. For example, if you want to skip a certain msgid to be included, you can do:

    def msgid_event(self, msgid, *args):
        if msgid == 'foo':
            self.disable_next_block = True
    

  • debug (bool) – Add events displaying all parsed elements in the extraction process.

  • **kwargs – Extra arguments passed to mdpo.md2po.Md2Po constructor.

Examples

>>> content = 'Some text with `inline code`'
>>> entries = markdown_to_pofile(content, plaintext=True)
>>> {e.msgid: e.msgstr for e in entries}
{'Some text with inline code': ''}
>>> entries = markdown_to_pofile(content)
>>> {e.msgid: e.msgstr for e in entries}
{'Some text with `inline code`': ''}
>>> entries = markdown_to_pofile(content, msgstr='Default message')
>>> {e.msgid: e.msgstr for e in entries}
{'Some text with `inline code`': 'Default message'}
Returns:

polib.POFile Pofile instance with new msgids included.

po2md

Markdown files translator using PO files as reference.

mdpo.po2md.pofile_to_markdown(filepath_or_content, pofiles, ignore=frozenset({}), save=None, md_encoding='utf-8', po_encoding=None, command_aliases=None, wrapwidth=80, events=None, debug=False, **kwargs)

Translate Markdown content or file using PO files as reference.

This implementation reproduces the same valid Markdown output, given the provided AST, with replaced translations, but doesn’t rebuilds the same input format as Markdown is just a subset of HTML.

Parameters:
  • filepath_or_content (str) – Markdown filepath or content to translate.

  • pofiles (str, list) – Glob or list of globs matching a set of PO files from where to extract messages to make the replacements translating strings.

  • ignore (list) – Paths of PO files to ignore. Useful when a glob does not fit your requirements indicating the files to extract content. Also, filename or a dirname can be defined without indicate the full path.

  • save (str) – Saves the output content in file whose path is specified at this parameter.

  • md_encoding (str) – Markdown content encoding.

  • po_encoding (str) – PO files encoding. If you need different encodings for each file, you must define it in the “Content-Type” field of each PO file metadata, in the form "Content-Type: text/plain; charset=<ENCODING>\n".

  • command_aliases (dict) – Mapping of aliases to use custom mdpo command names in comments. The mdpo- prefix in command names resolution is optional. For example, if you want to use <!-- mdpo-on --> instead of <!-- mdpo-enable -->, you can pass the dictionaries {"mdpo-on": "mdpo-enable"} or {"mdpo-on": "enable"} to this parameter.

  • wrapwidth (int) – Maximum width used rendering the Markdown output.

  • events (dict) –

    Preprocessing events executed during the translation process that can be used to customize the output. Takes list of functions as values. If one of these functions returns False, that part of the translation process is skipped by po2md. Available events are the next:

    • enter_block(self, block, details): Executed when the parsing of a Markdown block starts.

    • leave_block(self, block, details): Executed when the parsing of a Markdown block ends.

    • enter_span(self, span, details): Executed when the parsing of a Markdown span starts.

    • leave_span(self, span, details): Executed when the parsing of a Markdown span ends.

    • text(self, block, text): Executed when the parsing of text starts.

    • command(self, mdpo_command, comment, original command): Executed when a mdpo HTML command is found.

    • msgid(self, msgid, msgstr, msgctxt, tcomment, flags): Executed when a msgid is going to be replaced.

    • link_reference(self, target, href, title): Executed when each reference link is being written in the output (at the end of the translation process).

    You can also define the location of these functions by strings with the syntax path/to/file.py::function_name.

  • debug (bool) – Add events displaying all parsed elements in the translation process.

  • **kwargs – Extra arguments passed to mdpo.po2md.Po2Md constructor.

Returns:

Markdown output file with translated content.

Return type:

str

md2po2md

Markdown to PO file to Markdown translator.

mdpo.md2po2md.markdown_to_pofile_to_markdown(langs, input_paths_glob, output_paths_schema, extensions=['collapse_whitespace', 'tables', 'strikethrough', 'tasklists', 'latex_math_spans', 'wikilinks'], command_aliases=None, location=True, debug=False, po_wrapwidth=78, md_wrapwidth=80, po_encoding=None, md_encoding=None, include_codeblocks=False, md2po_kwargs=None, po2md_kwargs=None, _check_saved_files_changed=False, no_obsolete=False, no_fuzzy=False, no_empty_msgstr=False)

Translate a set of Markdown files using PO files.

Parameters:
  • langs (list) – List of languages used to build the output directories.

  • input_paths_glob (str) – Glob covering Markdown files to translate.

  • output_paths_schema (str) –

    Path schema for outputs, built using placeholders. There is a mandatory placeholder for languages: {lang}; and one optional for output basename: {basename}. For example, for the schema locale/{lang}, the languages ['es', 'fr'] and a README.md as input, the next files will be written:

    • locale/es/README.po

    • locale/es/README.md

    • locale/fr/README.po

    • locale/fr/README.md

    Note that you can omit {basename}, specifying a directory for each language with locale/{lang} for this example. Unexistent directories and files will be created, so you don’t have to prepare the output directories before the execution.

  • extensions (list) –

    md4c extensions used to parse markdown content, formatted as a list of ‘pymd4c’ keyword arguments. You can see all available at pymd4c documentation.

  • command_aliases (dict) – Mapping of aliases to use custom mdpo command names in comments. The mdpo- prefix in command names resolution is optional. For example, if you want to use <!-- mdpo-on --> instead of <!-- mdpo-enable -->, you can pass the dictionaries {"mdpo-on": "mdpo-enable"} or {"mdpo-on": "enable"} to this parameter.

  • location (bool) – Store references of top-level blocks in which are found the messages in PO file #: reference comments.

  • debug (bool) – Add events displaying all parsed elements in the extraction process.

  • po_wrapwidth (int) – Maximum width for PO files.

  • md_wrapwidth (int) – Maximum width for produced Markdown contents, when possible.

  • po_encoding (str) – PO files encoding.

  • md_encoding (str) – Markdown files encoding.

  • include_codeblocks (bool) – Include codeblocks in the extraction process.

  • md2po_kwargs (dict) – Additional optional arguments passed to markdown_to_pofile function.

  • po2md_kwargs (dict) – Additional optional arguments passed to pofile_to_markdown function.

  • no_obsolete (bool) – If True, check for obsolete entries in PO files.

  • no_fuzzy (bool) – If True, check for fuzzy entries in PO files.

  • no_empty_msgstr (bool) – If True, check for empty msgstr entries.

mdpo2html

HTML-produced-from-Markdown files translator using PO files as reference.

mdpo.mdpo2html.markdown_pofile_to_html(filepath_or_content, pofiles, ignore=frozenset({}), save=None, po_encoding=None, html_encoding='utf-8', command_aliases=None, **kwargs)

HTML-produced-from-Markdown file translator using PO files.

Produces a translated HTML file given a previous HTML file (created by a Markdown-to-HTML processor) and a set of PO files as reference for msgstrs.

Parameters:
  • filepath_or_content (str) – HTML whose content wants to be translated.

  • pofiles (str) – Glob for set of pofiles used as reference translating the strings to another language.

  • ignore (list) – List of paths to pofiles to ignore, useful if the glob patterns in pofiles parameter does not fit your requirements.

  • save (str) – If you pass this parameter as a path to one HTML file, even if does not exists, will be saved in the path the output of the function.

  • html_encoding (str) – HTML content encoding.

  • po_encoding (str) – PO files encoding. If you need different encodings for each file, you must define it in the “Content-Type” field of each PO file metadata, in the form "Content-Type: text/plain; charset=<ENCODING>\n".

  • command_aliases (dict) – Mapping of aliases to use custom mdpo command names in comments. The mdpo- prefix in command names resolution is optional. For example, if you want to use <!-- mdpo-on --> instead of <!-- mdpo-enable -->, you can pass the dictionaries {"mdpo-on": "mdpo-enable"} or {"mdpo-on": "enable"} to this parameter.

  • **kwargs – Extra keyword arguments passed to mdpo.mdpo2html.MdPo2HTML constructor.

Known limitations:

Returns:

HTML output translated version of the given file.

Return type:

str