Along with plaintext translation, support for translating structured documents using MT system is a common and useful feature.
- html
- Supported in MinT, uses a two stage beautifulsoup based parsing
- webpage by providing a URL is supported, but has performance issues and bugs with arbitrary webpage translation. Need more testing.
- markdown
- supported in MinT by converting markdown to html and translating. The translated html is coverted back to markdown
- Need to explore if markdown parsers like marko etc can help to avoid going through html format. Need more testing
- json
- supported in MinT, uses a two stage parsing. Translates values that are strings
- svg
- supported in MinT, uses a two stage parsing. Translates text, textpath nodes. Does not support tspan yet.
- Wikipedia i18n format strings. See T341544: Wikitext syntax is translated when requesting translations via MinT
- Wikitext (T347018)
- The easiest way to support this is going through html format like markdown
- OpenDocument formats like odt, odp, ods etc
- MS Word
- ..