How Can We Help?
You are here:
< Back

The left-to-right mark (LRM) is a control character or non-printing character, used in the computerized typesetting of bi-directional text, containing mixed left-to-right scripts (such as English and Russian) and right-to-left scripts (such as Arabic and Hebrew). It is used to change the way adjacent characters are grouped with respect to text direction.

In Unicode

In Unicode, LRM is encoded U+200E left-to-right mark (HTML: &#8206; &lrm;). UTF-8 is E2 80 8E. Usage is prescribed in the Unicode Bidi (bidirectional) algorithm.

Example of use in HTML

Suppose the writer wishes to inject a run of English text (i.e. left-to-right) text into an Arabic or Hebrew paragraph, with non-alphabetic characters at the end of the English text (on the right). "The language C++ is a programming language used..." in Arabic, but with the "C++" in English renders as follows:

‫ لغة C++ هي لغة برمجة تستخدم...

With an LRM mark entered in the HTML after the ++, it renders as follows:

‫ لغة C++‎ هي لغة برمجة تستخدم...

Standards-compliant browsers will render the ++ on the left in the first example, and on the right in the second

This happens because the browser recognizes that the paragraph is in a RTL script (Arabic), and applies punctuation, which is neutral as to its direction, in coordination with the more prominent (paragraph level) adjacent text. The LRM causes the punctuation to be adjacent to only LTR text - the "C" and the LRM mark - and hence position as if it were in left-to-right text, i.e., to the right of the preceding text.

See also

External links


Personal tools
  • Log in / create account
Namespaces
Variants
Actions
Navigation
Toolbox
Print/export
Categories
Table of Contents