Hosta

The left-to-right mark (LRM) is a control character or non-printing character, used in the computerized typesetting of bi-directional text, containing mixed left-to-right scripts (such as English and Russian) and right-to-left scripts (such as Arabic and Hebrew). It is used to change the way adjacent characters are grouped with respect to text direction.

In Unicode

In Unicode, LRM is encoded U+200E left-to-right mark (HTML: ‎ &lrm;). UTF-8 is E2 80 8E. Usage is prescribed in the Unicode Bidi (bidirectional) algorithm.

Example of use in HTML

Suppose the writer wishes to inject a run of English text (i.e. left-to-right) text into an Arabic or Hebrew paragraph, with non-alphabetic characters at the end of the English text (on the right). "The language C++ is a programming language used..." in Arabic, but with the "C++" in English renders as follows:

‫ لغة C++ هي لغة برمجة تستخدم...

With an LRM mark entered in the HTML after the ++, it renders as follows:

‫ لغة C++‎ هي لغة برمجة تستخدم...

Standards-compliant browsers will render the ++ on the left in the first example, and on the right in the second

This happens because the browser recognizes that the paragraph is in a RTL script (Arabic), and applies punctuation, which is neutral as to its direction, in coordination with the more prominent (paragraph level) adjacent text. The LRM causes the punctuation to be adjacent to only LTR text - the "C" and the LRM mark - and hence position as if it were in left-to-right text, i.e., to the right of the preceding text.

External links

Unicode

Code points

Characters

Special purpose	BOM Combining grapheme joiner Left-to-right mark and Right-to-left mark Soft hyphen Zero-width non-breaking space Zero-width joiner Zero-width non-joiner Zero-width space

Miscellaneous lists	Combining character Duplicate characters Graphic characters Numerals Spaces

Processing

Algorithms	Bi-directional text Collation (ISO 14651) Equivalence

Transformation	BOCU-1 CESU-8 UTF-1 UTF-7 UTF-8 UTF-9/UTF-18 UTF-16/UCS-2 UTF-32/UCS-4 UTF-EBCDIC Punycode SCSU Comparison

On pairs
of code points

Usage

Related standards

Related topics

Scripts and symbols in Unicode

Common and inherited scripts	Combining marks Diacritics Punctuation Space

Modern scripts	Arabic (diacritics) Armenian Balinese Batak Bamum Bengali Bopomofo Braille Buginese Buhid Canadian Aboriginal Cham Chakma Cherokee CJK Unified Ideographs (Han) Cyrillic Deseret Devanagari Ethiopic Georgian Greek Gujarati Gurmukhi Kanji Hanja Hán tự Hangul Hanunoo Hebrew (diacritics) Hiragana Javanese Kannada Katakana Kayah Li Khmer Lao Latin Lepcha Limbu Lisu Malayalam Mandaic Meetei Mayek Miao (Pollard) Mongolian Manchu Myanmar N'Ko New Tai Lue Ol Chiki Oriya Osmanya Rejang Samaritan Saurashtra Sharada Shavian Sinhala Sora Sompeng Sundanese Syloti Nagri Syriac Tagalog Tagbanwa Tai Le Tai Tham Tai Viet Takri Tamil Telugu Thaana Thai Tibetan Tifinagh Vai Yi

Ancient and historic scripts	Avestan Brāhmī Carian Coptic Sumero-Akkadian Cypriot Egyptian Hieroglyphs Glagolitic Gothic Imperial Aramaic Inscriptional Pahlavi Inscriptional Parthian Kaithi Kharoshthi Linear B Lycian Lydian Meroitic, Cursive and Hieroglyphs Ogham Old Italic Old Persian Phags-pa Phoenician Old South Arabian Old Turkic Runic Ugaritic

Symbols	Cultural, political, and religious symbols Currency Mathematical operators and symbols Phonetic symbols (including IPA)

Personal tools

Interaction

Toolbox

What links here
Related changes
Upload file
Special pages
Permanent link
Cite this page

Print/export

Create a book
Download as PDF
Printable version

In Unicode

Example of use in HTML

See also

External links

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Interaction

Toolbox

Print/export

Languages

Recent Comments