How Can We Help?
You are here:
< Back

In computing and typesetting, a soft hyphen (U+00AD soft hyphen, HTML: &#173; &shy;) is a type of hyphen used to specify a place in text where a hyphenated break is allowed without forcing a line break in an inconvenient place if the text is re-flowed.

Additional semantics associated with the soft hyphen vary. According to the Unicode standard, a soft hyphen is not displayed if the line is not broken at that point.[1] HTML4 describes it as a "hyphenation hint," though it suggests that that interpretation is not universal:[2]

In HTML, there are two types of hyphens: the plain hyphen and the soft hyphen. The plain hyphen should be interpreted by a user agent as just another character. The soft hyphen tells the user agent where a line break can occur. Those browsers that interpret soft hyphens must observe the following semantics: If a line is broken at a soft hyphen, a hyphen character must be displayed at the end of the first line. If a line is not broken at a soft hyphen, the user agent must not display a hyphen character. For operations such as searching and sorting, the soft hyphen should always be ignored.

ISO 8859-1 specifies that it is always visible. EBCDIC has a SHY character, with "SHY" an abbreviation for "syllable hyphen,"[1][3] which is defined by IBM to mean a "hyphen used to divide a word at the end of a line [that] may be removed when a program adjusts lines."[4]

In most parts of ISO-8859 the soft hyphen is at position 0xAD (hexadecimal), and since the first 256 positions in Unicode are taken from ISO-8859-1, it has a Unicode codepoint of U+00AD. HTML 3.2 introduced a character entity for the soft hyphen, "&shy;". In TeX and LaTeX the soft hyphen is represented by the command \- .[5]

To show the effect of a soft hyphen, the following “wocka”s have been separated with soft hyphens

Pac-Man goes "wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka­wocka."

On browsers supporting soft hyphens, resizing the window will hyphenate the above text only as “wocka-”s (and never with, for example, “wock-”). On browsers not supporting soft hyphens, the above might appear as one very long line, or as several lines but without hyphens at the end of each broken line.

Compare soft hyphen's semantics and HTML implementation with the zero-width space.

Accessibility issues

Soft hyphens are known to cause some text-to-speech systems to mispronounce words.[citation needed]

Security issues

Soft hyphens have been used to obscure malicious domains or URLs in E-mail spam.[6][7]

See also

References

Personal tools
  • Log in / create account
Namespaces
Variants
Actions
Navigation
Toolbox
Print/export
Languages
Categories
Table of Contents