Ceci est une ancienne révision du document !
Hyphenation exceptions
While TeX's hyphenation rules are good, they're not infallible: you will occasionally find words TeX just gets wrong. So for example, TeX's default hyphenation rules (for American English) don't know the word “manuscript”, and since it's a long word you may find you need to hyphenate it. You can “write the hyphenation out” each time you use the word:
... man\-u\-script ...
Here, each of the \-
commands is converted to a hyphenated break,
if (and only if) necessary.
That technique can rapidly become tedious: you'll probably only accept it if there are no more than one or two wrongly-hyphenated words in your document. The alternative is to set up hyphenations in the document preamble. To do that, for the hyphenation above, you would write:
\hyphenation{man-u-script}
and the hyphenation would be set for the whole document. Barbara Beeton publishes articles containing lists of these “hyphenation exceptions”, in TUGboat; the hyphenation “man-u-script” comes from one of those articles.
If you prefer reading the articles, here is the list:
Year | TUGboat issue |
---|---|
2018 | 39(2):152 |
2018 | 39(1):5–6 |
2015 | 36(1):5 |
2013 | 34(2):113–114 |
2012 | 33(1):5–6 |
2010 | 31(3):160 |
2008 | 29(2):239 |
2005 | 26(1):5–6 |
2002 | 23(3/4):247–248 |
2001 | 22(1/2):31–32 |
What if you have more than one language in your document ?
Simple: select the appropriate language, and do the same as above:
\usepackage[french]{babel} \selectlanguage{french} \hyphenation{re-cher-cher}
(nothing clever here: this is the “correct” hyphenation of the word,
in the current tables). However, there's a problem here: just as
words with accent macros in them won't break, so an \hyphenation
commands with accent macros in its argument will produce an error:
\usepackage[french]{babel} \selectlanguage{french} \hyphenation{r\'e-f\'e-rence}
tells us that the hyphenation is “improper”, and that it will be “flushed”.
But, just as hyphenation of words is enabled by selecting an 8-bit
font encoding, so \hyphenation
commands are rendered proper again
by selecting that same 8-bit font encoding. For the hyphenation
patterns provided for “legacy”, the encoding is
Cork, so the complete sequence is:
\usepackage[T1]{fontenc} \usepackage[french]{babel} \selectlanguage{french} \hyphenation{r\'e-f\'e-rence}
The same sort of performance goes for any language for which 8-bit fonts and corresponding hyphenation patterns are available. Since you have to select both the language and the font encoding to have your document typeset correctly, it should not be a great imposition to do the selections before setting up hyphenation exceptions.
Modern TeX variants (principally XeTeX
and LuaTeX
) use unicode,
internally, and distributions that offer them also offer
UTF-8-encoded patterns; since the hyphenation team do all the
work “behind the scenes”, the use of Unicode hyphenation is
deceptively similar to what we are used to.
Sources: