Manpages

NAME

groff_char − groff glyph names

DESCRIPTION

This manual page lists the standard groff glyph names and the default input mapping, latin1. The glyphs in this document look different depending on which output device was chosen (with option −T for the man(1) program or the roff formatter). Glyphs not available for the device that is being used to print or view this manual page are marked with ’(N/A)’; the device currently used is ’html’.

In the actual version, groff provides only 8-bit characters for direct input and named entities for further glyphs. On ASCII platforms, input character codes in the range 0 to 127 (decimal) represent the usual 7-bit ASCII characters, while codes between 127 and 255 are interpreted as the corresponding characters in the latin1 (ISO-8859-1) code set by default. This mapping is contained in the file latin1.tmac and can be changed by loading a different input encoding. Note that some of the input characters are reserved by groff, either for internal use or for special input purposes. On EBCDIC platforms, only code page cp1047 is supported (which contains the same characters as latin1; the input encoding file is called cp1047.tmac). Again, some input characters are reserved for internal and special purposes.

All roff systems provide the concept of named glyphs. In traditional roff systems, only names of length 2 were used, while groff also provides support for longer names. It is strongly suggested that only named glyphs are used for all character representations outside of the printable 7-bit ASCII range.

Some of the predefined groff escape sequences (with names of length 1) also produce single glyphs; these exist for historical reasons or are printable versions of syntactical characters. They include ’\\’, ’\'’, ’\`’, ’\−’, ’\.’, and ’\e’; see groff(7).

In groff, all of these different types of characters and glyphs can be tested positively with the ’.if c’ conditional.

REFERENCE

In this section, the glyphs in groff are specified in tabular form. The meaning of the columns is as follows.

Output

shows how the glyph is printed for the current device; although this can have quite a different shape on other devices, it always represents the same glyph.

Input

specifies how the glyph is input either directly by a key on the keyboard, or by a groff escape sequence.

Code

applies to glyphs which can be input with a single character, and gives the ISO latin1 decimal code of that input character. Note that this code is equivalent to the lowest 256 Unicode characters, including 7-bit ASCII in the range 0 to 127.

PostScript

gives the usual PostScript name of the glyph.

Unicode

is the glyph name used in composite glyph names. The names in the Unicode column look like u0021 or u0041_0300. In groff, the corresponding Unicode characters can be constructed by adding a backslash and a pair of square brackets, for example \[u0021] or \[u0041_0300].

7-bit Character Codes 32–126
These are the basic glyphs having 7-bit ASCII code values assigned. They are identical to the printable characters of the character standards ISO-8859-1 (latin1) and Unicode (range Basic Latin). The glyph names used in composite glyph names are ’u0020’ up to ’u007E’.

Note that input characters in the range 0−31 and character 127 are not printable characters. Most of them are invalid input characters for groff anyway, and the valid ones have special meaning. For EBCDIC, the printable characters are in the range 66−255.

48−57

Decimal digits 0 to 9 (print as themselves).

65−90

Upper case letters A−Z (print as themselves).

97−122

Lower case letters a–z (print as themselves).

Most of the remaining characters not in the just described ranges print as themselves; the only exceptions are the following characters:

`

the ISO latin1 ’Grave Accent’ (code 96) prints as ’, a left single quotation mark (Unicode u2018). The same output glyph can be requested explicitly with ’\(oq’. The original character can be obtained with ’\’’ (Unicode u0060).

'

the ISO latin1 ’Apostrophe’ (code 39) prints as ’, a right single quotation mark (Unicode u2019). The same output glyph is commonly used in typography to represent a punctation apostrophe, for example in contractions. It can be requested explicitly with ’\(cq’. The original character can be obtained with ’\(aq’ (Unicode u0027).

-

the ISO latin1 ’Hyphen, Minus Sign’ (code 45) prints as a hyphen (Unicode u2010). The same output glyph can be requested explicitly with ’\(hy’. A minus sign can be obtained with ’\-’ (Unicode u2212).

~

the ISO latin1 ’Tilde’ (code 126) is reduced in size to be usable as a diacritic (Unicode u02DC). A larger glyph can be obtained with ’\(ti’ (Unicode u007E).

^

the ISO latin1 ’Circumflex Accent’ (code 94) is reduced in size to be usable as a diacritic (Unicode u02C6); a larger glyph can be obtained with ’\(ha’ (Unicode u005E).

8-bit Character Codes 160 to 255
They are interpreted as printable characters according to the latin1 (ISO-8859-1) code set, being identical to the Unicode range Latin-1 Supplement.

Input characters in range 128−159 (on non-EBCDIC hosts) are not printable characters.

160

the ISO latin1 no-break space is mapped to ’\~’, the stretchable space character.

173

the soft hyphen control character. groff never uses this character for output (thus it is omitted in the table below); the input character 173 is mapped onto ’\%’.

The remaining ranges (161−172, 174−255) are printable characters that print as themselves. Although they can be specified directly with the keyboard on systems with a latin1 code page, it is better to use their glyph names; see the next section.

Named Glyphs
Glyph names can be embedded into the document text by using escape sequences. groff(7) describes how these escape sequences look. Glyph names can consist of quite arbitrary characters from the ASCII or latin1 code set, not only alphanumeric characters. Here some examples:

\(ch

A glyph having the 2-character name ch.

\[char_name]

A glyph having the name char_name (having length 1, 2, 3, ...). Note that ’c’ is not the same as ’\[c]’ (a single character): The latter is internally mapped to glyph name ’\c’. By default, groff defines a single glyph name starting with a backslash, namely ’\-’, which can be either accessed as ’\−’ or ’\[-]’.

\[base_glyph composite_1 composite_2 ...]

A composite glyph; see below for a more detailed description.

In groff, each 8-bit input character can also referred to by the construct ’\[charn]’ where n is the decimal code of the character, a number between 0 and 255 without leading zeros (those entities are not glyph names). They are normally mapped onto glyphs using the .trin request.

Another special convention is the handling of glyphs with names directly derived from a Unicode code point; this is shown in the ’Unicode’ column of the table below. In general, all glyphs not having a name as listed in this manual page can be accessed with the ’\[uXXXX]’ construct. Refer to section “Using Symbols” in Groff: The GNU Implementation of troff, the groff Texinfo manual, which describes how groff glyph names are constructed.

Moreover, new glyph names can be created by the .char request; see groff(7).

In the following, a plus sign ’+’ in the ’Notes’ column indicates that this particular glyph name appears in the PS version of the original troff documentation, CSTR 54.

Entries marked with ’***’ denote glyphs for mathematical purposes (mainly used for DVI output). Normally, such glyphs have metrics which make them unusable in normal text.

Ligatures and Other Latin Glyphs

Accented Characters

Accents

The composite request is used to map most of the accents to non-spacing glyph names; the values given in parentheses are the original (spacing) ones.

Quotes

Punctuation

Brackets

The extensible bracket pieces are font-invariant glyphs. In classical troff only one glyph was available to vertically extend brackets, braces, and parentheses: ’bv’. We map it rather arbitrarily to u23AA.

Note that not all devices contain extensible bracket pieces which can be piled up with ’\b’ due to the restrictions of the escape’s piling algorithm. A general solution to build brackets out of pieces is the following macro:

.\" Make a pile centered vertically 0.5em
.\" above the baseline.
.\" The first argument is placed at the top.
.\" The pile is returned in string ’pile’
.eo
.de pile-make
.  nr pile-wd 0
.  nr pile-ht 0
.  ds pile-args
.
.  nr pile-# \n[.$]
.  while \n[pile-#] \{\
.    nr pile-wd (\n[pile-wd] >? \w’\$[\n[pile-#]]’)
.    nr pile-ht +(\n[rst] - \n[rsb])
.    as pile-args \v’\n[rsb]u’\"
.    as pile-args \Z’\$[\n[pile-#]]’\"
.    as pile-args \v’-\n[rst]u’\"
.    nr pile-# -1
.  \}
.
.  ds pile \v’(-0.5m + (\n[pile-ht]u / 2u))’\"
.  as pile \*[pile-args]\"
.  as pile \v’((\n[pile-ht]u / 2u) + 0.5m)’\"
.  as pile \h’\n[pile-wd]u’\"
..
.ec

Another complication is the fact that some glyphs which represent bracket pieces in original troff can be used for other mathematical symbols also, for example ’lf’ and ’rf’ which provide the ’floor’ operator. Other devices (most notably for DVI output) don’t unify such glyphs. For this reason, the four glyphs ’lf’, ’rf’, ’lc’, and ’rc’ are not unified with similarly looking bracket pieces. In groff, only glyphs with long names are guaranteed to pile up correctly for all devices (provided those glyphs exist).

Arrows

Lines

The font-invariant glyphs ’br’, ’ul’, and ’rn’ form corners; they can be used to build boxes. Note that both the PostScript and the Unicode-derived names of these three glyphs are just rough approximations.

’rn’ also serves in classical troff as the horizontal extension of the square root sign.

’ru’ is a font-invariant glyph, namely a rule of length 0.5m.

Use ’\[radicalex]’, not ’\[overline]’, for continuation of square root.

Text markers

Legal Symbols

The Bell Labs logo is not supported in groff.

Currency symbols

Units

Logical Symbols

 

Mathematical Symbols

                 

Greek glyphs

These glyphs are intended for technical use, not for real Greek; normally, the uppercase letters have upright shape, and the lowercase ones are slanted. There is a problem with the mapping of letter phi to Unicode. Prior to Unicode version 3.0, the difference between U+03C6, GREEK SMALL LETTER PHI, and U+03D5, GREEK PHI SYMBOL, was not clearly described; only the glyph shapes in the Unicode book could be used as a reference. Starting with Unicode 3.0, the reference glyphs have been exchanged and described verbally also: In mathematical context, U+03D5 is the stroked variant and U+03C6 the curly glyph. Unfortunately, most font vendors didn’t update their fonts to this (incompatible) change in Unicode. At the time of this writing (January 2006), it is not clear yet whether the Adobe Glyph Names ’phi’ and ’phi1’ also change its meaning if used for mathematics, thus compatibility problems are likely to happen – being conservative, groff currently assumes that ’phi’ in a PostScript symbol font is the stroked version.

In groff, symbol ’\[*f]’ always denotes the stroked version of phi, and ’\[+f]’ the curly variant.

Card symbols

AUTHORS

This document was written by jjc [AT] jclark.com">James Clark, with additions by wl [AT] gnu.org">Werner Lemberg and groff-bernd.warken-72 [AT] web.de">Bernd Warken, and revised to use real tables by esr [AT] thyrsus.com">Eric S. Raymond.

SEE ALSO

Groff: The GNU Implementation of troff, by Trent A. Fisher and Werner Lemberg, is the primary groff manual. Section “Using Symbols” may be of particular note. You can browse it interactively with “info '(groff)Using Symbols'”.
groff(1)

the GNU roff formatter

groff(7)

a short reference of the groff formatting language

An extension to the troff character set for Europe, E.G. Keizer, K.J. Simonsen, J. Akkerhuis; EUUG Newsletter, Volume 9, No. 2, Summer 1989
http://www.unicode.org">The Unicode Standard

COMMENTS