|
|
There are two types of mapfile: version 1.0 for 8-bit character sets and version 2.0 for multi-byte character sets.
The internal character set is defined by the right column of the input map, and the first column of the output map.
Any character value not given is assumed to map directly. Only the differences are shown in the mapfile. See mapchan(1M). The left column must be unique. More than one occurrence of any entry is considered invalid and produces an error. The right column characters can appear more than once. This is a many to one mapping. Nulls can be produced with compose sequences or as part of an output string.
Note that characters are always put through the input map, even when they are part of the dead or compose sequences. The internal value is then looked up in the dead/compose section. This value may also be mapped to the output. This should be kept in mind when preparing mapfiles.
To illustrate this, consider a cursor-control sequence which should be passed directly to the terminal without being mapped. Such a sequence would typically begin with a fixed escape sequence instructing the terminal to interpret the following two characters as a cursor position; the values of the following two characters are variable, and depend on the cursor position requested. Such a control sequence would be specified as:
\E= 2 # Cursor control: escape = <x> <y>There are two subsections under the control section: the input section, which is used to filter data sent from the terminal to UnixWare, and the output section, which is used to filter data sent from UnixWare to the terminal. The two fields in each control sequence are separated by white space, that is the space or tab characters.
The entries in the left column of the input section must be different from those already entered in the general input section and likewise for the output section.
In the control section if any of the following three characters -- white space (space or tab characters), or the hash character( ``#'') -- are required in the specification itself, they should be entered using one of alternative means of entering characters, as follows:
#
#sharp/pound/cross-hatched is the comment character
#however, a quoted # ('#') is 0x23, not a comment.
#
#beep, input, output, dead, compose and control
#are special keywords and should appear as shown.
#
# version 1.0 (optional - if not present ver 1.0 will be assumed.)
beep #sound bell when a dead/compose sequence not specified
#in a mapfile is entered at the terminal.
input
´@´ 0xe0 # a grave
´[´ 0xe2 # a circumflex
´]´ 0xea # e circumflex
dead 0x90 #circumflex dead key
´E´ 0xca # E circumflex
´I´ 0xce # I circumflex
´O´ 0xd4 # O circumflex
dead 0x93
´A´ 0xc0 # A grave
´E´ 0xc8 # E grave
´I´ 0xcc # I grave
compose 0x14 # Ctrl-T is the compose key
´s´ ´|´ ´$´ # dollar sign
´A´ ´A´ ´@´ # at sign
´(´ ´(´ ´[´ # open square bracket
output
0xa8 '"' # diaresis (approximation)
0xa9 'c' # copyright sign (approximation)
0xaa 'a' # feminine ordinal indicator (approximation)
control # The control must be last
input
\E[ 1 # Standard ANSI key codes
output
\E[ 1 # Standard ANSI escape sequences
# version 2.0 - must be present for version 2.0 mapfiles
The sequences (two or more bytes) that are not given are considered invalid. For example, consider the following section of a mapfile:
input a : l b : m n c d : o c e :p q``a'' and ``b'' are single-byte input sequences that are translated to ``l'' and ``mn'' respectively. ``cd'' and ``ce'' are multibyte (2 byte) input sequences mapped to ``o'' and ``pq'' respectively.
If ``d'' is entered, then since ``d'' is not mapped and there is also no sequence starting with ``d'', ``d'' is passed to the kernel. If ``c'' is entered even though ``c'' is not mapped, it is part of another sequence, so the kernel will wait for the next input byte. If it is ``d'' or ``e'', the sequence is valid otherwise it will be considered an error.
The input sequence is specified in the left column and the output is specified in the right column. The columns are separated by a delimiter colon (:). This is a many to one mapping. This means the left column must be unique. More than one occurrence of any entry is considered invalid and produces an error. The right column sequence can appear more than once. Nulls can be produced.
The entries in the left column of the input section must be different from those already entered in the general input section and likewise for the output section.
#
#sharp/pound/cross-hatched is the comment character
#however, a quoted # ('#') is 0x23, not a comment
#
#beep, input, output and control
#are special keywords and should appear as shown.
#
#version 2.0 - must be present for version 2.0 mapfiles
beep #sound bell when unmapped multi-byte sequence is entered on
#terminal
input
0xA8 : 0xC5 0xA1
0xA9 : 0xC2 0xA9
0x14 0x14 : 0x14 # Ctrl-T twice gives Ctrl-T
0x82 'U' : 0xC3 0x99
0x81 'e' : 0xC3 0xA9
0x14 '`' '`' : 0x60 # grave
0x14 'c' '|' : 0xC2 0x82 # cent sign
output
0xfc : 0x7d
0xdf : 0x7e
0xb0 : 0x80 # Cyrillic Capital Letter A
0xb1 : 0x81 # Cyrillic Capital Letter BE
0xa5 : 'y' # yen sign
0xa6 : '|' # broken bar
0xbc : '/' # fraction one-quarter
0xbd : '/' # fraction one-half
0x9B : 0x9B 0x9B # CSI handling for ANSI
control # The control must be last
input
^A : 1 # Function keys:control-A followed by @ through O
output
\E : 2 # Cusor control: escape <x> <y>
\E[ : 1 # Standard ANSI key codes
\E[ : 1 # Standard ANSI escape sequences
Version 1.0 and version 2.0 mapfile formats are translated into single-byte values.
All of the single letters above the control sections of version 1.0 and version 2.0 mapfiles must be in one of these formats:
56 # decimal
045 # octal
0xfa # hexadecimal
´b´ # quoted char
´\076´ # quoted octal
´\x4a´ # quoted hex
Note that if a one-byte character is mapped to something then that character cannot be the start of a 2 -- or more byte input sequence. In general, if an n-byte sequence has a mapping specified for it then that sequence cannot be the start of an n+ input byte sequence.
input a : x # a is mapped to x a b : y # this is not allowedb r : t y # br is mapped to ty b t : y u # bt is mapped to yu - both are valid b r s : d # this is not allowed
Note that the length of a sequence cannot be more than 256 bytes. The maximum buffer size is restricted to 64KB.
mapchan automatically invokes mapchan.conv.awk to convert version 1.0 mapfiles into version 2.0 mapfiles. This file is not for direct user use.
It is especially important to retain the 7-bit ASCII portion of the character set, see ascii(5). UnixWare utilities and applications assume these values.