summaryrefslogtreecommitdiffstats
path: root/doc/binary.n
diff options
context:
space:
mode:
Diffstat (limited to 'doc/binary.n')
-rw-r--r--doc/binary.n532
1 files changed, 532 insertions, 0 deletions
diff --git a/doc/binary.n b/doc/binary.n
new file mode 100644
index 0000000..067c52e
--- /dev/null
+++ b/doc/binary.n
@@ -0,0 +1,532 @@
+'\"
+'\" Copyright (c) 1997 by Sun Microsystems, Inc.
+'\"
+'\" See the file "license.terms" for information on usage and redistribution
+'\" of this file, and for a DISCLAIMER OF ALL WARRANTIES.
+'\"
+'\" SCCS: @(#) binary.n 1.7 97/11/11 19:08:47
+'\"
+.so man.macros
+.TH binary n 8.0 Tcl "Tcl Built-In Commands"
+.BS
+'\" Note: do not modify the .SH NAME line immediately below!
+.SH NAME
+binary \- Insert and extract fields from binary strings
+.SH SYNOPSIS
+\fBbinary format \fIformatString \fR?\fIarg arg ...\fR?
+.br
+\fBbinary scan \fIstring formatString \fR?\fIvarName varName ...\fR?
+.BE
+
+.SH DESCRIPTION
+.PP
+This command provides facilities for manipulating binary data. The
+first form, \fBbinary format\fR, creates a binary string from normal
+Tcl values. For example, given the values 16 and 22, it might produce
+an 8-byte binary string consisting of two 4-byte integers, one for
+each of the numbers. The second form of the command,
+\fBbinary scan\fR, does the opposite: it extracts data from a binary
+string and returns it as ordinary Tcl string values.
+
+.SH "BINARY FORMAT"
+.PP
+The \fBbinary format\fR command generates a binary string whose layout
+is specified by the \fIformatString\fR and whose contents come from
+the additional arguments. The resulting binary value is returned.
+.PP
+The \fIformatString\fR consists of a sequence of zero or more field
+specifiers separated by zero or more spaces. Each field specifier is
+a single type character followed by an optional numeric \fIcount\fR.
+Most field specifiers consume one argument to obtain the value to be
+formatted. The type character specifies how the value is to be
+formatted. The \fIcount\fR typically indicates how many items of the
+specified type are taken from the value. If present, the \fIcount\fR
+is a non-negative decimal integer or \fB*\fR, which normally indicates
+that all of the items in the value are to be used. If the number of
+arguments does not match the number of fields in the format string
+that consume arguments, then an error is generated.
+.PP
+Each type-count pair moves an imaginary cursor through the binary
+data, storing bytes at the current position and advancing the cursor
+to just after the last byte stored. The cursor is initially at
+position 0 at the beginning of the data. The type may be any one of
+the following characters:
+.IP \fBa\fR 5
+Stores a character string of length \fIcount\fR in the output string.
+If \fIarg\fR has fewer than \fIcount\fR bytes, then additional zero
+bytes are used to pad out the field. If \fIarg\fR is longer than the
+specified length, the extra characters will be ignored. If
+\fIcount\fR is \fB*\fR, then all of the bytes in \fIarg\fR will be
+formatted. If \fIcount\fR is omitted, then one character will be
+formatted. For example,
+.RS
+.CS
+\fBbinary format a7a*a alpha bravo charlie\fR
+.CE
+will return a string equivalent to \fBalpha\\000\\000bravoc\fR.
+.RE
+.IP \fBA\fR 5
+This form is the same as \fBa\fR except that spaces are used for
+padding instead of nulls. For example,
+.RS
+.CS
+\fBbinary format A6A*A alpha bravo charlie\fR
+.CE
+will return \fBalpha bravoc\fR.
+.RE
+.IP \fBb\fR 5
+Stores a string of \fIcount\fR binary digits in low-to-high order
+within each byte in the output string. \fIArg\fR must contain a
+sequence of \fB1\fR and \fB0\fR characters. The resulting bytes are
+emitted in first to last order with the bits being formatted in
+low-to-high order within each byte. If \fIarg\fR has fewer than
+\fIcount\fR digits, then zeros will be used for the remaining bits.
+If \fIarg\fR has more than the specified number of digits, the extra
+digits will be ignored. If \fIcount\fR is \fB*\fR, then all of the
+digits in \fIarg\fR will be formatted. If \fIcount\fR is omitted,
+then one digit will be formatted. If the number of bits formatted
+does not end at a byte boundary, the remaining bits of the last byte
+will be zeros. For example,
+.RS
+.CS
+\fBbinary format b5b* 11100 111000011010\fR
+.CE
+will return a string equivalent to \fB\\x07\\x87\\x05\fR.
+.RE
+.IP \fBB\fR 5
+This form is the same as \fBb\fR except that the bits are stored in
+high-to-low order within each byte. For example,
+.RS
+.CS
+\fBbinary format B5B* 11100 111000011010\fR
+.CE
+will return a string equivalent to \fB\\xe0\\xe1\\xa0\fR.
+.RE
+.IP \fBh\fR 5
+Stores a string of \fIcount\fR hexadecimal digits in low-to-high
+within each byte in the output string. \fIArg\fR must contain a
+sequence of characters in the set ``0123456789abcdefABCDEF''. The
+resulting bytes are emitted in first to last order with the hex digits
+being formatted in low-to-high order within each byte. If \fIarg\fR
+has fewer than \fIcount\fR digits, then zeros will be used for the
+remaining digits. If \fIarg\fR has more than the specified number of
+digits, the extra digits will be ignored. If \fIcount\fR is
+\fB*\fR, then all of the digits in \fIarg\fR will be formatted. If
+\fIcount\fR is omitted, then one digit will be formatted. If the
+number of digits formatted does not end at a byte boundary, the
+remaining bits of the last byte will be zeros. For example,
+.RS
+.CS
+\fBbinary format h3h* AB def\fR
+.CE
+will return a string equivalent to \fB\\xba\\xed\\x0f\fR.
+.RE
+.IP \fBH\fR 5
+This form is the same as \fBh\fR except that the digits are stored in
+high-to-low order within each byte. For example,
+.RS
+.CS
+\fBbinary format H3H* ab DEF\fR
+.CE
+will return a string equivalent to \fB\\xab\\xde\\xf0\fR.
+.RE
+.IP \fBc\fR 5
+Stores one or more 8-bit integer values in the output string. If no
+\fIcount\fR is specified, then \fIarg\fR must consist of an integer
+value; otherwise \fIarg\fR must consist of a list containing at least
+\fIcount\fR integer elements. The low-order 8 bits of each integer
+are stored as a one-byte value at the cursor position. If \fIcount\fR
+is \fB*\fR, then all of the integers in the list are formatted. If
+the number of elements in the list is fewer than \fIcount\fR, then an
+error is generated. If the number of elements in the list is greater
+than \fIcount\fR, then the extra elements are ignored. For example,
+.RS
+.CS
+\fBbinary format c3cc* {3 -3 128 1} 257 {2 5}\fR
+.CE
+will return a string equivalent to
+\fB\\x03\\xfd\\x80\\x01\\x02\\x05\fR, whereas
+.CS
+\fBbinary format c {2 5}\fR
+.CE
+will generate an error.
+.RE
+.IP \fBs\fR 5
+This form is the same as \fBc\fR except that it stores one or more
+16-bit integers in little-endian byte order in the output string. The
+low-order 16-bits of each integer are stored as a two-byte value at
+the cursor position with the least significant byte stored first. For
+example,
+.RS
+.CS
+\fBbinary format s3 {3 -3 258 1}\fR
+.CE
+will return a string equivalent to
+\fB\\x03\\x00\\xfd\\xff\\x02\\x01\fR.
+.RE
+.IP \fBS\fR 5
+This form is the same as \fBs\fR except that it stores one or more
+16-bit integers in big-endian byte order in the output string. For
+example,
+.RS
+.CS
+\fBbinary format S3 {3 -3 258 1}\fR
+.CE
+will return a string equivalent to
+\fB\\x00\\x03\\xff\\xfd\\x01\\x02\fR.
+.RE
+.IP \fBi\fR 5
+This form is the same as \fBc\fR except that it stores one or more
+32-bit integers in little-endian byte order in the output string. The
+low-order 32-bits of each integer are stored as a four-byte value at
+the cursor position with the least significant byte stored first. For
+example,
+.RS
+.CS
+\fBbinary format i3 {3 -3 65536 1}\fR
+.CE
+will return a string equivalent to
+\fB\\x03\\x00\\x00\\x00\\xfd\\xff\\xff\\xff\\x00\\x00\\x10\\x00\fR.
+.RE
+.IP \fBI\fR 5
+This form is the same as \fBi\fR except that it stores one or more one
+or more 32-bit integers in big-endian byte order in the output string.
+For example,
+.RS
+.CS
+\fBbinary format I3 {3 -3 65536 1}\fR
+.CE
+will return a string equivalent to
+\fB\\x00\\x00\\x00\\x03\\xff\\xff\\xff\\xfd\\x00\\x10\\x00\\x00\fR.
+.RE
+.IP \fBf\fR 5
+This form is the same as \fBc\fR except that it stores one or more one
+or more single-precision floating in the machine's native
+representation in the output string. This representation is not
+portable across architectures, so it should not be used to communicate
+floating point numbers across the network. The size of a floating
+point number may vary across architectures, so the number of bytes
+that are generated may vary. If the value overflows the
+machine's native representation, then the value of FLT_MAX
+as defined by the system will be used instead. Because Tcl uses
+double-precision floating-point numbers internally, there may be some
+loss of precision in the conversion to single-precision. For example,
+on a Windows system running on an Intel Pentium processor,
+.RS
+.CS
+\fBbinary format f2 {1.6 3.4}\fR
+.CE
+will return a string equivalent to
+\fB\\xcd\\xcc\\xcc\\x3f\\x9a\\x99\\x59\\x40\fR.
+.RE
+.IP \fBd\fR 5
+This form is the same as \fBf\fR except that it stores one or more one
+or more double-precision floating in the machine's native
+representation in the output string. For example, on a
+Windows system running on an Intel Pentium processor,
+.RS
+.CS
+\fBbinary format d1 {1.6}\fR
+.CE
+will return a string equivalent to
+\fB\\x9a\\x99\\x99\\x99\\x99\\x99\\xf9\\x3f\fR.
+.RE
+.IP \fBx\fR 5
+Stores \fIcount\fR null bytes in the output string. If \fIcount\fR is
+not specified, stores one null byte. If \fIcount\fR is \fB*\fR,
+generates an error. This type does not consume an argument. For
+example,
+.RS
+.CS
+\fBbinary format a3xa3x2a3 abc def ghi\fR
+.CE
+will return a string equivalent to \fBabc\\000def\\000\\000ghi\fR.
+.RE
+.IP \fBX\fR 5
+Moves the cursor back \fIcount\fR bytes in the output string. If
+\fIcount\fR is \fB*\fR or is larger than the current cursor position,
+then the cursor is positioned at location 0 so that the next byte
+stored will be the first byte in the result string. If \fIcount\fR is
+omitted then the cursor is moved back one byte. This type does not
+consume an argument. For example,
+.RS
+.CS
+\fBbinary format a3X*a3X2a3 abc def ghi\fR
+.CE
+will return \fBdghi\fR.
+.RE
+.IP \fB@\fR 5
+Moves the cursor to the absolute location in the output string
+specified by \fIcount\fR. Position 0 refers to the first byte in the
+output string. If \fIcount\fR refers to a position beyond the last
+byte stored so far, then null bytes will be placed in the unitialized
+locations and the cursor will be placed at the specified location. If
+\fIcount\fR is \fB*\fR, then the cursor is moved to the current end of
+the output string. If \fIcount\fR is omitted, then an error will be
+generated. This type does not consume an argument. For example,
+.RS
+.CS
+\fBbinary format a5@2a1@*a3@10a1 abcde f ghi j\fR
+.CE
+will return \fBabfdeghi\\000\\000j\fR.
+.RE
+
+.SH "BINARY SCAN"
+.PP
+The \fBbinary scan\fR command parses fields from a binary string,
+returning the number of conversions performed. \fIString\fR gives the
+input to be parsed and \fIformatString\fR indicates how to parse it.
+Each \fIvarName\fR gives the name of a variable; when a field is
+scanned from \fIstring\fR the result is assigned to the corresponding
+variable.
+.PP
+As with \fBbinary format\fR, the \fIformatString\fR consists of a
+sequence of zero or more field specifiers separated by zero or more
+spaces. Each field specifier is a single type character followed by
+an optional numeric \fIcount\fR. Most field specifiers consume one
+argument to obtain the variable into which the scanned values should
+be placed. The type character specifies how the binary data is to be
+interpreted. The \fIcount\fR typically indicates how many items of
+the specified type are taken from the data. If present, the
+\fIcount\fR is a non-negative decimal integer or \fB*\fR, which
+normally indicates that all of the remaining items in the data are to
+be used. If there are not enough bytes left after the current cursor
+position to satisfy the current field specifier, then the
+corresponding variable is left untouched and \fBbinary scan\fR returns
+immediately with the number of variables that were set. If there are
+not enough arguments for all of the fields in the format string that
+consume arguments, then an error is generated.
+.PP
+Each type-count pair moves an imaginary cursor through the binary data,
+reading bytes from the current position. The cursor is initially
+at position 0 at the beginning of the data. The type may be any one of
+the following characters:
+.IP \fBa\fR 5
+The data is a character string of length \fIcount\fR. If \fIcount\fR
+is \fB*\fR, then all of the remaining bytes in \fIstring\fR will be
+scanned into the variable. If \fIcount\fR is omitted, then one
+character will be scanned. For example,
+.RS
+.CS
+\fBbinary scan abcde\\000fghi a6a10 var1 var2\fR
+.CE
+will return \fB1\fR with the string equivalent to \fBabcde\\000\fR
+stored in \fBvar1\fR and \fBvar2\fR left unmodified.
+.RE
+.IP \fBA\fR 5
+This form is the same as \fBa\fR, except trailing blanks and nulls are stripped from
+the scanned value before it is stored in the variable. For example,
+.RS
+.CS
+\fBbinary scan "abc efghi \\000" a* var1\fR
+.CE
+will return \fB1\fR with \fBabc efghi\fR stored in \fBvar1\fR.
+.RE
+.IP \fBb\fR 5
+The data is turned into a string of \fIcount\fR binary digits in
+low-to-high order represented as a sequence of ``1'' and ``0''
+characters. The data bytes are scanned in first to last order with
+the bits being taken in low-to-high order within each byte. Any extra
+bits in the last byte are ignored. If \fIcount\fR is \fB*\fR, then
+all of the remaining bits in \fBstring\fR will be scanned. If
+\fIcount\fR is omitted, then one bit will be scanned. For example,
+.RS
+.CS
+\fBbinary scan \\x07\\x87\\x05 b5b* var1 var2\fR
+.CE
+will return \fB2\fR with \fB11100\fR stored in \fBvar1\fR and
+\fB1110000110100000\fR stored in \fBvar2\fR.
+.RE
+.IP \fBB\fR 5
+This form is the same as \fBB\fR, except the bits are taken in
+high-to-low order within each byte. For example,
+.RS
+.CS
+\fBbinary scan \\x70\\x87\\x05 b5b* var1 var2\fR
+.CE
+will return \fB2\fR with \fB01110\fR stored in \fBvar1\fR and
+\fB1000011100000101\fR stored in \fBvar2\fR.
+.RE
+.IP \fBh\fR 5
+The data is turned into a string of \fIcount\fR hexadecimal digits in
+low-to-high order represented as a sequence of characters in the set
+``0123456789abcdef''. The data bytes are scanned in first to last
+order with the hex digits being taken in low-to-high order within each
+byte. Any extra bits in the last byte are ignored. If \fIcount\fR
+is \fB*\fR, then all of the remaining hex digits in \fBstring\fR will be
+scanned. If \fIcount\fR is omitted, then one hex digit will be
+scanned. For example,
+.RS
+.CS
+\fBbinary scan \\x07\\x86\\x05 h3h* var1 var2\fR
+.CE
+will return \fB2\fR with \fB706\fR stored in \fBvar1\fR and
+\fB50\fR stored in \fBvar2\fR.
+.RE
+.IP \fBH\fR 5
+This form is the same as \fBh\fR, except the digits are taken in
+low-to-high order within each byte. For example,
+.RS
+.CS
+\fBbinary scan \\x07\\x86\\x05 H3H* var1 var2\fR
+.CE
+will return \fB2\fR with \fB078\fR stored in \fBvar1\fR and
+\fB05\fR stored in \fBvar2\fR.
+.RE
+.IP \fBc\fR 5
+The data is turned into \fIcount\fR 8-bit signed integers and stored
+in the corresponding variable as a list. If \fIcount\fR is \fB*\fR,
+then all of the remaining bytes in \fBstring\fR will be scanned. If
+\fIcount\fR is omitted, then one 8-bit integer will be scanned. For
+example,
+.RS
+.CS
+\fBbinary scan \\x07\\x86\\x05 c2c* var1 var2\fR
+.CE
+will return \fB2\fR with \fB7 -122\fR stored in \fBvar1\fR and \fB5\fR
+stored in \fBvar2\fR. Note that the integers returned are signed, but
+they can be converted to unsigned 8-bit quantities using an expression
+like:
+.CS
+\fBexpr ( $num + 0x100 ) % 0x100\fR
+.CE
+.RE
+.IP \fBs\fR 5
+The data is interpreted as \fIcount\fR 16-bit signed integers
+represented in little-endian byte order. The integers are stored in
+the corresponding variable as a list. If \fIcount\fR is \fB*\fR, then
+all of the remaining bytes in \fBstring\fR will be scanned. If
+\fIcount\fR is omitted, then one 16-bit integer will be scanned. For
+example,
+.RS
+.CS
+\fBbinary scan \\x05\\x00\\x07\\x00\\xf0\\xff s2s* var1 var2\fR
+.CE
+will return \fB2\fR with \fB5 7\fR stored in \fBvar1\fR and \fB-16\fR
+stored in \fBvar2\fR. Note that the integers returned are signed, but
+they can be converted to unsigned 16-bit quantities using an expression
+like:
+.CS
+\fBexpr ( $num + 0x10000 ) % 0x10000\fR
+.CE
+.RE
+.IP \fBS\fR 5
+This form is the same as \fBs\fR except that the data is interpreted
+as \fIcount\fR 16-bit signed integers represented in big-endian byte
+order. For example,
+.RS
+.CS
+\fBbinary scan \\x00\\x05\\x00\\x07\\xff\\xf0 S2S* var1 var2\fR
+.CE
+will return \fB2\fR with \fB5 7\fR stored in \fBvar1\fR and \fB-16\fR
+stored in \fBvar2\fR.
+.RE
+.IP \fBi\fR 5
+The data is interpreted as \fIcount\fR 32-bit signed integers
+represented in little-endian byte order. The integers are stored in
+the corresponding variable as a list. If \fIcount\fR is \fB*\fR, then
+all of the remaining bytes in \fBstring\fR will be scanned. If
+\fIcount\fR is omitted, then one 32-bit integer will be scanned. For
+example,
+.RS
+.CS
+\fBbinary scan \\x05\\x00\\x00\\x00\\x07\\x00\\x00\\x00\\xf0\\xff\\xff\\xff i2i* var1 var2\fR
+.CE
+will return \fB2\fR with \fB5 7\fR stored in \fBvar1\fR and \fB-16\fR
+stored in \fBvar2\fR. Note that the integers returned are signed and
+cannot be represented by Tcl as unsigned values.
+.RE
+.IP \fBI\fR 5
+This form is the same as \fBI\fR except that the data is interpreted
+as \fIcount\fR 32-bit signed integers represented in big-endian byte
+order. For example,
+.RS
+.CS
+\fBbinary \\x00\\x00\\x00\\x05\\x00\\x00\\x00\\x07\\xff\\xff\\xff\\xf0 I2I* var1 var2\fR
+.CE
+will return \fB2\fR with \fB5 7\fR stored in \fBvar1\fR and \fB-16\fR
+stored in \fBvar2\fR.
+.RE
+.IP \fBf\fR 5
+The data is interpreted as \fIcount\fR single-precision floating point
+numbers in the machine's native representation. The floating point
+numbers are stored in the corresponding variable as a list. If
+\fIcount\fR is \fB*\fR, then all of the remaining bytes in
+\fBstring\fR will be scanned. If \fIcount\fR is omitted, then one
+single-precision floating point number will be scanned. The size of a
+floating point number may vary across architectures, so the number of
+bytes that are scanned may vary. If the data does not represent a
+valid floating point number, the resulting value is undefined and
+compiler dependent. For example, on a Windows system running on an
+Intel Pentium processor,
+.RS
+.CS
+\fBbinary scan \\x3f\\xcc\\xcc\\xcd f var1\fR
+.CE
+will return \fB1\fR with \fB1.6000000238418579\fR stored in
+\fBvar1\fR.
+.RE
+.IP \fBd\fR 5
+This form is the same as \fBf\fR except that the data is interpreted
+as \fIcount\fR double-precision floating point numbers in the
+machine's native representation. For example, on a Windows system
+running on an Intel Pentium processor,
+.RS
+.CS
+\fBbinary scan \\x9a\\x99\\x99\\x99\\x99\\x99\\xf9\\x3f d var1\fR
+.CE
+will return \fB1\fR with \fB1.6000000000000001\fR
+stored in \fBvar1\fR.
+.RE
+.IP \fBx\fR 5
+Moves the cursor forward \fIcount\fR bytes in \fIstring\fR. If
+\fIcount\fR is \fB*\fR or is larger than the number of bytes after the
+current cursor cursor position, then the cursor is positioned after
+the last byte in \fIstring\fR. If \fIcount\fR is omitted, then the
+cursor is moved forward one byte. Note that this type does not
+consume an argument. For example,
+.RS
+.CS
+\fBbinary scan \\x01\\x02\\x03\\x04 x2H* var1\fR
+.CE
+will return \fB1\fR with \fB0304\fR stored in \fBvar1\fR.
+.RE
+.IP \fBX\fR 5
+Moves the cursor back \fIcount\fR bytes in \fIstring\fR. If
+\fIcount\fR is \fB*\fR or is larger than the current cursor position,
+then the cursor is positioned at location 0 so that the next byte
+scanned will be the first byte in \fIstring\fR. If \fIcount\fR
+is omitted then the cursor is moved back one byte. Note that this
+type does not consume an argument. For example,
+.RS
+.CS
+\fBbinary scan \\x01\\x02\\x03\\x04 c2XH* var1 var2\fR
+.CE
+will return \fB2\fR with \fB1 2\fR stored in \fBvar1\fR and \fB020304\fR
+stored in \fBvar2\fR.
+.RE
+.IP \fB@\fR 5
+Moves the cursor to the absolute location in the data string specified
+by \fIcount\fR. Note that position 0 refers to the first byte in
+\fIstring\fR. If \fIcount\fR refers to a position beyond the end of
+\fIstring\fR, then the cursor is positioned after the last byte. If
+\fIcount\fR is omitted, then an error will be generated. For example,
+.RS
+.CS
+\fBbinary scan \\x01\\x02\\x03\\x04 c2@1H* var1 var2\fR
+.CE
+will return \fB2\fR with \fB1 2\fR stored in \fBvar1\fR and \fB020304\fR
+stored in \fBvar2\fR.
+.RE
+
+.SH "PLATFORM ISSUES"
+Sometimes it is desirable to format or scan integer values in the
+native byte order for the machine. Refer to the \fBbyteOrder\fR
+element of the \fBtcl_platform\fR array to decide which type character
+to use when formatting or scanning integers.
+
+.SH "SEE ALSO"
+format, scan, tclvars
+
+.SH KEYWORDS
+binary, format, scan