Update Doku aus Driversourcen Change-Id: I455f0813b970151089b3dc1b8d9407eea323cdd1

commit: 7ea4a03dfec8fd917fa08d5e21c0519a02a8e7a2 [log] [tgz]
author: Zesstra <zesstra@zesstra.de> Tue Nov 26 20:11:40 2019 +0100
committer: MG Mud User <mud@mg.mud.de> Tue Nov 26 20:11:40 2019 +0100
tree: 0f2606527a1910c08f1d19a8f8f1c192943cd418
parent: c70bf58803d2d854a1bd7e255031a562db0d94e6 [diff] [blame]
diff --git a/doc/concepts/unicode b/doc/concepts/unicode
new file mode 100644
index 0000000..5c22ecc
--- /dev/null
+++ b/doc/concepts/unicode

@@ -0,0 +1,56 @@
+CONCEPT
+        unicode
+
+DESCRIPTION
+        LPC strings come in two flavors: As byte sequences and as unicode
+        strings. For both types almost the full range of string operations
+        is available, but the types are not to be mixed. So for example
+        you cannot add a byte sequence to an unicode string or vice versa.
+
+        Byte sequences can store only bytes (values from 0 to 255),
+        but unicode strings can store the full unicode character set
+        (values from 0 to 1114111).
+
+        There are two conversion functions to convert between byte sequences
+        and unicode strings: to_text() which will return a unicode string,
+        and to_bytes() which returns a byte sequence. Both take either
+        a string or an array, and when converting between bytes and unicode
+        also the name of the encoding (to be) used for the byte sequence.
+
+        -- File handling --
+
+        When a file is accessed either by compiling, read_file(), write_file()
+        (not read_bytes() or write_bytes(), or when an explicit encoding was
+        given), the master is asked via the driver hook H_FILE_ENCODING for
+        the encoding of the file. If none is given, 7 bit ASCII is assumed.
+        Whenever codes are encounted that are not valid in the given encoding
+        a compile or runtime error will be raised.
+
+        -- File names --
+
+        The filesystem encoding can be set with a call to
+        configure_driver(DC_FILESYSTEM_ENCODING, <encoding>). The default
+        encoding is derived from the LC_CTYPE environment setting.
+        If there is no environment setting (or it is set to the default
+        "C" locale), then UTF-8 is used.
+
+        -- Interactives --
+
+        Each interactive has its own encoding. It can be set with
+        configure_interactive(IC_ENCODING, <encoding>). The default is
+        "ISO-8859-1//TRANSLIT" which maps each incoming byte to the
+        first 256 unicode characters and uses transliteration to encode
+        characters that are not in this character set. If an input or
+        output character can not be converted to/from the configured
+        encoding it will be silently discarded.
+
+        -- ERQ / UDP --
+
+        Only byte sequences can be sent to the ERQ or via UDP,
+        and only byte sequences can be received from them.
+
+HISTORY
+        Introduced in LDMud 3.6.
+
+SEE ALSO
+        to_text(E), to_bytes(E), configure_driver(E)
commit	7ea4a03dfec8fd917fa08d5e21c0519a02a8e7a2	[log] [tgz]
author	Zesstra <zesstra@zesstra.de>	Tue Nov 26 20:11:40 2019 +0100
committer	MG Mud User <mud@mg.mud.de>	Tue Nov 26 20:11:40 2019 +0100
tree	0f2606527a1910c08f1d19a8f8f1c192943cd418
parent	c70bf58803d2d854a1bd7e255031a562db0d94e6 [diff] [blame]