Update Doku aus Driversourcen
Change-Id: I455f0813b970151089b3dc1b8d9407eea323cdd1
diff --git a/doc/concepts/unicode b/doc/concepts/unicode
new file mode 100644
index 0000000..5c22ecc
--- /dev/null
+++ b/doc/concepts/unicode
@@ -0,0 +1,56 @@
+CONCEPT
+ unicode
+
+DESCRIPTION
+ LPC strings come in two flavors: As byte sequences and as unicode
+ strings. For both types almost the full range of string operations
+ is available, but the types are not to be mixed. So for example
+ you cannot add a byte sequence to an unicode string or vice versa.
+
+ Byte sequences can store only bytes (values from 0 to 255),
+ but unicode strings can store the full unicode character set
+ (values from 0 to 1114111).
+
+ There are two conversion functions to convert between byte sequences
+ and unicode strings: to_text() which will return a unicode string,
+ and to_bytes() which returns a byte sequence. Both take either
+ a string or an array, and when converting between bytes and unicode
+ also the name of the encoding (to be) used for the byte sequence.
+
+ -- File handling --
+
+ When a file is accessed either by compiling, read_file(), write_file()
+ (not read_bytes() or write_bytes(), or when an explicit encoding was
+ given), the master is asked via the driver hook H_FILE_ENCODING for
+ the encoding of the file. If none is given, 7 bit ASCII is assumed.
+ Whenever codes are encounted that are not valid in the given encoding
+ a compile or runtime error will be raised.
+
+ -- File names --
+
+ The filesystem encoding can be set with a call to
+ configure_driver(DC_FILESYSTEM_ENCODING, <encoding>). The default
+ encoding is derived from the LC_CTYPE environment setting.
+ If there is no environment setting (or it is set to the default
+ "C" locale), then UTF-8 is used.
+
+ -- Interactives --
+
+ Each interactive has its own encoding. It can be set with
+ configure_interactive(IC_ENCODING, <encoding>). The default is
+ "ISO-8859-1//TRANSLIT" which maps each incoming byte to the
+ first 256 unicode characters and uses transliteration to encode
+ characters that are not in this character set. If an input or
+ output character can not be converted to/from the configured
+ encoding it will be silently discarded.
+
+ -- ERQ / UDP --
+
+ Only byte sequences can be sent to the ERQ or via UDP,
+ and only byte sequences can be received from them.
+
+HISTORY
+ Introduced in LDMud 3.6.
+
+SEE ALSO
+ to_text(E), to_bytes(E), configure_driver(E)