Blame - doc/LPC/mappings - mudlib-public

blob: 381e3c8cc8a24adb41f23264b73844cb845805a4 [file] [log] [blame]

MG Mud User	88f1247	2016-06-24 23:31:02 +0200	[diff] [blame^]	1	CONCEPT
				2	mappings
				3
				4	LAST UPDATE
				5	Mon, 15 Mar 1999
				6
				7	DESCRIPTION
				8
				9	A step-by-step introduction to mappings:
				10	----------------------------------------
				11
				12	1. What is a mapping?
				13
				14	A mapping is a datatype which allows to store data associated to a key.
				15	In other languages they are also known as 'dictionaries' or 'alists'.
				16	There are also alists in LPC but they are not a separate datatype but are
				17	implemented on top of arrays. Alists are the predecessors of mappings.
				18	The keys and the values can be of any type. But most common datatypes
				19	for keys are strings, integers and objects. Others like arrays, mappings
				20	or closures aren't a good choice because comparision between i.e. arrays
				21	often returns false even if they equal in content. This is because the
				22	driver compares i.e. two arrays by their internal pointers and not by
				23	their content. The reason for this is simple: speed.
				24
				25	Mappings are allways treated as references when passing them to
				26	functions. This means when you pass a mapping to another object and this
				27	object modifies the mapping the modification will take place in a global
				28	scope - visible to all objects holding this mapping in a variable.
				29
				30
				31	2. What are mappings good for?
				32
				33	The term 'dictionary' probably describes the use of a mapping best.
				34	Opposed to arrays mappings don't have a specific order. They provide a
				35	mechanism to create a set of associations between values. Such an
				36	association consists of a unique key and data that is identified by the
				37	key. Think of a dictionary where you have a word and a definition of
				38	it. You use the word to lookup its definition.
				39
				40	Mappings can be used i.e. to hold aliases for commands. The key would
				41	then be the name of the alias and the data the command(s) behind an
				42	alias. Or they can be used for the exits of a room. The keys would be
				43	the directions where one can go to and the associated data would be the
				44	file names of the rooms. But mappings can also be used as a kind of a
				45	sparse array. A sparse array is an array where most of the elements
				46	aren't used (occupied by 0). I.e. if you want to store values at the
				47	position 0, 13 and 37642 of an array you would have to create an array
				48	with a size of at least 37643. This costs a lot of memory so a mapping
				49	would be more useful because you would then use the numbers 0, 13 and
				50	37642 as a key and not as an index to a position (actually the keys of a
				51	mapping are sometimes called indices but this is just because the way
				52	data is accessed in a mapping is similar to arrays: by the [] operator).
				53	This also allows to query all occupied positions of a sparse array by
				54	querying for all the keys of the mapping opposed to an array where you
				55	have to iterate over all elements.
				56
				57
				58	3. How do I create a mapping?
				59
				60	There are several ways to do so. The most convenient is the following:
				61
				62	mapping map;
				63	map = ([ key0: value00; ...; value0n,
				64	... : ... ; ...; ... ,
				65	keyn: valuen0; ...; valuenn ]);
				66
				67	As you can see, a key may have more than one value assigned. But the
				68	amount of values per key must always be equal. It is even possible to
				69	have mappings without any values!
				70	Another method is to use the efun mkmapping(). This efun gets two
				71	arguments with the first beeing an array of keys and the following beeing
				72	arrays of values:
				73
				74	mapping map;
				75	map = mkmapping (({ key0 , ..., keyn }),
				76	({ value00, ..., value0n }),
				77	({ ... , ..., ... }),
				78	({ valuen0, ..., valuenn }));
				79
				80	If the efun only gets one argument, then this argument will be taken as
				81	an array of keys and a mapping without values will be returned.
				82
				83	An empty mapping can be created by using the above described methods by
				84	simply ommitting the keys and values:
				85
				86	mapping map;
				87	map = ([]);
				88	or:
				89	map = mkmapping(({}), ({}));
				90
				91	Or by using the efun m_allocate(). This efun gets as first
				92	argument the amount of keys which will be added soon and an optional
				93	second argument specifying the width of the mapping:
				94
				95	map = m_allocate(n, width);
				96
				97	The value <n> may be a bit confusing since mappings shrink and grow
				98	dynamically. This value just tells the driver how 'long' this mapping is
				99	going to be so proper memory allocations will be performed to reduce
				100	the overhead of memory reallocation. I.e. if you want to read in a file
				101	and store the read data in a mapping you probably know the amount of
				102	keys. So you allocate a mapping with this efun and tell the driver how
				103	much memory should be allocated by specifing a proper <n> value.
				104	Thus causing a speedup when adding the read data to the mapping
				105	afterwards. The <width> just specifies how many values per key this
				106	mapping is going to have. If no width is given, 1 will be taken as
				107	default.
				108
				109	An empty mapping created with '([])' will always have a width of 1. To
				110	create empty mappings with other widths, write it as
				111
				112	map = ([:width ]);
				113
				114	<width> can be any expression returning an integer value (including
				115	function calls), and in fact this notation is just a fancy way of
				116	writing
				117
				118	map = m_allocate(0, width);
				119
				120
				121
				122	4. How can I modify the data of a mapping?
				123
				124	Adding a new key is similiar to modifying the associated data of an
				125	existing key:
				126
				127	map += ([ key: value0; ...; valuen ]);
				128
				129	Or in case only a single value should be modified:
				130
				131	map[key, n] = valuen;
				132
				133	If <n> is out of range or if <key> doesn't exists and <n> is greater
				134	than 0 an "Illegal index" error will be reported. If <n> is equal to 0 or
				135	the mapping only has a single value per key one can abbreviate it with:
				136
				137	map[key] = value;
				138
				139	If there is no <key> (and <n> is equal to 0 or not specified at all) a
				140	new one will be added automatically.
				141
				142	Deletion of a key is done with the -= operator or the efun
				143	m_delete(). A mapping can only be substracted by one without any values:
				144
				145	map -= ([ key ]);
				146	or:
				147	map -= ([ key0, ..., keyn ]);
				148
				149	The efun takes a mapping as first and a key as second argument:
				150
				151	m_delete(map, key);
				152
				153	The efun m_delete() returns the mapping but because mappings are
				154	handled as references there is no need of an assignment like:
				155
				156	map = m_delete(map, key);
				157
				158
				159	5. How can I access the data stored in a mapping?
				160
				161	This can be done by:
				162
				163	valuen = map[key, n];
				164
				165	Or in case of a mapping with just one value per key:
				166
				167	value0 = map[key];
				168
				169	If there is no <key> in the mapping and <n> is 0 or not specified at
				170	all (which is the same) a 0 will be returned or if <n> is greater than 0
				171	an "Illegal index" error will be reported.
				172
				173
				174	6. How can I test for the existance of a key?
				175
				176	A return value of 0 is sufficient for most applications but sometimes
				177	the ambiguity between an existing value of 0 and a nonexisting key can
				178	lead to a problem. Therefore one can use the efun member() or
				179	mapping_contains() to check if there actually is a key in the mapping:
				180
				181	if (member(map, key)) {
				182	...
				183	}
				184	or:
				185	if (mapping_contains(&value0, ..., &valuen, map, key)) {
				186	...
				187	}
				188
				189	This also shows how one can retrieve all values associated to a key
				190	from a mapping in a single step. The '&' is the reference operator which
				191	is neccesary to let the efun store the values in the variables.
				192
				193	In case of mappings with no values, the efun member() and
				194	mapping_contains() are equal in their behaviour and their way of calling
				195	because mapping_contains() won't get any reference variables to store the
				196	values in (obviously, because there aren't any).
				197
				198	Also normally member() is known to return the postion of an element in
				199	a list (i.e. a character in a string or data in an array) and if an
				200	element couldn't be found -1 is returned. But in the case of mappings
				201	there are no such things as order and postion. So member() only returns 0
				202	or 1.
				203
				204
				205	7. How can I copy a mapping?
				206
				207	A mapping can be copied with the + operator or by the efun
				208	copy_mapping():
				209
				210	newmap = ([]) + map;
				211	or:
				212	newmap = copy_mapping(map);
				213
				214	A mapping should only be copied when it is neccesary to get an own copy
				215	of it that must not be shared by other objects.
				216
				217
				218	8. How can I get all keys of a mapping?
				219
				220	The efun m_indices() gets a mapping as argument and returns an array
				221	holding all keys defined in this mapping:
				222
				223	keys = m_indices(map);
				224
				225
				226	9. How can I get all the values of a mapping?
				227
				228	The efun m_values() gets a mapping as argument and returns an array
				229	holding all the first (second, ...) values of it.
				230
				231	values0 = m_values(map); returns the first values
				232	values0 = m_values(map, 0); dito
				233	values1 = m_values(map, 1); returns the second values
				234	etc
				235
				236
				237	10. How can I determine the size of a mapping?
				238
				239	Because a mapping is a kind of rectangle it has two sizes: a length and
				240	a width. There are three different efuns to query these values. The first
				241	two are the efuns sizeof(), which returns the amount of key-value
				242	associations (the length of a mapping), and widthof(), which returns the
				243	number of values per key (the width). The third is the efun get_type_info().
				244	get_type_info() is meant to be a function to identify a datatype. Its
				245	return value is an array of two numerical values. The first specifies
				246	the datatype of the argument and the second is a datatype dependend
				247	value. In the case of a mapping the first value is T_MAPPING (which is a
				248	value defined in <lpctypes.h>) and the second the amount of values per
				249	key (a.k.a. columns or the width of the mapping - actually it would be
				250	correct to say that the width of a mapping is the amount of columns plus
				251	one for the keys but this is uncommon).
				252
				253
				254	11. What is the best method to iterate over a mapping?
				255
				256	First of all the main purpose of a mapping is not meant to be a set of
				257	data to iterate over. Afterall the keys in a mapping have no specific but
				258	a random order (at least on the LPC side). But still it is possible and
				259	sometimes even neccesary to do so.
				260
				261	If all key-value associations should be processed then one should use
				262	walk_mapping(). If all keys of a mapping should be processed to create a
				263	new mapping being a subset of the given one, then filter_mapping() should
				264	be used. If all keys are going to be processed and to create a new
				265	mapping with the same set of keys as the given mapping, then one would
				266	use map_mapping(). But in the case of an iteration that should/can stop
				267	even if not all data is processed it is probably wise to iterate over the
				268	mapping by first querying for the keys and then to iterate over them with
				269	a for() or a while() loop and querying the values by 'hand'.
				270
				271	The efun walk_mapping() gets a mapping as first argument and the name
				272	of a function as second one. All the following arguments are treated as
				273	extras which will be passed to the function specified with the 2nd
				274	argument. Instead of a string for the name of a function a closure can be
				275	used, too. Nothing will be returned:
				276
				277	...
				278	walk_mapping(map, "func", xarg0, ..., xargn);
				279	...
				280
				281	void func(mixed key, mixed value0, ..., mixed valuen,
				282	mixed xarg0, ..., mixed xargn) {
				283	...
				284	}
				285
				286	func() will be called for all key-value associations and gets as first
				287	argument the key. The next arguments are the values behind the key and
				288	are passed as references. The rest of the passed arguments are those
				289	specified as extras. Because the values are passed as references (opposed
				290	to copies) it is possible to modify them from inside func() by simply
				291	assigning new value to the variables <value0>, ..., <valuen>.
				292
				293	The efun filter_mapping() calls a function for each key in a mapping
				294	and creates a new mapping which only contains key-value associations for
				295	which the called function returned true (not equal 0 that is). The first
				296	argument is the mapping to iterate over and the second is a function name
				297	given as a string or a closure:
				298
				299	...
				300	submap = filter_mapping(map, "func", xarg0, ..., xargn);
				301	...
				302
				303	int func(mixed key, mixed xarg0, ..., mixed xargn) {
				304	...
				305	}
				306
				307	func() gets as first argument the key and the others are those passed
				308	as extras to filter_mapping().
				309
				310	The efun map_mapping() gets a mapping as first argument and a string as
				311	a function name (or again a closure) as second argument. Any additional
				312	arguments are again used as extras that will be passed to the iteration
				313	function. This efun returns a new mapping with the same keys as the given
				314	one. The values returned by the function that is invoked for each key
				315	will be used as the associated data behind each key of the new mapping:
				316
				317	...
				318	newmap = map_mapping(map, "func", xarg0, ..., xargn);
				319	...
				320
				321	mixed func(mixed key, mixed xarg0, ..., mixed xargn) {
				322	...
				323	}
				324
				325	func() gets as first argument the key and the others are those passed
				326	as extras to map_mapping().
				327
				328	Because a function can only return a single value (even when it is an
				329	array) it restricts the use of map_mapping() to only allow creation of
				330	mappings with a single value per key.
				331
				332
				333	12. Is it possible to join/intersect/cut mappings with another?
				334
				335	Joining mappings is only possible, if they have the same width (amount
				336	of values per key). One can use the + and += operator:
				337
				338	map = map1 + map2 + ... + mapn;
				339	map += map1 + map2 + ... + mapn;
				340
				341	Intersection of two mappings is only possible by using
				342	filter_mapping(). There is no efun or operator which features this. The
				343	'easiest' way may be the following function:
				344
				345	mapping intersect_mapping(mapping map1, mapping map2) {
				346	closure cl;
				347
				348	cl = lambda(({ 'key }), ({ #'member, map2, 'key }));
				349	return filter_mapping(map1, cl, map2);
				350	}
				351
				352	This function returns a new mapping which consists of all key-value
				353	associations of <map1> for which an equal key could be found in
				354	<map2>. This function uses a closure which returns 0 or 1 depending on
				355	wether a key from <map1> is contained in <map2> or not.
				356
				357	Cutting out all key-value associations of a mapping for which a key
				358	could be found in another mapping can be done by using the - and -=
				359	operator:
				360
				361	mapping cut_mapping(mapping map1, mapping map2) {
				362	return map1 - mkmapping(m_indices(map2));
				363	}
				364
				365	Because a maping can only be substracted by one without any values we
				366	first have to create such by using m_indices() and mkmapping().
				367
				368
				369	13. What are those mappings without any values (besides keys) good for?
				370
				371	Because the way how the driver searches for a key in a mapping is
				372	rather fast, those mappings can be used as a set of elements with a fast
				373	method for testing if an element is contained in the set. This technique
				374	is called hashing (further explanation would lead too far) which is
				375	faster than searching for values in array (which is done in a linear
				376	fashion).
				377
				378	Another (maybe more pratical) use of these mappings are to create a
				379	array of unique values out of an array with several equal values:
				380
				381	uniques = m_indices(mkmapping(array));
				382
				383	mkmapping() uses <array> to create a mapping without any values but
				384	just keys. And because a mapping can only have unique keys all multiple
				385	values in <array> are taken as one. The call of m_indices() then returns
				386	an array of these unique keys. Actually we only make use of those
				387	mappings temporarily.
				388
				389
				390	14. How can I convert an alist into a mapping and vice versa?
				391
				392	There are no special efuns which handle such conversions. But it can be
				393	done by the following functions:
				394
				395	mapping alist_to_mapping(mixed *alist) {
				396	return apply(#'mkmapping, alist);
				397	}
				398
				399	The efun apply() takes a closure and an array of values and passes each
				400	element of the array as an argument to the closure. Because an alist
				401	consists of an array of arrays with the first beeing the list of keys and
				402	the others the values associated to each key passing them as arguments to
				403	the efun closure #'mkmapping via apply() causes the creation of a mapping
				404	out of an alist.
				405
				406	mixed *mapping_to_alist(mapping map) {
				407	mixed *alist;
				408	symbol *vars;
				409	string var;
				410	closure cl;
				411	int width;
				412
				413	width = get_type_info(map)[1];
				414	alist = allocate(width + 1);
				415	vars = allocate(width + 2);
				416	for (var = "a"; width; var[0]++, width--) {
				417	alist[width] = ({});
				418	vars[width] = quote(var);
				419	}
				420	alist[0] = ({});
				421	vars[0] = 'key;
				422	vars[<1] = 'alist;
				423	cl = lambda(vars, ({ #'=, 'alist, ({ #'insert_alist }) + vars }));
				424	walk_mapping(map, cl, &alist);
				425	return alist;
				426	}
				427
				428	This function is a bit more complicated than the other and detailed
				429	description would lead too far of the topic. This function has one
				430	restriction: it can only turn a mappings with up to 26 values per key
				431	into an alist. But this should be sufficient for probably all
				432	applications which use mappings.
				433
				434	And Hyps further comment on this:
				435	The function mapping_to_alist() is also not that
				436	clever because insert_alist() allways creates a new
				437	alist. A second (optional) argument to m_values() to
				438	specify the value column would be better. Besides
				439	this, the conversion of a mapping into an alist could
				440	be done by to_array().
				441
				442	15. Dirty Mappings
				443
				444	'Dirty mappings' are nothing the LPC programmer directly is involved
				445	with, however, as it relates to the way mappings are implemented
				446	internally by the gamedriver. However, as this term shows up in
				447	various driver statistics, it is explained here.
				448
				449	There are two fundamental approaches to implement mappings:
				450
				451	1. Store all data entries in an array-like structure, in sorted order.
				452	2. Store all data in a hashtable, each entry allocaed separately.
				453
				454	Method 1 is very space efficient, as it doesn't need much overhead
				455	per entry; however, insertions and deletions of entries are
				456	relatively slow as all other entries need to be moved.
				457	Method 2 is very fast as nothing needs to be moved in memory,
				458	however it has a large overhead.
				459
				460	The gamedriver uses a hybrid method: at the basis is a mapping
				461	implementation based on arrays. However the driver uses a hash table
				462	in addition to handle all the ongoing insertions and deletions.
				463	Every once in a while, the contents of the hash table are sorted
				464	into the base array, reasoning that any entry surviving for longer
				465	time in the hash table is worth keeping in a more space-efficient
				466	manner. 'Dirty' mappings are such mappings with both an array and a
				467	hash part, 'clean' mappings are those with just an array part.
				468
				469	HISTORY
				470	The ([:width ]) notation was added in LDMud 3.2.9/3.3.208 .
				471
				472	SEE ALSO
				473	alists(LPC), closures(LPC), structs(LPC), mkmapping(E),
				474	walk_mapping(E)