Blame - doc/LPC/mappings - mudlib-public

blob: 2f64b88b98b89a035681d6f614c0cdfda91d8512 [file] [log] [blame]

MG Mud User	88f1247	2016-06-24 23:31:02 +0200	[diff] [blame]	1	CONCEPT
				2	mappings
				3
				4	LAST UPDATE
				5	Mon, 15 Mar 1999
				6
				7	DESCRIPTION
				8
				9	A step-by-step introduction to mappings:
				10	----------------------------------------
				11
				12	1. What is a mapping?
				13
				14	A mapping is a datatype which allows to store data associated to a key.
				15	In other languages they are also known as 'dictionaries' or 'alists'.
				16	There are also alists in LPC but they are not a separate datatype but are
				17	implemented on top of arrays. Alists are the predecessors of mappings.
				18	The keys and the values can be of any type. But most common datatypes
				19	for keys are strings, integers and objects. Others like arrays, mappings
				20	or closures aren't a good choice because comparision between i.e. arrays
				21	often returns false even if they equal in content. This is because the
				22	driver compares i.e. two arrays by their internal pointers and not by
				23	their content. The reason for this is simple: speed.
				24
				25	Mappings are allways treated as references when passing them to
				26	functions. This means when you pass a mapping to another object and this
				27	object modifies the mapping the modification will take place in a global
				28	scope - visible to all objects holding this mapping in a variable.
				29
				30
				31	2. What are mappings good for?
				32
				33	The term 'dictionary' probably describes the use of a mapping best.
				34	Opposed to arrays mappings don't have a specific order. They provide a
				35	mechanism to create a set of associations between values. Such an
				36	association consists of a unique key and data that is identified by the
				37	key. Think of a dictionary where you have a word and a definition of
				38	it. You use the word to lookup its definition.
				39
				40	Mappings can be used i.e. to hold aliases for commands. The key would
				41	then be the name of the alias and the data the command(s) behind an
				42	alias. Or they can be used for the exits of a room. The keys would be
				43	the directions where one can go to and the associated data would be the
				44	file names of the rooms. But mappings can also be used as a kind of a
				45	sparse array. A sparse array is an array where most of the elements
				46	aren't used (occupied by 0). I.e. if you want to store values at the
				47	position 0, 13 and 37642 of an array you would have to create an array
				48	with a size of at least 37643. This costs a lot of memory so a mapping
				49	would be more useful because you would then use the numbers 0, 13 and
				50	37642 as a key and not as an index to a position (actually the keys of a
				51	mapping are sometimes called indices but this is just because the way
				52	data is accessed in a mapping is similar to arrays: by the [] operator).
				53	This also allows to query all occupied positions of a sparse array by
				54	querying for all the keys of the mapping opposed to an array where you
				55	have to iterate over all elements.
				56
				57
				58	3. How do I create a mapping?
				59
				60	There are several ways to do so. The most convenient is the following:
				61
				62	mapping map;
				63	map = ([ key0: value00; ...; value0n,
				64	... : ... ; ...; ... ,
				65	keyn: valuen0; ...; valuenn ]);
				66
				67	As you can see, a key may have more than one value assigned. But the
				68	amount of values per key must always be equal. It is even possible to
				69	have mappings without any values!
				70	Another method is to use the efun mkmapping(). This efun gets two
				71	arguments with the first beeing an array of keys and the following beeing
				72	arrays of values:
				73
				74	mapping map;
				75	map = mkmapping (({ key0 , ..., keyn }),
				76	({ value00, ..., value0n }),
				77	({ ... , ..., ... }),
				78	({ valuen0, ..., valuenn }));
				79
				80	If the efun only gets one argument, then this argument will be taken as
				81	an array of keys and a mapping without values will be returned.
				82
				83	An empty mapping can be created by using the above described methods by
				84	simply ommitting the keys and values:
				85
				86	mapping map;
				87	map = ([]);
				88	or:
				89	map = mkmapping(({}), ({}));
				90
				91	Or by using the efun m_allocate(). This efun gets as first
				92	argument the amount of keys which will be added soon and an optional
				93	second argument specifying the width of the mapping:
				94
				95	map = m_allocate(n, width);
				96
				97	The value <n> may be a bit confusing since mappings shrink and grow
				98	dynamically. This value just tells the driver how 'long' this mapping is
				99	going to be so proper memory allocations will be performed to reduce
				100	the overhead of memory reallocation. I.e. if you want to read in a file
				101	and store the read data in a mapping you probably know the amount of
				102	keys. So you allocate a mapping with this efun and tell the driver how
				103	much memory should be allocated by specifing a proper <n> value.
				104	Thus causing a speedup when adding the read data to the mapping
				105	afterwards. The <width> just specifies how many values per key this
				106	mapping is going to have. If no width is given, 1 will be taken as
				107	default.
				108
				109	An empty mapping created with '([])' will always have a width of 1. To
				110	create empty mappings with other widths, write it as
				111
				112	map = ([:width ]);
				113
				114	<width> can be any expression returning an integer value (including
				115	function calls), and in fact this notation is just a fancy way of
				116	writing
				117
				118	map = m_allocate(0, width);
				119
				120
				121
				122	4. How can I modify the data of a mapping?
				123
				124	Adding a new key is similiar to modifying the associated data of an
				125	existing key:
				126
				127	map += ([ key: value0; ...; valuen ]);
				128
				129	Or in case only a single value should be modified:
				130
				131	map[key, n] = valuen;
				132
				133	If <n> is out of range or if <key> doesn't exists and <n> is greater
				134	than 0 an "Illegal index" error will be reported. If <n> is equal to 0 or
				135	the mapping only has a single value per key one can abbreviate it with:
				136
				137	map[key] = value;
				138
				139	If there is no <key> (and <n> is equal to 0 or not specified at all) a
				140	new one will be added automatically.
				141
Zesstra	3085c66	2025-08-02 18:31:10 +0200	[diff] [blame]	142	Multiple values for a single key can be modified with range access:
				143
				144	map[key, n1..n2] = ({ valuen1, ..., valuen2 });
				145	map[key, 0..<1] = ({ value0, ..., valuen });
				146
MG Mud User	88f1247	2016-06-24 23:31:02 +0200	[diff] [blame]	147	Deletion of a key is done with the -= operator or the efun
				148	m_delete(). A mapping can only be substracted by one without any values:
				149
				150	map -= ([ key ]);
				151	or:
				152	map -= ([ key0, ..., keyn ]);
				153
				154	The efun takes a mapping as first and a key as second argument:
				155
				156	m_delete(map, key);
				157
				158	The efun m_delete() returns the mapping but because mappings are
				159	handled as references there is no need of an assignment like:
				160
				161	map = m_delete(map, key);
				162
				163
				164	5. How can I access the data stored in a mapping?
				165
				166	This can be done by:
				167
				168	valuen = map[key, n];
				169
				170	Or in case of a mapping with just one value per key:
				171
				172	value0 = map[key];
				173
				174	If there is no <key> in the mapping and <n> is 0 or not specified at
				175	all (which is the same) a 0 will be returned or if <n> is greater than 0
				176	an "Illegal index" error will be reported.
				177
Zesstra	3085c66	2025-08-02 18:31:10 +0200	[diff] [blame]	178	Multiple values for a single key can be accessed by:
				179
				180	arr = map[key, n1..n2];
				181
				182	The values are returned in an array.
				183
MG Mud User	88f1247	2016-06-24 23:31:02 +0200	[diff] [blame]	184
				185	6. How can I test for the existance of a key?
				186
				187	A return value of 0 is sufficient for most applications but sometimes
				188	the ambiguity between an existing value of 0 and a nonexisting key can
				189	lead to a problem. Therefore one can use the efun member() or
				190	mapping_contains() to check if there actually is a key in the mapping:
				191
				192	if (member(map, key)) {
				193	...
				194	}
				195	or:
				196	if (mapping_contains(&value0, ..., &valuen, map, key)) {
				197	...
				198	}
				199
				200	This also shows how one can retrieve all values associated to a key
				201	from a mapping in a single step. The '&' is the reference operator which
				202	is neccesary to let the efun store the values in the variables.
				203
				204	In case of mappings with no values, the efun member() and
				205	mapping_contains() are equal in their behaviour and their way of calling
				206	because mapping_contains() won't get any reference variables to store the
				207	values in (obviously, because there aren't any).
				208
				209	Also normally member() is known to return the postion of an element in
				210	a list (i.e. a character in a string or data in an array) and if an
				211	element couldn't be found -1 is returned. But in the case of mappings
				212	there are no such things as order and postion. So member() only returns 0
				213	or 1.
				214
				215
				216	7. How can I copy a mapping?
				217
				218	A mapping can be copied with the + operator or by the efun
				219	copy_mapping():
				220
				221	newmap = ([]) + map;
				222	or:
				223	newmap = copy_mapping(map);
				224
				225	A mapping should only be copied when it is neccesary to get an own copy
				226	of it that must not be shared by other objects.
				227
				228
				229	8. How can I get all keys of a mapping?
				230
				231	The efun m_indices() gets a mapping as argument and returns an array
				232	holding all keys defined in this mapping:
				233
				234	keys = m_indices(map);
				235
				236
				237	9. How can I get all the values of a mapping?
				238
				239	The efun m_values() gets a mapping as argument and returns an array
				240	holding all the first (second, ...) values of it.
				241
				242	values0 = m_values(map); returns the first values
				243	values0 = m_values(map, 0); dito
				244	values1 = m_values(map, 1); returns the second values
				245	etc
				246
				247
				248	10. How can I determine the size of a mapping?
				249
				250	Because a mapping is a kind of rectangle it has two sizes: a length and
				251	a width. There are three different efuns to query these values. The first
				252	two are the efuns sizeof(), which returns the amount of key-value
				253	associations (the length of a mapping), and widthof(), which returns the
				254	number of values per key (the width). The third is the efun get_type_info().
				255	get_type_info() is meant to be a function to identify a datatype. Its
				256	return value is an array of two numerical values. The first specifies
				257	the datatype of the argument and the second is a datatype dependend
				258	value. In the case of a mapping the first value is T_MAPPING (which is a
				259	value defined in <lpctypes.h>) and the second the amount of values per
				260	key (a.k.a. columns or the width of the mapping - actually it would be
				261	correct to say that the width of a mapping is the amount of columns plus
				262	one for the keys but this is uncommon).
				263
				264
				265	11. What is the best method to iterate over a mapping?
				266
				267	First of all the main purpose of a mapping is not meant to be a set of
				268	data to iterate over. Afterall the keys in a mapping have no specific but
				269	a random order (at least on the LPC side). But still it is possible and
				270	sometimes even neccesary to do so.
				271
				272	If all key-value associations should be processed then one should use
				273	walk_mapping(). If all keys of a mapping should be processed to create a
				274	new mapping being a subset of the given one, then filter_mapping() should
				275	be used. If all keys are going to be processed and to create a new
				276	mapping with the same set of keys as the given mapping, then one would
				277	use map_mapping(). But in the case of an iteration that should/can stop
				278	even if not all data is processed it is probably wise to iterate over the
				279	mapping by first querying for the keys and then to iterate over them with
				280	a for() or a while() loop and querying the values by 'hand'.
				281
				282	The efun walk_mapping() gets a mapping as first argument and the name
				283	of a function as second one. All the following arguments are treated as
				284	extras which will be passed to the function specified with the 2nd
				285	argument. Instead of a string for the name of a function a closure can be
				286	used, too. Nothing will be returned:
				287
				288	...
				289	walk_mapping(map, "func", xarg0, ..., xargn);
				290	...
				291
				292	void func(mixed key, mixed value0, ..., mixed valuen,
				293	mixed xarg0, ..., mixed xargn) {
				294	...
				295	}
				296
				297	func() will be called for all key-value associations and gets as first
				298	argument the key. The next arguments are the values behind the key and
				299	are passed as references. The rest of the passed arguments are those
				300	specified as extras. Because the values are passed as references (opposed
				301	to copies) it is possible to modify them from inside func() by simply
				302	assigning new value to the variables <value0>, ..., <valuen>.
				303
				304	The efun filter_mapping() calls a function for each key in a mapping
				305	and creates a new mapping which only contains key-value associations for
				306	which the called function returned true (not equal 0 that is). The first
				307	argument is the mapping to iterate over and the second is a function name
				308	given as a string or a closure:
				309
				310	...
				311	submap = filter_mapping(map, "func", xarg0, ..., xargn);
				312	...
				313
				314	int func(mixed key, mixed xarg0, ..., mixed xargn) {
				315	...
				316	}
				317
				318	func() gets as first argument the key and the others are those passed
				319	as extras to filter_mapping().
				320
				321	The efun map_mapping() gets a mapping as first argument and a string as
				322	a function name (or again a closure) as second argument. Any additional
				323	arguments are again used as extras that will be passed to the iteration
				324	function. This efun returns a new mapping with the same keys as the given
				325	one. The values returned by the function that is invoked for each key
				326	will be used as the associated data behind each key of the new mapping:
				327
				328	...
				329	newmap = map_mapping(map, "func", xarg0, ..., xargn);
				330	...
				331
				332	mixed func(mixed key, mixed xarg0, ..., mixed xargn) {
				333	...
				334	}
				335
				336	func() gets as first argument the key and the others are those passed
				337	as extras to map_mapping().
				338
				339	Because a function can only return a single value (even when it is an
				340	array) it restricts the use of map_mapping() to only allow creation of
				341	mappings with a single value per key.
				342
				343
				344	12. Is it possible to join/intersect/cut mappings with another?
				345
				346	Joining mappings is only possible, if they have the same width (amount
				347	of values per key). One can use the + and += operator:
				348
				349	map = map1 + map2 + ... + mapn;
				350	map += map1 + map2 + ... + mapn;
				351
				352	Intersection of two mappings is only possible by using
				353	filter_mapping(). There is no efun or operator which features this. The
				354	'easiest' way may be the following function:
				355
				356	mapping intersect_mapping(mapping map1, mapping map2) {
				357	closure cl;
				358
				359	cl = lambda(({ 'key }), ({ #'member, map2, 'key }));
				360	return filter_mapping(map1, cl, map2);
				361	}
				362
				363	This function returns a new mapping which consists of all key-value
				364	associations of <map1> for which an equal key could be found in
				365	<map2>. This function uses a closure which returns 0 or 1 depending on
				366	wether a key from <map1> is contained in <map2> or not.
				367
				368	Cutting out all key-value associations of a mapping for which a key
				369	could be found in another mapping can be done by using the - and -=
				370	operator:
				371
				372	mapping cut_mapping(mapping map1, mapping map2) {
				373	return map1 - mkmapping(m_indices(map2));
				374	}
				375
				376	Because a maping can only be substracted by one without any values we
				377	first have to create such by using m_indices() and mkmapping().
				378
				379
				380	13. What are those mappings without any values (besides keys) good for?
				381
				382	Because the way how the driver searches for a key in a mapping is
				383	rather fast, those mappings can be used as a set of elements with a fast
				384	method for testing if an element is contained in the set. This technique
				385	is called hashing (further explanation would lead too far) which is
				386	faster than searching for values in array (which is done in a linear
				387	fashion).
				388
				389	Another (maybe more pratical) use of these mappings are to create a
				390	array of unique values out of an array with several equal values:
				391
				392	uniques = m_indices(mkmapping(array));
				393
				394	mkmapping() uses <array> to create a mapping without any values but
				395	just keys. And because a mapping can only have unique keys all multiple
				396	values in <array> are taken as one. The call of m_indices() then returns
				397	an array of these unique keys. Actually we only make use of those
				398	mappings temporarily.
				399
				400
				401	14. How can I convert an alist into a mapping and vice versa?
				402
				403	There are no special efuns which handle such conversions. But it can be
				404	done by the following functions:
				405
				406	mapping alist_to_mapping(mixed *alist) {
				407	return apply(#'mkmapping, alist);
				408	}
				409
				410	The efun apply() takes a closure and an array of values and passes each
				411	element of the array as an argument to the closure. Because an alist
				412	consists of an array of arrays with the first beeing the list of keys and
				413	the others the values associated to each key passing them as arguments to
				414	the efun closure #'mkmapping via apply() causes the creation of a mapping
				415	out of an alist.
				416
				417	mixed *mapping_to_alist(mapping map) {
				418	mixed *alist;
				419	symbol *vars;
				420	string var;
				421	closure cl;
				422	int width;
				423
				424	width = get_type_info(map)[1];
				425	alist = allocate(width + 1);
				426	vars = allocate(width + 2);
				427	for (var = "a"; width; var[0]++, width--) {
				428	alist[width] = ({});
				429	vars[width] = quote(var);
				430	}
				431	alist[0] = ({});
				432	vars[0] = 'key;
				433	vars[<1] = 'alist;
				434	cl = lambda(vars, ({ #'=, 'alist, ({ #'insert_alist }) + vars }));
				435	walk_mapping(map, cl, &alist);
				436	return alist;
				437	}
				438
				439	This function is a bit more complicated than the other and detailed
				440	description would lead too far of the topic. This function has one
				441	restriction: it can only turn a mappings with up to 26 values per key
				442	into an alist. But this should be sufficient for probably all
				443	applications which use mappings.
				444
				445	And Hyps further comment on this:
				446	The function mapping_to_alist() is also not that
				447	clever because insert_alist() allways creates a new
				448	alist. A second (optional) argument to m_values() to
				449	specify the value column would be better. Besides
				450	this, the conversion of a mapping into an alist could
				451	be done by to_array().
				452
				453	15. Dirty Mappings
				454
				455	'Dirty mappings' are nothing the LPC programmer directly is involved
				456	with, however, as it relates to the way mappings are implemented
				457	internally by the gamedriver. However, as this term shows up in
				458	various driver statistics, it is explained here.
				459
				460	There are two fundamental approaches to implement mappings:
				461
				462	1. Store all data entries in an array-like structure, in sorted order.
				463	2. Store all data in a hashtable, each entry allocaed separately.
				464
				465	Method 1 is very space efficient, as it doesn't need much overhead
				466	per entry; however, insertions and deletions of entries are
				467	relatively slow as all other entries need to be moved.
				468	Method 2 is very fast as nothing needs to be moved in memory,
				469	however it has a large overhead.
				470
				471	The gamedriver uses a hybrid method: at the basis is a mapping
				472	implementation based on arrays. However the driver uses a hash table
				473	in addition to handle all the ongoing insertions and deletions.
				474	Every once in a while, the contents of the hash table are sorted
				475	into the base array, reasoning that any entry surviving for longer
				476	time in the hash table is worth keeping in a more space-efficient
				477	manner. 'Dirty' mappings are such mappings with both an array and a
				478	hash part, 'clean' mappings are those with just an array part.
				479
				480	HISTORY
				481	The ([:width ]) notation was added in LDMud 3.2.9/3.3.208 .
				482
				483	SEE ALSO
				484	alists(LPC), closures(LPC), structs(LPC), mkmapping(E),
				485	walk_mapping(E)