blob: 381e3c8cc8a24adb41f23264b73844cb845805a4 [file] [log] [blame]
MG Mud User88f12472016-06-24 23:31:02 +02001CONCEPT
2 mappings
3
4LAST UPDATE
5 Mon, 15 Mar 1999
6
7DESCRIPTION
8
9 A step-by-step introduction to mappings:
10 ----------------------------------------
11
12 1. What is a mapping?
13
14 A mapping is a datatype which allows to store data associated to a key.
15 In other languages they are also known as 'dictionaries' or 'alists'.
16 There are also alists in LPC but they are not a separate datatype but are
17 implemented on top of arrays. Alists are the predecessors of mappings.
18 The keys and the values can be of any type. But most common datatypes
19 for keys are strings, integers and objects. Others like arrays, mappings
20 or closures aren't a good choice because comparision between i.e. arrays
21 often returns false even if they equal in content. This is because the
22 driver compares i.e. two arrays by their internal pointers and not by
23 their content. The reason for this is simple: speed.
24
25 Mappings are allways treated as references when passing them to
26 functions. This means when you pass a mapping to another object and this
27 object modifies the mapping the modification will take place in a global
28 scope - visible to all objects holding this mapping in a variable.
29
30
31 2. What are mappings good for?
32
33 The term 'dictionary' probably describes the use of a mapping best.
34 Opposed to arrays mappings don't have a specific order. They provide a
35 mechanism to create a set of associations between values. Such an
36 association consists of a unique key and data that is identified by the
37 key. Think of a dictionary where you have a word and a definition of
38 it. You use the word to lookup its definition.
39
40 Mappings can be used i.e. to hold aliases for commands. The key would
41 then be the name of the alias and the data the command(s) behind an
42 alias. Or they can be used for the exits of a room. The keys would be
43 the directions where one can go to and the associated data would be the
44 file names of the rooms. But mappings can also be used as a kind of a
45 sparse array. A sparse array is an array where most of the elements
46 aren't used (occupied by 0). I.e. if you want to store values at the
47 position 0, 13 and 37642 of an array you would have to create an array
48 with a size of at least 37643. This costs a lot of memory so a mapping
49 would be more useful because you would then use the numbers 0, 13 and
50 37642 as a key and not as an index to a position (actually the keys of a
51 mapping are sometimes called indices but this is just because the way
52 data is accessed in a mapping is similar to arrays: by the [] operator).
53 This also allows to query all occupied positions of a sparse array by
54 querying for all the keys of the mapping opposed to an array where you
55 have to iterate over all elements.
56
57
58 3. How do I create a mapping?
59
60 There are several ways to do so. The most convenient is the following:
61
62 mapping map;
63 map = ([ key0: value00; ...; value0n,
64 ... : ... ; ...; ... ,
65 keyn: valuen0; ...; valuenn ]);
66
67 As you can see, a key may have more than one value assigned. But the
68 amount of values per key must always be equal. It is even possible to
69 have mappings without any values!
70 Another method is to use the efun mkmapping(). This efun gets two
71 arguments with the first beeing an array of keys and the following beeing
72 arrays of values:
73
74 mapping map;
75 map = mkmapping (({ key0 , ..., keyn }),
76 ({ value00, ..., value0n }),
77 ({ ... , ..., ... }),
78 ({ valuen0, ..., valuenn }));
79
80 If the efun only gets one argument, then this argument will be taken as
81 an array of keys and a mapping without values will be returned.
82
83 An empty mapping can be created by using the above described methods by
84 simply ommitting the keys and values:
85
86 mapping map;
87 map = ([]);
88 or:
89 map = mkmapping(({}), ({}));
90
91 Or by using the efun m_allocate(). This efun gets as first
92 argument the amount of keys which will be added soon and an optional
93 second argument specifying the width of the mapping:
94
95 map = m_allocate(n, width);
96
97 The value <n> may be a bit confusing since mappings shrink and grow
98 dynamically. This value just tells the driver how 'long' this mapping is
99 going to be so proper memory allocations will be performed to reduce
100 the overhead of memory reallocation. I.e. if you want to read in a file
101 and store the read data in a mapping you probably know the amount of
102 keys. So you allocate a mapping with this efun and tell the driver how
103 much memory should be allocated by specifing a proper <n> value.
104 Thus causing a speedup when adding the read data to the mapping
105 afterwards. The <width> just specifies how many values per key this
106 mapping is going to have. If no width is given, 1 will be taken as
107 default.
108
109 An empty mapping created with '([])' will always have a width of 1. To
110 create empty mappings with other widths, write it as
111
112 map = ([:width ]);
113
114 <width> can be any expression returning an integer value (including
115 function calls), and in fact this notation is just a fancy way of
116 writing
117
118 map = m_allocate(0, width);
119
120
121
122 4. How can I modify the data of a mapping?
123
124 Adding a new key is similiar to modifying the associated data of an
125 existing key:
126
127 map += ([ key: value0; ...; valuen ]);
128
129 Or in case only a single value should be modified:
130
131 map[key, n] = valuen;
132
133 If <n> is out of range or if <key> doesn't exists and <n> is greater
134 than 0 an "Illegal index" error will be reported. If <n> is equal to 0 or
135 the mapping only has a single value per key one can abbreviate it with:
136
137 map[key] = value;
138
139 If there is no <key> (and <n> is equal to 0 or not specified at all) a
140 new one will be added automatically.
141
142 Deletion of a key is done with the -= operator or the efun
143 m_delete(). A mapping can only be substracted by one without any values:
144
145 map -= ([ key ]);
146 or:
147 map -= ([ key0, ..., keyn ]);
148
149 The efun takes a mapping as first and a key as second argument:
150
151 m_delete(map, key);
152
153 The efun m_delete() returns the mapping but because mappings are
154 handled as references there is no need of an assignment like:
155
156 map = m_delete(map, key);
157
158
159 5. How can I access the data stored in a mapping?
160
161 This can be done by:
162
163 valuen = map[key, n];
164
165 Or in case of a mapping with just one value per key:
166
167 value0 = map[key];
168
169 If there is no <key> in the mapping and <n> is 0 or not specified at
170 all (which is the same) a 0 will be returned or if <n> is greater than 0
171 an "Illegal index" error will be reported.
172
173
174 6. How can I test for the existance of a key?
175
176 A return value of 0 is sufficient for most applications but sometimes
177 the ambiguity between an existing value of 0 and a nonexisting key can
178 lead to a problem. Therefore one can use the efun member() or
179 mapping_contains() to check if there actually is a key in the mapping:
180
181 if (member(map, key)) {
182 ...
183 }
184 or:
185 if (mapping_contains(&value0, ..., &valuen, map, key)) {
186 ...
187 }
188
189 This also shows how one can retrieve all values associated to a key
190 from a mapping in a single step. The '&' is the reference operator which
191 is neccesary to let the efun store the values in the variables.
192
193 In case of mappings with no values, the efun member() and
194 mapping_contains() are equal in their behaviour and their way of calling
195 because mapping_contains() won't get any reference variables to store the
196 values in (obviously, because there aren't any).
197
198 Also normally member() is known to return the postion of an element in
199 a list (i.e. a character in a string or data in an array) and if an
200 element couldn't be found -1 is returned. But in the case of mappings
201 there are no such things as order and postion. So member() only returns 0
202 or 1.
203
204
205 7. How can I copy a mapping?
206
207 A mapping can be copied with the + operator or by the efun
208 copy_mapping():
209
210 newmap = ([]) + map;
211 or:
212 newmap = copy_mapping(map);
213
214 A mapping should only be copied when it is neccesary to get an own copy
215 of it that must not be shared by other objects.
216
217
218 8. How can I get all keys of a mapping?
219
220 The efun m_indices() gets a mapping as argument and returns an array
221 holding all keys defined in this mapping:
222
223 keys = m_indices(map);
224
225
226 9. How can I get all the values of a mapping?
227
228 The efun m_values() gets a mapping as argument and returns an array
229 holding all the first (second, ...) values of it.
230
231 values0 = m_values(map); returns the first values
232 values0 = m_values(map, 0); dito
233 values1 = m_values(map, 1); returns the second values
234 etc
235
236
237 10. How can I determine the size of a mapping?
238
239 Because a mapping is a kind of rectangle it has two sizes: a length and
240 a width. There are three different efuns to query these values. The first
241 two are the efuns sizeof(), which returns the amount of key-value
242 associations (the length of a mapping), and widthof(), which returns the
243 number of values per key (the width). The third is the efun get_type_info().
244 get_type_info() is meant to be a function to identify a datatype. Its
245 return value is an array of two numerical values. The first specifies
246 the datatype of the argument and the second is a datatype dependend
247 value. In the case of a mapping the first value is T_MAPPING (which is a
248 value defined in <lpctypes.h>) and the second the amount of values per
249 key (a.k.a. columns or the width of the mapping - actually it would be
250 correct to say that the width of a mapping is the amount of columns plus
251 one for the keys but this is uncommon).
252
253
254 11. What is the best method to iterate over a mapping?
255
256 First of all the main purpose of a mapping is not meant to be a set of
257 data to iterate over. Afterall the keys in a mapping have no specific but
258 a random order (at least on the LPC side). But still it is possible and
259 sometimes even neccesary to do so.
260
261 If all key-value associations should be processed then one should use
262 walk_mapping(). If all keys of a mapping should be processed to create a
263 new mapping being a subset of the given one, then filter_mapping() should
264 be used. If all keys are going to be processed and to create a new
265 mapping with the same set of keys as the given mapping, then one would
266 use map_mapping(). But in the case of an iteration that should/can stop
267 even if not all data is processed it is probably wise to iterate over the
268 mapping by first querying for the keys and then to iterate over them with
269 a for() or a while() loop and querying the values by 'hand'.
270
271 The efun walk_mapping() gets a mapping as first argument and the name
272 of a function as second one. All the following arguments are treated as
273 extras which will be passed to the function specified with the 2nd
274 argument. Instead of a string for the name of a function a closure can be
275 used, too. Nothing will be returned:
276
277 ...
278 walk_mapping(map, "func", xarg0, ..., xargn);
279 ...
280
281 void func(mixed key, mixed value0, ..., mixed valuen,
282 mixed xarg0, ..., mixed xargn) {
283 ...
284 }
285
286 func() will be called for all key-value associations and gets as first
287 argument the key. The next arguments are the values behind the key and
288 are passed as references. The rest of the passed arguments are those
289 specified as extras. Because the values are passed as references (opposed
290 to copies) it is possible to modify them from inside func() by simply
291 assigning new value to the variables <value0>, ..., <valuen>.
292
293 The efun filter_mapping() calls a function for each key in a mapping
294 and creates a new mapping which only contains key-value associations for
295 which the called function returned true (not equal 0 that is). The first
296 argument is the mapping to iterate over and the second is a function name
297 given as a string or a closure:
298
299 ...
300 submap = filter_mapping(map, "func", xarg0, ..., xargn);
301 ...
302
303 int func(mixed key, mixed xarg0, ..., mixed xargn) {
304 ...
305 }
306
307 func() gets as first argument the key and the others are those passed
308 as extras to filter_mapping().
309
310 The efun map_mapping() gets a mapping as first argument and a string as
311 a function name (or again a closure) as second argument. Any additional
312 arguments are again used as extras that will be passed to the iteration
313 function. This efun returns a new mapping with the same keys as the given
314 one. The values returned by the function that is invoked for each key
315 will be used as the associated data behind each key of the new mapping:
316
317 ...
318 newmap = map_mapping(map, "func", xarg0, ..., xargn);
319 ...
320
321 mixed func(mixed key, mixed xarg0, ..., mixed xargn) {
322 ...
323 }
324
325 func() gets as first argument the key and the others are those passed
326 as extras to map_mapping().
327
328 Because a function can only return a single value (even when it is an
329 array) it restricts the use of map_mapping() to only allow creation of
330 mappings with a single value per key.
331
332
333 12. Is it possible to join/intersect/cut mappings with another?
334
335 Joining mappings is only possible, if they have the same width (amount
336 of values per key). One can use the + and += operator:
337
338 map = map1 + map2 + ... + mapn;
339 map += map1 + map2 + ... + mapn;
340
341 Intersection of two mappings is only possible by using
342 filter_mapping(). There is no efun or operator which features this. The
343 'easiest' way may be the following function:
344
345 mapping intersect_mapping(mapping map1, mapping map2) {
346 closure cl;
347
348 cl = lambda(({ 'key }), ({ #'member, map2, 'key }));
349 return filter_mapping(map1, cl, map2);
350 }
351
352 This function returns a new mapping which consists of all key-value
353 associations of <map1> for which an equal key could be found in
354 <map2>. This function uses a closure which returns 0 or 1 depending on
355 wether a key from <map1> is contained in <map2> or not.
356
357 Cutting out all key-value associations of a mapping for which a key
358 could be found in another mapping can be done by using the - and -=
359 operator:
360
361 mapping cut_mapping(mapping map1, mapping map2) {
362 return map1 - mkmapping(m_indices(map2));
363 }
364
365 Because a maping can only be substracted by one without any values we
366 first have to create such by using m_indices() and mkmapping().
367
368
369 13. What are those mappings without any values (besides keys) good for?
370
371 Because the way how the driver searches for a key in a mapping is
372 rather fast, those mappings can be used as a set of elements with a fast
373 method for testing if an element is contained in the set. This technique
374 is called hashing (further explanation would lead too far) which is
375 faster than searching for values in array (which is done in a linear
376 fashion).
377
378 Another (maybe more pratical) use of these mappings are to create a
379 array of unique values out of an array with several equal values:
380
381 uniques = m_indices(mkmapping(array));
382
383 mkmapping() uses <array> to create a mapping without any values but
384 just keys. And because a mapping can only have unique keys all multiple
385 values in <array> are taken as one. The call of m_indices() then returns
386 an array of these unique keys. Actually we only make use of those
387 mappings temporarily.
388
389
390 14. How can I convert an alist into a mapping and vice versa?
391
392 There are no special efuns which handle such conversions. But it can be
393 done by the following functions:
394
395 mapping alist_to_mapping(mixed *alist) {
396 return apply(#'mkmapping, alist);
397 }
398
399 The efun apply() takes a closure and an array of values and passes each
400 element of the array as an argument to the closure. Because an alist
401 consists of an array of arrays with the first beeing the list of keys and
402 the others the values associated to each key passing them as arguments to
403 the efun closure #'mkmapping via apply() causes the creation of a mapping
404 out of an alist.
405
406 mixed *mapping_to_alist(mapping map) {
407 mixed *alist;
408 symbol *vars;
409 string var;
410 closure cl;
411 int width;
412
413 width = get_type_info(map)[1];
414 alist = allocate(width + 1);
415 vars = allocate(width + 2);
416 for (var = "a"; width; var[0]++, width--) {
417 alist[width] = ({});
418 vars[width] = quote(var);
419 }
420 alist[0] = ({});
421 vars[0] = 'key;
422 vars[<1] = 'alist;
423 cl = lambda(vars, ({ #'=, 'alist, ({ #'insert_alist }) + vars }));
424 walk_mapping(map, cl, &alist);
425 return alist;
426 }
427
428 This function is a bit more complicated than the other and detailed
429 description would lead too far of the topic. This function has one
430 restriction: it can only turn a mappings with up to 26 values per key
431 into an alist. But this should be sufficient for probably all
432 applications which use mappings.
433
434 And Hyps further comment on this:
435 The function mapping_to_alist() is also not that
436 clever because insert_alist() allways creates a new
437 alist. A second (optional) argument to m_values() to
438 specify the value column would be better. Besides
439 this, the conversion of a mapping into an alist could
440 be done by to_array().
441
442 15. Dirty Mappings
443
444 'Dirty mappings' are nothing the LPC programmer directly is involved
445 with, however, as it relates to the way mappings are implemented
446 internally by the gamedriver. However, as this term shows up in
447 various driver statistics, it is explained here.
448
449 There are two fundamental approaches to implement mappings:
450
451 1. Store all data entries in an array-like structure, in sorted order.
452 2. Store all data in a hashtable, each entry allocaed separately.
453
454 Method 1 is very space efficient, as it doesn't need much overhead
455 per entry; however, insertions and deletions of entries are
456 relatively slow as all other entries need to be moved.    
457 Method 2 is very fast as nothing needs to be moved in memory,
458 however it has a large overhead.
459
460 The gamedriver uses a hybrid method: at the basis is a mapping
461 implementation based on arrays. However the driver uses a hash table
462 in addition to handle all the ongoing insertions and deletions.
463 Every once in a while, the contents of the hash table are sorted
464 into the base array, reasoning that any entry surviving for longer
465 time in the hash table is worth keeping in a more space-efficient
466 manner. 'Dirty' mappings are such mappings with both an array and a
467 hash part, 'clean' mappings are those with just an array part.
468
469HISTORY
470 The ([:width ]) notation was added in LDMud 3.2.9/3.3.208 .
471
472SEE ALSO
473 alists(LPC), closures(LPC), structs(LPC), mkmapping(E),
474 walk_mapping(E)