MG Mud User | 88f1247 | 2016-06-24 23:31:02 +0200 | [diff] [blame^] | 1 | Intermediate LPC |
| 2 | Descartes of Borg |
| 3 | November 1993 |
| 4 | |
| 5 | Chapter 3: Complex Data Types |
| 6 | |
| 7 | 3.1 Simple Data Types |
| 8 | In the textbook LPC Basics, you learned about the common, basic LPC |
| 9 | data types: int, string, object, void. Most important you learned that |
| 10 | many operations and functions behave differently based on the data type |
| 11 | of the variables upon which they are operating. Some operators and |
| 12 | functions will even give errors if you use them with the wrong data |
| 13 | types. For example, "a"+"b" is handled much differently than 1+1. |
| 14 | When you ass "a"+"b", you are adding "b" onto the end of "a" to get |
| 15 | "ab". On the other hand, when you add 1+1, you do not get 11, you get |
| 16 | 2 as you would expect. |
| 17 | |
| 18 | I refer to these data types as simple data types, because they atomic in |
| 19 | that they cannot be broken down into smaller component data types. |
| 20 | The object data type is a sort of exception, but you really cannot refer |
| 21 | individually to the components which make it up, so I refer to it as a |
| 22 | simple data type. |
| 23 | |
| 24 | This chapter introduces the concept of the complex data type, a data type |
| 25 | which is made up of units of simple data types. LPC has two common |
| 26 | complex data types, both kinds of arrays. First, there is the traditional |
| 27 | array which stores values in consecutive elements accessed by a number |
| 28 | representing which element they are stored in. Second is an associative |
| 29 | array called a mapping. A mapping associates to values together to |
| 30 | allow a more natural access to data. |
| 31 | |
| 32 | 3.2 The Values NULL and 0 |
| 33 | Before getting fully into arrays, there first should be a full understanding |
| 34 | of the concept of NULL versus the concept of 0. In LPC, a null value is |
| 35 | represented by the integer 0. Although the integer 0 and NULL are often |
| 36 | freely interchangeable, this interchangeability often leads to some great |
| 37 | confusion when you get into the realm of complex data types. You may |
| 38 | have even encountered such confusion while using strings. |
| 39 | |
| 40 | 0 represents a value which for integers means the value you add to |
| 41 | another value yet still retain the value added. This for any addition |
| 42 | operation on any data type, the ZERO value for that data type is the value |
| 43 | that you can add to any other value and get the original value. Thus: A |
| 44 | plus ZERO equals A where A is some value of a given data type and |
| 45 | ZERO is the ZERO value for that data type. This is not any sort of |
| 46 | official mathematical definition. There exists one, but I am not a |
| 47 | mathematician, so I have no idea what the term is. Thus for integers, 0 |
| 48 | is the ZERO value since 1 + 0 equals 1. |
| 49 | |
| 50 | NULL, on the other hand, is the absence of any value or meaning. The |
| 51 | LPC driver will interpret NULL as an integer 0 if it can make sense of it |
| 52 | in that context. In any context besides integer addition, A plus NULL |
| 53 | causes an error. NULL causes an error because adding valueless fields |
| 54 | in other data types to those data types makes no sense. |
| 55 | |
| 56 | Looking at this from another point of view, we can get the ZERO value |
| 57 | for strings by knowing what added to "a" will give us "a" as a result. |
| 58 | The answer is not 0, but instead "". With integers, interchanging NULL |
| 59 | and 0 was acceptable since 0 represents no value with respect to the |
| 60 | integer data type. This interchangeability is not true for other data types, |
| 61 | since their ZERO values do not represent no value. Namely, "" |
| 62 | represents a string of no length and is very different from 0. |
| 63 | |
| 64 | When you first declare any variable of any type, it has no value. Any |
| 65 | data type except integers therefore must be initialized somehow before |
| 66 | you perform any operation on it. Generally, initialization is done in the |
| 67 | create() function for global variables, or at the top of the local function |
| 68 | for local variables by assigning them some value, often the ZERO value |
| 69 | for that data type. For example, in the following code I want to build a |
| 70 | string with random words: |
| 71 | |
| 72 | string build_nonsense() { |
| 73 | string str; |
| 74 | int i; |
| 75 | |
| 76 | str = ""; /* Here str is initialized to the string |
| 77 | ZERO value */ |
| 78 | for(i=0; i<6; i++) { |
| 79 | switch(random(3)+1) { |
| 80 | case 1: str += "bing"; break; |
| 81 | case 2: str += "borg"; break; |
| 82 | case 3: str += "foo"; break; |
| 83 | } |
| 84 | if(i==5) str += ".\n"; |
| 85 | else str += " "; |
| 86 | } |
| 87 | return capitalize(str); |
| 88 | } |
| 89 | |
| 90 | If we had not initialized the variable str, an error would have resulted |
| 91 | from trying to add a string to a NULL value. Instead, this code first |
| 92 | initializes str to the ZERO value for strings, "". After that, it enters a |
| 93 | loop which makes 6 cycles, each time randomly adding one of three |
| 94 | possible words to the string. For all words except the last, an additional |
| 95 | blank character is added. For the last word, a period and a return |
| 96 | character are added. The function then exits the loop, capitalizes the |
| 97 | nonsense string, then exits. |
| 98 | |
| 99 | 3.3 Arrays in LPC |
| 100 | An array is a powerful complex data type of LPC which allows you to |
| 101 | access multiple values through a single variable. For instance, |
| 102 | Nightmare has an indefinite number of currencies in which players may |
| 103 | do business. Only five of those currencies, however, can be considered |
| 104 | hard currencies. A hard currency for the sake of this example is a |
| 105 | currency which is readily exchangeable for any other hard currency, |
| 106 | whereas a soft currency may only be bought, but not sold. In the bank, |
| 107 | there is a list of hard currencies to allow bank keepers to know which |
| 108 | currencies are in fact hard currencies. With simple data types, we would |
| 109 | have to perform the following nasty operation for every exchange |
| 110 | transaction: |
| 111 | |
| 112 | int exchange(string str) { |
| 113 | string from, to; |
| 114 | int amt; |
| 115 | |
| 116 | if(!str) return 0; |
| 117 | if(sscanf(str, "%d %s for %s", amt, from, to) != 3) |
| 118 | return 0; |
| 119 | if(from != "platinum" && from != "gold" && from != |
| 120 | "silver" && |
| 121 | from != "electrum" && from != "copper") { |
| 122 | notify_fail("We do not buy soft currencies!\n"); |
| 123 | return 0; |
| 124 | } |
| 125 | ... |
| 126 | } |
| 127 | |
| 128 | With five hard currencies, we have a rather simple example. After all it |
| 129 | took only two lines of code to represent the if statement which filtered |
| 130 | out bad currencies. But what if you had to check against all the names |
| 131 | which cannot be used to make characters in the game? There might be |
| 132 | 100 of those; would you want to write a 100 part if statement? |
| 133 | What if you wanted to add a currency to the list of hard currencies? That |
| 134 | means you would have to change every check in the game for hard |
| 135 | currencies to add one more part to the if clauses. Arrays allow you |
| 136 | simple access to groups of related data so that you do not have to deal |
| 137 | with each individual value every time you want to perform a group |
| 138 | operation. |
| 139 | |
| 140 | As a constant, an array might look like this: |
| 141 | ({ "platinum", "gold", "silver", "electrum", "copper" }) |
| 142 | which is an array of type string. Individual data values in arrays are |
| 143 | called elements, or sometimes members. In code, just as constant |
| 144 | strings are represented by surrounding them with "", constant arrays are |
| 145 | represented by being surrounded by ({ }), with individual elements of |
| 146 | the array being separated by a ,. |
| 147 | |
| 148 | You may have arrays of any LPC data type, simple or complex. Arrays |
| 149 | made up of mixes of values are called arrays of mixed type. In most |
| 150 | LPC drivers, you declare an array using a throw-back to C language |
| 151 | syntax for arrays. This syntax is often confusing for LPC coders |
| 152 | because the syntax has a meaning in C that simply does not translate into |
| 153 | LPC. Nevertheless, if we wanted an array of type string, we would |
| 154 | declare it in the following manner: |
| 155 | |
| 156 | string *arr; |
| 157 | |
| 158 | In other words, the data type of the elements it will contain followed by |
| 159 | a space and an asterisk. Remember, however, that this newly declared |
| 160 | string array has a NULL value in it at the time of declaration. |
| 161 | |
| 162 | 3.4 Using Arrays |
| 163 | You now should understand how to declare and recognize an array in |
| 164 | code. In order to understand how they work in code, let's review the |
| 165 | bank code, this time using arrays: |
| 166 | |
| 167 | string *hard_currencies; |
| 168 | |
| 169 | int exchange(string str) { |
| 170 | string from, to; |
| 171 | int amt; |
| 172 | |
| 173 | if(!str) return 0; |
| 174 | if(sscanf(str, "%d %s for %s", amt, from, to) != 3) |
| 175 | return 0; |
| 176 | if(member_array(from, hard_currencies) == -1) { |
| 177 | notify_fail("We do not buy soft currencies!\n"); |
| 178 | return 0; |
| 179 | } |
| 180 | ... |
| 181 | } |
| 182 | |
| 183 | This code assumes hard_currencies is a global variable and is initialized |
| 184 | in create() as: |
| 185 | hard_currencies = ({ "platinum", "gold", "electrum", "silver", |
| 186 | "copper" }); |
| 187 | Ideally, you would have hard currencies as a #define in a header file for |
| 188 | all objects to use, but #define is a topic for a later chapter. |
| 189 | |
| 190 | Once you know what the member_array() efun does, this method |
| 191 | certainly is much easier to read as well as is much more efficient and |
| 192 | easier to code. In fact, you can probably guess what the |
| 193 | member_array() efun does: It tells you if a given value is a member of |
| 194 | the array in question. Specifically here, we want to know if the currency |
| 195 | the player is trying to sell is an element in the hard_curencies array. |
| 196 | What might be confusing to you is, not only does member_array() tell us |
| 197 | if the value is an element in the array, but it in fact tells us which element |
| 198 | of the array the value is. |
| 199 | |
| 200 | How does it tell you which element? It is easier to understand arrays if |
| 201 | you think of the array variable as holding a number. In the value above, |
| 202 | for the sake of argument, we will say that hard_currencies holds the |
| 203 | value 179000. This value tells the driver where to look for the array |
| 204 | hard_currencies represents. Thus, hard_currencies points to a place |
| 205 | where the array values may be found. When someone is talking about |
| 206 | the first element of the array, they want the element located at 179000. |
| 207 | When the object needs the value of the second element of the array, it |
| 208 | looks at 179000 + one value, then 179000 plus two values for the third, |
| 209 | and so on. We can therefore access individual elements of an array by |
| 210 | their index, which is the number of values beyond the starting point of |
| 211 | the array we need to look to find the value. For the array |
| 212 | hard_currencies array: |
| 213 | "platinum" has an index of 0. |
| 214 | "gold" has an index of 1. |
| 215 | "electrum" has an index of 2. |
| 216 | "silver" has an index of 3. |
| 217 | "copper" has an index of 4. |
| 218 | |
| 219 | The efun member_array() thus returns the index of the element being |
| 220 | tested if it is in the array, or -1 if it is not in the array. In order to |
| 221 | reference an individual element in an array, you use its index number in |
| 222 | the following manner: |
| 223 | array_name[index_no] |
| 224 | Example: |
| 225 | hard_currencies[3] |
| 226 | where hard_currencies[3] would refer to "silver". |
| 227 | |
| 228 | So, you now should now several ways in which arrays appear either as |
| 229 | a whole or as individual elements. As a whole, you refer to an array |
| 230 | variable by its name and an array constant by enclosing the array in ({ }) |
| 231 | and separating elements by ,. Individually, you refer to array variables |
| 232 | by the array name followed by the element's index number enclosed in |
| 233 | [], and to array constants in the same way you would refer to simple data |
| 234 | types of the same type as the constant. Examples: |
| 235 | |
| 236 | Whole arrays: |
| 237 | variable: arr |
| 238 | constant: ({ "platinum", "gold", "electrum", "silver", "copper" }) |
| 239 | |
| 240 | Individual members of arrays: |
| 241 | variable: arr[2] |
| 242 | constant: "electrum" |
| 243 | |
| 244 | You can use these means of reference to do all the things you are used to |
| 245 | doing with other data types. You can assign values, use the values in |
| 246 | operations, pass the values as parameters to functions, and use the |
| 247 | values as return types. It is important to remember that when you are |
| 248 | treating an element alone as an individual, the individual element is not |
| 249 | itself an array (unless you are dealing with an array of arrays). In the |
| 250 | example above, the individual elements are strings. So that: |
| 251 | str = arr[3] + " and " + arr[1]; |
| 252 | will create str to equal "silver and gold". Although this seems simple |
| 253 | enough, many people new to arrays start to run into trouble when trying |
| 254 | to add elements to an array. When you are treating an array as a whole |
| 255 | and you wish to add a new element to it, you must do it by adding |
| 256 | another array. |
| 257 | |
| 258 | Note the following example: |
| 259 | string str1, str2; |
| 260 | string *arr; |
| 261 | |
| 262 | str1 = "hi"; |
| 263 | str2 = "bye"; |
| 264 | /* str1 + str2 equals "hibye" */ |
| 265 | arr = ({ str1 }) + ({ str2 }); |
| 266 | /* arr is equal to ({ str1, str2 }) */ |
| 267 | Before going any further, I have to note that this example gives an |
| 268 | extremely horrible way of building an array. You should set it: arr = ({ |
| 269 | str1, str2 }). The point of the example, however, is that you must add |
| 270 | like types together. If you try adding an element to an array as the data |
| 271 | type it is, you will get an error. Instead you have to treat it as an array of |
| 272 | a single element. |
| 273 | |
| 274 | 3.5 Mappings |
| 275 | One of the major advances made in LPMuds since they were created is |
| 276 | the mapping data type. People alternately refer to them as associative |
| 277 | arrays. Practically speaking, a mapping allows you freedom from the |
| 278 | association of a numerical index to a value which arrays require. |
| 279 | Instead, mappings allow you to associate values with indices which |
| 280 | actually have meaning to you, much like a relational database. |
| 281 | |
| 282 | In an array of 5 elements, you access those values solely by their integer |
| 283 | indices which cover the range 0 to 4. Imagine going back to the example |
| 284 | of money again. Players have money of different amounts and different |
| 285 | types. In the player object, you need a way to store the types of money |
| 286 | that exist as well as relate them to the amount of that currency type the |
| 287 | player has. The best way to do this with arrays would have been to |
| 288 | store an array of strings representing money types and an array of |
| 289 | integers representing values in the player object. This would result in |
| 290 | CPU-eating ugly code like this: |
| 291 | |
| 292 | int query_money(string type) { |
| 293 | int i; |
| 294 | |
| 295 | i = member_array(type, currencies); |
| 296 | if(i>-1 && i < sizeof(amounts)) /* sizeof efun |
| 297 | returns # of elements */ |
| 298 | return amounts[i]; |
| 299 | else return 0; |
| 300 | } |
| 301 | |
| 302 | And that is a simple query function. Look at an add function: |
| 303 | |
| 304 | void add_money(string type, int amt) { |
| 305 | string *tmp1; |
| 306 | int * tmp2; |
| 307 | int i, x, j, maxj; |
| 308 | |
| 309 | i = member_array(type, currencies); |
| 310 | if(i >= sizeof(amounts)) /* corrupt data, we are in |
| 311 | a bad way */ |
| 312 | return; |
| 313 | else if(i== -1) { |
| 314 | currencies += ({ type }); |
| 315 | amounts += ({ amt }); |
| 316 | return; |
| 317 | } |
| 318 | else { |
| 319 | amounts[i] += amt; |
| 320 | if(amounts[i] < 1) { |
| 321 | tmp1 = allocate(sizeof(currencies)-1); |
| 322 | tmp2 = allocate(sizeof(amounts)-1); |
| 323 | for(j=0, x =0, maxj=sizeof(tmp1); j < maxj; |
| 324 | j++) { |
| 325 | if(j==i) x = 1; |
| 326 | tmp1[j] = currencies[j+x]; |
| 327 | tmp2[j] = amounts[j+x]; |
| 328 | } |
| 329 | currencies = tmp1; |
| 330 | amounts = tmp2; |
| 331 | } |
| 332 | } |
| 333 | } |
| 334 | |
| 335 | That is really some nasty code to perform the rather simple concept of |
| 336 | adding some money. First, we figure out if the player has any of that |
| 337 | kind of money, and if so, which element of the currencies array it is. |
| 338 | After that, we have to check to see that the integrity of the currency data |
| 339 | has been maintained. If the index of the type in the currencies array is |
| 340 | greater than the highest index of the amounts array, then we have a |
| 341 | problem since the indices are our only way of relating the two arrays. |
| 342 | Once we know our data is in tact, if the currency type is not currently |
| 343 | held by the player, we simply tack on the type as a new element to the |
| 344 | currencies array and the amount as a new element to the amounts array. |
| 345 | Finally, if it is a currency the player currently has, we just add the |
| 346 | amount to the corresponding index in the amounts array. If the money |
| 347 | gets below 1, meaning having no money of that type, we want to clear |
| 348 | the currency out of memory. |
| 349 | |
| 350 | Subtracting an element from an array is no simple matter. Take, for |
| 351 | example, the result of the following: |
| 352 | |
| 353 | string *arr; |
| 354 | |
| 355 | arr = ({ "a", "b", "a" }); |
| 356 | arr -= ({ arr[2] }); |
| 357 | |
| 358 | What do you think the final value of arr is? Well, it is: |
| 359 | ({ "b", "a" }) |
| 360 | Subtracting arr[2] from the original array does not remove the third |
| 361 | element from the array. Instead, it subtracts the value of the third |
| 362 | element of the array from the array. And array subtraction removes the |
| 363 | first instance of the value from the array. Since we do not want to be |
| 364 | <* NOTE Highlander@MorgenGrauen 11.2.94: |
| 365 | WRONG in MorgenGrauen (at least). The result is actually ({ "b" }). Array |
| 366 | subtraction removes ALL instances of the subtracted value from the array. |
| 367 | This holds true for all Amylaar-driver LPMuds. |
| 368 | *> |
| 369 | forced on counting on the elements of the array as being unique, we are |
| 370 | forced to go through some somersaults to remove the correct element |
| 371 | from both arrays in order to maintain the correspondence of the indices |
| 372 | in the two arrays. |
| 373 | |
| 374 | Mappings provide a better way. They allow you to directly associate the |
| 375 | money type with its value. Some people think of mappings as arrays |
| 376 | where you are not restricted to integers as indices. Truth is, mappings |
| 377 | are an entirely different concept in storing aggregate information. Arrays |
| 378 | force you to choose an index which is meaningful to the machine for |
| 379 | locating the appropriate data. The indices tell the machine how many |
| 380 | elements beyond the first value the value you desire can be found. With |
| 381 | mappings, you choose indices which are meaningful to you without |
| 382 | worrying about how that machine locates and stores it. |
| 383 | |
| 384 | You may recognize mappings in the following forms: |
| 385 | |
| 386 | constant values: |
| 387 | whole: ([ index:value, index:value ]) Ex: ([ "gold":10, "silver":20 ]) |
| 388 | element: 10 |
| 389 | |
| 390 | variable values: |
| 391 | whole: map (where map is the name of a mapping variable) |
| 392 | element: map["gold"] |
| 393 | |
| 394 | So now my monetary functions would look like: |
| 395 | |
| 396 | int query_money(string type) { return money[type]; } |
| 397 | |
| 398 | void add_money(string type, int amt) { |
| 399 | if(!money[type]) money[type] = amt; |
| 400 | else money[type] += amt; |
| 401 | if(money[type] < 1) |
| 402 | map_delete(money, type); /* this is for |
| 403 | MudOS */ |
| 404 | ...OR... |
| 405 | money = m_delete(money, type) /* for some |
| 406 | LPMud 3.* varieties */ |
| 407 | ... OR... |
| 408 | m_delete(money, type); /* for other LPMud 3.* |
| 409 | varieties */ |
| 410 | } |
| 411 | |
| 412 | Please notice first that the efuns for clearing a mapping element from the |
| 413 | mapping vary from driver to driver. Check with your driver's |
| 414 | documentation for the exact name an syntax of the relevant efun. |
| 415 | |
| 416 | As you can see immediately, you do not need to check the integrity of |
| 417 | your data since the values which interest you are inextricably bound to |
| 418 | one another in the mapping. Secondly, getting rid of useless values is a |
| 419 | simple efun call rather than a tricky, CPU-eating loop. Finally, the |
| 420 | query function is made up solely of a return instruction. |
| 421 | |
| 422 | You must declare and initialize any mapping before using it. |
| 423 | Declarations look like: |
| 424 | mapping map; |
| 425 | Whereas common initializations look like: |
| 426 | map = ([]); |
| 427 | map = m_allocate(10) ...OR... map = m_allocate(10); |
| 428 | map = ([ "gold": 20, "silver": 15 ]); |
| 429 | |
| 430 | As with other data types, there are rules defining how they work in |
| 431 | common operations like addition and subtraction: |
| 432 | ([ "gold":20, "silver":30 ]) + ([ "electrum":5 ]) |
| 433 | gives: |
| 434 | (["gold":20, "silver":30, "electrum":5]) |
| 435 | Although my demonstration shows a continuity of order, there is in fact |
| 436 | no guarantee of the order in which elements of mappings will stored. |
| 437 | Equivalence tests among mappings are therefore not a good thing. |
| 438 | |
| 439 | 3.6 Summary |
| 440 | Mappings and arrays can be built as complex as you need them to be. |
| 441 | You can have an array of mappings of arrays. Such a thing would be |
| 442 | declared like this: |
| 443 | |
| 444 | mapping *map_of_arrs; |
| 445 | which might look like: |
| 446 | ({ ([ ind1: ({ valA1, valA2}), ind2: ({valB1, valB2}) ]), ([ indX: |
| 447 | ({valX1,valX2}) ]) }) |
| 448 | |
| 449 | Mappings may use any data type as an index, including objects. |
| 450 | Mapping indices are often referred to as keys as well, a term from |
| 451 | databases. Always keep in mind that with any non-integer data type, |
| 452 | you must first initialize a variable before making use of it in common |
| 453 | operations such as addition and subtraction. In spite of the ease and |
| 454 | dynamics added to LPC coding by mappings and arrays, errors caused |
| 455 | by failing to initialize their values can be the most maddening experience |
| 456 | for people new to these data types. I would venture that a very high |
| 457 | percentage of all errors people experimenting with mappings and arrays |
| 458 | for the first time encounter are one of three error messages: |
| 459 | Indexing on illegal type. |
| 460 | Illegal index. |
| 461 | Bad argument 1 to (+ += - -=) /* insert your favourite operator */ |
| 462 | Error messages 1 and 3 are darn near almost always caused by a failure |
| 463 | to initialize the array or mapping in question. Error message 2 is caused |
| 464 | generally when you are trying to use an index in an initialized array |
| 465 | which does not exist. Also, for arrays, often people new to arrays will |
| 466 | get error message 3 because they try to add a single element to an array |
| 467 | by adding the initial array to the single element value instead of adding |
| 468 | an array of the single element to the initial array. Remember, add only |
| 469 | arrays to arrays. |
| 470 | |
| 471 | At this point, you should feel comfortable enough with mappings and |
| 472 | arrays to play with them. Expect to encounter the above error messages |
| 473 | a lot when first playing with these. The key to success with mappings is |
| 474 | in debugging all of these errors and seeing exactly what causes wholes |
| 475 | in your programming which allow you to try to work with uninitialized |
| 476 | mappings and arrays. Finally, go back through the basic room code and |
| 477 | look at things like the set_exits() (or the equivalent on your mudlib) |
| 478 | function. Chances are it makes use of mappings. In some instances, it |
| 479 | will use arrays as well for compatibility with mudlib.n. |
| 480 | |
| 481 | Copyright (c) George Reese 1993 |