MG Mud User | 88f1247 | 2016-06-24 23:31:02 +0200 | [diff] [blame^] | 1 | Intermediate LPC |
| 2 | Descartes of Borg |
| 3 | November 1993 |
| 4 | |
| 5 | Chapter 5: Advanced String Handling |
| 6 | |
| 7 | 5.1 What a String Is |
| 8 | The LPC Basics textbook taught strings as simple data types. LPC |
| 9 | generally deals with strings in such a matter. The underlying driver |
| 10 | program, however, is written in C, which has no string data type. The |
| 11 | driver in fact sees strings as a complex data type made up of an array of |
| 12 | characters, a simple C data type. LPC, on the other hand does not |
| 13 | recognize a character data type (there may actually be a driver or two out |
| 14 | there which do recognize the character as a data type, but in general not). |
| 15 | The net effect is that there are some array-like things you can do with |
| 16 | strings that you cannot do with other LPC data types. |
| 17 | |
| 18 | The first efun regarding strings you should learn is the strlen() efun. |
| 19 | This efun returns the length in characters of an LPC string, and is thus |
| 20 | the string equivalent to sizeof() for arrays. Just from the behaviour of |
| 21 | this efun, you can see that the driver treats a string as if it were made up |
| 22 | of smaller elements. In this chapter, you will learn how to deal with |
| 23 | strings on a more basic level, as characters and sub strings. |
| 24 | |
| 25 | 5.2 Strings as Character Arrays |
| 26 | You can do nearly anything with strings that you can do with arrays, |
| 27 | except assign values on a character basis. At the most basic, you can |
| 28 | actually refer to character constants by enclosing them in '' (single |
| 29 | quotes). 'a' and "a" are therefore very different things in LPC. 'a' |
| 30 | represents a character which cannot be used in assignment statements or |
| 31 | any other operations except comparison evaluations. "a" on the other |
| 32 | hand is a string made up of a single character. You can add and subtract |
| 33 | other strings to it and assign it as a value to a variable. |
| 34 | |
| 35 | With string variables, you can access the individual characters to run |
| 36 | comparisons against character constants using exactly the same syntax |
| 37 | that is used with arrays. In other words, the statement: |
| 38 | if(str[2] == 'a') |
| 39 | is a valid LPC statement comparing the second character in the str string |
| 40 | to the character 'a'. You have to be very careful that you are not |
| 41 | comparing elements of arrays to characters, nor are you comparing |
| 42 | characters of strings to strings. |
| 43 | |
| 44 | LPC also allows you to access several characters together using LPC's |
| 45 | range operator ..: |
| 46 | if(str[0..1] == "ab") |
| 47 | In other words, you can look for the string which is formed by the |
| 48 | characters 0 through 1 in the string str. As with arrays, you must be |
| 49 | careful when using indexing or range operators so that you do not try to |
| 50 | reference an index number larger than the last index. Doing so will |
| 51 | result in an error. |
| 52 | |
| 53 | Now you can see a couple of similarities between strings and arrays: |
| 54 | 1) You may index on both to access the values of individual elements. |
| 55 | a) The individual elements of strings are characters |
| 56 | b) The individual elements of arrays match the data type of the |
| 57 | array. |
| 58 | 2) You may operate on a range of values |
| 59 | a) Ex: "abcdef"[1..3] is the string "bcd" |
| 60 | b) Ex: ({ 1, 2, 3, 4, 5 })[1..3] is the int array ({ 2, 3, 4 }) |
| 61 | <* NOTE Highlander@MorgenGrauen |
| 62 | Also possible in MorgenGrauen (in common: Amylaar-driver LPMuds): |
| 63 | "abcdef"[2..] -> "cdef" and |
| 64 | "abcdef"[1..<2] -> "bcde" (< means start counting from the end and with 1) |
| 65 | *> |
| 66 | |
| 67 | And of course, you should always keep in mind the fundamental |
| 68 | difference: a string is not made up of a more fundamental LPC data type. |
| 69 | In other words, you may not act on the individual characters by |
| 70 | assigning them values. |
| 71 | |
| 72 | 5.3 The Efun sscanf() |
| 73 | You cannot do any decent string handling in LPC without using |
| 74 | sscanf(). Without it, you are left trying to play with the full strings |
| 75 | passed by command statements to the command functions. In other |
| 76 | words, you could not handle a command like: "give sword to leo", since |
| 77 | you would have no way of separating "sword to leo" into its constituent |
| 78 | parts. Commands such as these therefore use this efun in order to use |
| 79 | commands with multiple arguments or to make commands more |
| 80 | "English-like". |
| 81 | |
| 82 | Most people find the manual entries for sscanf() to be rather difficult |
| 83 | reading. The function does not lend itself well to the format used by |
| 84 | manual entries. As I said above, the function is used to take a string and |
| 85 | break it into usable parts. Technically it is supposed to take a string and |
| 86 | scan it into one or more variables of varying types. Take the example |
| 87 | above: |
| 88 | |
| 89 | int give(string str) { |
| 90 | string what, whom; |
| 91 | |
| 92 | if(!str) return notify_fail("Give what to whom?\n"); |
| 93 | if(sscanf(str, "%s to %s", what, whom) != 2) |
| 94 | return notify_fail("Give what to whom?\n"); |
| 95 | ... rest of give code ... |
| 96 | } |
| 97 | |
| 98 | The efun sscanf() takes three or more arguments. The first argument is |
| 99 | the string you want scanned. The second argument is called a control |
| 100 | string. The control string is a model which demonstrates in what form |
| 101 | the original string is written, and how it should be divided up. The rest |
| 102 | of the arguments are variables to which you will assign values based |
| 103 | upon the control string. |
| 104 | |
| 105 | The control string is made up of three different types of elements: 1) |
| 106 | constants, 2) variable arguments to be scanned, and 3) variable |
| 107 | arguments to be discarded. You must have as many of the variable |
| 108 | arguments in sscanf() as you have elements of type 2 in your control |
| 109 | string. In the above example, the control string was "%s to %s", which |
| 110 | is a three element control string made up of one constant part (" to "), |
| 111 | and two variable arguments to be scanned ("%s"). There were no |
| 112 | variables to be discarded. |
| 113 | |
| 114 | The control string basically indicates that the function should find the |
| 115 | string " to " in the string str. Whatever comes before that constant will |
| 116 | be placed into the first variable argument as a string. The same thing |
| 117 | will happen to whatever comes after the constant. |
| 118 | |
| 119 | Variable elements are noted by a "%" sign followed by a code for |
| 120 | decoding them. If the variable element is to be discarded, the "%" sign |
| 121 | is followed by the "*" as well as the code for decoding the variable. |
| 122 | Common codes for variable element decoding are "s" for strings and "d" |
| 123 | for integers. In addition, your mudlib may support other conversion |
| 124 | codes, such as "f" for float. So in the two examples above, the "%s" in |
| 125 | the control string indicates that whatever lies in the original string in the |
| 126 | corresponding place will be scanned into a new variable as a string. |
| 127 | |
| 128 | A simple exercise. How would you turn the string "145" into an |
| 129 | integer? |
| 130 | |
| 131 | Answer: |
| 132 | int x; |
| 133 | sscanf("145", "%d", x); |
| 134 | |
| 135 | After the sscanf() function, x will equal the integer 145. |
| 136 | |
| 137 | Whenever you scan a string against a control string, the function |
| 138 | searches the original string for the first instance of the first constant in |
| 139 | the original string. For example, if your string is "magic attack 100" and |
| 140 | you have the following: |
| 141 | int improve(string str) { |
| 142 | string skill; |
| 143 | int x; |
| 144 | |
| 145 | if(sscanf(str, "%s %d", skill, x) != 2) return 0; |
| 146 | ... |
| 147 | } |
| 148 | you would find that you have come up with the wrong return value for |
| 149 | sscanf() (more on the return values later). The control string, "%s %d", |
| 150 | is made up of to variables to be scanned and one constant. The constant |
| 151 | is " ". So the function searches the original string for the first instance |
| 152 | of " ", placing whatever comes before the " " into skill, and trying to |
| 153 | place whatever comes after the " " into x. This separates "magic attack |
| 154 | 100" into the components "magic" and "attack 100". The function, |
| 155 | however, cannot make heads or tales of "attack 100" as an integer, so it |
| 156 | returns 1, meaning that 1 variable value was successfully scanned |
| 157 | ("magic" into skill). |
| 158 | |
| 159 | Perhaps you guessed from the above examples, but the efun sscanf() |
| 160 | returns an int, which is the number of variables into which values from |
| 161 | the original string were successfully scanned. Some examples with |
| 162 | return values for you to examine: |
| 163 | |
| 164 | sscanf("swo rd descartes", "%s to %s", str1, str2) return: 0 |
| 165 | sscanf("swo rd descartes", "%s %s", str1, str2) return: 2 |
| 166 | sscanf("200 gold to descartes", "%d %s to %s", x, str1, str2) return: 3 |
| 167 | sscanf("200 gold to descartes", "%d %*s to %s", x, str1) return: 2 |
| 168 | where x is an int and str1 and str2 are string |
| 169 | |
| 170 | 5.4 Summary |
| 171 | LPC strings can be thought of as arrays of characters, yet always |
| 172 | keeping in mind that LPC does not have the character data type (with |
| 173 | most, but not all drivers). Since the character is not a true LPC data |
| 174 | type, you cannot act upon individual characters in an LPC string in the |
| 175 | same manner you would act upon different data types. Noticing the |
| 176 | intimate relationship between strings and arrays nevertheless makes it |
| 177 | easier to understand such concepts as the range operator and indexing on |
| 178 | strings. |
| 179 | |
| 180 | There are efuns other than sscanf() which involve advanced string |
| 181 | handling, however, they are not needed nearly as often. You should |
| 182 | check on your mud for man or help files on the efuns: explode(), |
| 183 | implode(), replace_string(), sprintf(). All of these are very valuable |
| 184 | tools, especially if you intend to do coding at the mudlib level. |
| 185 | |
| 186 | Copyright (c) George Reese 1993 |