MG Mud User | 88f1247 | 2016-06-24 23:31:02 +0200 | [diff] [blame^] | 1 | Intermediate LPC |
| 2 | Descartes of Borg |
| 3 | November 1993 |
| 4 | |
| 5 | Chapter 7: Debugging |
| 6 | |
| 7 | 7.1 Types of Errors |
| 8 | By now, you have likely run into errors here, there, and everywhere. In |
| 9 | general, there are three sorts of errors you might see: compile time |
| 10 | errors, run time errors, and malfunctioning code. On most muds you |
| 11 | will find a personal file where your compile time errors are logged. For |
| 12 | the most part, this file can be found either in your home directory as the |
| 13 | file named "log" or ".log", or somewhere in the directory "/log" as a file |
| 14 | with your name.. In addition, muds tend to keep a log of run time errors |
| 15 | which occur while the mud is up. Again, this is generally found in |
| 16 | "/log". On MudOS muds it is called "debug.log". On other muds it may |
| 17 | be called something different like "lpmud.log". Ask your administrators |
| 18 | where compile time and run time errors are each logged if you do not |
| 19 | already know. |
| 20 | |
| 21 | Compile time errors are errors which occur when the driver tries to load |
| 22 | an object into memory. If, when the driver is trying to load an object |
| 23 | into memory, it encounters things which it simply does not understand |
| 24 | with respect to what you wrote, it will fail to load it into memory and log |
| 25 | why it could not load the object into your personal error log. The most |
| 26 | common compile time errors are typos, missing or extra (), {}. [], or "", |
| 27 | and failure to declare properly functions and variables used by the |
| 28 | object. |
| 29 | |
| 30 | Run time errors occur when something wrong happens to an object in |
| 31 | memory while it is executing a statement. For example, the driver |
| 32 | cannot tell whether the statement "x/y" will be valid in all circumstances. |
| 33 | In fact, it is a valid LPC expression. Yet, if the value of y is 0, then a |
| 34 | run time error will occur since you cannot divide by 0. When the driver |
| 35 | runs across an error during the execution of a function, it aborts |
| 36 | execution of the function and logs an error to the game's run time error |
| 37 | log. It will also show the error to this_player(), if defined, if the player |
| 38 | is a creator, or it will show "What?" to players. Most common causes |
| 39 | for run time errors are bad values and trying to perform operations with |
| 40 | data types for which those operations are not defined. |
| 41 | |
| 42 | The most insideous type of error, however, is plain malfunctioning |
| 43 | code. These errors do not log, since the driver never really realizes that |
| 44 | anything is wrong. In short, this error happens when you think the code |
| 45 | says one thing, but in fact it says another thing. People too often |
| 46 | encounter this bug and automatically insist that it must be a mudlib or |
| 47 | driver bug. Everyone makes all types of errors though, and more often |
| 48 | than not when code is not functioning the way you should, it will be |
| 49 | because you misread it. |
| 50 | |
| 51 | 7.2 Debugging Compile Time Errors |
| 52 | Compile time errors are certainly the most common and simplest bugs to |
| 53 | debug. New coders often get frustrated by them due to the obscure |
| 54 | nature of some error messages. Nevertheless, once a person becomes |
| 55 | used to the error messages generated by their driver, debugging compile |
| 56 | time errors becomes utterly routine. |
| 57 | |
| 58 | In your error log, the driver will tell you the type of error and on which |
| 59 | line it finally noticed there was an error. Note that this is not on which |
| 60 | line the actual error necessarily exists. The most common compile time |
| 61 | error, besides the typo, is the missing or superfluous parentheses, |
| 62 | brackets, braces, or quotes. Yet this error is the one that most baffles |
| 63 | new coders, since the driver will not notice the missing or extra piece |
| 64 | until well after the original. Take for example the following code: |
| 65 | |
| 66 | 1 int test(string str) { |
| 67 | 2 int x; |
| 68 | 3 for(x =0; x<10; x++) |
| 69 | 4 write(x+"\n"); |
| 70 | 5 } |
| 71 | 6 write("Done.\n"); |
| 72 | 7 } |
| 73 | |
| 74 | Depending on what you intended, the actual error here is either at line 3 |
| 75 | (meaning you are missing a {) or at line 5 (meaing you have an extra }). |
| 76 | Nevertheless, the driver will report that it found an error when it gets to |
| 77 | line 6. The actual driver message may vary from driver to driver, but no |
| 78 | matter which driver, you will see an error on line 6, since the } in line 5 |
| 79 | is interpreted as ending the function test(). At line 6, the driver sees that |
| 80 | you have a write() sitting outside any function definition, and thus |
| 81 | reports an error. Generally, the driver will also go on to report that it |
| 82 | found an error at line 7 in the form of an extra }. |
| 83 | |
| 84 | The secret to debugging these is coding style. Having closing } match |
| 85 | up vertically with the clauses they close out helps you see where you are |
| 86 | missing them when you are debugging code. Similarly, when using |
| 87 | multiple sets of parentheses, space out different groups like this: |
| 88 | if( (x=sizeof(who=users()) > ( (y+z)/(a-b) + (-(random(7))) ) ) |
| 89 | As you can see, the parentheses for the for() statement, are spaced out |
| 90 | from the rest of the statement. In addition, individual sub-groups are |
| 91 | spaced so they can easily be sorted out in the event of an error. |
| 92 | |
| 93 | Once you have a coding style which aids in picking these out, you learn |
| 94 | which error messages tend to indicate this sort of error. When |
| 95 | debugging this sort of error, you then view a section of code before and |
| 96 | after the line in question. In most all cases, you will catch the bug right |
| 97 | off. |
| 98 | |
| 99 | Another common compile time error is where the driver reports an |
| 100 | unknown identifier. Generally, typos and failure to declare variables |
| 101 | causes this sort of error. Fortunately, the error log will almost always |
| 102 | tell you exactly where the error is. So when debugging it, enter the |
| 103 | editor and find the line in question. If the problem is with a variable and |
| 104 | is not a typo, make sure you declared it properly. On the other hand, if |
| 105 | it is a typo, simply fix it! |
| 106 | |
| 107 | One thing to beware of, however, is that this error will sometimes be |
| 108 | reported in conjunction with a missing parentheses, brackets, or braces |
| 109 | type error. In these situations, your problem with an unknown identifier |
| 110 | is often bogus. The driver misreads the way the {} or whatever are |
| 111 | setup, and thus gets variable declarations confused. Therefore make |
| 112 | sure all other compile time errors are corrected before bothering with |
| 113 | these types of errors. |
| 114 | |
| 115 | In the same class with the above error, is the general syntax error. The |
| 116 | driver generates this error when it simply fails to understand what you |
| 117 | said. Again, this is often caused by typos, but can also be caused by not |
| 118 | properly understanding the syntax of a certain feature like writing a for() |
| 119 | statement: for(x=0, x<10, x++). If you get an error like this which is |
| 120 | not a syntax error, try reviewing the syntax of the statement in which the |
| 121 | error is occurring. |
| 122 | |
| 123 | 7.3 Debugging Run Time Errors |
| 124 | Run time errors are much more complex than their compile time |
| 125 | counterparts. Fortunately these errors do get logged, though many |
| 126 | creators do not realise or they do not know where to look. The error log |
| 127 | for run time errors are also generally much more detailed than compile |
| 128 | time errors, meaning that you can trace the history of the execution train |
| 129 | from where it started to where it went wrong. You therefore can setup |
| 130 | debugging traps using precompiler statements much easier using these |
| 131 | logs. Run time errors, however, tend to result from using more |
| 132 | complex codign techniques than beginners tend to use, which means you |
| 133 | are left with errors which are generally more complicated than simple |
| 134 | compile time errors. |
| 135 | |
| 136 | Run time errors almost always result from misusing LPC data types. |
| 137 | Most commonly, trying to do call others using object variables which are |
| 138 | NULL, indexing on mapping, array, or string variables which are |
| 139 | NULL, or passing bad arguments to functions. We will look at a real |
| 140 | run time error log from Nightmare: |
| 141 | |
| 142 | Bad argument 1 to explode() |
| 143 | program: bin/system/_grep.c, object: bin/system/_grep |
| 144 | line 32 |
| 145 | ' cmd_hook' in ' std/living.c' (' |
| 146 | std/user#4002')line 83 |
| 147 | ' cmd_grep' in ' bin/system/_grep.c' (' |
| 148 | bin/system/_grep')line 32 |
| 149 | Bad argument 2 to message() |
| 150 | program: adm/obj/simul_efun.c, object: adm/obj/simul_efun |
| 151 | line 34 |
| 152 | ' cmd_hook' in ' std/living.c' (' |
| 153 | std/user#4957')line 83 |
| 154 | ' cmd_look' in ' bin/mortal/_look.c' (' |
| 155 | bin/mortal/_look')line 23 |
| 156 | ' examine_object' in ' bin/mortal/_look.c' (' |
| 157 | bin/mortal/_look')line 78 |
| 158 | ' write' in 'adm/obj/simul_efun.c' (' |
| 159 | adm/obj/simul_efun')line 34 |
| 160 | Bad argument 1 to call_other() |
| 161 | program: bin/system/_clone.c, object: bin/system/_clone |
| 162 | line 25 |
| 163 | ' cmd_hook' in ' std/living.c' (' |
| 164 | std/user#3734')line 83 |
| 165 | ' cmd_clone' in ' bin/system/_clone.c' (' |
| 166 | bin/system/_clone')line 25 |
| 167 | Illegal index |
| 168 | program: std/monster.c, object: |
| 169 | wizards/zaknaifen/spy#7205 line 76 |
| 170 | ' heart_beat' in ' std/monster.c' |
| 171 | ('wizards/zaknaifen/spy#7205')line |
| 172 | 76 |
| 173 | |
| 174 | All of the errors, except the last one, involve passing a bad argument to a |
| 175 | function. The first bug, involves passing a bad first arument to the efun |
| 176 | explode(). This efun expects a string as its first argment. In debugging |
| 177 | these kinds of errors, we would therefore go to line 32 in |
| 178 | /bin/system/_grep.c and check to see what the data type of the first |
| 179 | argument being passed in fact is. In this particular case, the value being |
| 180 | passed should be a string. |
| 181 | |
| 182 | If for some reason I has actually passed something else, I would be done |
| 183 | debugging at that point and fix it simply by making sure that I was |
| 184 | passing a string. This situation is more complex. I now need to trace |
| 185 | the actual values contained by the variable being passed to explode, so |
| 186 | that I can see what it is the explode() efun sees that it is being passed. |
| 187 | |
| 188 | The line is question is this: |
| 189 | borg[files[i]] = regexp(explode(read_file(files[i]), "\n"), exp); |
| 190 | where files is an array for strings, i is an integer, and borg is a mapping. |
| 191 | So clearly we need to find out what the value of read_file(files[i]) is. |
| 192 | Well, this efun returns a string unless the file in question does not exist, |
| 193 | the object in question does not have read access to the file in question, or |
| 194 | the file in question is an empty file, in which cases the function will |
| 195 | return NULL. Clearly, our problem is that one of these events must |
| 196 | have happened. In order to see which, we need to look at files[i]. |
| 197 | |
| 198 | Examining the code, the files array gets its value through the get_dir() |
| 199 | efun. This returns all the files in a directory if the object has read access |
| 200 | to the directory. Therefore the problem is neither lack of access or non- |
| 201 | existent files. The file which caused this error then must have been an |
| 202 | empty file. And, in fact, that is exactly what caused this error. To |
| 203 | debug that, we would pass files through the filter() efun and make |
| 204 | sure that only files with a file size greater than 0 were allowed into the |
| 205 | array. |
| 206 | |
| 207 | The key to debugging a run time error is therefore knowing exactly what |
| 208 | the values of all variables in question are at the exact moment where the |
| 209 | bug created. When reading your run time log, be careful to separate the |
| 210 | object from the file in which the bug occurred. For example, the |
| 211 | indexing error above came about in the object /wizards/zaknaifen/spy, |
| 212 | but the error occured while running a function in /std/monster.c, which |
| 213 | the object inherited. |
| 214 | |
| 215 | 7.4 Malfunctioning Code |
| 216 | The nastiest problem to deal with is when your code does not behave the |
| 217 | way you intended it to behave. The object loads fine, and it produces no |
| 218 | run time errors, but things simply do not happen the way they should. |
| 219 | Since the driver does not see a problem with this type of code, no logs |
| 220 | are produced. You therefore need to go through the code line by line |
| 221 | and figure out what is happening. |
| 222 | |
| 223 | Step 1: Locate the last line of code you knew successfully executed |
| 224 | Step 2: Locate the first line of code where you know things are going |
| 225 | wrong |
| 226 | Step 3: Examine the flow of the code from the known successful point to |
| 227 | the first known unsuccessful point. |
| 228 | |
| 229 | More often than not, these problems occurr when you are using if() |
| 230 | statements and not accounting for all possibilities. For example: |
| 231 | |
| 232 | int cmd(string tmp) { |
| 233 | if(stringp(tmp)) return do_a() |
| 234 | else if(intp(tmp)) return do_b() |
| 235 | return 1; |
| 236 | } |
| 237 | |
| 238 | In this code, we find that it compiles and runs fine. Problem is nothing |
| 239 | happens when it is executed. We know for sure that the cmd() function |
| 240 | is getting executed, so we can start there. We also know that a value of |
| 241 | 1 is in fact being returned, since we do not see "What?" when we enter |
| 242 | the command. Immediately, we can see that for some reason the |
| 243 | variable tmp has a value other than string or int. As it turns out, we |
| 244 | issued the command without parameters, so tmp was NULL and failed |
| 245 | all tests. |
| 246 | |
| 247 | The above example is rather simplistic, bordering on silly. |
| 248 | Nevertheless, it gives you an idea of how to examine the flow of the |
| 249 | code when debugging malfunctioning code. Other tools are available as |
| 250 | well to help in debugging code. The most important tool is the use of |
| 251 | the precompiler to debug code. With the code above, we have a clause |
| 252 | checking for integers being passed to cmd(). When we type "cmd 10", |
| 253 | we are expecting do_b() to execute. We need to see what the value of |
| 254 | tmp is before we get into the loop: |
| 255 | |
| 256 | #define DEBUG |
| 257 | int cmd(string tmp) { |
| 258 | #ifdef DEBUG |
| 259 | write(tmp); |
| 260 | #endif |
| 261 | if(stringp(tmp)) return do_a(); |
| 262 | else if(intp(tmp)) return do_b(); |
| 263 | else return 1; |
| 264 | } |
| 265 | |
| 266 | We find out immediately upon issuing the command, that tmp has a |
| 267 | value of "10". Looking back at the code, we slap ourselves silly, |
| 268 | forgetting that we have to change command arguments to integers using |
| 269 | sscanf() before evaluating them as integers. |
| 270 | |
| 271 | 7.5 Summary |
| 272 | The key to debugging any LPC problem is always being aware of what |
| 273 | the values of your variables are at any given step in your code. LPC |
| 274 | execution reduces on the simplest level to changes in variable values, so |
| 275 | bad values are what causes bad things to happen once code has been |
| 276 | loaded into memory. If you get errors about bad arguments to |
| 277 | functions, more likely than not you are passing a NULL value to a |
| 278 | function for that argument. This happens most often with objects, since |
| 279 | people will do one of the following: |
| 280 | 1) use a value that was set to an object that has since destructed |
| 281 | 2) use the return value of this_player() when there is no this_player() |
| 282 | 3) use the return value of this_object() just after this_object() was |
| 283 | destructed |
| 284 | |
| 285 | In addition, people will often run into errors involving illegal indexing or |
| 286 | indexing on illegal types. Most often, this is because the mapping or |
| 287 | array in question was not initialized, and therefore cannot be indexed. |
| 288 | The key is to know exactly what the full value of the array or mapping |
| 289 | should be at the point in question. In addition, watch for using index |
| 290 | numbers larger than the size of given arrays |
| 291 | |
| 292 | Finally, make use of the precompiler to temporarly throw out code, or |
| 293 | introduce code which will show you the values of variables. The |
| 294 | precompiler makes it easy to get rid of debugging code quickly once you |
| 295 | are done. You can simply remove the DEBUG define when you are |
| 296 | done. |
| 297 | |
| 298 | Copyright (c) George Reese 1993 |