chapter 15 "Debugging"
Intermediate LPC
Descartes of Borg
November 1993

                            Chapter 7: Debugging

7.1 Types of Errors
By now, you have likely run into errors here, there, and everywhere.  In
general, there are three sorts of errors you might see: compile time
errors, run time errors, and malfunctioning code.  On most muds you
will find a personal file where your compile time errors are logged.  For
the most part, this file can be found either in your home directory as the
file named "log" or ".log", or somewhere in the directory "/log" as a file
with your name..  In addition, muds tend to keep a log of run time errors
which occur while the mud is up.  Again, this is generally found in
"/log".  On MudOS muds it is called "debug.log".  On other muds it may
be called something different like "lpmud.log".  Ask your administrators
where compile time and run time errors are each logged if you do not
already know.

Compile time errors are errors which occur when the driver tries to load
an object into memory.  If, when the driver is trying to load an object
into memory, it encounters things which it simply does not understand
with respect to what you wrote, it will fail to load it into memory and log
why it could not load the object into your personal error log.  The most
common compile time errors are typos, missing or extra (), {}. [], or "",
and failure to declare properly functions and variables used by the
object.

Run time errors occur when something wrong happens to an object in
memory while it is executing a statement.  For example, the driver
cannot tell whether the statement "x/y" will be valid in all circumstances. 
In fact, it is a valid LPC expression.  Yet, if the value of y is 0, then a
run time error will occur since you cannot divide by 0.  When the driver
runs across an error during the execution of a function, it aborts
execution of the function and logs an error to the game's run time error
log.  It will also show the error to this_player(), if defined, if the player
is a creator, or it will show "What?" to players.  Most common causes
for run time errors are bad values and trying to perform operations with
data types for which those operations are not defined.

The most insideous type of error, however, is plain malfunctioning
code.  These errors do not log, since the driver never really realizes that
anything is wrong.  In short, this error happens when you think the code
says one thing, but in fact it says another thing.  People too often
encounter this bug and automatically insist that it must be a mudlib or
driver bug.  Everyone makes all types of errors though, and more often
than not when code is not functioning the way you should, it will be
because you misread it.  

7.2 Debugging Compile Time Errors
Compile time errors are certainly the most common and simplest bugs to
debug.  New coders often get frustrated by them due to the obscure
nature of some error messages.  Nevertheless, once a person becomes
used to the error messages generated by their driver, debugging compile
time errors becomes utterly routine.

In your error log, the driver will tell you the type of error and on which
line it finally noticed there was an error.  Note that this is not on which
line the actual error necessarily exists.  The most common compile time
error, besides the typo, is the missing or superfluous parentheses,
brackets, braces, or quotes.  Yet this error is the one that most baffles
new coders, since the driver will not notice the missing or extra piece
until well after the original.  Take for example the following code:

1 int test(string str) {
2    int x;
3    for(x =0; x<10; x++)
4        write(x+"\n");
5    }
6    write("Done.\n");
7 }

Depending on what you intended, the actual error here is either at line 3
(meaning you are missing a {) or at line 5 (meaing you have an extra }). 
Nevertheless, the driver will report that it found an error when it gets to
line 6.  The actual driver message may vary from driver to driver, but no
matter which driver, you will see an error on line 6, since the } in line 5
is interpreted as ending the function test().  At line 6, the driver sees that
you have a write() sitting outside any function definition, and thus
reports an error.  Generally, the driver will also go on to report that it
found an error at line 7 in the form of an extra }.

The secret to debugging these is coding style.  Having closing } match
up vertically with the clauses they close out helps you see where you are
missing them when you are debugging code.  Similarly, when using
multiple sets of parentheses, space out different groups like this:
    if( (x=sizeof(who=users()) > ( (y+z)/(a-b) + (-(random(7))) ) ) 
As you can see, the parentheses for the for() statement, are spaced out
from the rest of the statement.  In addition, individual sub-groups are
spaced so they can easily be sorted out in the event of an error.

Once you have a coding style which aids in picking these out, you learn
which error messages tend to indicate this sort of error.  When
debugging this sort of error, you then view a section of code before and
after the line in question.  In most all cases, you will catch the bug right
off.

Another common compile time error is where the driver reports an
unknown identifier.  Generally, typos and failure to declare variables
causes this sort of error.  Fortunately, the error log will almost always
tell you exactly where the error is.  So when debugging it, enter the
editor and find the line in question.  If the problem is with a variable and
is not a typo, make sure you declared it properly.  On the other hand, if
it is a typo, simply fix it!

One thing to beware of, however, is that this error will sometimes be
reported in conjunction with a missing parentheses, brackets, or braces
type error.  In these situations, your problem with an unknown identifier
is often bogus.  The driver misreads the way the {} or whatever are
setup, and thus gets variable declarations confused.  Therefore make
sure all other compile time errors are corrected before bothering with
these types of errors.

In the same class with the above error, is the general syntax error.  The
driver generates this error when it simply fails to understand what you
said.  Again, this is often caused by typos, but can also be caused by not
properly understanding the syntax of a certain feature like writing a for()
statement: for(x=0, x<10, x++).  If you get an error like this which is
not a syntax error, try reviewing the syntax of the statement in which the
error is occurring.

7.3 Debugging Run Time Errors
Run time errors are much more complex than their compile time
counterparts.  Fortunately these errors do get logged, though many
creators do not realise or they do not know where to look.  The error log
for run time errors are also generally much more detailed than compile
time errors, meaning that you can trace the history of the execution train
from where it started to where it went wrong.  You therefore can setup
debugging traps using precompiler statements much easier using these
logs.  Run time errors, however, tend to result from using more
complex codign techniques than beginners tend to use, which means you
are left with errors which are generally more complicated than simple
compile time errors.

Run time errors almost always result from misusing LPC data types. 
Most commonly, trying to do call others using object variables which are
NULL, indexing on mapping, array, or string variables which are
NULL, or passing bad arguments to functions.  We will look at a real
run time error log from Nightmare:

Bad argument 1 to explode()
program: bin/system/_grep.c, object: bin/system/_grep
line 32
'       cmd_hook' in '        std/living.c' ('      
std/user#4002')line 83
'       cmd_grep' in '  bin/system/_grep.c' ('   
bin/system/_grep')line 32     
Bad argument 2 to message()
program: adm/obj/simul_efun.c, object: adm/obj/simul_efun
line 34
'       cmd_hook' in '        std/living.c' ('      
std/user#4957')line 83
'       cmd_look' in '  bin/mortal/_look.c' ('   
bin/mortal/_look')line 23
' examine_object' in '  bin/mortal/_look.c' ('   
bin/mortal/_look')line 78
'          write' in 'adm/obj/simul_efun.c' (' 
adm/obj/simul_efun')line 34
Bad argument 1 to call_other()
program: bin/system/_clone.c, object: bin/system/_clone
line 25
'       cmd_hook' in '        std/living.c' ('      
std/user#3734')line 83
'      cmd_clone' in ' bin/system/_clone.c' ('  
bin/system/_clone')line 25
Illegal index
program: std/monster.c, object:
wizards/zaknaifen/spy#7205 line 76
'     heart_beat' in '       std/monster.c'
('wizards/zaknaifen/spy#7205')line
76                                                                                  

All of the errors, except the last one, involve passing a bad argument to a
function.  The first bug, involves passing a bad first arument to the efun
explode().  This efun expects a string as its first argment.  In debugging
these kinds of errors, we would therefore go to line 32 in
/bin/system/_grep.c and check to see what the data type of the first
argument being passed in fact is.  In this particular case, the value being
passed should be a string.

If for some reason I has actually passed something else, I would be done
debugging at that point and fix it simply by making sure that I was
passing a string.  This situation is more complex.  I now need to trace
the actual values contained by the variable being passed to explode, so
that I can see what it is the explode() efun sees that it is being passed.

The line is question is this:
 borg[files[i]] = regexp(explode(read_file(files[i]), "\n"), exp);
where files is an array for strings, i is an integer, and borg is a mapping. 
So clearly we need to find out what the value of read_file(files[i]) is. 
Well, this efun returns a string unless the file in question does not exist,
the object in question does not have read access to the file in question, or
the file in question is an empty file, in which cases the function will
return NULL.  Clearly, our problem is that one of these events must
have happened.  In order to see which, we need to look at files[i].

Examining the code, the files array gets its value through the get_dir()
efun.  This returns all the files in a directory if the object has read access
to the directory.  Therefore the problem is neither lack of access or non-
existent files.  The file which caused this error then must have been an
empty file.  And, in fact, that is exactly what caused this error.  To
debug that, we would pass files through the filter() efun and make
sure that only files with a file size greater than 0 were allowed into the
array.

The key to debugging a run time error is therefore knowing exactly what
the values of all variables in question are at the exact moment where the
bug created.  When reading your run time log, be careful to separate the
object from the file in which the bug occurred.  For example, the
indexing error above came about in the object /wizards/zaknaifen/spy,
but the error occured while running a function in /std/monster.c, which
the object inherited.

7.4 Malfunctioning Code
The nastiest problem to deal with is when your code does not behave the
way you intended it to behave.  The object loads fine, and it produces no
run time errors, but things simply do not happen the way they should. 
Since the driver does not see a problem with this type of code, no logs
are produced.  You therefore need to go through the code line by line
and figure out what is happening.

Step 1: Locate the last line of code you knew successfully executed
Step 2: Locate the first line of code where you know things are going
wrong
Step 3: Examine the flow of the code from the known successful point to
the first known unsuccessful point.

More often than not, these problems occurr when you are using if()
statements and not accounting for all possibilities.  For example:

int cmd(string tmp) {
    if(stringp(tmp)) return do_a()
    else if(intp(tmp)) return do_b()
    return 1;
}

In this code, we find that it compiles and runs fine.  Problem is nothing
happens when it is executed.  We know for sure that the cmd() function
is getting executed, so we can start there.  We also know that a value of
1 is in fact being returned, since we do not see "What?" when we enter
the command.  Immediately, we can see that for some reason the
variable tmp has a value other than string or int.  As it turns out, we
issued the command without parameters, so tmp was NULL and failed
all tests.

The above example is rather simplistic, bordering on silly. 
Nevertheless, it gives you an idea of how to examine the flow of the
code when debugging malfunctioning code.  Other tools are available as
well to help in debugging code.  The most important tool is the use of
the precompiler to debug code.  With the code above, we have a clause
checking for integers being passed to cmd().  When we type "cmd 10",
we are expecting do_b() to execute.  We need to see what the value of
tmp is before we get into the loop:

#define DEBUG
int cmd(string tmp) {
#ifdef DEBUG
    write(tmp);
#endif
    if(stringp(tmp)) return do_a();
    else if(intp(tmp)) return do_b();
    else return 1;
}

We find out immediately upon issuing the command, that tmp has a
value of "10".  Looking back at the code, we slap ourselves silly,
forgetting that we have to change command arguments to integers using
sscanf() before evaluating them as integers.

7.5 Summary
The key to debugging any LPC problem is always being aware of what
the values of your variables are at any given step in your code.  LPC
execution reduces on the simplest level to changes in variable values, so
bad values are what causes bad things to happen once code has been
loaded into memory.  If you get errors about bad arguments to
functions, more likely than not you are passing a NULL value to a
function for that argument.  This happens most often with objects, since
people will do one of the following:
    1) use a value that was set to an object that has since destructed
    2) use the return value of this_player() when there is no this_player()
    3) use the return value of this_object() just after this_object() was
    destructed

In addition, people will often run into errors involving illegal indexing or
indexing on illegal types.  Most often, this is because the mapping or
array in question was not initialized, and therefore cannot be indexed. 
The key is to know exactly what the full value of the array or mapping
should be at the point in question.  In addition, watch for using index
numbers larger than the size of given arrays

Finally, make use of the precompiler to temporarly throw out code, or
introduce code which will show you the values of variables.  The
precompiler makes it easy to get rid of debugging code quickly once you
are done.  You can simply remove the DEBUG define when you are
done.

Copyright (c) George Reese 1993