+----------------------------------------------------------------------+ | | | Some Notes on Using the Perl Debugger | | | | or | | | | Confessions of a ytpist. | | | +----------------------------------------------------------------------+ (C) 2003 Steven Lembark, Workhorse Computing "Hi, my name is Steve and I'm ..." Well, I'll admit up front that the title of this might better have been "confessions of a programmer": not all of my bugs are typos. There are also times when I have to debug other people's code -- some of whom are are neither typists nor programmers. Be it theirs or mine, searching for logic buried in the code can be frustrating. This is especially true when the modules call modules that call modules and you cannot put print statements into all of them. Even when logic doesn't fail me, searching for syntax buried in Perl can be frustrating. This is especially true when looking for "cleaner" alternatives to apparently messy constructs or checking some of the more offball syntax used by the idiots who wrote this code last month, er, me? ah, yeah: great syntax... If premature optimization is the root of all evil, finding places where it matters is the root of many paychecks. Most of this comes down to finding out where the code spends most of its time, how much of is wasted or even under the program's control (e.g., I/O blocks vs. tight loops). Hunting for WHERE the code spends its time can be, Oh such joy -- if you really, really like warm Mtn. Dew or stale coffee... Fortunately for me, what I really am is: "... a Perl hacker." Which means I can use Perl to debug my code for me. Or at least help. If the ultimate goal of programming is lazyness and the ultimate lazy programmer lets the language do the work, then Perl's Debugger is the ultimate tool. At the touch of a command switch you can have command line and Tk-GUI debugging or a detailed profiler doing the work for you. What follows here is a short guide to lazyness for the impatient [which obvously requires some hubris to present]. ------------------------------------------------- Introductory Domination: Perl at the Command Line. ------------------------------------------------- Among the alphabet soup available for perl's switches is "-d". This puts perl into "debug" mode with the default debugger: DB.pm. If you grew up with GNU's "gdb" program the rest of this will be deja vu all over again; if not you can open your mind to a whole new source of frustration. Aside: One of the really nice things about Perl is its level of documentation. Using 'perldoc perldebug' on your system of choice will give you a nice, short summary of the debugger. Then again, so will turning to Page 147 of Perl in a Nutshell (2nd ed) or 506 of Camel III. In fact, as you start up the perl debugger it'll tell you where to get the docs: $ perl -d -e 42 Loading DB routines from perl5db.pl version 1.22 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(-e:1): 42 DB<1> Because I started up perl without a program (unless there was a ./42 file nearby) I'm at the command line, with the return value of my program before me. This can be handy for checking out Perl syntax or whether modules are installed/functioning on your system. In this case the prompt shows me that I'm in the "main" package at program "-e" line 1. At this point I can check Perl syntax or test code just by entering it on the command line: DB<1> x localtime 0 8 1 19 2 13 3 2 4 11 5 103 6 2 7 335 8 0 DB<2> x getpwuid 0 0 'root' 1 'x' 2 0 3 0 4 '' 5 '' 6 'root' 7 '/root' 8 '/bin/bash' The 'x' command pretty prints a structure (vs. 'p' which is mainly for printing strings). In this case I the returns from "localtime" and "getpwuid 0" and displayed the resulting lists. The 'DB' prompt shows me that I'm in the debugger, the <1> or <2> shows me that I'm in the outer level of calls on the first and second debugger command, respectively. As I descend into the hell of code calls the <> will nest to show me the levels. This mode is good for testing Perl syntax, and can be useful for Q&D data munging where you want to fondle the data as it is being processed. The one thing to remember is that each step in the debugger is treated as being in its own block, so that lexical ("my") variables will not survive between commands: DB<3> my @a = ( 0..10 ) DB<4> x @a empty array DB<5> @a = (0..10) DB<6> x @a 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 DB<7> %b = @a DB<8> x %b 0 8 1 9 2 6 3 7 4 4 5 5 6 0 7 1 8 10 9 undef 10 2 11 3 DB<9> x \%b 0 HASH(0x8411c04) 0 => 1 10 => undef 2 => 3 4 => 5 6 => 7 8 => 9 Interesting thing to note: the display of referents (\%b at <9>) is a bit more readable than naked hashes (<8>). In most cases I prefer to read the stuff put out by referents. If you really care about how the output is handled, or want to use it for pretty printing your own output the Dumpvalue module is what the debugger uses for its output. After a while, however, you might get sick of just typing in commands and want to see something actually running. There are, of course, multiple ways to do it. One way is to simply add a "-d" switch to your #! code. This works if you want to run a program in the debugger, run from, say, a data directory, are are in no mood to type the full path to the perl program. Another way is to simply use "perl -d" and give a program name. #!/opt/bin/perl -d $ perl -d /path/to/foobar [arguments] [more arguments]; The one catch to using '-d' is that it has to be removed by hand before releasing the code. One giveaway that your code has a stray "-d" in it is that cron jobs leave huge logfiles (caused by the debugger going into trace mode if there is no controlling terminal). All of the examples here will use "perl -d" from the command line. As an example, what if you wanted to debug: #!/opt/bin/perl $\ = "\n"; $, = "\n\t"; print 'You are:', getpwuid $<; 0 __END__ And it prints out: You are: lembark x 1024 1024 /home/lembark /bin/bash Hmm... what are the blanks doing there? It might be helpful to know if they are empty strings, undefs, whatever... $ perl -d ./hak Loading DB routines from perl5db.pl version 1.22 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(./hak:3): $\ = "\n"; OK so far: I'm in the "main" package, in source file 'hak' on line 3. At this point I'd like to walk through the program. The simplest way is to single step through the code. As always there are more than one way to do this. In most cases you'll use the 'n' command to step to the Next source line; for debugging into calls you can use the 's' command to Step Into subroutine calls along the way. For now it doesn't matter since there are no subroutines in the code, I'll use 'n': main::(./hak:3): $\ = "\n"; DB<1> n main::(./hak:4): $, = "\n\t"; DB<1> n main::(./hak:6): print 'You are:', getpwuid $<; DB<1> n You are: lembark x 1024 1024 /home/lembark /bin/bash main::(./hak:9): 0 Each time I typed in 'n' it got me one step further through the code. When you get sick of typing 'n', pressing enter repeats the last 'n' or 's' command: main::(./hak:3): $\ = "\n"; DB<1> n main::(./hak:4): $, = "\n\t"; DB<1> main::(./hak:6): print 'You are:', getpwuid $<; DB<1> You are: lembark x 1024 1024 /home/lembark /bin/bash main::(./hak:9): 0 DB<1> Debugged program terminated. Use q to quit or R to restart, use O inhibit_exit to avoid stopping after program termination, h q, h R or h O to get additional info. Just single stepping through doesn't do much good, however. The real fun is in being able to try things out on the way: main::(./hak:3): $\ = "\n"; DB<1> n main::(./hak:4): $, = "\n\t"; DB<1> main::(./hak:6): print 'You are:', getpwuid $<; DB<1> x getpwuid $<; 0 'lembark' 1 'x' 2 1024 3 1024 4 '' 5 '' 6 '' 7 '/home/lembark' 8 '/bin/bash' DB<2> n You are: lembark x 1024 1024 /home/lembark /bin/bash main::(./hak:9): 0 OK, the blank lines were caused by empty strings in the result of getpwuid. Viewing variables by package can also be handled using the X and V commands. Using either of these in the main package results in LOTS of output. Changing the code just a bit allows me to show the difference between 'n' and 's': #!/opt/bin/perl $\ = "\n"; $, = "\n\t"; sub userinfo { my $uid = @_ ? shift : $<; print "Checking: $uid"; getpwuid $uid; } print 'Clark Kent is:', userinfo 0; print 'You are:', userinfo; 0 OK, at this point there is a subroutine call: "userinfo". The 'n' command I've been using so far gives me a look at what happens from the top level: main::(./hak:3): $\ = "\n"; DB<1> n main::(./hak:4): $, = "\n\t"; DB<1> main::(./hak:15): print 'Clark Kent is:', userinfo 0; DB<1> Checking: 1024 Clark Kent is: lembark x 1024 1024 /home/lembark /bin/bash main::(./hak:16): print 'You are:', userinfo; DB<1> Checking: 1024 You are: lembark x 1024 1024 /home/lembark /bin/bash main::(./hak:19): 0 Nothing all that surprising, but notice that you didn't get to see what happend under the hood. In many cases (say calling Date::Manip::UnixDate) you just want to see the results. If you want to crawl around under the bed, however, use the 's' command instead: main::(./hak:3): $\ = "\n"; DB<1> n main::(./hak:4): $, = "\n\t"; DB<1> main::(./hak:15): print 'Clark Kent is:', userinfo 0; DB<1> s main::userinfo(./hak:8): my $uid = @_ ? shift : $<; DB<1> main::userinfo(./hak:10): print "Checking: $uid"; DB<1> Checking: 0 main::userinfo(./hak:12): getpwuid $uid; DB<1> x $uid 0 0 Notice the 's', this steps into the userinfo call rather than to the next source line. Also notice that 'n' and 's' can be mixed as necessary to step over or into calls. I can also look at the contents of lexicals within the current scope and check that, yes, the value 0 did get passed in and used with $uid. You can check the return values from subroutines (or escape from really long ones after an accidental 's') with the 'r' ("Return") command: main::(./hak:15): print 'Clark Kent is:', userinfo 0; DB<1> s main::userinfo(./hak:8): my $uid = @_ ? shift : $<; DB<1> x @_ 0 0 DB<2> r Checking: 0 list context return from main::userinfo: 0 'root' 1 'x' 2 0 3 0 4 '' 5 '' 6 'root' 7 '/root' 8 '/bin/bash' Clark Kent is: root x 0 0 root /root /bin/bash main::(./hak:16): print 'You are:', userinfo; The 'r' shows what came back from the call. This is really handy for tracking down bugs in multiple call levels: at some point you will get an unexpected return from a seemingly- sane set of arguments. The 'n', 's', and 'r' commands will suffice for most step-by step work. ----------------------------------- Domination at its best: Breakpoints ----------------------------------- On the other hand, you may not want to debug 5_000 lines of code in an Apache module that uses Date::Manip step-by-step. For these cases the 'c' and 'b' commands come into play, as does $DB::single. The 'l' command Lists source and is handy for spotting the line numbers at which to break, and 'R' is nice if you Go To Far. 'T' is helpful for finding out where you are. The 'c' command Continues execution, optionally with a one- time breakpoint at a subroutine or line. Typing 'c' at any point in the debugging session will simply run the program until it completes or hits a breakpoint. Using 'c' with a subroutine or line runs up to that point and stops: main::(./hak:3): $\ = "\n"; DB<1> c userinfo main::userinfo(./hak:8): my $uid = @_ ? shift : $<; Using 'c' is handy when you only want to break at the next call to a subroutine and don't want to set and delete a breakpoint. It can also be useful for skipping over a bunch of lines: DB<1> l userinfo 7 { 8: my $uid = @_ ? shift : $<; 9 10: print "Checking: $uid"; 11 12: getpwuid $uid; 13 } DB<2> c 10 main::userinfo(./hak:10): print "Checking: $uid"; Here I listed the userinfo sub to find where I wanted to break at then continued to line 10. Notice that the command history count actually went up for a change: the 'l' command can be recalled later on with the '!' (repeat) command: DB<3> !1 l userinfo 7 { 8: my $uid = @_ ? shift : $<; 9 10==> print "Checking: $uid"; 11 12: getpwuid $uid; 13 } A bit more DB nomenclature is useful here: the lines shown with ':' after their numbers are valid breakpoints; the '==>' is the current line of code (i.e., what will happen on the next 'n' or 's' command): DB<3> x $uid 0 0 DB<4> r Checking: 0 list context return from main::userinfo: 0 'root' 1 'x' 2 0 3 0 4 '' 5 '' 6 'root' 7 '/root' 8 '/bin/bash' Clark Kent is: root x 0 0 root /root /bin/bash The 'c' command is nice enough for running things once or twice, but after a while 'c 10' could get tiring. That's where breakpoints come in: letting your fingers stay Lazy. The breakpoint command has a few flavors: - b - b The first sets an uncontitional breakpoint. All this means is that the code will stop execution every time it gets to that line: DB<1> b 10 DB<2> c main::userinfo(./hak:10): print "Checking: $uid"; DB<2> c Checking: 0 Clark Kent is: root x 0 0 root /root /bin/bash main::userinfo(./hak:10): print "Checking: $uid"; DB<2> c Checking: 1024 You are: lembark x 1024 1024 /home/lembark /bin/bash Debugged program terminated. Use q to quit or R to restart, use O inhibit_exit to avoid stopping after program termination, h q, h R or h O to get additional info. If I only wanted to see non-SU lookups: DB<1> b 10 $uid != 0 DB<2> c Checking: 0 Clark Kent is: root x 0 0 root /root /bin/bash main::userinfo(./hak:10): print "Checking: $uid"; DB<2> x $uid 0 1024 The "b 10 $uid != 0" says to evaluate the expression "$uid != 0" in the local context at line 10 (i.e., $uid has to be in scope wherever line 10 leaves you) and then break if the expression is true. This prints SuperUser's info then breaks for mine. Conditional breakpoints are particularly useful for tracking down out-of-bounds errors (e.g., a program blows up on the 10_000'th input row because $age is less than zero). You can also set breakpoints at subroutine calls, which can be helpful for checking calls for bogus input: DB<3> b userinfo @_ == 0 DB<4> c Checking: 0 Clark Kent is: root x 0 0 root /root /bin/bash main::userinfo(./hak:8): my $uid = @_ ? shift : $<; The 'R' command Restarts the current program -- resetting @ARGV and a number of other useful things on the way. It is useful if you suddenly notice that you've stepped too far into the execution cycle and want to set a breakpoint or examine some data before it was changed in the current cycle. One of the more common occurrances during debugging is loosing track of where you are. The 'T' command shows a stack Trace from the current exeuction line: DB<1> b userinfo @_ == 0 DB<2> c Checking: 0 Clark Kent is: root x 0 0 root /root /bin/bash main::userinfo(./hak:8): my $uid = @_ ? shift : $<; DB<2> T @ = main::userinfo() called from file `./hak' line 16 This can be particularly handy if you have several breakpoints scattered around. There are also times that you'd like to have a breakpoint set programatically. One might be adding a '-d' switch to you own program to set breakpoints at obvious points for debugging: my @optionz = qw( debug+ ); my $cmdline = {}; GetOptions $cmdline, @optionz or die 'trying...'; my $debug = $cmdline->{debug}; ... $DB::single = 1 if $debug; frobnicate $_ while ; ... sub frobnicate { my $line = shift; $DB::single = 1 if $debug > 1; ... } The Perl debugger is written in Perl and lives in the DB package. Its global variables are in the symbol table just like any other package's and can be set by name. $DB::single is what the debugger uses to tell itself to stop at the next line. You can tell it to stop also. In this case if $debug is true the code stops before frobnicat is called, if it's greater than one then execution goes single-step for each of the lines being called. You can use tests based on the data values, command line, error status: eval { }; # trace error handling only. $DB::trace = 1 if $@; The trace variable, as you probably guessed, turns on execution tracing, which is what the 't' command toggles trace mode. The one other thing you probably want to do occasionally is remove the breaks. This is done with the 'd' command (for a particular line or subroutine) or 'D' to delete all breakpoints. Using 'd' with no line on it deletes the current line's breakpoint. ---------------------------------------------- Perly Submission: Making Perl Do Your Commands ---------------------------------------------- Even if breakpoints let your fingers be lazy on the 's', they don't save you from having to type 'x' to examine things. One classic example of this is having to type 'x $object' every line watching an initializer set up data structures. But, this is Perl, so even that can be lazy. The '{' command tells the debugger to run a debugger command before every prompt: DB<1> b userinfo ! @_ DB<2> { x $uid DB<3> c Checking: 0 Clark Kent is: root x 0 0 root /root /bin/bash main::userinfo(./hak:8): my $uid = @_ ? shift : $<; auto(-1) DB<3> x $uid 0 undef DB<4> n main::userinfo(./hak:10): print "Checking: $uid"; auto(-1) DB<4> x $uid 0 1024 DB<5> Checking: 1024 main::userinfo(./hak:12): getpwuid $uid; auto(-1) DB<5> x $uid 0 1024 While obviously overkill for something this simple, it can be a big help if you are watching a structure change over 100 lines of code. See also '<', '<<', and '<<' for manaing the commands. The 'a' command sets an Action to be performed a line is executed. This is Perl code (such as a print statement) that gets run every time a line is passed. Aside from printing you could also increment a counter or push bogus values onto a stack for later checking in the debugger. The 'A' command deletes all installed actions. ------------------------------------- Lazy Domination: Automated Submission ------------------------------------- The Perl Debugger has a number of options that control how it runs the code and edits commands. Some of these are controlled via Term::ReadLine and ~/.inputrc, others the 'O' command that sets Options. You can also use .perldb or $ENV{PERLDB_OPTS} set things for you. Either way, DB will be obliging. Since I use an obviously superior editor, it seems a waste to learn a whole new set of movement keys to edit commands in the debugger -- or fumble around for arrow keys. If my ~/.inputrc has the proper commands I can avoid keystroke mismanagement: set editing-mode vi set show-all-if-ambiguous on This allows natrual movement commands to navigate the debugger history, correct typos, or manage repetative edits within the debugger. I could also use a local .perldb file to automate setting breakpoints or control variable output. The .perldb file can also set up aliases for more common commands -- or ones you repeat often when dealing with particular source code. The Nutcase book has a short section "Customizing the Debugger" on pp. 156-7 with some usable examples. ------------------------------ Tying Up A Mouse: Devel::ptkdb ------------------------------ Installing this module and running your code through: perl -d:ptkdb hak; will bring up an X11 window with a nice, usable GUI debugger. This basicall uses Tk to feed commands directly into DB: co-opting the built in DB command line handler. ptkdb supports breakpoints, watchpoints, one-time expressions. The breakpoints can be set by clicking on the line numbers, and a few buttons on the page allow step and next exeuction by line. Another alternative debugger is Devel::DProf, called via perl -d:DProf hak; This profiles your code by subroutine call/function point into a datafile that is post-processed by 'dprofpp' into all sorts of wonderful statistics.