#!/usr/bin/perl -w #------------------------------------------------------- # HTTP Debugger # # Copyright (c) 1999 John Nolan. All rights reserved. # This program is free software. You may modify and/or # distribute it under the same terms as Perl itself. # This copyright notice must remain attached to the file. # # You can run this file through either pod2man or pod2html # to produce pretty documentation in manual or html file format # (these utilities are part of the Perl 5 distribution). #------------------------------------------------------- =head1 NAME B - A tool for debugging HTTP transactions =head1 SYNOPSIS httpdebug [-p port] [-t timeout] =head1 README This is a tool to help you debug HTTP transactions. It uses both the HTTP server and HTTP client functionalities of the LWP bundle. Using this script, you can easily and quickly mimic and tweak transactions between servers and clients. You operate this program using a Web browser. =head1 DESCRIPTION When you launch this program from the command line, it becomes a tiny HTTP daemon. For example, if you launch this program with the parameter "-p 8080", then it will become a Web server on port 8080. You can then access it using a browser at the URL "http://host.domain.com:8080/c". The page that you will see is a control panel for the program. With any other URL besides "/c" (and a few other paths), this little server will only print out a brief test page (i.e., test headers and a test document). From the control panel, you can specifically adjust the test headers and the test document that the server (this program) sends to the client (something else), and then watch how the client responds. All transactions are logged, and you can view these transaction logs right from the browser, by using the path "/l" or "/log". You can use the debugger's HTTP client functionality to interact with a remote web server. From the control panel, you can specify a URL, and the debugger (as HTTP client) will send that request to a remote Web server and save the response headers and document. If you want, you can manually adjust the header data and request lines that the HTTP client uses during this transaction. After fetching a document like this, the debugger's server functionality can immediately use this information to mimic that remote server. In this way, you can very easily simulate the interactions between a remote server and a remote client, by just making your little server behave exactly like the remote server. You can very carefully tweak the headers and document data that you are sending and receiving. This can be useful for locating otherwise obscure errors. The debugger has a built-in timeout, which by default is 180 seconds. This helps prevent you from launching the HTTP daemon and then forgetting that it's running, which could be a security issue. When you launch the program from the command line, use the -t option to specify a timeout (in seconds). The program will exit after that number of seconds of idle time. =head1 The Log page The debugger has a log page, where it records the data transferred (both headers and data) during HTTP transactions. On the log page, this is the color scheme: Remote client: blue italics Local server: black italics Local client: black roman Remote server: green roman Headers and data are all the same color. They are separated by two newlines, of course. The debugger does not log transactions made when it serves up the control panel, the log page, nor this help page. =head1 Special URLs Below is a list of all the URLs that are "special" for this Web server: Control panel: /c /con /cons /console /control Log page: /l /log Help page: /i /info /h /help /q Any other URLs will result in the sending of the test page as a response. =head1 Do I really need this thing? Maybe not. You can do practically all of these things from the command line using netcat. But it's a lot trickier that way, especially if you are not a die-hard command-line jockey. This interface is certainly faster, and it keeps a nice handy log of all transactions. Plus it has pretty colors. =head1 SCRIPT CATEGORIES Web CGI =head1 PREREQUISITES You basically need the LWP bundle and CGI.pm. If your version of CGI.pm does not have the noDebug pragma, then consider downloading a later version of CGI.pm from CPAN. =head1 AUTHOR John Nolan jpnolan@sonic.net February 1999. =cut #------------------------------------------------------- use CGI qw(:standard :noDebug); use HTTP::Daemon; use HTTP::Status; use HTTP::Request; use LWP::UserAgent; use Getopt::Std; use Sys::Hostname; use strict; use vars qw( $opt_p $opt_t $Progname $nontext $menubar ); $|++; # NOTE: Near the end of this program is a BEGIN block # where some important variables are initialized. # It's at the end of the program, rather than up here, # only because it's really long. getopts('p:t:'); #---------------------------------------------- # Setup Global variables # my $PORT = (defined $opt_p ? $opt_p : 0 ); my $TIMEOUT = (defined $opt_t ? $opt_t : 180 ); # NOTE: hostname() does not return the FQDN, only the hostname. # You may want to just hard-code your hostname here, instead. # my $HOST = hostname(); chomp($HOST); my $req_counter = 1; my $res_counter = 1; my $d = new HTTP::Daemon (LocalAddr => $HOST); $d = new HTTP::Daemon (LocalAddr => $HOST, LocalPort => $PORT) if $PORT; unless (defined $d) { warn "Could not bind to port. I'm going to have to exit. Sorry.\n"; exit(-1); } my $url = $d->url; my @helpinfo = ; # Read the info at end of code my $log = ""; # Where we store transaction logs my $delim = ('-') x 60 . "\n"; # Delimiter for displaying logs my $agentname = "Mozilla (compatible: LWP $LWP::VERSION)"; my %res_headers; # Hash will hold response headers that we serve my %res_content; # Hash will hold response content that we serve my %request; # Hash will hold request data that we send $res_headers{'current'} = ($res_headers{'HTML'} or ""); $res_content{'current'} = ($res_content{'HTML'} or "

Oops -- test no data!

"); $request{'current'} = "GET http://www.perl.org HTTP/1.1\nUser-Agent: $agentname\n\n"; #---------------------------------------------- # Escape HTML so we can display raw HTML in a web browser # sub escapeHTML { my %ENTITIES = ( '&' => '&', '>' => '>', '<' => '<', '"' => '"', ); my $text = shift; $text =~ s/([&\"><])/$ENTITIES{$1}/ge; return $text; } #---------------------------------------------- # Create a Web browser and fetch a URL, headers and all. # If necessary, construct custom request headers. # sub geturl { my ($url,$request_data) = @_; my ($ua,$req,$res); $ua = new LWP::UserAgent; $ua->agent($agentname); # If there is any custom request data, then parse out # header fields and values, and set them in the # request object. # if ($request_data) { my @request_line = split /\n/, $request_data; # Special handling for the actual GET/POST statement # my ($method,$url,$protocol) = split / +/, (shift @request_line), 3; $req = HTTP::Request->new (GET => $url); $req->method($method); $req->protocol($protocol); # Now parse out the other headers while (defined ($_ = shift @request_line)) { last unless /\S/; my ($key,$value) = split /:\s+/, $_, 2; # We need to handle the User-Agent header specifically, # because it's a property of the LWP::UserAgent object, # not the HTTP::Request object. # if (lc($key) eq 'user-agent') { $ua->agent($value); next; } else { # This is where we set all the other headers. $req->header($key => $value); } } # We have read the last line of headers. # Now slurp in lines of content, if any, # and insert them as the content of our request object. # my $content = join '', @request_line; $req->content($content) if $content; } else { # If $request_data is empty, then just create # a plain-Jane request object, with the default headers. # $req = HTTP::Request->new (GET => $url); $req->protocol('HTTP/1.1'); } # Fetch the URL! $res = $ua->request($req); # Return the request and response as objects. return ($req,$res); } #---------------------------------------------- # Daemonize: fork, and then detatch from the local shell. # defined(my $pid = fork) or die "Cannot fork: $!"; if ($pid) { # The parent exits print redirect($url); exit 0; } close(STDOUT); # The child lives on, but disconnects # from the local terminal # We opt not to close STDERR here, because we actually might # want to see error messages at the terminal. #---------------------------------------------- # MAIN LOGIC: Basically a never-ending listen loop # LISTEN: { alarm($TIMEOUT); # (re-)set the deadman timer my $c = $d->accept; # $c is a connection redo LISTEN unless defined $c; my $r = $c->get_request; # $r is a request redo LISTEN unless defined $r; $CGI::Q = new CGI $r->content; #-------------------- # Log page # if ($r->url->epath =~ /(^\/+log$|^\/+l$)/) { $c->send_basic_header; print $c header, start_html("$Progname Transaction Logt"), h1("$Progname Transaction Log"), $menubar, hr, pre($log) ; close $c; redo LISTEN; #-------------------- # Help page # } elsif ($r->url->epath =~ /(help|info|^\/+i$|^\/+q$|^\/+h$)/) { $c->send_basic_header; print $c header, start_html("$Progname Help Page"), h1("$Progname Help Page"), $menubar, hr, @helpinfo, hr, $menubar, end_html ; close $c; redo LISTEN; #-------------------- # Console page # } elsif ($r->url->epath =~ /(control|console|^\/+cons?$|^\/+c$)/) { if (param 'Shut down now') { # Print a nice farewll message and then exit. # $c->send_basic_header; print $c header, start_html("$Progname Shut Down"), h1("$Progname Shut Down"), "$Progname has been shut down.", ; close $c; exit(0); } elsif (param 'Use sample') { $res_headers{'current'} = $res_headers{param 'sample'}; $res_content{'current'} = $res_content{param 'sample'}; } elsif (param 'Use previous request') { $request{'current'} = $request{param 'previous_request'}; } elsif (param 'apply') { my ($headers,$content) = split( /\n\s*\n/, param('response'), 2 ); $res_headers{'current'} = $headers . "\n"; $res_content{'current'} = $content unless $content eq $nontext; my $response = "# " . $res_counter++; $res_headers{$response} = $res_headers{'current'}; $res_content{$response} = $res_content{'current'}; } elsif (param('grab') or param('custom grab')) { my $request_data = param('custom grab') ? param('request') : ''; my ($req,$res) = geturl( param('remoteurl'), $request_data ); if (param 'custom grab') { my $req_url = defined $req->url ? $req->url : ""; my $request_tag = $req_counter++ . " - " . $req_url ; $request{$request_tag} = $r->as_string; } $log .= i( escapeHTML($req->as_string) ) . $delim; $log .= font( {color=>"green"}, escapeHTML($res->as_string) ) . $delim; # Separate headers from content. This part can probably be # cleaned up. # my ($headers,$content) = split( /\n\s*\n/, $res->as_string, 2 ); $res_headers{'current'} = $headers . "\n"; $res_content{'current'} = $content; my $req_url = defined($req->url) ? $req->url : ""; my $response_tag = "# " . $res_counter++ . " - " . $req_url; $res_headers{$response_tag} = $res_headers{'current'}; $res_content{$response_tag} = $res_content{'current'}; $request{'current'} = $req->as_string; } # Use the Content-Type header to figure out if the document body # can be displayed in browser, or if we should insert a placeholder # instead. This is kludgy. A later verion should clean up this part. # my $document = $res_headers{'current'} . "\n" if $res_headers{'current'}; if ( $document =~ m/Content-Type:\s+/i and $document !~ m/Content-Type:\s+text/i ) { $document .= $nontext; } else { $document .= $res_content{'current'} ; } $c->send_basic_header; print $c header, start_html("$Progname Control Panel"), h1("$Progname Control Panel"), $menubar, hr, startform("POST", $url."control"), p(b("Response Headers and Document Data:")), p, textarea( -name =>'response', -value=>$document, -force=>1, -rows=>12, -cols=>75, -wrap=>'physical' ), p, "You can ", submit("apply"), " these edits OR use a sample response: ", # Here, we dynamically create a popup menu whose items # are the keys of the hash %response, except for the item 'current'. # The keys of %response are set up as the actual sample responses # at the end of this script. # submit('Use sample'), popup_menu( -name => 'sample', -value => [ grep { $_ ne 'current'; } sort { lc($a) cmp lc($b); } keys %res_headers ], -default => 'HTML' ), p,"OR you can grab response data from a remote web server, and use that as is:", br, textfield( -name => "remoteurl", -size => 60, -value => 'http://' ), submit("grab"), p(b("Request Data:")), p,"Here you can customize the actual request you use to grab data: ", submit("custom grab"), p, textarea( -name =>'request', -value=>$request{'current'}, -force=>1, -rows=>12, -cols=>75, -wrap=>'physical' ), submit('Use previous request'), popup_menu( -name => 'previous_request', -value => [ grep { $_ ne 'current'; } sort { lc($a) cmp lc($b); } keys %request ], -default => 'HTML' ), p("This daemon will die after $TIMEOUT seconds of idle time. ", br, submit('Shut down now') ), endform, hr, $menubar, # This is just for debugging h3("Your contol panel request looked like this (you can debug the debugger!):"), pre(font({-color=>'blue'},escapeHTML($r->as_string))), end_html ; close $c; redo LISTEN; #-------------------- # The actual Test Page # } else { # Save the request headers, so that we can use them # ourselves if we want to mimic the client # if (defined $r) { # The variable $agent will be the hash-key which identifies # clients which sent requests to this daemon. It will appear # in the browser in a pull-down menu, from which the user # can select a previous set of request headers. # my $agent; if (defined($r->user_agent) and $r->user_agent ne "") { $agent = $r->user_agent ; } else { $agent = "Unknown"; } $request{$agent} = $r->as_string; # Munge request, to make sure we don't inadvertantly post # a request back to ourselves # $request{$agent} =~ s#(GET|POST)\s+http://[^/]+(.*)\s+HTTP#$1 http://INSERT_URL$2 HTTP#; } # Send the document to the browser # print $c $res_headers{current},"\n",$res_content{current}; close $c; # Log this transaction # $log .= font( {color=>"blue"}, i( escapeHTML($r->as_string) ) ) . $delim; # Use the Content-Type header to figure out if the document body # can be displayed in browser, or if we should insert a placeholder # instead. This is kludgy. A later verion should clean up this part. # my $document = $res_headers{current} . "\n"; if ( $res_headers{current} =~ m/Content-Type:/i and $res_headers{current} !~ m/Content-Type:\s+text/i ) { $document .= $nontext . "\n" } else { $document .= $res_content{current} } $log .= escapeHTML($document) . $delim; redo LISTEN; } } #---------------------------------------------- # The sample test pages. # Put these into a begin block so that they will be # defined before the rest of the code executes. # BEGIN { $Progname = "HTTP Debugger"; $nontext = "[non-text data]"; # The menubar, used on almost every page # $menubar = a({href=>'/info'},"Help") . ' - ' . a({href=>'/control'},"Control Panel") . ' - ' . a({href=>'/log'},"Log") . ' - ' . a({href=>'/'},"Test") ; #---------------------------------- $res_headers{'HTML'} =< $Progname

Hello client! I'm an HTML file.

Control Panel EOM #---------------------------------- $res_headers{'HTML form'} =< $Progname

$Progname - test HTML form

$menubar

See the NOTE.

POST Form

GET Form

NOTE: Values here are not sticky. This is not a comboform, it's just a plain old-fashioned HTML form with no fancy tricks from CGI.pm. It's just here as a sample form so you get the idea of how this debugger lets you view HTTP transactions.

After submitting, go to the log page to check the results. Your request will appear in blue text.

$menubar EOM $res_content{'HTML form'} .= "

" . ("\n") x 15 . "

\n\n"; #---------------------------------- $res_headers{'text'} =< 302 Found

Found

The document has moved here.

EOM #---------------------------------- $res_headers{'401 Unauthorized'} =< 401 Authorization Required

Authorization Required

This server could not verify that you are authorized to access the document requested. Either you supplied the wrong credentials (e.g., bad password), or your browser doesn't understand how to supply the credentials required.

EOM #---------------------------------- $res_headers{'403 Forbidden'} =< 403 Forbidden

Forbidden

You don't have permission to access /offlimits on this server.

EOM #---------------------------------- $res_headers{'404 Not Found'} =<404 Not Found

404 Not Found

The requested URL was not found on this server. EOM #---------------------------------- $res_headers{'500 Server Error'} =<500 Server Error

500 Server Error

The server encountered an internal error or misconfiguration and was unable to complete your request.

Please contact the server administrator, webmaster\@op.net and inform them of the time the error occurred, and anything you might have done that may have caused the error.

Error: HTTPd: malformed header from script /cgi-bin/myscript.cgi EOM #---------------------------------- my $gifdata =<What is this thing?

This is a tool to help you debug HTTP transactions. It uses both the HTTP server and HTTP client functionalities of the LWP bundle. It helps you easily mimic and tweak transactions between servers and clients. You operate this program using a Web browser.

When you launch this program from the command line, it becomes a tiny HTTP daemon. For example, if you launch this program with the parameter "-p 8080", then it will become a Web server on port 8080. You can then access it using a browser at the URL "http://host.domain.com:8080/c". The page that you will see is a control panel for the program.

With any other URL besides "/c" (and a few other paths), this little server will only print out a brief test page (i.e., test headers and a test document). From the control panel, you can specifically adjust the test headers and the test document that the server (this program) sends to the client (something else), and then watch how the client responds.

All transactions are logged, and you can view these transaction logs right from the browser, by using the path "/l" or "/log".

You can use the debugger's HTTP client functionality to interact with a remote web server. From the control panel, you can specify a URL, and the debugger (as HTTP client) will that request to a remote Web server and save the response headers and document. If you want, you can manually adjust the header data and request lines that the HTTP client uses during this transaction.

After fetching a document like this, the debugger's server functionality can immediately use this information to mimic that remote server. In this way, you can very easily simulate the interactions between a remote server and a remote client, by just making your little server behave exactly like the remote server.

You can very carefully tweak the headers and document data that you are sending and receiving. This can be useful for locating otherwise obscure errors.

The debugger has a built-in timeout, which by default is 180 seconds. This helps prevent you from launching the HTTP daemon and then forgetting that it's running, which could be a security issue. When you launch the program from the command line, use the -t option to specify a timeout (in seconds). The program will exit after that number of seconds of idle time.

The Log page

The debugger has a log page, where it records the data transferred (both headers and data) during HTTP transactions. On the log page, this is the color scheme:

Remote client <-> Local server Local client <-> Remote server

Headers and data are all the same color. They are separated by two newlines, of course.

The debugger does not log transactions made when it serves up the control panel, the log page, nor this help page.

Special URLs

Below is a list of all the URLs that are "special" for this Web server:

Control panel: /c  /con /cons /console /control
Log page:      /l  /log
Help page:     /i  /info /h /help /q

Any other URLs will result in the sending of the test page as a response.

Do I really need this thing?

Maybe not. You can do practically all of these things from the command line using netcat. But it's a lot trickier that way, especially if you are not a die-hard command-line jockey. This interface is certainly faster, and it keeps a nice handy log of all transactions. Plus it has pretty colors. :-)

Complaints, suggestions, improvements

Please send mail to the author at jpnolan@sonic.net.