AWStats on IIS 5 with Old Log Files

I wanted to use awstats to analyse my IIS 5 W3C format log files. I got the program up and running with aid of these instructions as well as conning it that the port (always 80) was actually the bytes sent parameter (for some reason it won’t run without that info being in the log file). And I could only get it to parse the last log file.

I looked at the instructions about how to parse old log files but this involved issuing a command at the commandline for each file. I have logs that go back to 2001. So in the end I wrote this perl script to issue all the necessary commands:

#! c:/perl/perl.exe

sub main {

    my $dir = "C:/WINNT/system32/LogFiles/W3SVC1";

    for (my $year = 01; $year < = 02; $year++) {
        for (my $month = 1; $month <= 12; $month++) {
            for (my $date = 1; $date <= 31; $date++) {
                my $file = $dir . "/" . get_filename($year, $month, $date);
                if (-e $file) {
                    my $cmd = "f:/inetpub/wwwroot/awstats/cgi-bin"
                        . "/awstats.pl  -config=bluebones.net -LogFile=""
                        . $file . "" -update";
                    print "$cmd
";
                    system($cmd);
                } else {
                    print $file . " does not exist
";
                }
            }
        }
    }
}

sub get_filename {

    local($year, $month, $date) = ($_[0], $_[1], $_[2]);

    my $filename = "ex" . pad($year) . pad($month) . pad($date) . ".log";

    return $filename;
}

sub pad {

    my $num = pop;

    if ($num < 10) {
        $num = "0" . $num;
    }

    return $num;
}

main();

I had to stop it running in the middle when I hit the date that I added referer to the log file format, alter the conf file manually and then start it running again. But I got there in the end.

What the world needs is a nice, clean API for log files that comes with parsers that intrinsically understand all the various standard formats. That is, I want to be able to just point the program at any Apache, IIS or other standard log files and have it chomp them all up and let me programatically get at the data in any way I like (perhaps stick it all in a SQL database?) Crucially, the program should be able to “discover” the format of the log files by looking at the headers and there should be no configuration (unless you have really weird log files).

Then people can write beautiful graphical reports for this API and everyone can use them regardless of the format that the original logfiles were in. Surely someone has thought of this before? I’ve put it on my todo list.


About this entry