Analyze Apache Logs With R

I am trying to turn a new leaf and learn more about statistics. In order to demonstrate my new abilities I wanted to share with you how to analyze your Apache web logs using R.

Download R

I am going to be using R because it is free, open source and has a large community backing it. Download the latest version of R from http://cran.r-project.org.

Get Your Apache Log File

Log into your web server and download your access file(s) from Apache to your local computer or wherever you have R installed. I recommend merging your Apache log files together in order to increase the quality of the information you want to extract from your web logs. Merge your virtual host files if you serve CSS and Images from separate virtual hosts.

Parse Apache Log Files With R

I am using a standard Ubuntu Apache 2 configuration. Lets first examine the access log file.


access_log <- read.table(file="C:\\Users\\windoze\\Documents\\R\\data\\other_vhosts_access.log")
access_log[1,] # Display the different vectors in the access_log dataframe

This is the easiest way to parse the log file into a data frame for analysis.

R Bar Chart Of Apache HTTP Codes

Here is a nice visual break down of the HTTP codes your application is serving.
table(access_log[,8]) # Gives a nice text break down of HTTP codes served.
barplot(table(access_log[,8])) # Gives a nice bar plot visual of the HTTP codes served.
R Barplot of Apache HTTP Codes

R Barplot of Apache HTTP Codes

Later,

I will demonstrate how to extract more information from your access log such number of unique users visiting your site and when is the busiest time of week for your web site..
Advertisement

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.