Apache HTTPD good logging

1. Apache proxy server logging
- 1.1. Apache configuration
  - 1.1.1. On proxy host
  - 1.1.2. On proxied host
- 1.2. Format legend

1 Apache proxy server logging

There are two main reasons for ducking around with Apache's default log style:

You're proxying stuff, and want to propagate the original IP address down the stack
You want to parse logs programatically, which is nigh impossible in the default setup.

Here's an opinionated setup which works.

1.1 Apache configuration

1.1.1 On proxy host

# Global configuration:

LogFormat "%t ^_%a ^_%l ^_%u ^_\"%r\" ^_%>s ^_%b ^_\"%{Referer}i\" ^_\"%{User-Agent}i\"^_ %{UNIQUE_ID}e" combined_single_vhost
LogFormat "%t ^_%{Host}i ^_%a ^_%{SSL_PROTOCOL}i ^_%l ^_%u ^_\"%r\" ^_%>s ^_%b ^_\"%{Referer}i\" ^_\"%{User-Agent}i\"^_ %{UNIQUE_ID}e" combined_multiple

<Macro Unique_Header>
	RequestHeader set X-bobs-request-id "%{UNIQUE_ID}e"
	Header add X-bobs-request-id "%{UNIQUE_ID}e"
<Macro>

<Macro Log_To $name>
	CustomLog "/var/log/httpd/$name-access.log" combined_single_vhost
	ErrorLog "/var/log/https/$name-error.log"
</Macro>

CustomLog "/var/log/httpd/default-access.log" combined_multiple
ErrorLog "/var/log/httpd/default-error.log"

#
# ... then, later, in vhost config:
#

Use UniqueHeader
Use Log_To mysite

TODO: Why add two headers? (Gotta love post-hoc documentation…)

1.1.2 On proxied host

Note the directive RemoteIP(Internal|Trusted)Proxy, which differs in whether RFC 1918-style addresses are allowed to occur in X-Forwarded-For. Read the mod_remoteip documentation.

LoadModule remoteip_module modules/mod_remoteip.so
RemoteIPHeader X-Forwarded-For
RemoteIPInternalProxy 10.1.2.3
RemoteIPTrustedProxy 172.16.17.18

LogFormat "%t ^_%a ^_%l ^_%u ^_\"%r\" ^_%>s ^_%b ^_\"%{Referer}i\" ^_\"%{User-Agent}i\"^_ %{X-bobs-request-id}i" combined_single_vhost_proxied
LogFormat "%t ^_%{Host}i ^_%a ^_%{SSL_PROTOCOL}i ^_%l ^_%u ^_\"%r\" ^_%>s ^_%b ^_\"%{Referer}i\" ^_\"%{User-Agent}i\"^_ %{X-bobs-request-id}i" combined_multiple_proxied

Note and beware the intersprinkled ^_'s, which are explained below.

1.2 Format legend

Complete listing at httpd.apache.org, relevant part replicated below:

Format	Description
`%t`	Timestamp
`%a`	Client IP address
`%l`	Logname (useless, but ….)
`%u`	Username or "-"
`%r`	The request
`%>s`	Final status
`%b`	Size of response
`%{hdr}i`	Contents of header "`hdr`"
`%{var}e`	Content of env. variable "`var`"

Note that the fields are separated by a space (0x20) followed by an ASCII unit separator (0x1F). For converting the above display-code to actual code, pipe through e.g. perl -pe 's/\^_/@{[chr 31]}/g;'.

The unit separator is non-printable, and so plays (mostly) nice with visual inspection of log files, and also makes possible programmatically parsing a log file, which is logically impossible otherwise, due to the fact that fields may contain spaces, newlines, and, well, virtually anything.

A request containing a separator character will show up in the logs like so:

... "GET /wat\x1fwat HTTP/1.1" 400 ...

Separators (indeed, any Unicode characters) can be inserted in Emacs by: C-x 8 <RET> 1f <RET>. Or in Vim (from insert mode): ^v u 1f<RET>.