Apache Server Get Actual Client IP Address (X-Forwarded-For) Behind Proxy or Load Balancer

Having a load balancer or a proxy server for reverse proxying is a fairly common setup these days. A slightly busy web application would be deployed on multiple servers with a load balancer distributing the incoming requests. There are also setups involving a server that’s reverse proxying requests to one or more internal servers/services. In either case when the request finally lands on httpd, it doesn’t really know who the original client is and hence if you try to resolve the IP address in the server or your application code, you’ll get the proxy’s or load balancer’s IP address.

Request with load balancer in between.
1. Useragent sends request to load balancer (Client IP: 89.249.xxx.yy)
2. Load balancer forwards to Apache servers (Client IP: 192.168.x.y)

Resolving the original client’s IP address can be really important for reasons like analytics, access control, logging, etc.

So what’s the solution then ?

mod_remoteip

Thankfully the smart folks working on the Apache Web Server have already thought about this and built a module called mod_remoteip that can be used to resolve the IP of the original client instead of the proxy. How does it do that ? Before I answer that, we must understand the Forwarded HTTP Extension.

Forwarded Header

According to RFC 7239 (Forwarded HTTP Extension), any legit proxy must relay client information like IP address, requested hostname, request protocol, etc. to the next hop (server or proxy). This must be done in the Forwarded header but in real world the non-standard X-Forwarded-For header is widely used. For instance, if Apache is used as a proxy (mod_proxy), it passes on three different headers to the next hop by default:

  1. X-Forwarded-For – List or chain of IP addresses where the left most address belongs to the originating client and the right most to the most recent proxy. This basically means every proxy just keeps appending this header with the client IP they resolve. With multiple proxies in the chain, some of the client IPs would be that of the proxy servers itself.
  2. X-Forwarded-Host – Contains the host requested by the original client (the user or useragent).
  3. X-Forwarded-Server – Gets overwritten by each proxy and contains the current proxy’s server name. If the proxy were Apache then it’d simply slap its ServerName value in it.

The most important piece here is X-Forwarded-For. If we trust the intermediate proxies/load balancers (more on this in a bit), then we can just extract the original IP from this header.

Usage

Let’s look at how mod_remoteip helps us retrieve the original client IP instead of the proxy’s. So with a proxy in place, our Apache server would be getting this in the header:

X-Forwarded-For: <client_ip>, <proxy1_ip>, <proxy2_ip>

Once the mod_remoteip is enabled, we must specify which header is Apache supposed to read the original IP from. For this we use the RemoteIPHeader directive.

# Syntax: RemoteIPHeader header-field
RemoteIPHeader X-Forwarded-For

This particular configuration setting tells Apache that it must parse X-Forwarded-For header field to retrieve the original client’s IP Address. Apache HTTP server will parse this header from right to left (remember it can be a list of IPs). It’ll extract every (proxy) IP from the right, check if it can be trusted to present the preceding IP (one on the left) and strip it off the header. By default all intermediate proxies are trusted.

The right most IP in the list is appended by the final proxy server in the chain. The last proxy’s IP is the one that httpd receives the connection from, so the trust check actually begins from there. In the end httpd overrides its client IP with the left most one and it removes the header from the request altogether.

So now if you try to get the client IP address in your application ($_SERVER['REMOTE_ADDR'] in PHP) or log client it in access logs with the %a format string, you’ll get the IP address of the actual client and not of the intermediate proxies. However if you want to log the last proxy’s IP as well, the %{c}a format string does the job.

mod_remoteip also stores the list of trusted intermediate proxies in a variable called remoteip-proxy-ip-list that can be accessed to log using the %{remoteip-proxy-ip-list}n format string. The same can be made available as a header using the RemoteIPProxiesHeader directive.

# X-Forwarded-By headers are widely used to store
# a list of IP addresses of the intermediate proxies'
# network interface that received the client connection
#
# Syntax: RemoteIPProvidesHeader header-field
RemoteIPProxiesHeader X-Forwarded-By

Now in your application you could just get the value of this header to access the list of trusted intermediate proxies. For instance in PHP:

echo $_SERVER['HTTP_X_FORWARDED_BY'];
// or
echo getallheaders()['X-Forwarded-By'];

Trusted proxies

One of the important aspects when resolving IPs off the Forwarded HTTP extension is that we should only rely on “trusted” proxies. Whenever there’s a legit proxy or load balancer in picture, we know its IP address or the address block (subnet). Most setups are in a private network where the proxy subnet is fixed and we should let Apache do IP resolution only if the incoming request (with the X-Forwarded-For header) is from that IP range, because its a part of our own setup and hence a reliable source of information.

If we don’t do this then by default httpd will trust all clients (as we saw earlier) and its super easy for a remote client/useragent to impersonate another client/useragent. I could just fire a request with a fake X-Forwarded-For: RANDOM_IP_ADDR and you don’t wanna trust that.

So how do we enforce trust ? mod_remoteip gives us the required directives for this.

RemoteIPInternalProxy proxy-ip|proxy-ip/subnet|hostname ...
RemoteIPTrustedProxy proxy-ip|proxy-ip/subnet|hostname ...

We can use RemoteIPInternalProxy or RemoteIPTrustedProxy to specify a list of IP address, an address block (subnet) or hostnames to list a bunch of proxies that Apache should trust while parsing X-Forwarded-For (or whatever header field is specified by RemoteIPProxiesHeader).

Now when httpd does parse the proxy header list from right to left, as long as it keeps finding IPs that can be trusted to provide the IP on the left (preceding one), it’ll keep removing it from the list. But if it encounters one that cannot be trusted, it’ll stop any further parsing and leave the header as is. The client IP of the connection won’t be overridden and hence if you try to fetch that, you’ll get the last proxy IP (not the actual client’s/useragent’s IP).

The difference between RemoteIPInternalProxy and RemoteIPTrustedProxy are:

  1. Former accepts all kinds of IPs – public and private networks. The latter does not accept private network IPs, only public ones.
  2. The RemoteIPProxiesHeader‘s header field (X-Forwarded-By) discards Internal proxies and only records Trusted proxies. Hence if your proxies are public and you want them to be tracked in the proxies header (and the remoteip-proxy-ip-list server variable), always use RemoteIPTrustedProxy over the former.

There’s a slight variation of these directives that accepts files just incase you’ve got a huge list of proxies that you want trusted:

RemoteIPInternalProxyList filepath
RemoteIPTrustedProxyList filepath

File must contain one entry (IP address, address block or hostname) per line. A sample from the docs:

# Our internally trusted proxies;
10.0.2.0/24         # Everyone in the testing group
gateway.localdomain # The front end balancer

Conclusion

Interestingly most modern web application frameworks are now equipped to resolve the real client’s IP via X-Forwarded-For when proxies or load balancers come into picture. But doing it at the Apache level is useful when your infrastructure is making use of other modules like mod_log_config, mod_authz_host, mod_evasive, et al. that only work as expected if they’re able to retrieve the originating request IP from httpd’s core.

Leave a Reply

Your email address will not be published.