A Guide to Apache Virtual Hosts

The default functionality of a brand new Apache web server (httpd) installation is driven by its main server configuration settings. What kind of functionality ? For example on booting up, which address and port should the daemon bind to or listen on for connections, which files should be served to incoming requests, what system user and group should the daemon run as, so on so forth.

With a single instance of httpd daemon on a machine, If you just wanted to serve or host a single website, the main server (config) would suffice. But if you have to serve or host more than one website from the same httpd instance, then virtual hosts (config) are needed.

Some of the key points we will cover in this tutorial:

  • What are virtual hosts ?
  • How to use the <VirtualHost> directive ?
  • How are Listen and <VirtualHost> different ?
  • What is the difference between IP-based and Name-based virtual hosts ?
  • Where can you find the virtual host configuration files and how can you debug the current virtual host settings ?

Virtual Host

Virtual hosts are a concept where the Apache server lets you create multiple logical or conceptual “servers” or “hosts” through which you can serve or run multiple websites on a single machine, running a single instance of httpd.

While setting up a website you’ll most likely purchase a domain (foo.com) and set its DNS record to point to your machine’s IP address. This machine will serve files or scripts via a web server (Apache in our case) to all the client connections being made to the machine’s IP address and port. To connect, clients will generally perform a DNS lookup on the domain to resolve the machine’s IP address or somehow have direct access to the IP.

This mapping of requests on an ip:port to resources (like files) on the system is done in the configuration files. You can put these settings in the main server config to start with, but what if you bought two domains (foo.com and bar.com) and wanted to serve both of them through this same machine. How would the main server know which resources to serve to requests coming from clients for both the websites ? How would the differentiation among the requested websites/hostnames happen ? The main server won’t suffice.

This is practically where the concept of virtual hosts kicks in. To serve multiple websites/hostnames, you you have multiple virtual hosts that are placed inside the main server (config) only. These virtual hosts – one for foo.com and another for bar.com – will define different configurations to handle requests for each hostname separately.

Effectively, virtual hosts or virtual servers are nothing but configuration settings that extend the main server configuration to be able to serve requests for multiple IPs or hostnames in httpd. Let’s look at some pseudo-configuration code which quite resembles the actual httpd configuration.

# MAIN SERVER CONFIG

# Default configuration for core Apache's functioning
#
# A lot of the settings here are also inherited by all the virtual
# hosts as default values.

# httpd will listen on port 80 of all interfaces
Listen 80

# If a request is served by the main server, then it'll
# be served by the documents/files inside the DocumentRoot path
DocumentRoot /var/www/html

# First virtual host
<VirtualHost>
  # Configuration to serve foo.com
  ...
</VirtualHost>

# Second virtual host
<VirtualHost>
  # Configuration to serve bar.com
  ...
</VirtualHost>

...

Apache’s configuration is pretty much laid out like what you see above. It should explain what the main server (config) and virtual hosts (config) look like in practical (or configuration code) terms.

With the ability of serving multiple websites via virtual hosts, we don’t have to run them on different machines or run multiple instances of httpd to serve each of them on different ip/ports. Another way to define or look at virtual hosts is that, Apache web server allows breaking its configuration that affects its functionality in smaller individual units. These individual components are the virtual hosts that can be configured independently to serve a specific site or domain. Two virtual hosts shouldn’t share the same site or domain or hostname (must be foo.com and bar.com). But if they do, then they should be serving different port numbers (one for foo.com:80 and another for foo.com:8080).

Let’s now try to understand how the <VirtualHost> directive is used to define a virtual host.

<VirtualHost>

The <VirtualHost> block encloses a group of settings that are merged with others from the main server config and then applied to only that particular virtual host. At this point we may wonder whether all the directives from the main server config are inherited by the virtual host or not ? To find this, you can always go to the documentation for a particular directive and check the usage Context stated against it. For instance:

  • For Listen, the stated usage Context is server config only. This means it can only be used inside the main server config and not inside virtual hosts.
  • For ServerName, the stated usage Context is server config, virtual host which means it can be used in both the contexts – main server and virtual host/server.
httpd ServerName directive context

Let’s now look at a simple but real-world <VirtualHost> configuration to see what other configuration directives or settings generally goes inside it.

# If the incoming request is trying to connect on port 80
# of any IP address available on this machine, then this virtual host
# will be used to serve it.
<VirtualHost *:80>
  # If the incoming request's HTTP Host header is foo.com
  # then this virtual host will be used to serve the request.
  ServerName foo.com

  # If Apache sends any error message to the client, it will
  # add this email address to it so that the client can contact
  # the server administrator if required.
  ServerAdmin [email protected]

  # The directory or path on the machine's file system
  # that will be used to serve files or documents against
  # incoming requests.
  #
  # http://foo.com/index.htm will map to /var/www/html/index.htm
  DocumentRoot /var/www/html

  # Allow directory access to /var/www/html by the client
  <Directory /var/www/html>
    AllowOverride All
    Require all granted
  </Directory>

  # Set the file to which the server will log any errors it encounters
  ErrorLog ${APACHE_LOG_DIR}/error.log
  # Set the file to which accepted request information will be logged
  CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>

I’ve annotated most of the directives or settings above to help you understand what each of them does. Basically we’re setting up a virtual host that’ll be used by Apache to serve incoming requests to any IP on port 80 with the Host header containing foo.com. The files or documents from /var/www/html will be served for the website. Any error encountered or general information for each requests handled will also be logged to separate files.

Now that we understand how virtual hosts are defined with <VirtualHost> and the directives inside it, let’s dissect the <VirtualHost> directive itself. Here’s its syntax:

<VirtualHost addr[:port] [addr[:port]] ...> ... </VirtualHost>

We can pass multiple addr[:port] or address-port combinations to a single <VirtualHost>. All the addresses together is also known as the address set. When httpd receives a request on a bound address which is also specified in a virtual host, that particular virtual host is chosen to serve the request. An example would be if httpd is listening on 143.110.176.71:80 and it receives a request there, then it’ll pick the following virtual host to serve the request:

<VirtualHost 143.110.176.71:80> # or just 143.110.176.71
  ... configuration for serving the client requests ...
</VirtualHost>

A particular address itself can be:

  • An IPv4 or IPv6 address of one of the physical or virtual interfaces of the machine running the httpd daemon. Live in the code example above.
  • A fully qualified domain name for an IP address. What this means is that we could specify <VirtualHost foo.com> and Apache must perform DNS lookup to resolve the IP address for the domain. This is not recommended because the DNS lookup could fail which will lead to the virtual host not getting configured at all (discarded). Hence no requests will be matched to it.
  • The character * that acts as a wildcard for all IP addresses of the machine.
  • The string _default_, which is an alias for *.

The port part of the address is optional. If not specified or specified as *, the virtual host will serve requests to any port on the specified IP.

Based on the different forms of address, let’s see some examples:

# IPv4 with port
<VirtualHost 143.110.176.71:80>
  ...
</VirtualHost>

# Ipv6 must be specified in square brackets, with port
# [] helps determine the optional port
<VirtualHost [2400:6180:100:d0::b03:5001]:80>
  ...
</VirtualHost>

# Only port 80 on all IPs
<VirtualHost *:80>
  ...
</VirtualHost>

# All IPs, all ports
<VirtualHost _default_>
  ...
</VirtualHost>

# Without the optional port
#
# This will match with requests to all ports,
# ip:80, ip:8080, ip:all
<VirtualHost 143.110.176.71>
  ...
</VirtualHost>

# Not recommended (hostname or FQDN based address)
<VirtualHost codingshower.com>
</VirtualHost>

Multiple <VirtualHost>

The entire point of having virtual hosts was to be able to serve multiple websites/hostnames from the same httpd instance, right ? Like foo.com and bar.com. Let’s see how we can “start” doing that by defining multiple virtual hosts inside our main server config:

# Main server config
Listen 80
...

<VirtualHost 143.110.176.71:80>
 ...
</VirtualHost>

<VirtualHost *:80>
  ...
</VirtualHost>

A request to http://143.110.176.71:80 should make httpd pick the first virtual host, but then the second one also matches it right ? After all * matches all IPs. Well, in such cases, Apache will give direct or specific IPs higher precedence over the wildcard definitions. Hence, in this specific case the *:80 virtual host will not be used.

What do we need to do in order to serve foo.com and bar.com though? We will have to use the ServerName directive to achieve that! After doing the ip-port matching, Apache does further checks by matching the Host header with the ServerName or ServerAlias directives inside a virtual host.

# Main server config
Listen 80
...

<VirtualHost 143.110.176.71:80>
  ServerName foo.com
  DocumentRoot /home/foo/public_html
  ...
</VirtualHost>

<VirtualHost 143.110.176.71:80>
  ServerName bar.com
  # This virtual host will also serve requests for baz.com!
  ServerAlias baz.com
  DocumentRoot /home/bar/public_html
  ...
</VirtualHost>

Great! Now when someone tries to access foo.com or bar.com, as long as their DNS lookup resolves to 143.110.176.71 which should also be the correct IP for the machine running httpd, the respective virtual hosts will be picked up. Few things to note here:

  • If I remove the ServerName/ServerAlias directives from both the virtual hosts, then the first one will be picked for both foo.com and bar.com. Why so ? Whenever there are multiple virtual hosts matches based on ip-port combination and none of them further match based on ServerName/ServerAlias, the first virtual host (in appearance-order) from the entire list is picked to serve the request. This “first” virtual host is also called the default or primary server.
  • What if a request came on the IPv6 IP of the machine for foo.com or bar.com ? Since we are only matching on the IPv4 version, the virtual hosts will not match against the request. When no virtual host matches are found based on the ip-port combination itself, the request is passed on to the main server config to be handled.
  • It might be a better idea to use *:80 instead of an IP:80 because the IP could change tomorrow and you’ll have to then change the configuration as well.

As you’d have noticed by now there are a bunch of rules involved in matching a virtual host, like:

  • First level check of the vhost address (addr[:port]).
  • Second level check of ServerName/ServerAlias.
  • Requests passed on to the main server if first level check fails.
  • If first level or second level check results in multiple virtual hosts, then the first one is picked which is also known as the default or primary server.
  • Direct IPs getting precedence over wildcards

Request matching is a big thing in itself. Hence I decided to write a separate guide on it.

The most important point from all the logic above is that, apart from the ip-port combination, the ServerName/ServerAlias directive also allows us to differentiate between multiple virtual hosts even when they share the same address/port. Hence, when having multiple virtual hosts, we should make sure they are different based on one of these parameters – IP, port, ServerName/ServerAlias.

Instead of foo.com if we were using a subdomain like sub.foo.com or an IP address like 143.110.176.71, the entire Host header matching logic and functionality would still remain the same.

Listen vs VirtualHost

It is extremely important to understand that the addresses or the address set provided to <VirtualHost> is only used for “matching” with the machine addresses (and optional ports) on which the client connections are made and incoming requests received. They do not “instruct” Apache to listen on or bind to addresses and/or ports.

Making or instructing httpd to listen on a particular IP address or a port is the job of the Listen directive.

IP-Based vs Name-Based Virtual Hosts

Although this may not be super important, but Apache further divides virtual hosts into two categories (concepts):

  • IP-Based virtual hosts
  • Name-based virtual hosts

When you create one virtual host for an ip-port combination, it is called an IP-based virtual host. Whereas when you create multiple virtual hosts sharing the same combination, they’re all called name-based virtual hosts.

In this article, you can explore this concept in further detail if you’d really like, with a bunch of examples.

Config Location

One last piece that remains is, where can we actually find all the existing virtual hosts configuration on a machine. For that I’ve created a reference here. The reference covers the main server config and virtual host config file locations for the default Apache installation on most popular operating systems like Ubuntu, Debian, Fedora, CentOS, FreeBSD, macOS, etc.

A quick way to get a dump of active virtual hosts and a bit of useful information about which one is the default server, which ones are named virtual hosts, what are the hostnames (ServerName, ServerAlias) in use, etc. is to use the following command:

# Dump the parsed vhost settings
$ apachectl -D DUMP_VHOSTS
VirtualHost configuration:
143.110.176.71:8080    foobar (/etc/httpd/conf.d/welcome.conf:38)
*:80                   is a NameVirtualHost
         default server 127.0.0.1 (/etc/httpd/conf.d/welcome.conf:24)
         port 80 namevhost 127.0.0.1 (/etc/httpd/conf.d/welcome.conf:24)
         port 80 namevhost 127.0.0.1 (/etc/httpd/conf.d/welcome.conf:29)

# With httpd directly
$ httpd -D DUMP_VHOSTS

I’ve written a tutorial here, on debugging virtual host configurations that you might find useful.

Separation of Virtual Hosts Config

In the config reference for Apache in different operating systems, you’ll notice that the virtual hosts configurations are not directly specified inside the main server configuration file. Instead they are put in a separate folder which is then included in the main server config.

IMHO, this is a much better way to organise all your configuration code. Depending upon whatever separate folder is allocated for vhost configuration in your default Apache installation, you should always create separate files there for different domains/hostnames. For instance foo.conf for foo.com and bar.conf for bar.com. This is just a way to better organise your configuration and doesn’t affect the functionality in any way.

Some of the default vhost config paths on various OS are (taken from the config reference linked above):

  • Ubuntu/Debian – /etc/apache2/sites-available which is symlinked in /etc/apache2/sites-enabled via tools like a2enmod and a2dismod
  • Centos and Fedora – /etc/httpd/conf.d
  • FreeBSD – /usr/local/etc/apache24/Includes
  • macOS – /private/etc/apache2/other

Good Practices

Here’s a list of good practices that you should follow while writing your virtual hosts to avoid any problems that might take a while to hunt down:

  • Always use IP addresses or wildcards to define your virtual hosts, not hostnames (domain names) to prevent any DNS lookup failure issue.
  • The IP addresses and ports you use in your virtual host address set should come from whatever is used in the Listen directive. For instance if httpd is listening on IP1:80 and we define a virtual host for IP2 or IP1:8080 then that will never be used.
  • Ensure all virtual hosts have an explicit ServerName to avoid confusions and counter-intuition.
  • If you want to prevent any request from going to the main server, you can always have a catch-all vhost to handle them – <VirtualHost _default_:*>DocumentRoot "/var/www/default"</VirtualHost>. Instead of _default_:* you can use _default_:80 if the server is only listening on port 80 or to simplify it further just *:* or *:80 or *. Whatever you prefer.

Leave a Reply

Your email address will not be published. Required fields are marked *