skip to primary navigationskip to content

Access controls on the web

This page details various access models and what their shortcomings are. It also explains how to implement them under Apache but please understand that the shortcomings are fundamental to the models and not to the Apache implementation of them.

Important point: If you are serving files from a standard public Unix system then any user of the system can login and read the pages directly. The access controls listed below only apply to web access and not to local file access. You cannot change the permissions on the pages to stop them being world-readable on most systems because the web server runs as an unprivileged user for security reasons.

Access by Raven

You are strongly encouraged to use Raven for all forms of access control wherever possible. It almost always maps onto the restrictions people would like to impose more closely than any of the other methods.

Access by userid and password

  • Restricting access to users who can quote a userid and password
  • These are not their Unix userids and passwords.

How to do it

Create a passwored file with the htpasswd utility distributed with Apache. This creates files consisting of userids and encrypted passwords, separated by a colon, one per line. For example, suppose the following were in /some/path/htpwfile


This file needs to be readable by the Apache daemon but should not be in the document tree. You don't want the web server serving up your password file.

In the .htaccess file in the directory at the top of the set of files you want to protect, insert the following lines.

AuthUserFile /some/path/htpwfile
AuthType Basic
AuthName realm
require valid-user

The realm is used to lump together a collection of files under one security policy. The first time a user requests a file covered by a particular realm s/he will be prompted for a userid and password. This is then checked against those in the password file, /some/path/htpwfile. In any subsequent attempt to get at a page covered by realm, the browser sends the same userid and password as before so the user is not prompted again.

Note that these userids and passwords are sent in clear text across HTTP. Use Secure HTTP if you want to avoid this.

If you use the %U token in your log directives it will be replaced by the userid quoted.

Access by domain

  • Restricting access to just or
  • This does not work and cannot work.

Why it's a bad idea

Most commonly people restrict access to This is increasingly less restrictive and more and more people that traditionally wouldn't have had access to a machine in will do now. For example people who walk into the University Library can get access to a public cluster there.

Normally restricting it to people with Raven accounts will get you closer to the original semantics, although even that is increasingly insufficient.

The problem is the existence of HTTP proxies. Suppose you had a server and you wanted to restrict access to the pages so that only browsers within can read them.

Now suppose there is an HTTP proxy within (There are plenty and are frequently just turned on by accident.) Suppose the access to these has not been restricted. (This is the default state.)

A browser outside Cambridge at makes a connection to the proxy, passing an HTTP (web) request for The proxy forwards it to Because the proxy's request comes from inside it is honoured. The proxy then passes the page back to

Attempting to restrict access to is equivalent to assuming that there are no open proxies within; this is a bad assumption.

You may be able to get away with departmental restriction if you are certain there are no proxies that forward outside your department.

How to do it even if it's a bad idea

Assuming that the httpd.conf file allows the Limit override you can limit access to a domain in the.htaccess file at the top of the directory you want to protect. It should contain the following lines.

order deny,allow
deny from all
allow from
allow from
allow from localhost

NB the leading dot in the first "allow from" line. Without it the line refers to a machine rather than a domain as with the second "allow from" line.

Other resources