This page details various access models and what their shortcomings are. It also explains how to implement them under Apache but please understand that the shortcomings are fundamental to the models and not to the Apache implementation of them.
Important point: If you are serving files from a standard public Unix system then any user of the system can login and read the pages directly. The access controls listed below only apply to web access and not to local file access. You cannot change the permissions on the pages to stop them being world-readable on most systems because the web server runs as an unprivileged user for security reasons.
You are strongly encouraged to use Raven for all forms of access control wherever possible. It almost always maps onto the restrictions people would like to impose more closely than any of the other methods.
- Restricting access to users who can quote a userid and password
- These are not their Unix userids and passwords.
How to do it
Create a passwored file with the
htpasswd utility distributed with Apache. This creates files consisting of userids and encrypted passwords, separated by a colon, one per line. For example, suppose the following were in
This file needs to be readable by the Apache daemon but should not be in the document tree. You don't want the web server serving up your password file.
.htaccess file in the directory at the top of the set of files you want to protect, insert the following lines.
AuthUserFile /some/path/htpwfile AuthType Basic AuthName realm require valid-user
realm is used to lump together a collection of files under one security policy. The first time a user requests a file covered by a particular realm s/he will be prompted for a userid and password. This is then checked against those in the password file,
/some/path/htpwfile. In any subsequent attempt to get at a page covered by
realm, the browser sends the same userid and password as before so the user is not prompted again.
Note that these userids and passwords are sent in clear text across HTTP. Use Secure HTTP if you want to avoid this.
If you use the
%U token in your log directives it will be replaced by the userid quoted.
- Restricting access to just
- This does not work and cannot work.
Why it's a bad idea
Most commonly people restrict access to
.cam.ac.uk. This is increasingly less restrictive and more and more people that traditionally wouldn't have had access to a machine in .cam.ac.uk will do now. For example people who walk into the University Library can get access to a public cluster there.
Normally restricting it to people with Raven accounts will get you closer to the original semantics, although even that is increasingly insufficient.
The problem is the existence of HTTP proxies. Suppose you had a server
www.dept.cam.ac.uk and you wanted to restrict access to the pages
http://www.dept.cam.ac.uk/CamOnly/ so that only browsers within
cam.ac.uk can read them.
Now suppose there is an HTTP proxy within
cam.ac.uk. (There are plenty and are frequently just turned on by accident.) Suppose the access to these has not been restricted. (This is the default state.)
A browser outside Cambridge at
randompc.evilempire.com makes a connection to the proxy, passing an HTTP (web) request for
http://www.dept.cam.ac.uk/CamOnly/. The proxy forwards it to
www.dept.cam.ac.uk. Because the proxy's request comes from inside
cam.ac.uk it is honoured. The proxy then passes the page back to
Attempting to restrict access to
cam.ac.uk is equivalent to assuming that there are no open proxies within
cam.ac.uk; this is a bad assumption.
You may be able to get away with departmental restriction if you are certain there are no proxies that forward outside your department.
How to do it even if it's a bad idea
Assuming that the
httpd.conf file allows the
Limit override you can limit access to a domain in the
.htaccess file at the top of the directory you want to protect. It should contain the following lines.
order deny,allow deny from all allow from .dept1.cam.ac.uk allow from machine.dept2.cam.ac.uk allow from localhost
NB the leading dot in the first "
allow from" line. Without it the line refers to a machine rather than a domain as with the second "
allow from" line.