Securing the CSE Web

grim

This document is intended to illuminate the issues surrounding web security at CSE and some ideas for addressing them. See also Scott Rose's Insecure Page on Nouveau, which talks mostly about our nascient SSL service (SSL is the "Secure Socket Layer"), but also some about strategies for "securing the web."

In this document:

[Last modified on 12/09/99 at 03:06PM PST]


The Motivation for Change

A variety of considerations make the time ripe for adopting more sophisticated approaches to securing the CSE web. A few such:

the advent of ADSL and the demise of the modem pool
More insiders will come in using outside providers, making them look like outsiders, IP-wise. As the quality of the connection increases, more people will be tempted to work from home. Besides having outsider IP addresses, they are coming through more "hostile territory" to get here, such that their traffic is more likely to be monitored.
erosion of social mores
The world has become more hostile, and we can expect routine attacks on services. We can no longer assume that our traffic won't be monitored.
an increase in online services we provide
We want to move more services to the web, both for insiders and for external users. Examples: online applications to the department, visitor scheduling, account applications, class lists, resumes. That means that more kinds of data, some sensitive, are moving over the wire.
kerberos single sign-on passwords
We want to move towards a single password to authenticate to all services. That password becomes a very precious resource.

I think it worth noting that, beyond the actual provision of better secured services, we are challenged to provide the appearance of better secured services, particularly as we move to provide new services to external individual users. An additional standard we must meet is that the expectations of users be met.


Our Current "Strategy"

Currently, we rely upon the standard HTTP authentication mechanisms provided by the Apache server: IP-based authentication, and "basic" password authentication. These are described in some detail in Controlling Access to Your Documents. I'll summarize here with an emphasis on how we use the mechanisms here.

IP Auth

We use IP-based authentication for lightly-protected resources, such as pages that contain email aliases and MVIS, the room reservation system.

It is quite common to hear complaints about the inability to use MVIS from remote hosts and complaints about links from web pages that generate permission denied errors. Obviously, such complaints can be expected to increase in frequency as more people come in from outside hosts.

We had ameliorated the dead-link situation somewhat at one stage by filtering HTML pages themselves (using server-side includes, AKA "SSI") to hide the links to IP-protected resources, but that fell by the wayside with the recent web redesign. At least, temporarily.

Basic Auth

"Basic" auth refers to the scheme that's been used pretty much whenever one sees the familiar authentication dialog on a web browser. It's greatest shortcoming is that it passes the user password over the wire in "base 64 encoding," which is effectively clear text. There are a wide variety of schemes for Apache for storing the password data, such as in various databases, but they all suffer the same "password in cleartext" shortcoming because widely-deployed browsers don't support anything but Basic auth.

One such scheme, somewhat alluring, is called mod_auth_kerb-- this Apache module either authenticates a user (can you guess?) against a kerberos (4 or 5) authentication service, or uses kerberos mutual ticket authentication. Only the former is a practical option because, again, popular browsers aren't kerberized.

Something worth mentioning about basic authentication is that the authentication is performed on each and every access to a resource in a protected realm. The user is not made aware of that because the browser manages it for them by remembering the username and password associated with each authentication realm and sending it without being asked by the server. That means that the password may cross the wire hundreds or thousands of times in a single session.

Hybrid Auth

It is possible to specify that a resource be protected by basic authentication only if the IP is foreign. That means that local users need never be confronted with authentication dialogs for resources protected in this fashion. That can come in handy, though I don't know of anywhere it's being use at CSE.

Cookies

A cookie is a small (up to 4096 bytes) chunk of data that is passed from the server to the browser, which stores it. It is then returned by the browser to the server on accesses to resources within a specified domain. The browser has no control over nor understanding of the content of the cookie. Cookies are used primarily to hack state onto the otherwise stateless HTTP.

Cookies aren't really an authentication mechanism, though they can be sort of squeezed into that box. For example-- yes, mod_auth_cookie. This Apache module will accept a cookie from a user in lieu of a standard authentication dialog. The advantage over basic auth is that, because cookies can have a lifetime longer than a single session, the user may avoid the per-session authentication dialog on resources that use it. Cookies are passed in cleartext, so there is no security advantage over basic auth.


What Wants Securing

Any discussion of web security is rather too abstract without an inventory or taxonomy of resources that require or profit from being secured. How much security is enough? How much is it worth paying for it? Depends upon what we are trying to protect.

Information Not Exposed by Default

What: By default, some resources-- notably home pages-- are not shared beyond the department. Almost everybody with such resources overrides them by overriding that default.

Currently: We use IP-based directives to protect such resources.

Ideally: The existing IP-based directives are not a problem for such resources. Nobody has asked, in our hearing, for anything different than what we offer now for securing home pages.

Information We'd Rather Not Expose

What: There are a variety of pages that are better not shared to the world. Examples are pages that list the mail aliases that have a broad reach. Since we don't attempt to control who can send mail to the aliases, our sole path to a spam-free lifestyle is security through obscurity.

Currently: We use IP-based directives to protect such resources.

Ideally: It's an inconvenience to insiders not to be able to access these pages, and it's particularly frustrating to encounter links to them that, when followed, result in a access-denied error. Better would be to allow access to them only after password authentication.

Information That Should Not Be Exposed

What: Certain information and tools on the web are sensitive enough that, if exposed, could

  1. cause internal damage
  2. give away something we would rather sell
  3. present an opportunity for practitioners of mayhem
An example of (1) is the Space page. Examples of (2) and (3) are the the Affiliates Resume Databank and the CSE time clock.

The resume databank and timecard, BTW, are examples of applications with two levels of authentication: basic authentication, and authentication provided by the application itself. The latter is simply a user-provided password that is required to edit an existing resource, passed over the wire in cleartext without even the benefit of base 64 encoding.

Currently: We use basic auth with a single username/password to control access to those pages.

Ideally: It would be better to allow individuals to authenticate to such resources with their own passwords, passed securely, so that there would be neither the opportunity for sniffed authentication nor the giant granularity provided by a single username/password. Individuals could come and go from the group that is afforded access without a change of the per-resource password. Plus, people frequently forget a per-resource password.

Information That Must Not Be Exposed

What: Resources in this category are all in the planning stages, as far as I know. Examples are online account requests and online student applications. In some cases, the need for extreme security may be motivated mostly by political concerns: the industry standard suggests that such resources be accessed over a secured channel, and people are therefore likely to object to using the resources if that standard isn't met.

Currently: None.

Ideally: Ideally, such resources would be provided by an SSL-enabled service. In the event that internal users are involved, user authentication should be provided via the kerberos password. In the event of external users, there will be no authentication beyond that provided by the application- for example, if a user creates an application to the department, the account application can record a user-supplied password required to modify or delete the record.


What Is in Our Toolkit

Besides the usual suspects -- basic auth and IP-based auth -- we have access to a few other tools.

SSL

I've set up an SSL-enabled service on Nouveau, AKA "www4." The poorly-name Scott Rose's Insecure Page on Nouveau gives some detail on that service.

What SSL gives is is the ability to pass any sort of information over a wire without fearing, or having the user fear, that it will be intercepted. The cost is that each transaction has much higher computational overhead.

This particular service is based almost entirely on open source software; the exception is an RSA library that, while licensed, is freely usable by educational institutions.

mod_auth_kerb

As mentioned above, this open source module for Apache allows user authentication against a kerberos authentication service. Combined with SSL, we expect it to be a secure if heavyweight means of authenticating internal users with their local passwords. If not combined with SSL, we expect it to be an effective means of broadcasting local passwords to The Evil Ones.

Authentication Cookies

One idea that we are working with is to use cookies to authenticate users:

  1. A user visits an SSL-enabled URL to authenticate. The CGI service at that URL accepts their username and kerberos password, checks it, and, if valid, returns a cookie. The cookie is signed by the server, valid for the current session only, and contains a little bit of information about the session: the user name, a timestamp, and perhaps other miscellaneous information.
  2. Having authenticated, the user proceeds to browse the HTTP service. When a secured resource is encountered, the cookie is checked for validity, and access is granted if it is determined to be valid. This would require either a custom Apache module or that all the resources controlled by it be CGI resources.

Some considerations:

A simple example of an authentication cookie, CGI-based, is Web Login.

At first, we thought that this was our own brand new idea, but the day we settled on it, we saw an annoucement of a remarkably similar service about to be offered by CAC. See pubcookie. Their scheme differs in that all accesses to secured resources are via SSL.

Extensions to this authentication cookie idea might include automatic redirects to a "web login" page when an outside user attempts access, with an automatic return to the resource after a successful authentication.


What Way Someday Be in Our Toolkit

Client Certificates

Some day, users may have their own client certificates. SSL is ready for that day, but it's not here.

Client Extensions

An approach to client authentication that is gaining some ground, and has been in use at some sites for several year, involves a "callback" extension to the client side. In such a scheme, a simple daemon process runs on the client, communicating with the server to authenticate the user. Sidecar is an example. The advantage is that standard clients can be used. A disadvantage is that we expect caching, which can obscure the user's IP address, to be a roadblock. Another disadvantage is that users have to install custom software, which must be available for each platform.

Smarter Browsers

With the source code for Netscape Communicator becoming available, we expect to see kerberized web browsers follow soon after. Haven't heard of such yet, though.

A more secure alternative to basic auth is digest authentication, specified in RFC 2069: An Extension to HTTP : Digest Access Authentication, which states

The Digest Access Authentication scheme is not intended to be a complete answer to the need for security in the World Wide Web. This scheme provides no encryption of object content. The intent is simply to create a weak access authentication method which avoids the most serious flaws of Basic authentication.
and
Like Basic Access Authentication, the Digest scheme is based on a simple challenge-response paradigm. The Digest scheme challenges using a nonce value. A valid response contains a checksum (by default the MD5 checksum) of the username, the password, the given nonce value, the HTTP method, and the requested URI. In this way, the password is never sent in the clear. Just as with the Basic scheme, the username and password must be prearranged in some fashion which is not addressed by this document.
We are unaware of any browsers that support it.


Other Issues

Collaboration

Much of the content of the CSE web is the result of collaboration of individuals. Typically, that's implemented by the use of network file systems and Unix groups.

When the content is secured, collaboration in this manner is problematic. The web server runs as the unprivileged user nobody, and needs to be able to read files to be able to serve them. That user is a member of no group, so it needs to either own the content it serves or that content needs to be world-readable. If the content is world-readable, it is not secured from access by other internal users, who can use, for example, the file: protocol to read the content with their web browsers. If it's owned by nobody and is writable by members of a Unix group, all is well, but it turns out to be difficult to maintain that ownership across edit sessions. Emacs, for example, will change the ownership of a file to the uid of the last person to edit it.

Already we have encountered the need to resort to obscure hacks to work around the shortcomings of our default collaboration strategy. For the Space web, we run a cron job that adjusts the ownership of files to nobody. Did I hear you scream hack!?

Another approach we could use to allow collaboration would be CVS, but that suffers the disadvantage of a steep learning curve for users, and perhaps some platform issues as well.


Proposed Strategies

Information Not Exposed by Default

No change in policy is needed for such content as home pages.

Information We'd Rather Not Expose

IP-based restrictions on such material is an inconvenience to our users about which they complain frequently. A cookie-based web login provides sufficient security for such information with a minimum of inconvenience, but we haven't implemented it yet.

Information That Should Not Be Exposed

Information in this category, such as the Space page, should be evaluated on a case-by-case basis, with the choices being web login and hosting on an SSL server. The latter only becomes practical if we come closer to solving the collaboration problem.

Information That Must Not Be Exposed

Host such materials on an SSL server.


comment on this document