CSE logo University of Washington Computer Science & Engineering
 Detecting In-Flight Page Changes with Web Tripwires
  CSE Home     Security and Privacy  About Us    Search    Contact Info 

Detecting In-Flight Page Changes with Web Tripwires

Charles Reis, Steven D. Gribble, Tadayoshi Kohno, University of Washington (UW).
Nicholas C. Weaver, International Computer Science Institute (ICSI).
February 29, 2008. (Updated July 2, 2009.)
[ Overview | Detecting Changes | Measurement Study | Web Tripwire Toolkit | NSDI 2008 Paper ]

Web tripwire result for this page visit:

Overview

Our research group was surprised to hear that some ISPs have been "injecting advertisements into web pages requested by their end users," according to a recent Slashdot article. As a result, we set out to measure how often web pages are changed after leaving the server and before arriving in the user's browser.

With a web-based measurement tool called a "web tripwire," we found that approximately 1% of 50,000 visitors received pages that had been changed "in-flight." Most of these changes were caused by software that users installed on their computer (such as personal firewalls or ad blockers), but many were caused by agents in the network, such as ISPs and enterprise firewalls. Worse, we found that many of the products that users installed introduced bugs or security vulnerabilities into the web pages they requested.

To address this problem, publishers could choose to serve their pages over HTTPS rather than HTTP, using encryption to preserve page integrity. However, this is an expensive solution in many respects, and it may not always be practical. Web tripwires offer publishers a less expensive (but non-cryptographically secure) form of a page integrity check. A web tripwire uses JavaScript code to detect textual changes to an HTTP web page, with the ability to report any changes to the user and to the publisher.

The rest of this page provides more information about the results of the measurement study we conducted with web tripwires (once hosted on vancouver.cs.washington.edu), and how you can use web tripwires on your own web pages. We have a web tripwire installed on this page, and its results should be shown at the top of the page. Further details about our study are available in several forms:

Detecting Page Changes

To measure the in-flight page changes that users see in practice, we developed JavaScript code for our own web page that can detect if the page is changed after leaving our server and before arriving in the user's browser. This JavaScript code runs in the user's browser and compares the page the user received to what we expected it to be. We refer to this check as a web tripwire, and we put it online at vancouver.cs.washington.edu during our measurement study.

More information about how the web tripwire works can be found on the page linked above. At a basic level, it detects most textual changes to the page's HTML. It is not triggered by browser extensions (which are part of the browser), but it will detect proxy software installed on the user's computer. It is also not secure: adversaries could tamper with or remove the web tripwire if they wish to avoid detection. In most current cases, however, it will detect any changes to the page.

Measurement Study

We used the web page described above to conduct a measurement study of web page changes in practice. We needed to attract a large number of users on diverse networks to our page, so we posted stories on Slashdot, Digg, and similar web sites on July 24, 2007. Over the following 20 days, we received visits from 50,171 unique IP addresses.

Results

We were surprised to find that clients at 657 of those IP addresses (1.3%) reported some in-flight change to our page. This is a much larger number of changes than we expected. The vast majority of changes were caused by proxy software on the user's machine, such as popup blockers and ad blockers. However, we also observed ISPs that injected ads (e.g., through companies like NebuAd), enterprise firewalls that removed meta tags and inserted JavaScript security checks (e.g., with products like BlueCoat Web Filter), and malware that injected either exploit code or ads. Our findings are summarized in the table below.

CategoryIPsExamples
Popup Blocker277Zone Alarm (210), CA Personal Firewall (17), Sunbelt Popup Killer (12)
Ad Blocker188Ad Muncher (99), Privoxy (58), Proxomitron (25)
Problem in Transit118Blank Page (107), Incomplete Page (7)
Compression30bmi.js (23), Newlines removed (6), Image Distillation (1)
Security or Privacy17Blue Coat (15), The Cloak (1), AnchorFree (1)
Ad Injector16MetroFi (6), FairEagle/NebuAd (5), LokBox (1), Front Porch (1), PerfTech (1), Edge Technologies (1)
Meta Tag Changes12Removed meta tags (8), Reformatted meta tags (4)
Malware3W32.Arpiframe (2), Adware.LinkMaker (1)
Miscellaneous3New background color (1), IE Mark of the Web (1)
This table shows the categories of observed page changes, the number of client IP addresses affected by each, and examples. Each example is followed by the number of IP addresses that reported it; examples listed in bold introduced bugs or vulnerabilities into our page.

These changes were made by agents that had incentives to change the page, but their goals were not always in line with the goals of the user or the publisher. For example:

Bugs and Vulnerabilities

We found that many in-flight changes inadvertently broke web pages. For example, both CA Personal Firewall and some ISP-based changes caused a JavaScript stack overflow error when the scripts they inserted interfered with the code on our page. CA Personal Firewall also interfered with many web forums, including MySpace. MySpace users would post blog entries and comments, and they would find popup blocking code (i.e., "_popupControl()") inadvertently injected into their post.

Worse than this, we found several types of page changes that caused our page to become vulnerable to a cross site scripting (XSS) attack. Products such as Ad Muncher and the Sidki and Grypen filter sets for Proxomitron introduced code that was vulnerable to attack. These vulnerabilities were significant because they affected most or all of the web pages that a user visited. In the case of Proxomitron (but not Ad Muncher), the vulnerabilities could affect HTTPS traffic as well. This type of problem is analogous to a root exploit for an operating system, because it can potentially affect all pages that a user visits.

In these cases, an attacker could convince a user to follow a link that injected script code into almost any web page. This script code could steal a user's session cookie (e.g., on Facebook), modify login forms to steal passwords (e.g., on many banks), or manipulate the contents of any page (e.g., search results on Google).

We have reported the vulnerabilities we found, and the developers have released versions that fix the vulnerabilities as of Fall 2007. If you are using older versions of the products above, be sure to update as soon as possible.

Overall, these problems indicate that web page rewriting software can have dangerous consequences if it is not carefully analyzed. Users should understand these consequences when using web proxies.

Web Tripwire Toolkit

Because many of the changes we found can have negative consequences for web publishers or their users, we recommend that publishers take steps to understand what changes are made to their pages. Encrypting pages with HTTPS prevents changes, but it can be overly expensive in terms of CPU overhead, latency, and certificate costs. Thus, in many cases we suggest that publishers deploy web tripwires similar to those used in our measurement study. Although web tripwires are not secure and may miss some types of changes (e.g., full page substitutions), they can effectively detect most page modifications in practice.

To make web tripwires easy to deploy, we have developed a configurable toolkit that can be hosted by a web publisher. The toolkit is available under a BSD-style open source license, and it is effective for most web pages with static content. It consists of two Perl CGI scripts that integrate web tripwires into a given web page.

Download Web Tripwire Toolkit

Web Tripwire Toolkit License

Copyright (c) 2007 Charles Reis
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
3. Neither the name of the University of Washington nor the names of
   its contributors may be used to endorse or promote products derived
   from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER
OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Toolkit Examples

Two examples of pages that use the toolkit are linked below.


CSE logo Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to creis]