Re: Known Problems working draft

From: John Dilley (jad@pimlico.hpl.hp.com)
Date: Thu Aug 05 1999 - 14:58:42 MDT


> Well.... since our modus operandi for this group is email, I would
> prefer a text-only version for the email list.

        Not a problem. Enclosed below is a text version of the current
draft 1.9 of the proxy/caching knonw problems draft. In the future I
will send a text copy in addition to the URL. As usual please submit
any comments or feedback to me (directly or via the wrec mail list if
you prefer) and I will incorporate them. Best regards,

                             -- jad --
                          John Dilley <jad@hpl.hp.com>
                  http://www.hpl.hp.com/personal/John_Dilley/

============================================================================

                      Known HTTP Proxy/Caching Problems

                     First Draft - Comments Solicited !

                             John Dilley, editor

Status of this Memo

This memo provides information for the Internet community. It does not
conform yet to the standards for an IETF Internet Draft although it is
expected to do so when complete. This document is is offered in accordance
with Section 10 (Intellectual Property Rights) of RFC2026. Please provide
feedback and stay tuned!

Introduction

This memo catalogs a number of known problems with World Wide Web proxy and
cache servers. The goal of the document is to provide a discussion of the
problems and proposed workarounds, and ultimately to improve conditions by
illustrating problems. The construction of this document is a joint effort
of the web caching community. It is being done under the auspices of the
IETF Web Replication and Caching working group. We gratefully acknowledge
RFC 2525, which helped define the initial format for this known problems
list.

This memo discusses problems both with Proxy servers, which act as
application-level gateways for web requests, as well as Cache servers, which
hold copies of previously requested documents in the hope of saving future
network bandwidth and latency for users. Proxies often perform a caching
function, but the two are not necessarily linked. Refer to the work in
progress report Internet Web Replication and Caching Taxonomy for
definitions of proxy and cache terminology used in this memo.

No individual or organization has complete knowledge of the know problems in
web caching. If you know of a problem that is not documented on this list
you are encouraged to send it to the WREC mailing list, wrec@cs.utk.edu for
discussion or to the memo's editor, jad@hpl.hp.com for review and inclusion
in the list.

Problem Template

Each problem is defined in a common format, summarized in the following
table and described below.
----------------------------------------------------------------------------

Name: short, descriptive name of the problem (3-5 words)
Classification: classifies the problem: performance, security, etc
Description: describes the problem succinctly
Significance: magnitude of problem, environments where it exists
Implications: the impact of the problem on systems and networks
See Also: a reference to a related known problem
Indications: states how to detect the presence of this problem
Solution(s): describe the solution(s) to this problem, if any
Workaround: practical workaround for the problem
References: information about the problem or solution
Contact: contact name and email address for this section

----------------------------------------------------------------------------

Name
     A short, descriptive name (3-5 words) name associated with the problem.
Classification
     Problems are grouped into categories of similar problems for ease of
     reading of this memo. Choose the category that best describes the
     problem. The suggested categories include three general categories and
     several more specific categories.
        o Architecture: the fundamental design is incomplete, or incorrect.
        o Specification: the spec is ambiguous, incomplete, or incorrect.
        o Implementation: the implementation of the spec is incorrect
          ------------------------------------------------------------------
        o Performance: perceived page response at the client is excessive;
          network bandwidth consumption is excessive; demand on origin or
          proxy servers exceed reasonable bounds.
        o Administration: care and feeding of caches is or causes a problem.
        o Security: privacy, integrity, or authentication concerns.
     This is the first draft of this memo. The classification structure is
     in revision. In the published drafts of the memo the classification
     structure should be fixed but may be revised from time to time.
Description
     A definition of the problem, succinct but including necessary
     background information.
Significance (High, Medium, Low)
     May include a brief summary of the environments for which the problem
     is significant.
Implications
     Why the problem is viewed as a problem. What inappropriate behavior
     results from it? This section should substantiate the magnitude of any
     problem indicated with High significance.
See Also
     Optional. List of other known problems that are related to this one.
Indications
     How to detect the presence of the problem. This may include references
     to one or more substantiating documents that demonstrate the problem.
     This should include the network configuration that led to the problem
     such that it can be reproduced. Problems that are not reproduceable
     will not appear in this memo.
Solution(s)
     Solutions that permanently fix the problem, if such are known. For
     example, what version of the software does not exhibit the problem?
     Indicate if the solution is accepted by the community, one of several
     solutions pending agreement, or open possibly with experimental
     solutions.
Workaround
     Practical workaround if no solution is available or usable. The
     workaround should have sufficient detail for someone experiencing the
     problem to get around it.
References
     References to related information in technical publications or on the
     web. Where can someone interested in learning more go to find out more
     about this problem, its solution, or workarounds?
Contact
     Contact name and email address of the person who supplied the
     information for this section. If you would prefer to remain anonymous
     the editor's name will appear here instead, but we believe in credit
     where credit is due.

Document Template

Templates for submission of known problems can be found on the web at
http://www.hpl.hp.com/personal/John_Dilley/caching/known-prob-template-00.html

----------------------------------------------------------------------------
Known Problems

The remaining sections present the currently documented known problems. The
problems are ordered by classification and significance. Issues with web
cache protocol specification or architecture are first, followed by
implementation issues. Issues of high significance are first, followed by
lower significance. The list below links to each of the known problem
descriptions below.

     Known Problems List - Wed Aug 4 11:17:15 1999
   * Network transparent proxies break client cache directives
   * Network transparent proxies prevent introduction of new HTTP methods
   * Cannot specify multiple URIs for replicated resources
   * Replica distance is unknown
   * Proxy resource location
   * Cache peer selection in heterogeneous networks
   * ICP performance
   * Cache meshes can break HTTP serialization of content
   * Use of Cache-Control headers
   * Lack of HTTP/1.1 testing for proxy caches
   * ETag support
   * Client proxy failover
   * Servers and content should be optimized for caching
   * Some servers send bad Content-Length header
   * Lack of fine-grained, standardized hierarchy controls

Please send any updated or new problems to the document editor,
jad@hpl.hp.com. I will updated this document and re-post it as needed. Thank
you!

----------------------------------------------------------------------------

Architecture

----------------------------------------------------------------------------
Name
     Network transparent proxies break client cache directives
Classification
     Architecture
Description
     HTTP is designed for the client to be aware if it is connected to an
     origin server or to a proxy. Clients who believe they are transacting
     with an origin server but are really in a connection with a network
     transparent proxy may fail to send critical cache-control information
     they would have otherwise included in their request.
Significance
     High
Implications
     Clients may receive data that is not synchronized with the origin even
     when they request an end to end refresh because of the lack of
     inclusion of either a cache-control: no-cache or must-revalidate
     header. These headers have no impact on origin server behavior so may
     not be included by the browser if it believes it is connected to that
     resource. Other related data implications are possible as well. For
     instance data security may be compromised by the lack of inclusion of
     private or no-store clauses of the cache-control header under similar
     conditions.
Indications
     Easily detected by placing fresh (un-expired) content on a proxy while
     changing the authoritative copy and requesting an end to end reload of
     the data through a proxy in both transparent and explicit modes.
Solution(s)
     Eliminate the need for network transparent proxies and IP spoofing
     which will return correct context awareness to the client.
Workaround
     Include relevant cache-control: directives in every request at the cost
     of increased bandwidth and CPU requirements.

     The HTTP/1.1 specification allows a proxy to switch over to tunnel mode
     when it receives a request with a method or HTTP version it does not
     understand how to handle.
Contact
     Patrick McManus <mcmanus@AppliedTheory.com>
     Henrik Nordstrom <hno@hem.passagen.se> (HTTP/1.1 clarification)

----------------------------------------------------------------------------
Name
     Network transparent proxies prevent introduction of new HTTP methods
Classification
     Architecture
Description
     A proxy that receives a request with a method unknown to it is required
     to generate an HTTP 501 Error as a response. HTTP methods are designed
     to be extensible so there may be applications deployed with initial
     support just for the user agent and origin server. A transparent proxy
     that hijacks requests with new methods destined for servers that have
     implemented that method creates a de-facto firewall where none may be
     intended.
Significance
     Medium within network transparent proxy environments.
Implications
     Renders new compliant applications useless unless modifications are
     made to proxy software. Because new methods are not required to be
     globally standardized it is impossible to keep up to date in the
     general case.
Solution(s)
     Eliminate the need for network transparent proxies. A client receiving
     a 501 in a traditional HTTP environment may either choose to repeat the
     request to the origin server directly, or perhaps be configured to use
     a different cache.
Workaround
     Level 5 switches (sometimes called Level 7 or application layer
     switches) can be used to keep HTTP traffic with unknown methods out of
     the proxy. However, these devices have heavy buffering
     responsibilities, still require TCP sequence number spoofing, and do
     not interact well with persistent connections.
Contact
     Patrick McManus <mcmanus@AppliedTheory.com>

----------------------------------------------------------------------------
Name
     Cannot specify multiple URIs for replicated resources
Classification
     Architecture
Description
     There is no way to specify that multiple URIs may be used for a single
     resource, one for each replica of the resource. Similarly, there is no
     way to say that some set of proxies (each identified by a URI) may be
     used to resolve a URI.
Significance
     Medium
Implications
     Forces users to understand the replication model and mechanism. Makes
     it difficult to create a replication framework without protocol support
     for replication and naming.
Indications
     Inherent in HTTP 1.0, HTTP 1.1.
Solution(s)
     Architectural - protocol design is necessary.
Workaround
     Replication mechanisms force users to locate a replica or mirror site
     for replicated content.
Contact
     Daniel LaLiberte <liberte@w3.org>

----------------------------------------------------------------------------
Name
     Replica distance is unknown
Classification
     Architecture
Description
     There is no recommended way to find out which of several servers or
     proxies is closer either to the requesting client or to another
     machine, either geographically or in the network topology.
Significance
     Medium
Implications
     Clients must guess which replica is closer to them when requesting a
     copy of a document that may be served from multiple locations. Users
     are must know the set of servers that can serve a particular object.
     This in general is hard to determine and maintain. Users must
     understand network topology in order to choose the closest copy. Note
     that the closest copy is not always the one that will result in
     quickest service. A nearby but heavily loaded server may be slower than
     a more distant but lightly loaded server.
Indications
     Inherent in HTTP 1.0, HTTP 1.1.
Solution(s)
     Architectural - protocol work is necessary. This is a specific instance
     of a general problem in widely distributed systems. A general solution
     is unlikely, however a specific solution in the web context is
     possible.
Workaround
     Servers can (many do) provide location hints in a replica selection web
     page. Users choose one based upon their location. Users can learn which
     replica server gives them best performance. Note that the closest
     replica geographically is not necessarily the closest in terms of
     network topology. Expecting users to understand network topology is
     unreasonable.
Contact
     Daniel LaLiberte <liberte@w3.org>

----------------------------------------------------------------------------
Name
     Proxy resource location
Classification
     Architecture
Description
     There is no way to tell a proxy that it may request a resource from
     another location, then the receiver should check the authenticity of
     the given resource.
Significance
     Medium
Implications
     Proxies have no systematic way to locate resources within other proxies
     or origin servers. This makes it more difficult to share information
     among proxies. Information sharing would improve global efficiency.
Indications
     Inherent in HTTP 1.0, HTTP 1.1.
Solution(s)
     Architectural - protocol design is necessary.
Workaround
     Certain proxies share location hints in the form of summary digests of
     their contents (e.g., Squid). Certain proxy protocols enable a proxy
     query another for its contents (e.g., ICP). (See however "ICP
     Performance" issue.)
Contact
     Daniel LaLiberte <liberte@w3.org>

----------------------------------------------------------------------------
Name
     Cache peer selection in heterogeneous networks
Classification
     Architecture
Description
     Cache peer selection in networks with large variance in latency and
     bandwidth between peers can lead to non-optimal peer selection. For
     example take cache C with two siblings, Sib1 and Sib2, and the
     following network topology (summarized).
        o Cache C's link to Sib1 carries 2 Mbit/sec with 300 msec latency.
        o Cache C's link to Sib2 carries 64 Kbit/sec with 10 msec latency.

     ICP won't work well in this context. If a user submits a request to
     Cache C for page P that results in a miss. C will send an ICP request
     to Sib1 and Sib2. Assume both siblings have the requested object P. The
     ICP-HIT reply will always come from Sib2 before Sib1. However, for
     large objects it is clear that the retrieval will be faster from Sib1
     rather than Sib2.

     In fact, the problem is more complex because Sib1 and Sib2 can't have a
     100% hit ratio. With a hit rate of 10%, it is more efficient to use
     Sib1 with URLs larger than 48K. The best choice depends on at least the
     hit rate and link characteristics; maybe other parameters as well.
Significance
     Medium
Implications
     By selecting the first peer to respond peer selection algorithms are
     not optimizing retrieval latency to end users. Furthermore they are
     causing more work for the high-latency peer since it must respond to
     such requests but will never be chosen to serve content if the lower
     latency peer has a copy.
Indications
     Inherent in design of ICP v1, ICP v2, and any cache mesh protocol that
     selects peer based upon first response.

     This problem is not exhibited by cache digest or other protocols which
     (attempt to) maintain knowledge of peer contents and only hit peers
     that are believed to have a copy of the requested page.
Solution(s)
     This problem is architectural with the peer selection protocol.
Workaround
     Cache mesh design when using such a protocol should be done in such a
     way that there is not a high latency variance among peers. In the
     example presented in the Description the high latency high bandwidth
     peer could be used as a parent, but should not be used as a sibling.
Contact
     Ivan LOVRIC <ivan.lovric@cnet.francetelecom.fr>
     John Dilley <jad@hpl.hp.com>

----------------------------------------------------------------------------
Name
     ICP performance
Classification
     Architecture(ICP), Performance
Description
     The ICP protocol exhibits O(n^2) scaling properties, where n is the
     number of peer proxies participating in the protocol. This can lead ICP
     traffic to dominate HTTP traffic within a network.
Significance
     Medium
Implications
     If a proxy has many ICP peers the bandwidth demand of ICP can be
     excessive. Cache managers must carefully regulate ICP peering. ICP also
     leads proxies to become heterogeneous in what they serve. This means if
     your proxy does not have a document it is unlikely your peers will have
     it either. Therefore, ICP traffic requests are largely unable to locate
     a local copy of an object [credit to Ingrid Melve's 3WCW talk for
     this].
Indications
     Inherent in design of ICP v1, ICP v2.
Solution(s)
     This problem is architectural - protocol redesign or replacement are
     required to solve it if ICP is to continue to be used.
Workaround
     Implementation workarounds exist, for example to turn off use of ICP,
     to carefully regulate peering, or to use another mechanism if
     available, such as cache digests. A cache digest protocol shares a
     summary of cache contents using a Bloom Filter technique. This allows a
     cache to estimate whether a peer has a document. Filters are updated
     regularly but are not always up-to-date so cannot help when a spike in
     popularity occurs. They also increase traffic but not as much as ICP.

     Cache clustering protocols organize caches into a mesh provide another
     alternative solution. There is ongoing research on this topic.
Contact
     John Dilley <jad@hpl.hp.com>

----------------------------------------------------------------------------
Name
     Cache meshes can break HTTP serialization of content
Classification
     Architecture (HTTP protocol)
Description
     A cache mesh where a request may travel different paths depending on
     the sate of the mesh and associated caches can break HTTP content
     serialization, possibly causing the end user to receive older content
     than seen on an earlier request where the request traveled another path
     in the mesh.
Significance
     Medium
Implications
     Can cause end user confusion. May in some situations (sibling cache
     hit, object has changed state from cacheable to uncacheable) be close
     to impossible to get the caches properly updated with the new content.
Indications
     Older content is unexpectedly returned from a cache mesh after some
     time.
Solutions(s)
     Work with cache vendors and researchers to find a suitable protocol for
     maintaining cache relations and object state in a cache mesh.
Workaround
     When designing a cache hierarchy/mesh, make sure that for each
     end-user,URL combination there is only one single path in the mesh
     during normal operation.
Contact
     Henrik Nordstrom <hno@hem.passagen.se>

----------------------------------------------------------------------------

Implementation

----------------------------------------------------------------------------
Name
     Use of Cache-Control headers
Classification
     Implementation
Description
     Many (if not most) implementations incorrectly interpret Cache-Control
     response headers.
Significance
     High
Implications
     CC headers will be spurned by end users if there are conflicting or
     non-standard implementations.
Indications
     Check: Squid, NetCache, Cache Engine, HTTP State Management draft for
     use of CC: no-cache and must-revalidate against HTTP/1.1rev6.
Solution(s)
     Work with vendors and others to assure proper application
Workaround
     None
Contact
     Mark Nottingham <mnot@pobox.com>

----------------------------------------------------------------------------
Name
     Lack of HTTP/1.1 testing for proxy caches
Classification
     Implementation/Testing
Description
     Although performance benchmarking of caches is starting to be explored,
     protocol compliance is just as important.
Significance
     High
Implications
     Cache vendors implement their interpretation of the spec; because it is
     a very large, and sometimes vague, specification, this can lead to
     inconsistent behaviors.
Indications
     -
Solution(s)
     There was some talk at WCW4 about starting a test suite, but it appears
     to have stalled.
Workaround
     Just do it.
Contact
     Mark Nottingham <mnot@pobox.com>

----------------------------------------------------------------------------
Name
     ETag support
Classification
     Implementation
Description
     No currently released cache implements ETag (strong) validation.
Significance
     Medium
Implications
     LM/IMS validation is inappropriate for many requirements, both because
     of its weakness and its use of dates. Lack of a usable, strong
     coherency protocol leads developers and end users not to trust caches.
Indications
     -
Solution(s)
     Work with vendors to implement ETags; work for better validation
     protocols
Workaround
     use LM/IMS validation
Contact
     Mark Nottingham <mnot@pobox.com>

----------------------------------------------------------------------------
Name
     Client proxy failover
Classification
     Implementation
Description
     Failover between proxies at the client level (using a proxy.pac file)
     is erratic and no standard behavior is defined. Additionally, behavior
     is hard-coded into the browser, so that proxy administrators cannot use
     failover at the client level effectively.
Significance
     Medium
Implications
     Cache system architects are forced to implement failover at the cache
     itself, when it may be more appropriate and economical to do it at the
     client.
Indications
     If a browser detects that its primary proxy is down, it will wait n
     minutes before trying the next one it is configured to use. It will
     then wait y minutes before asking the user if they'd like to try the
     original proxy again. This is very confusing for end users.
Solution(s)
     Work with browser vendors to establish standard extensions to
     JavaScript proxy.pac libraries that will allow configuration of these
     timeouts.
Workaround
     User education; redundancy at the proxy level.
Contact
     Mark Nottingham <mnot@pobox.com>

----------------------------------------------------------------------------
Name
     Servers and content should be optimized for caching
Classification
     Implementation (Performance)
Description
     Many web servers and much web content could be implemented to be more
     conducive to caching, reducing bandwidth demand and page load delay.
Significance
     Medium
Implications
     By making poor use of caches origin servers encourage longer load
     times, greater load on cache servers, and increased network demand.
Indications
     The problem is most apparent for pages that have low or zero expires
     time, yet do not change.
Solution(s)
     ...
Workaround
     For example servers could start using unique object identifiers for
     write-only content: if an object changes it gets a new name, otherwise
     is is considered to be immutable and therefore have an infinite expire
     age. Certain hosting providers do this already.
Contact
     Peter Danzig <danzig@netapp.com>

----------------------------------------------------------------------------
Name
     Some servers send bad Content-Length header files that contain CR.
Classification
     Implementation
Description
     Certain web servers send a Content-length value that is larger than
     number of bytes in the HTTP message body. This happens when the server
     strips off CR characters from text files with lines terminated with
     CRLF as the file is written to the client. The server probably uses the
     stat() system call to get the file size for the Content-Length header.
     Servers that exhibit this behavior include the GN Web server (version
     2.14 at least) (http://gopher.unicom.com/gn-info/).
Significance
     Low. Surveys indicate only a small number of sites run faulty servers.
Implications
     In this case, an HTTP agent (client or proxy) may believe it received a
     partial response. HTTP/1.1 (RFC 2616) advises that caches MAY store
     partial responses.
Indications
     Count the number of bytes in the message body and comparing it to the
     Content-length value. If they differ the server exhibits this problem.
Solutions
     Upgrade or replace the buggy server.
Workaround
     Some browsers and proxies use one TCP connection per object and ignore
     the Content-Length. The document end of file is identified by the close
     of the TCP socket.
Contact
     Duane Wessels <wessels@ircache.net>

----------------------------------------------------------------------------

Administration

----------------------------------------------------------------------------
Name
     Lack of fine-grained, standardized hierarchy controls
Classification
     Administration
Description
     There is no standard for instructing a cache as to how it should
     resolve what parent to fetch a given object from. Because of this,
     implementations vary greatly, and it can be difficult to make them
     interoperate correctly in a complex environment.
Significance
     Medium
Implications
     Complications in deployment of caches in a complex network (esp.
     corporate networks)
Indications
     Inability of some caches to be configured to direct traffic based on
     domain name, reverse lookup IP address, raw IP address, in normal
     operation and in failover mode. Inability in some caches to set a
     preferred parent / backup parent configuration.
Solution(s)
     ?
Workaround
     Work with vendors to establish an acceptable configuration within the
     limits of their product; standardize on one product
Contact
     Mark Nottingham <mnot@pobox.com>

----------------------------------------------------------------------------
$Header: draft-wrec-known-prob-00.html,v 1.9 99/08/04 11:23:34 jad Exp $



This archive was generated by hypermail 2b29 : Thu Nov 18 2004 - 11:21:26 MST