> Well.... since our modus operandi for this group is email, I would
> prefer a text-only version for the email list.
Not a problem. Enclosed below is a text version of the current
draft 1.9 of the proxy/caching knonw problems draft. In the future I
will send a text copy in addition to the URL. As usual please submit
any comments or feedback to me (directly or via the wrec mail list if
you prefer) and I will incorporate them. Best regards,
-- jad --
John Dilley <jad@hpl.hp.com>
http://www.hpl.hp.com/personal/John_Dilley/
============================================================================
Known HTTP Proxy/Caching Problems
First Draft - Comments Solicited !
John Dilley, editor
Status of this Memo
This memo provides information for the Internet community. It does not
conform yet to the standards for an IETF Internet Draft although it is
expected to do so when complete. This document is is offered in accordance
with Section 10 (Intellectual Property Rights) of RFC2026. Please provide
feedback and stay tuned!
Introduction
This memo catalogs a number of known problems with World Wide Web proxy and
cache servers. The goal of the document is to provide a discussion of the
problems and proposed workarounds, and ultimately to improve conditions by
illustrating problems. The construction of this document is a joint effort
of the web caching community. It is being done under the auspices of the
IETF Web Replication and Caching working group. We gratefully acknowledge
RFC 2525, which helped define the initial format for this known problems
list.
This memo discusses problems both with Proxy servers, which act as
application-level gateways for web requests, as well as Cache servers, which
hold copies of previously requested documents in the hope of saving future
network bandwidth and latency for users. Proxies often perform a caching
function, but the two are not necessarily linked. Refer to the work in
progress report Internet Web Replication and Caching Taxonomy for
definitions of proxy and cache terminology used in this memo.
No individual or organization has complete knowledge of the know problems in
web caching. If you know of a problem that is not documented on this list
you are encouraged to send it to the WREC mailing list, wrec@cs.utk.edu for
discussion or to the memo's editor, jad@hpl.hp.com for review and inclusion
in the list.
Problem Template
Each problem is defined in a common format, summarized in the following
table and described below.
----------------------------------------------------------------------------
Name: short, descriptive name of the problem (3-5 words)
Classification: classifies the problem: performance, security, etc
Description: describes the problem succinctly
Significance: magnitude of problem, environments where it exists
Implications: the impact of the problem on systems and networks
See Also: a reference to a related known problem
Indications: states how to detect the presence of this problem
Solution(s): describe the solution(s) to this problem, if any
Workaround: practical workaround for the problem
References: information about the problem or solution
Contact: contact name and email address for this section
----------------------------------------------------------------------------
Name
A short, descriptive name (3-5 words) name associated with the problem.
Classification
Problems are grouped into categories of similar problems for ease of
reading of this memo. Choose the category that best describes the
problem. The suggested categories include three general categories and
several more specific categories.
o Architecture: the fundamental design is incomplete, or incorrect.
o Specification: the spec is ambiguous, incomplete, or incorrect.
o Implementation: the implementation of the spec is incorrect
------------------------------------------------------------------
o Performance: perceived page response at the client is excessive;
network bandwidth consumption is excessive; demand on origin or
proxy servers exceed reasonable bounds.
o Administration: care and feeding of caches is or causes a problem.
o Security: privacy, integrity, or authentication concerns.
This is the first draft of this memo. The classification structure is
in revision. In the published drafts of the memo the classification
structure should be fixed but may be revised from time to time.
Description
A definition of the problem, succinct but including necessary
background information.
Significance (High, Medium, Low)
May include a brief summary of the environments for which the problem
is significant.
Implications
Why the problem is viewed as a problem. What inappropriate behavior
results from it? This section should substantiate the magnitude of any
problem indicated with High significance.
See Also
Optional. List of other known problems that are related to this one.
Indications
How to detect the presence of the problem. This may include references
to one or more substantiating documents that demonstrate the problem.
This should include the network configuration that led to the problem
such that it can be reproduced. Problems that are not reproduceable
will not appear in this memo.
Solution(s)
Solutions that permanently fix the problem, if such are known. For
example, what version of the software does not exhibit the problem?
Indicate if the solution is accepted by the community, one of several
solutions pending agreement, or open possibly with experimental
solutions.
Workaround
Practical workaround if no solution is available or usable. The
workaround should have sufficient detail for someone experiencing the
problem to get around it.
References
References to related information in technical publications or on the
web. Where can someone interested in learning more go to find out more
about this problem, its solution, or workarounds?
Contact
Contact name and email address of the person who supplied the
information for this section. If you would prefer to remain anonymous
the editor's name will appear here instead, but we believe in credit
where credit is due.
Document Template
Templates for submission of known problems can be found on the web at
http://www.hpl.hp.com/personal/John_Dilley/caching/known-prob-template-00.html
----------------------------------------------------------------------------
Known Problems
The remaining sections present the currently documented known problems. The
problems are ordered by classification and significance. Issues with web
cache protocol specification or architecture are first, followed by
implementation issues. Issues of high significance are first, followed by
lower significance. The list below links to each of the known problem
descriptions below.
Known Problems List - Wed Aug 4 11:17:15 1999
* Network transparent proxies break client cache directives
* Network transparent proxies prevent introduction of new HTTP methods
* Cannot specify multiple URIs for replicated resources
* Replica distance is unknown
* Proxy resource location
* Cache peer selection in heterogeneous networks
* ICP performance
* Cache meshes can break HTTP serialization of content
* Use of Cache-Control headers
* Lack of HTTP/1.1 testing for proxy caches
* ETag support
* Client proxy failover
* Servers and content should be optimized for caching
* Some servers send bad Content-Length header
* Lack of fine-grained, standardized hierarchy controls
Please send any updated or new problems to the document editor,
jad@hpl.hp.com. I will updated this document and re-post it as needed. Thank
you!
----------------------------------------------------------------------------
Architecture
----------------------------------------------------------------------------
Name
Network transparent proxies break client cache directives
Classification
Architecture
Description
HTTP is designed for the client to be aware if it is connected to an
origin server or to a proxy. Clients who believe they are transacting
with an origin server but are really in a connection with a network
transparent proxy may fail to send critical cache-control information
they would have otherwise included in their request.
Significance
High
Implications
Clients may receive data that is not synchronized with the origin even
when they request an end to end refresh because of the lack of
inclusion of either a cache-control: no-cache or must-revalidate
header. These headers have no impact on origin server behavior so may
not be included by the browser if it believes it is connected to that
resource. Other related data implications are possible as well. For
instance data security may be compromised by the lack of inclusion of
private or no-store clauses of the cache-control header under similar
conditions.
Indications
Easily detected by placing fresh (un-expired) content on a proxy while
changing the authoritative copy and requesting an end to end reload of
the data through a proxy in both transparent and explicit modes.
Solution(s)
Eliminate the need for network transparent proxies and IP spoofing
which will return correct context awareness to the client.
Workaround
Include relevant cache-control: directives in every request at the cost
of increased bandwidth and CPU requirements.
The HTTP/1.1 specification allows a proxy to switch over to tunnel mode
when it receives a request with a method or HTTP version it does not
understand how to handle.
Contact
Patrick McManus <mcmanus@AppliedTheory.com>
Henrik Nordstrom <hno@hem.passagen.se> (HTTP/1.1 clarification)
----------------------------------------------------------------------------
Name
Network transparent proxies prevent introduction of new HTTP methods
Classification
Architecture
Description
A proxy that receives a request with a method unknown to it is required
to generate an HTTP 501 Error as a response. HTTP methods are designed
to be extensible so there may be applications deployed with initial
support just for the user agent and origin server. A transparent proxy
that hijacks requests with new methods destined for servers that have
implemented that method creates a de-facto firewall where none may be
intended.
Significance
Medium within network transparent proxy environments.
Implications
Renders new compliant applications useless unless modifications are
made to proxy software. Because new methods are not required to be
globally standardized it is impossible to keep up to date in the
general case.
Solution(s)
Eliminate the need for network transparent proxies. A client receiving
a 501 in a traditional HTTP environment may either choose to repeat the
request to the origin server directly, or perhaps be configured to use
a different cache.
Workaround
Level 5 switches (sometimes called Level 7 or application layer
switches) can be used to keep HTTP traffic with unknown methods out of
the proxy. However, these devices have heavy buffering
responsibilities, still require TCP sequence number spoofing, and do
not interact well with persistent connections.
Contact
Patrick McManus <mcmanus@AppliedTheory.com>
----------------------------------------------------------------------------
Name
Cannot specify multiple URIs for replicated resources
Classification
Architecture
Description
There is no way to specify that multiple URIs may be used for a single
resource, one for each replica of the resource. Similarly, there is no
way to say that some set of proxies (each identified by a URI) may be
used to resolve a URI.
Significance
Medium
Implications
Forces users to understand the replication model and mechanism. Makes
it difficult to create a replication framework without protocol support
for replication and naming.
Indications
Inherent in HTTP 1.0, HTTP 1.1.
Solution(s)
Architectural - protocol design is necessary.
Workaround
Replication mechanisms force users to locate a replica or mirror site
for replicated content.
Contact
Daniel LaLiberte <liberte@w3.org>
----------------------------------------------------------------------------
Name
Replica distance is unknown
Classification
Architecture
Description
There is no recommended way to find out which of several servers or
proxies is closer either to the requesting client or to another
machine, either geographically or in the network topology.
Significance
Medium
Implications
Clients must guess which replica is closer to them when requesting a
copy of a document that may be served from multiple locations. Users
are must know the set of servers that can serve a particular object.
This in general is hard to determine and maintain. Users must
understand network topology in order to choose the closest copy. Note
that the closest copy is not always the one that will result in
quickest service. A nearby but heavily loaded server may be slower than
a more distant but lightly loaded server.
Indications
Inherent in HTTP 1.0, HTTP 1.1.
Solution(s)
Architectural - protocol work is necessary. This is a specific instance
of a general problem in widely distributed systems. A general solution
is unlikely, however a specific solution in the web context is
possible.
Workaround
Servers can (many do) provide location hints in a replica selection web
page. Users choose one based upon their location. Users can learn which
replica server gives them best performance. Note that the closest
replica geographically is not necessarily the closest in terms of
network topology. Expecting users to understand network topology is
unreasonable.
Contact
Daniel LaLiberte <liberte@w3.org>
----------------------------------------------------------------------------
Name
Proxy resource location
Classification
Architecture
Description
There is no way to tell a proxy that it may request a resource from
another location, then the receiver should check the authenticity of
the given resource.
Significance
Medium
Implications
Proxies have no systematic way to locate resources within other proxies
or origin servers. This makes it more difficult to share information
among proxies. Information sharing would improve global efficiency.
Indications
Inherent in HTTP 1.0, HTTP 1.1.
Solution(s)
Architectural - protocol design is necessary.
Workaround
Certain proxies share location hints in the form of summary digests of
their contents (e.g., Squid). Certain proxy protocols enable a proxy
query another for its contents (e.g., ICP). (See however "ICP
Performance" issue.)
Contact
Daniel LaLiberte <liberte@w3.org>
----------------------------------------------------------------------------
Name
Cache peer selection in heterogeneous networks
Classification
Architecture
Description
Cache peer selection in networks with large variance in latency and
bandwidth between peers can lead to non-optimal peer selection. For
example take cache C with two siblings, Sib1 and Sib2, and the
following network topology (summarized).
o Cache C's link to Sib1 carries 2 Mbit/sec with 300 msec latency.
o Cache C's link to Sib2 carries 64 Kbit/sec with 10 msec latency.
ICP won't work well in this context. If a user submits a request to
Cache C for page P that results in a miss. C will send an ICP request
to Sib1 and Sib2. Assume both siblings have the requested object P. The
ICP-HIT reply will always come from Sib2 before Sib1. However, for
large objects it is clear that the retrieval will be faster from Sib1
rather than Sib2.
In fact, the problem is more complex because Sib1 and Sib2 can't have a
100% hit ratio. With a hit rate of 10%, it is more efficient to use
Sib1 with URLs larger than 48K. The best choice depends on at least the
hit rate and link characteristics; maybe other parameters as well.
Significance
Medium
Implications
By selecting the first peer to respond peer selection algorithms are
not optimizing retrieval latency to end users. Furthermore they are
causing more work for the high-latency peer since it must respond to
such requests but will never be chosen to serve content if the lower
latency peer has a copy.
Indications
Inherent in design of ICP v1, ICP v2, and any cache mesh protocol that
selects peer based upon first response.
This problem is not exhibited by cache digest or other protocols which
(attempt to) maintain knowledge of peer contents and only hit peers
that are believed to have a copy of the requested page.
Solution(s)
This problem is architectural with the peer selection protocol.
Workaround
Cache mesh design when using such a protocol should be done in such a
way that there is not a high latency variance among peers. In the
example presented in the Description the high latency high bandwidth
peer could be used as a parent, but should not be used as a sibling.
Contact
Ivan LOVRIC <ivan.lovric@cnet.francetelecom.fr>
John Dilley <jad@hpl.hp.com>
----------------------------------------------------------------------------
Name
ICP performance
Classification
Architecture(ICP), Performance
Description
The ICP protocol exhibits O(n^2) scaling properties, where n is the
number of peer proxies participating in the protocol. This can lead ICP
traffic to dominate HTTP traffic within a network.
Significance
Medium
Implications
If a proxy has many ICP peers the bandwidth demand of ICP can be
excessive. Cache managers must carefully regulate ICP peering. ICP also
leads proxies to become heterogeneous in what they serve. This means if
your proxy does not have a document it is unlikely your peers will have
it either. Therefore, ICP traffic requests are largely unable to locate
a local copy of an object [credit to Ingrid Melve's 3WCW talk for
this].
Indications
Inherent in design of ICP v1, ICP v2.
Solution(s)
This problem is architectural - protocol redesign or replacement are
required to solve it if ICP is to continue to be used.
Workaround
Implementation workarounds exist, for example to turn off use of ICP,
to carefully regulate peering, or to use another mechanism if
available, such as cache digests. A cache digest protocol shares a
summary of cache contents using a Bloom Filter technique. This allows a
cache to estimate whether a peer has a document. Filters are updated
regularly but are not always up-to-date so cannot help when a spike in
popularity occurs. They also increase traffic but not as much as ICP.
Cache clustering protocols organize caches into a mesh provide another
alternative solution. There is ongoing research on this topic.
Contact
John Dilley <jad@hpl.hp.com>
----------------------------------------------------------------------------
Name
Cache meshes can break HTTP serialization of content
Classification
Architecture (HTTP protocol)
Description
A cache mesh where a request may travel different paths depending on
the sate of the mesh and associated caches can break HTTP content
serialization, possibly causing the end user to receive older content
than seen on an earlier request where the request traveled another path
in the mesh.
Significance
Medium
Implications
Can cause end user confusion. May in some situations (sibling cache
hit, object has changed state from cacheable to uncacheable) be close
to impossible to get the caches properly updated with the new content.
Indications
Older content is unexpectedly returned from a cache mesh after some
time.
Solutions(s)
Work with cache vendors and researchers to find a suitable protocol for
maintaining cache relations and object state in a cache mesh.
Workaround
When designing a cache hierarchy/mesh, make sure that for each
end-user,URL combination there is only one single path in the mesh
during normal operation.
Contact
Henrik Nordstrom <hno@hem.passagen.se>
----------------------------------------------------------------------------
Implementation
----------------------------------------------------------------------------
Name
Use of Cache-Control headers
Classification
Implementation
Description
Many (if not most) implementations incorrectly interpret Cache-Control
response headers.
Significance
High
Implications
CC headers will be spurned by end users if there are conflicting or
non-standard implementations.
Indications
Check: Squid, NetCache, Cache Engine, HTTP State Management draft for
use of CC: no-cache and must-revalidate against HTTP/1.1rev6.
Solution(s)
Work with vendors and others to assure proper application
Workaround
None
Contact
Mark Nottingham <mnot@pobox.com>
----------------------------------------------------------------------------
Name
Lack of HTTP/1.1 testing for proxy caches
Classification
Implementation/Testing
Description
Although performance benchmarking of caches is starting to be explored,
protocol compliance is just as important.
Significance
High
Implications
Cache vendors implement their interpretation of the spec; because it is
a very large, and sometimes vague, specification, this can lead to
inconsistent behaviors.
Indications
-
Solution(s)
There was some talk at WCW4 about starting a test suite, but it appears
to have stalled.
Workaround
Just do it.
Contact
Mark Nottingham <mnot@pobox.com>
----------------------------------------------------------------------------
Name
ETag support
Classification
Implementation
Description
No currently released cache implements ETag (strong) validation.
Significance
Medium
Implications
LM/IMS validation is inappropriate for many requirements, both because
of its weakness and its use of dates. Lack of a usable, strong
coherency protocol leads developers and end users not to trust caches.
Indications
-
Solution(s)
Work with vendors to implement ETags; work for better validation
protocols
Workaround
use LM/IMS validation
Contact
Mark Nottingham <mnot@pobox.com>
----------------------------------------------------------------------------
Name
Client proxy failover
Classification
Implementation
Description
Failover between proxies at the client level (using a proxy.pac file)
is erratic and no standard behavior is defined. Additionally, behavior
is hard-coded into the browser, so that proxy administrators cannot use
failover at the client level effectively.
Significance
Medium
Implications
Cache system architects are forced to implement failover at the cache
itself, when it may be more appropriate and economical to do it at the
client.
Indications
If a browser detects that its primary proxy is down, it will wait n
minutes before trying the next one it is configured to use. It will
then wait y minutes before asking the user if they'd like to try the
original proxy again. This is very confusing for end users.
Solution(s)
Work with browser vendors to establish standard extensions to
JavaScript proxy.pac libraries that will allow configuration of these
timeouts.
Workaround
User education; redundancy at the proxy level.
Contact
Mark Nottingham <mnot@pobox.com>
----------------------------------------------------------------------------
Name
Servers and content should be optimized for caching
Classification
Implementation (Performance)
Description
Many web servers and much web content could be implemented to be more
conducive to caching, reducing bandwidth demand and page load delay.
Significance
Medium
Implications
By making poor use of caches origin servers encourage longer load
times, greater load on cache servers, and increased network demand.
Indications
The problem is most apparent for pages that have low or zero expires
time, yet do not change.
Solution(s)
...
Workaround
For example servers could start using unique object identifiers for
write-only content: if an object changes it gets a new name, otherwise
is is considered to be immutable and therefore have an infinite expire
age. Certain hosting providers do this already.
Contact
Peter Danzig <danzig@netapp.com>
----------------------------------------------------------------------------
Name
Some servers send bad Content-Length header files that contain CR.
Classification
Implementation
Description
Certain web servers send a Content-length value that is larger than
number of bytes in the HTTP message body. This happens when the server
strips off CR characters from text files with lines terminated with
CRLF as the file is written to the client. The server probably uses the
stat() system call to get the file size for the Content-Length header.
Servers that exhibit this behavior include the GN Web server (version
2.14 at least) (http://gopher.unicom.com/gn-info/).
Significance
Low. Surveys indicate only a small number of sites run faulty servers.
Implications
In this case, an HTTP agent (client or proxy) may believe it received a
partial response. HTTP/1.1 (RFC 2616) advises that caches MAY store
partial responses.
Indications
Count the number of bytes in the message body and comparing it to the
Content-length value. If they differ the server exhibits this problem.
Solutions
Upgrade or replace the buggy server.
Workaround
Some browsers and proxies use one TCP connection per object and ignore
the Content-Length. The document end of file is identified by the close
of the TCP socket.
Contact
Duane Wessels <wessels@ircache.net>
----------------------------------------------------------------------------
Administration
----------------------------------------------------------------------------
Name
Lack of fine-grained, standardized hierarchy controls
Classification
Administration
Description
There is no standard for instructing a cache as to how it should
resolve what parent to fetch a given object from. Because of this,
implementations vary greatly, and it can be difficult to make them
interoperate correctly in a complex environment.
Significance
Medium
Implications
Complications in deployment of caches in a complex network (esp.
corporate networks)
Indications
Inability of some caches to be configured to direct traffic based on
domain name, reverse lookup IP address, raw IP address, in normal
operation and in failover mode. Inability in some caches to set a
preferred parent / backup parent configuration.
Solution(s)
?
Workaround
Work with vendors to establish an acceptable configuration within the
limits of their product; standardize on one product
Contact
Mark Nottingham <mnot@pobox.com>
----------------------------------------------------------------------------
$Header: draft-wrec-known-prob-00.html,v 1.9 99/08/04 11:23:34 jad Exp $
This archive was generated by hypermail 2b29 : Thu Nov 18 2004 - 11:21:26 MST