Re: oh the irony

From: Howard Davis (HOWARD_DAVIS@novell.com)
Date: Tue Aug 03 1999 - 13:06:57 MDT


Since many origin servers do not support persistent connections, bad HTTP implementations get away with sending Content-Length headers that are wrong. The browser can use the FIN to properly detect the end of the content and ignore the bad Content-Length header. However, when a proxy fills the content using a non-persistent connection with the origin and then transmits this content to the browser using a persistent connection, the bad content length header hangs the persistent connection with the browser.

In order to get around the problem in our Novell BorderManager proxy and Novell Internet Cache System (ICS) proxy we added code to track which web servers send us bad Content-Length headers. When we detect that the length of the reply does not match the Content-Length header that came with the reply we remember this. On all future requests to that particular host our proxy will ignore the Content-Length header sent by the host and generate a new one based on the actual size of the content received. We then use this correct Content-Length header to transmit the reply to the browser over a persistent connection. There is also some extra logic needed to determine if the reply is extremely large, and the browser has been waiting too long without a response, in which case we determine that it is better to begin transmitting the reply to the browser without a Content-Length, and change the HTTP connection to a non-persistent connection.

Another case is origin servers that negotiate with our proxy to maintain a persistent connection, and then send a bad Content-Length header over the persistent connection. In this case we again detect the problem and then remember that this host sends bad content length headers on persistent connections. In the future we will always negotiate for non-persistent HTTP connections with the offending host.

With these work arounds we have been able to successfully maintain persistent connections as the default behavior. Maintaining lots of persistent connections can use up a lot of system resources. However the code can be designed to minimize the resource utilization. Our proxy is designed as an asynchronous state machine. It does not use threads to listen on each connection, and does not even assign a thread to process HTTP requests that are received. Instead requests are processed by a state machine with communication, disk, and time events driving the state transitions. Because of this the overhead to maintain a persistent connection is less than 400 bytes of RAM per connection and there is no process or thread overhead. This allows us to maintain 100,000 simultaneous persistent connections using less than 40 meg of RAM, or 200,000 simultaneous persistent connections using less than 80 meg of RAM, etc.

I hope this information is useful.

Howard

>>> John Dilley <jad@pimlico.hpl.hp.com> 07/31/99 10:40AM >>>

...I've heard anecdotal evidence of content length being incorrect
in responses and browsers having to accomodate that. Is anyone willing
to formally summarize those anecdotes, their implications, and current
workarounds/solutions? Thanks,



This archive was generated by hypermail 2b29 : Thu Nov 18 2004 - 11:21:26 MST