Dan Li wrote:
> Hi, here is the slides on WCIP presented in the recent caching workshop.
>
> It's mostly a bag of ideas, a better write-up is on the way. Please
> comment, if you are interested in working on it.
>
> Thanks!
> Dan
>
> ------------------------------------------------------------------------
> Name: invalidation.ppt
> invalidation.ppt Type: Microsoft PowerPoint Show (application/vnd.ms-powerpoint)
> Encoding: base64
Overall, I agree with the motivation and high-level approach.
Providing invalidation-based consistency as an option would make it
easier to construct services that behave correctly. It could also
improve performance/scalability in a number of of situations.
There is one significant point where I think a small change in what is
described would significantly improve the protocol independent of most
other design decisions. In addition, I have a number of more
detailed comments, questions, and other issues to consider.
Below, I first discuss what I see as the main issue in detail. Then I
discuss the lower level comments using the outline of Li and Cao's
powerpoint presentation (sent to this mailing list a few weeks ago;
see ftp://cs.utk.edu/pub/wrec/mail-archive.2000-06) as a framework.
Main comment:
I mark this as the main comment not because it is particularly hard to
fix, but because not fixing it could, I believe, hamstring the
protocol by making it more complex and less expressive than it could
be with a slightly different approach.
The definition of "fresh bound of an object" described in the
powerpoint is awkward in that it gives the "heartbeat interval"
semantic meaning. In particular, the design bounds the worst case
staleness of an object to (fresh_bound +
heartbeat_interval). According to the slides, objects become stale as
follows: "if the cache receives no heartbeat from the invalidation
server in the last heartbeat interval, invalidate all objects whose
'fresh bound' timer has expired." The slides go on to give the
following example: if an object fresh bound is 5 minutes and the
heartbeat interval of the channel is 1 minute then the worst case
staleness of the object is 6 minutes.
I believe that the lease (or "fresh bound") on an object should define
the semantic guarantee and that the heartbeat interval should have no
semantic meaning. For example, one could use the following definition:
"if an object's freshness bound is F and the time since the cache's
last communication with the server (server-initiated heartbeat or
reply to a cache's request) is CS, the cached object is valid iff
(CS <= F); otherwise the cached object must be revalidated before use."
There are several advantages to this approach:
1) Performance -- separating these concerns allows the heartbeat
interval to be tuned to match link characteristics and performance
requirements. For example, as loss rate falls, lower heartbeat
rates (down to 1 hb per lease interval rather than multiple hb per
lease interval) may be appropriate. Or, busy servers may want to
reduce their heartbeat rate (trading increased risk of "stranding"
clients for reduced overhead). Or, some servers may not want to
send periodic heartbeats at all (piggy-backing "I'm alive" messages
on other messages to clients actively using the service and
dropping idle clients.)
2) Clarity -- Telling web masters that they should set their
"freshness_bound = 5 minutes" to get, say, a 6 minute maximum
stalenes is an API that will confuse people.
3) Flexibility -- Separating the heartbeat interval from the freshness
guarantee makes it easier to support a broad range of consistency
guarantees. Conversely, if the heartbeat interval must be smaller
than the staleness guarantee, small staleness guarantees in the
seconds to tens of seconds range become very expensive because they
force the system to use even shorter heartbeats. Furthermore,
semantics that separate heartbeats from maximum staleness can be
extended in a straightforward way to allow services to provide
strong consistency guarantees under which clients never see stale
data; that is not an option if heartbeats have this semantic
baggage.
4) Flexibility -- Separating the heartbeat interval from the freshness
guarantee makes it easier to support different implementations of
heartbeats from unreliable multicast to unreliable unicast to
reliable unicast to client polling.
// Aside:
// Two pieces of background that I am assuming in the above discussion.
// 1) Protocols will piggyback "I'm alive"/volume lease
// renewal/heartbeat on all client->server messages so that the system
// can function correctly without heartbeats
// 2) Protocols that use heartbeats and that run on unreliable transport
// may want to send multiple heartbeats per lease interval to avoid the
// need to resynchronize state when one or two messages are lost.
Detailed comments on slides
---------------------------
Slides 1-3
----------
Not much to argue with here...
Slide 4
-------
1.
The slide implies that active server-push heartbeats are required by
the protocol. As indicated above, I think we may want to consider
making server-push heartbeats optional (e.g., the system renews the
leases of active clients by piggy-backing renewals on other traffic
and/or allow clients to send explicit "lease renewal" messages rather
than requiring server push).
If the definitions are cleaned up as discussed above, this option
comes more or less for free.
2.
"Delivery: unicast, application-layer multicast, or IP multicast."
I agree with the goal of making the consistency protocol independent
of the transport protocol to the extent possible.
One other option to put on the wish list: hierarchy of consistency
servers (perhaps this was already assumed.)
Another one to put on the wish list: unreliable unicast (the
later slides imply that unicast = persistent connection/TCP).
(I'm not saying that we want to support all of these protocols, but
it would be nice to have a systematic way to decide which ones to
support.)
One other point to keep in mind as we hash out the details: there may be
some subtle effects of different transports on the semantics you end
up with.
For example: suppose you use unreliable IP multicast for both
invalidations and heartbeats and an invalidation message is lost but a
client receives a subsequent heartbeat. Unless the protocol goes to
some trouble, there is effectively no upper bound on the maximum
staleness of a cached object (which may be much weaker semantics than
some users might expect or desire from the protocol.)
When I think about these protocols, this dilemma (desire for protocol
independence v. the need for certain guarantees from the transport
layer in order to provide certain desirable semantics guarantees) is
one of the things I struggle with.
I don't have a great answer. Here are some options to get the
discussion started:
o Varying semantics
Define a simple protocol that is independent of transport semantics,
allow different servers to specify different transports to be used,
and allow the semantics delivered to the application vary depending
on the transport chosen by the server. To make this acceptable, we
should precisely charactactize the semantics provided by different
transport choices.
The advantage of the approach is that it may offer the chance to
provide a simple protocol that has good performance and semantics in a
wide range of implementations and it lets servers pay for just the
semantics they need.
A disadvantage is that it is not satisfying to have the tranport layer
choice affect semantics.
o Limit network choices or limit semantics
Define a simple protocol that makes specific assumptions about the
network transport and that provides specific semantics. If a transport
doesn't meet the assumptions, it cannot be used (or requires a layer
above it to fix the semantic limitations). If a transport layer
provides stronger guarantees than assumed by the protocol, the system
may pay unneeded overhead as these guarantees are re-implemented at
the consistency protocol level.
The disadvantage of this approach is that we face a dilemma: if we
specify stronger semantics (e.g., specify a protocol that guarantees
some maximum staleness) we eliminate certain transports (or increase
the complexity of using certain transports); conversely, if we specify
weaker semantics, we leave some applications unsatisfied.
The advantage is that we may be able to hash out a specific solution
to the compromise that keeps everyone happy.
The obvious example of a successful application of this strategy is
HTTP -- HTTP uses a somewhat heavyweight network protocol that keeps
the HTTP protocol simple but that may give up the opportunities for
optimizing performance. We should keep this option in the back of our
minds before we commit to supporting a wide range of transports.
o Provide protocol "hooks" for varying semantics
Define a simple protocol, but add some optional features to provide
stronger semantics on weaker networks.
The disadvantage of this approach is protocol complexity.
3.
"POST to update the object (Delta encoding)"
It is worth noting that this really raises the ante w.r.t security. I
have not tried to think through these details.
Slide 5
-------
1.
Question: I assume the client-->server message "Invalidation: ..." is
piggy-backed on the GET request?
2.
One option we may want to consider: clients may want to specify a
maximum time that the server is entitled to send invalidations (or
updates) to the client. I'm not sure this is vital, but it might be
important in some circumstances. The point is that pushing things to a
client does chew up client resources (especially if the server is
pushing updates, not just invalidates) so one might want to give
clients the option of limiting this usage.
(My initial reaction would be to omit this feature, but it may be
worth discussing.)
3. Given the (somewhat peculiar IMHO) definition of heartbeats
proposed in the powerpoint, something may be missing (or it may be a
deliberate omission for the powerpoint) -- if object expirations are a
function of the heartbeat interval, the server needs to tell the
client what heartbeat interval it is using in the server-->client
"Invalidation:" message. If the definition is fixed, this is not an
issue.
Slide 6
-------
1.
"Join the invalidation channel if it caches quite a few objects of the
channel"
Hm. I don't know off hand how to set this magic number.
Is a "channel" equivalent to a server (an IP address?); is it intended
that more general 'volumes' be supported? (It looks like this protocol
supports those by allowing a server to give out different
"Invalidation:" IP addresses for different volumes, which seems like a
nice feature.)
2.
"In unicast, establish a persistent connection with the
invalidation server" Note that depending on the decisions made about
the network/protocol semantics above, the persistent connection may
end up having semantic meaning (which could be awkward.) E.g., one
might expect that if you are using a persistent connection that you
would get things like in-order, reliable delivery. But if the
connection gets torn down and another one started, you might lose some
of those properties. If you are relying on those properties, you would
need to have the "persistent connection re-established" event passed
up to the consistency protocol, which could be problematic.
3.
The structure of the message that the client sends to the server to
connect (for tcp-unicast) probably needs to be defined
4.
It might also be worth considering supporting unreliable unicast
(e.g., UDP) (so unicast = persistent connection might not be quite
right).
Slide 7
-------
1.
"Send heartbeats if no message is sent in the last heartbeat interval"
seems a bit awkward.
2.
Should the system have the option of using unicast for invalidations
and multicast for heartbeats? (What new semantic issues arise?)
Slide 8
------
(main comments: see discussion above)
1.
Incidently, I believe the freshness bound stated (fresh_bound +
heartbeat_interval) is not quite right (even if the original
definition for fresh bound/heartbeat is used). I believe that the
proposed protocol actually provides a freshness bound of: (fresh_bound
+ heartbeat_interval + max_network_delay). Note that the alternate
definition also has this property (its guarantee is fresh_bound +
max_network_delay).
In theory this could significantly weaken semantic guarantees but in
practice this is probably something that could be lived with.
At the same time, a small refinement of the definition can get rid of
this glitch (at least for lease renewals "piggy-backed" on replies to
client requests): when a client sends a request at time T1 and
receives a reply at time T2, it knows that the server was alive at
some instant between time T1 and T2. Rather than expire objects at
time T2 + F, the client can expire objects at time T1 + F.
I don't think this is a huge issue, but I think the protocol is a bit
cleaner this way with essentially no additional cost. It also improves
flexibility by supporting correct behavior with very short leases.
Unfortunately, this trick doesn't help in the case of servers
"pushing" heartbeats. Ideas anyone?
2.
It seems like there is an implicit assumption that heartbeats are sent
via an unreliable channel so that the heartbeat interval is much
shorter than the fresh bound). (Then if one or two heartbeat are lost,
the client-server state stays synchronized.) If heartbeats are sent on
a reliable channel, you just want to send them once, just before the
lease expires.
3.
If leases expire (e.g., no heartbeats for a while), the simple way to
re-synchronize the state is to discard objects, and that is what is
indicated here. That is probably the prudent way to go. Still, there
may be some benefits to considering other reconnection protocols
that might have some nice performance benefits.
Slide 9
-------
see comments above
Slide 10
--------
1.
Should a transparent caches be able to reply to get-if-modified/
lease-renewal requests? The advantage of allowing them to do so is
improved performance. Given better consistency guarantees to control
the semantics of caches, this (conceptually) should be possible. The
challenge in doing so is probably proving sensible semantics across
failures.
2. hierarchy
In addition to working through the details of transparent hierarchical
caches, we should work through the details of explicit hierarchical
caches (in particular the 2-level client/proxy case). Again, I suspect
the main issue is failure semantics.
Slide 11
--------
1.
In addition to supporting scalability via application-level multicast,
the system should support scalability via hierarchy of
caches/invalidation servers. In addition to providing a similar type
of scalability as application-level multicast, supporting hierarchies (a)
provides a simple way to build scalable invalidation servers on a
cluster and (b) can improve lease renewal/reconnection latency by
letting consistency servers near clients locally handle client requests.
Slide 12
--------
1.
One other option to add is to specify an SSL or IPSec channel in the
"Invalidation:" header (where there would normally be just an IP
address.) This approach might let you factor the security out of this
protocol and leave it to the transport layer (this is in the spirit of
the "vary semantics by transport layer" approach above -- depending on
the transport layer the server uses, the service will get different
semantic guarantees.
2.
Let me see if I understand how the ETAG is working to prevent
replays. If a client is caching object O with Etag E, the server sends
an invalidate message of roughly the form:
Invalidate O E {MD5{Invalidate 0 E}}K_server_private
So the client knows this is a "fresh" request. If the signature
doesn't match or E does not match the client's entity tage for O, the
client ignores the message; similar procedure for update, right.
This seems reasonable, although it is a bit awkward if lossy channels
are to be supported -- if a client misses an invalidation message,
the old version of that object becomes "stuck" in the cache
forever. It would be nice if these were self-healing in some way
(e.g., one could specify a monotonically increasing version number
rather than an opaque etag, but I guess that has its own problems.)
Other issues
------------
(some of which I have candidate answers for and some of
which I do not..)
o recovery (after server/ crash)
e.g., suppose server crashes and recovers before leases expire
One possible solution is to use server "epoch numbers" to allow
clients to detect when lease renewals/heartbeats do not apply to
object leases they are holding. An alternate implementation of this
concept would be for servers not to re-use the same heartbeat channel
when they recover (at least until max_lease has passed).
o How to set "legacy" cache control headers in different scenarios
(esp. in a hierarchy where some levels speak "invalidation" and some
don't) For example, what should happen when a server and proxy speak
"invalidation" but want to support clients talking to the proxy that
don't.
o Should the system support atomic invalidation of multiple objects?
(I can think of a number of advantages of doing so, and it seems
pretty straightforward to do.)
Well, hopefully this is enough to get discussion rolling...
I look forward to continuing this discussion and getting a protocol
out there.
-mike
This archive was generated by hypermail 2b29 : Thu Nov 18 2004 - 11:21:28 MST