2/5 Re: I-D ACTION:draft-danli-wrec-wcip-00.txt

From: Mike Dahlin (dahlin@cs.utexas.edu)
Date: Tue Nov 21 2000 - 20:08:09 MST


2. Section 2.3 IP Multicast

Two main comments. First, I have tried to re-organize the section to
make the fundamental issues more clear and to separate the "short-term
hack" for application-level reliability to support unreliable IP
multicast from the rest of the discussion (since that is the section
most likely to change or get deleted over time.) Second, I chew on the
application-level reliability discussion a bit more and try to
generalize it to accomodate the Yu et. al approach.

<<<<< BEGIN proposed update to 3.2.3 >>>>>>>

3.2.3 IP multicast

   In this mode, an IP multicast group is allocated for the
   invalidation channel and its address is advertised as part of the
   channel information. The invalidation client subscribes to this
   multicast group to receive cache invalidations and/or object
   updates. Ideally, it's a single-source multicast group, meaning that
   the invalidation client subscribes to the sender and group address
   pair <S, G>, where S is the invalidation server address and G is the
   multicast group address.

   IP multicast removes the scalability concern at the invalidation
   server in that the invalidation server now only needs to send one
   copy of any message. Plus, it doesn't maintain per-client state. A
   multicast invalidation channel is much more efficient than unicast-
   based cache consistency schemes.

   Worth noting, however, is that anything the invalidation server
   sends to the invalidation channel goes to every
   subscriber. Therefore, objects covered by a multicast invalidation
   channel need to be correlated so that if an invalidation client is
   interested in or has cached some objects of the channel, it is
   highly likely that it will cache the other ones. For example, CNNfn
   top stories should belong to one channel while ESPN top stories
   belong to another.

   There are two issues in implementing WCIP over ip multicast: (1)
   establishing synchronization and (2) retaining synchronization
   depite packet loss

   (1) Establishing synchronization

   To establish synchronization, the client must ensure that all
   cached objects' versions (Last modified times or Etags) match the
   most recent versions known to the consistency servers. The standard
   unicast approach for establishing synchronization -- the client
   sends a Registration message to the server containing the version
   numbers of cached objects and the server replies with current
   version numbers of those objects (see Section 5.2.1) -- has two
   problems. First, it is not scalable in that it can overload the
   consistency server. Second, because the servers reply would have to
   be sent on the multicast channel (in order to maintain ordering and
   reliability constraints), all clients would see all regitration
   replies, which would be inefficient.

   Therefore, multicast channels should use a server-driven approach to
   establish synchronization: the invalidation server periodically
   transmits resynchronization data (e.g., a list of objects' current
   Etags) to allow clients to re-synchronize their object freshness
   state.

   The client-driven and server-driven resynchronization protocols are
   discussed in more detail in section TBD. <see MDD main comment #3>

   (2) Retaining synchronization despite packet loss

   To provide consistency guarantees, invalidation/heartbeat message
   channels must maintain the invariant: that the invalidation client
   must never receive a heartbeat without first receiving all
   preceding invalidations sent to it.

   This can be done in one of two ways.

   First, the system may use a reliable multicast transport protocol
   (e.g., PGM, SRM, etc.) by specifying the protocol to be used in the
   channel information header (see section 4.1). The invalidation
   protocol then proceeds as it does for the reliable unicast case:
   clients invalidate objects when they receive invalidation messags,
   and they re-register (but using server-driven re-registration as
   described above and in section TBD) if the transport layer detects
   a lost packet. Note: the WCIP channel information header does not
   currently define any headers for reliable multicast protocols.

   Second, because no reliable multicast transport protocols are
   widely deployed, WCIP provides an application-level reliability
   protocol to allow it to run on top of unreliable transports such as
   raw IP multicast.

MDD:
OK. I think we may agree on everything up to here. Then the remaining
task is to define how application-level reliability should be done.

After chewing on the specified protocol a bit, I finally understand it
and basically like it. I particularly like the new idea in the draft of
incrementally resynchronizing. In fact, this notion might be worth
generalizing to the unicast reconnection case (see main comment 3 in
notes.11.20.2000b.txt).

The potential disadvantage is that if packet loss --> sync loss,
synchronization losses may be common and this protocol requires work
in proportion to the size of a volume to re-establish
synchronization. I can easily imagine situations where clients spend a
majority of their time unsynchronized. Fundamentally, the other option
is to pay a cost proportional to the number of objects being
invalidated to reduce the probability that synchronization is lost by
retransmitting.

Note that since this only reduces the probability, there still needs
to be a way to establish syncrhonization, and I like your approach for
that.

Now that I finally understand your protocol, however, it appears to me
that it is simple to add this second mode of operation as an optional
optimization. This is attractive for two reasons (1) it lets an
implementation balance the "per-object" overhead against the
"per-invalidation" overhead; I can imagine there are some systems
where one is the dominant factor and others where the other is; (2)
it is general enough to accomodate, say, Yu, Breslau, and Shenker's
protocol (SIGCOMM99).

As always the question is whether the added complexity is worth the
performance benefit. Especially given that this is a "stop gap" until
real reliable multicast comes along... I could be swayed either way on
that. (My instinct is that it if we support unreliable IP multicast at
all, then this is worth adding.)

/MDD

MDD
Protocol from current draft (wording slightly modified; one bug fixed;
optimizations added at the end):
/MDD

   Absent an off-the-shelf real-time reliable multicast protocol, WCIP
   allows for an unreliable transport protocol with application-level
   recovery from failures.

   (1) The invalidation server marks invalidation channel messages
        with incrementing sequence numbers;

MDD
NOTE: (bug fix I think) heartbeat packets need sequence numbers
too. Otherwise, we can not detect the case when the last invalidation
message before a heartbeat is lost.
/MDD

   (2) Whenever the invalidation client sees a sequence number gap, it
        considers itself to have lost synchronization with the
        channel. The client marks all objects as "no-object-lease"
 and reverts to following the normal HTTP Cache-control
        directives.
MDD
I don't care what this state is called, but we refer to this state in
a bunch of situations. Rather than saying "revert to following normal
HTTP Cache Control directives" each time, it would be more clear to
describe the state machine for each object and then just say "set all
objects to state X". See main comment 3 in notes.11.20.2000b.txt
/MDD

   (3) The invalidation client resynchronizes according to the
        resynchronization protocol specified for the channel
        Typically, this is server-driven resynchonization in which the
        server periodically transmits the current version numbers
        of objects. Note that resynchronization messages must
 also include sequence numbers.
   (4) Once the invalidation client resynchronizes the freshness state
        of certain objects, it switches those objects from HTTP cache-
        control back to WCIP freshness guarantees.
   (5) As more re-synchronization messages arrive, the invalidation
        client gradually reinstates all its objects back to WCIP
        freshness guarantees. In fact, a cache proxy may join the
        multicast channel and become gradually synchronized this way
        without ever directly contacting the invalidation server via
        unicast.

MDD
Here are the new optimizations for discussion
/MDD

   A client MAY process or buffer invalidation packets received with
   out of order sequence numbers. A client MUST NOT process a
   heartbeat until (a) its has processed all preceeding sequence numbers

   or (b) it has declared a loss of synchronization and set the state
   of all objects in the channel to state "no-object-lease" (see
   Section TBD).

   An invalidation server MAY send a packet multiple times
   to reduce the probability that synchronization is lost.

   To reduce the cost of retransmission, an invalidation server MAY
   group multiple packets into a single retransmission. In this case,
   the sequence number field indicates a range of sequence numbers,
   specifying which numbers are included in the packet. Such a packet
   MAY include "new" information, in which case the sequence number
   range is extended to also include a new sequence number
   For example, an invalidation server could include
   a range of preceeding invalidations in each heartbeat message to
   make sure that invalidation clients "catch up" with missing
   messages by the end of a heartbeat interval~\cite{yu99}.

      seqno=100 invalidate "foo" Etag=A
      seqno=101 invalidate "bar" Etag=z
      seqno=102 invalidate "baz" Etag=q
      seqno=100-103 invalidate "foo" Etag=A, invalidate "bar"
      Etag=z, invalidate "baz" Etag=q, heartbeat

<<<< END proposed update to 3.2.3 >>>>



This archive was generated by hypermail 2b29 : Thu Nov 18 2004 - 11:21:29 MST