4. Delayed invalidations
The following is implicit (I think) in the current draft. It may be
worth making it explicit by adding a subsection in 3 or 4 (or if I'm
wrong and these rules are not true, then a discussion should
definitely be added...):
The protocol does provide some flexibility about when invalidations
are sent. In particular, an invalidation server SHOULD send object
invalidations "reasonably" soon after it learns of an object change,
but it MAY delay them until some time before the subsequent
heartbeat. (It MUST NOT delay them further.)
Examples:
o An invalidation server might place an upper bound on the rate at
which invalidations are sent (to smooth bursts of load)
o An invalidation server might delay an invalidation for a few seconds
in hope of grouping invalidations to several objects into the same
message.
o An invalidation server might prioritize sending invalidations by
sending invalidations to "active" clients (that have recently
registered new object callbacks) before sending them to "idle"
clients (that are connected but which don't seem to be actively using
the channel's volume)
The above examples are minor (and, I think, already implicitly allowed
by the current draft.) The following is a more significant point. It
is ambiguous whether it is allowed by the current draft or not. It is
debatable whether it should be (my initial inclination is that it
should...)
Consider a unicast-tcp invalidation channel. At least the following
two states are defined:
o connected,synchronized -- a TCP connection is established
and the server is periodically sending heartbeats
o disconnected -- no TCP connection is established
Although not explicit in the current description, it is possible to
have a middle point:
o connected, idle -- a TCP connection is established, but the
server is not sending heartbeats on it; the server is queuing
invalidations on this channel and sending them when idle capacity
allows; if the client becomes active again, the client will
send a request to the server, forcing heartbeats to begin again
(and the first heartbeat will include any queued invalidates).
(This could also be done with sequence numbers across connections,
allowing connection state to be reclaimed for idle clients; I describe
it this way to minimize changes from the current draft.)
The argument for including this:
o The optimization of stopping sending invalidations to idle
clients is important for both average load and peak load in the
studies we've done
(http://www.cs.utexas.edu/users/dahlin/papers/olympics-consistency.ps,
http://www.cs.utexas.edu/users/dahlin/papers/tkde99.ps).
o The current draft makes "stopping sending invalidations" a
heavy-weight operation -- all synchronization is dropped, so
recovery takes work proportional to the number of objects in a
volume.
o This feature makes this apparantly important optimization
much cheaper.
The argument against including this feature:
o The current protocol is semantically correct, so we should be
hesitant to add complexity for a performance gain unless there is
a proven big win.
o The performance win may be modest here: the current system can
handle short periods of idleness (stay connected and synchronized)
and long periods of idleness (pay the resync cost), so the
addition only matters of "medium" periods of idleness. Do these
occur often enough to justify the additional mechanism?
My sense is that there is a significant payoff for this feature, but
there is room for debate.
This archive was generated by hypermail 2b29 : Thu Nov 18 2004 - 11:21:29 MST