
Philosophy
----------

as little work as possible is done on request thread.

session destruction/migration is only done when we know that no
relevant application threads are running.

no session can be timed-out whilst a relevant application thread is in
the container.

invalidation is done by simply marking a session to be dealt with by
the housekeeper.

contention on the id:session map will be severe so we must keep it's
use to a minimum.



migrate - move persistant rep to 'shared' area
replicate - store persistant rep to shared 'disaster-recovery' area
whole session - replicate whole session to replication area
delta -  replicate delta to shared area

per-attribute - delta might be add/remove/change attribute

per-session - serialise session into byte[] and diff (optional
compression) - may get better performance doing this on per-attribute
basis...

a background thread should go around periodically merging diffs into
whole session sized storage...

would analogue of isModified flag be of any use - an isModified
attribute ? - On a per-attrbute basis, this is simply an agreement to
call set/remove when a change occurs...


Migration vs Replication

Migration

JVM -> Store -> JVM

Replication

JVM -> JVM & Store -> JVM & Store

we need 2 stores : 1 for migration and 1 for replication

do they need to cooperate ?

I replicate a session, then evict it. Do I simply throw away the
reference because I know it is already replicated ? or do I remove all
replicants and migrate it onto e.g. disc.

Replication:

Object Identity:
       All sessions
       single session
       single attribute
Safety:
	Immediate (attribute-based)
	Request
	RequestGroup
Packaging
	Delta
	Whole

Migration:
Object Identity:
	single session
	all sessions
Safety:
	some factorisable of RequestGroup
Packaging
	whole (could we use deltas?)


Combinations:

	ObjectIdentity | Safety | Packaging | ???


Ideas for Geronimo:

every MBean has an election policy default value is EveryNode -
i.e. Homogeneous other values are e.g. - OneNodeOnly, RoundRobin
etc,NumInstances(2)=any 2 nodes, etc... OneNodeOnly=NumInstances(1)...




See FileMigrationPolicy

request enters filter with id
id table is checked

if session present try for Rlock and req proceeds

if session not present take Wlock and try to activate - when
activated, release Wlock, take Rlock and proceed.

so when you look up a session, if it isn't in the main table check a
secondary one, for sessions being activated/located etc.

if we locate and proxy, do we cache the location - not at first.

when search in local store fails take a W lock and try to activate.

If another request comes in before activation is complete, it will
fall through to secondary table and block before continuing.

If a location enquiry comes in, we can search first the local map and
then the activation map before responding.

If we don't respond within a given time frame the other container will
assume the session no longer exists.


It would be useful for the session housekeeper to have a garbage
collection policy. The default policy would simply remove invalid
sessions. Alternative policies might e.g. move them to an analysis
area where another program might look at them and figure out why the
user lost interest in the web site etc...

It looks like, in order to support routing info a la modjk, we will
need another interceptor to add/remove it from the session id, as and
when required...




Where to put lock

only facade is looked up by id via impl - so no impl, no facade
facade API should ONLY be that of HttpSession
lock needs to exist before and after impl

so:

we need impl placeholders, with no data, just a lock, which can be
placed into map as and when needed, and turn into a session, cease to
be one as and when also. - nasty, but much more efficient. - how do we
do it ?

transit table shared by node2node and node2store migrations

work on
WADI, Jetty, Tomcat, Resin integrations
Geronimo/Jetty/Tomcat
Clustering/JCluster-replacement

High level clustering/election abstractions...


distributed gc election protocol broken


We can use the Wrapper class to spot when references to stateful
sessions are migrated to store or elsewhere in the cluster and somehow
coordinate the same movement for ejb tier state. Ultimately, this may
not be such a good idea as it will result in cache misses but...


If node clocks aren't synched FileMigrationPolicy gcs files at wrong times - ouch !


Replication.

Every node holds open a connection to every one of it's buddies,
each connection has a logical queue
updates are put onto queues
queue may be sync/async
queue may know how to ellyde deltas/complete-sessions
session should know how to write itself into an internal byte[]
connection will use NIO to copy this byte[] across connection
same byte[] should be shared by all queues

if a session is migrated (from storage of another node) to a node it
will arrive as such a byte[], which can be immediately replicated to
necessary buddies by being put on their queues, etc.



A common pattern (IOC)

interface LifeCycle {

public void start();
public void stop();
public boolean isRunning();
}

class Bean implements LifeCycle {
...

public void setFoo(Type foo){...}
public void setBar(Type foo){...}
public void setBaz(Type foo){...}

...
}

public Type method() {

Bean b=new Bean();
b.setFoo(x);
b.setBar(y);
b.setBaz(z);
b.start();
...
b.stop();
}

I would like:

an xdoclet tag to identify attributes that can only be set when !isRunning();
an aspect which will check an assertion on any setter identified in this way.
the same thing with attributes in 1.5

----------------------------------------

replication.

in case of the loss (planned or catastrophic) of a node, all sessions
which it hosts must be rehosted immediately or possibly initially
reconstituted from backup and passivated into shared store...

this revisits the migrate to disc or vm issue?

----------------------------------------

location

the result of a location query should return whether the host node is
bleeding sessions, if so the session should be migrated off it.


----------------------------------------

PROBLEM.

What happens if a request thread arrives whilst we are
migrating/(passivating) a session?

It waits to acquire an app/read-lock
We finish and it acquires lock
It continues into the container and asks for the session.
It is now too late to proxy it elsewhere, so we will have to migrate the session in underneath it.
NO GOOD !

before we release the container/write-lock, we need to somehow
interrupt all the waiting threads and force them to try to look up the
session again. They will find it is not there and have to consider
alternatives such as redirection, proxying and migration/activation.

----------------------------------------

shutting down

We could evict all sessions to shared store, whence to be loaded by other nodes.

We could divide sessions equally across cluster and migrate
concurrently to all nodes.

Bear in mind how this situation will be affected if every
node/basket/session has a pair of buddies...

----------------------------------------

Refactoring

The location query should really return a client side proxy for the
session-manager/bucket containing the session in question. That it is
still contained should be ascertained during the resulting two-way
conversation.

So maybe the return packet should just contain a serialised proxy ?,
or simply a string from which we can construct said proxy on the other
side ?

advantages to string - no serialversionuid, smaller...

----------------------------------------

Servlets & Filters may be marked as 'stateless' - i.e. no session
interaction.

Non-container code running in a stateless component may not read/write
the session. Attempts to do so will be met with an Exception. How do
we handle container code ? setLastAccessedTime will/won't be called ?
If called - does nothing.

With this optimisation in place, we can avoid locking the session
table (hopefully) and certainly any unecessary replication or
contention when serving static resources from e.g. the DefaultServlet
etc.

----------------------------------------

inbound/outbound lock tables....

refactoring local/activating lock tables....

should be replaced with local/inbound/outbound.

whilst sessions are being migrated into a container (from store or
another node) they take an inbound lock, whilst bein migrated out of a
container (to store or another container) they take an outbound lock.

I think we probably need the same arrangement in stores.

Try (but maybe too slow...) :

requests should :

take read lock on outbound table for session id - now no-one can move their session off node
take read lock on local table for session id - now no-one can garbage collect their session
check if session is local
if so is it outbound (wait for outbound lock, then check local again - maybe migration failed)
if not take inbound readlock
is session inbound wait
else release inbound readlock, take inbound writelock and insert locked inbound lock
try to load session
release all locks

maybe we don't need the outbound table - the w lock in the session
would be locked - we wait for it and then try local map again - phew !

----------------------------------------

do we really need to load a session back off disc and activate all
it's attributes before destroying it?

the attributes were passivated as we saved it - do they need to be
activated when they are timed out ?


----------------------------------------

redirect/cookie/Jetty problem.

If I get hold of the session cookie, rewrite its routing info and
redirect back to the lb, then browser receives:

Cookie: JSESSIONID=C82EA2883D12C57D1B1F47BE85E8FF71.web0; JSESSIONID=C82EA2883D12C57D1B1F47BE85E8FF71.web1

see headers.txt, instead of just one cookie - it looks like Jetty adds
the web1 cookie after I have added the web0 one - why ?

Then the browser, firebird, figures out the path for each of them differently !

the first gets /wadi/jsp, the second /wadi... ?

when I logged their paths in Jetty, both were 'null'

----------------------------------------=

eviction lock need not take any locks on first pass
migration could remove session from table then reinsert on fail

Commands could be nested into two-sided conversations.
A whole conversation could be sent from one node then progressively unpacked and as passed backwards and forwards
each command has a commit and rollback method.

----------------------------------------=


two types of affinity ?

(1) memory-based

lb actually remembers where last request for a given session was
processed and sends subsequent requests for same session to same
place.

with this type of lb we cannot redirect a request to another node via
its routing info - see (2), but we could proxy it. However this should
only be done in emergencies as the lb will still think that the
session was available on the node that it sent the request to and will
continue to route relevent requests since we it is awkward to relocate
requests with this lb we shouldhere. Therefore we should choose
policies that will relocate the session instead, knowing that the lb
will remember where it last saw it.

NB - memory is state and a tier of these lbs will need to replicate
such state to each other.

(2) routing-based

lb holds no memory/state, but this is encoded as routing info on the
end if the session cookie/param. This is advantageous since no state
needs to be replicated across lb tier, but adds extra complexity in
web tier, since if a session relocates its session cookie must be
adjusted.

If we understand how the routing info is encoded then we can use
request relocation (cheaper than session relocation). This means that
we can actually relocate sessions to where we want them, rather than
where the lb expects them, then relocate initial requests to
them. Subsequent requests will thenceforth be delivered to correct new
location.


node shutdown...

we should avoid writing to long term store if we can, since it is a
single point of contention and failure.

under a routing-based lb, a node should share out its active sessions
between remaining nodes up to their capacity, then write remaining
sessions into long term storage. The lb will have no idea where to
deliver requests for these sessions, since the node encoded in the
routing info is inactive, so it will drop them anywhere on the
cluster. The node receiving the request will relocate it to the
sessions new owner. worst case scenario cost for a sbsequent incoming
request will be: either a single request relocation, or a single
session disc-node migration.

under a memory-based lb a node has a choice :-(

a) do as for a routing-based lb.

worst case scenario cost will be: either one node/store-node session relocation

or migrate all sessions to store where WCSC becomes one store-node session relocation

so if we want to avoid use of store it looks as if both lb strategies
should share their sessions out with remaining nodes. At least some
requests will fall on correct node and if they fall on wrong node cost
will be same or less than load from store.

There is another way an lb can implement stickiness - by intrusive
cookie that records where last successful request went, this can be
considered to be the same as routing-based affinity except that the
routing info is encoded in its own cookie - WADI should consider how
to do this.

in summary:

- node is shutting down
- takes exclusive lock on session map
- multicasts to all other nodes enquiring about location and capacity
- share out active sessions between answering nodes up to their
capacity (all nodes should answer as this will be quicker than waiting
for a timeout - we have to wait anyway if we don't know how many nodes
there are)
- release lock or shutdown whilst still holding it - browsers will
reissue unanswered requests.

issues.

node A shuts down passing node B sessions that fill it to capacity
node A puts remaining sessions in store
node B receives a request that requires that it load one of As sessions from store
but it has no capacity...

it can temporarily override max-size
it can forcibly evict another session to make space

it can look for another that has space and proxy the request to it -
but will end up being a long-term go-between in this conversation -
not a good idea.

----------------------------------------

does identity need to live in a session ?

does validity need to live in a session - is an invalid session simply
not one that doesn't exist?

----------------------------------------

now that facade is set in HttpSessionImpl.readObject ensure that it is
not being done unecessarily elsewhere.

----------------------------------------

we need a good joinpoint to hang session creation and destruction
related aspects from - how about a session commission/decommission
method pair.

----------------------------------------

wad-web.xml is written (currently) in Jetty-ese, because I anticipate
the internal APIs changing frequently for a while and Jetty-ese is
more flexible in this respectthan Digester. I understand that this
ties the Tomcat integration to Jetty code. Too bad :-) its probably
only a temporary situation.

----------------------------------------

The PassivationStrategy should probably live inside the EvictionPolicy
? Their naming convention should be rationalised.

-----------------------------------------

wadi-web.xml needs FAQ-ing

-----------------------------------------

What happens when Jetty or Tomcat are run via JMX :-)

-----------------------------------------

for Tomcat (maybe Jetty as well)


if get() comes in from above filter, we know it is request thread and maybe first time
filter will wrap request and put aspect around get so that it arrives with ref to req object in threadlocal
req obj will have field for current session

req objects session methods will be aspected to keep this up to date

so any session fetch/create/invalidate will update relevant request
obj.

on leaving through filter current session will be picked out of request wrapper and unlocked...



In Jetty, we may not need to wrap request as every get/create will have req passed as param...

what about invalidate ?, how do we ensure that req is kept up to date...
invalidation goes through manager which must notify relevant request wrappers ?

-----------------------------------------
