Stateful Session (On Clustering Part IV of VII)

Of course, high-volume computing didn’t stop at real-time updates. Let’s continue!

Stateful Session
----------------
Applications like Y! Mail can use stateless session approach for to achieve scaling. However, in some application, it is highly desirable, in programming perceptive, to have stateful session. Information of the current logged in user is good candidate to be store in a session. Some site might enable GUI hint to allow editing of some attribute. In other case, stateful session also enable easy design of web application that involves multiple steps. A questionnaire might involve multiple pages. With stateful session, previous page’s answer can be retained in the session. Some new high interactive site that provides full desktop-like experiences use stateful session heavily to store current active view, opened document, table sorting order etc. Session does not tend to be survives forever. It is often discarded after timeout or user log off. Scaling such application requires different strategies.

(side note: many developers site use shopping cart as an example for stateful session. However, it is probably not the right approach. User might log out due to various reasons. But, it is often desirable to have the cart still when user logged back in.)

Application server or web server framework is often used for session management. Scalability is often built into the application server of framework itself. For web application, HTTP load-balancer (dedicated hardware) are often use to aid the task. A HTTP load-balancer understands HTTP session ID, or rewritten URL and always reroute the same client to the same server, (unless the server failed) such that the session can be lived in the physical server. Failover of session has been a major selling point of an application server.

It is important to note that the functionality of stateful session can be simulated by moving all the trasient state to browser (probably hidden fields in HTML form), always store them back to the data layer, or both. In a way, stateful session can be viewed as "division" strategy. By factoring out data that is transient and short-lived, and keep them in memory, it reduces the hit to the data layer and increases the performance. The division also enable the use of HTTP load-balancer.

Tag: , , ,
1 response
Stateful session state management in large clustered environments can be quite challenging.
Increasingly large portals rely on session state to manage all sorts of information. Right from your profile, your access rights, intelligence from past purchases, navigations, besides the usual stuff like shopping cart, etc.
What I have seen is that the session state slowly but surely gets obese over time. To make things worse, application developers build their domain model (state that becomes part of the session ) without much consideration to clustering. For instance, HTTP session management in J2EE is designed so that apps manage session state as many entries in a map, so only entries that get modified are serialized across the cluster. But, frequently, the whole graph is stored as one big object.
Then, you want to make your sessions non-expiring, else the recovery and reconstruction from a DB can make the presentation layer quite frustrating. So, now, how do you design a clustered app with thousands(millions of registered users) of concurrent user sessions where each session object size can be measured in MBs?
Load the VM and consume too much heap and you know how the notorious full GC cycle rewards you?