Wednesday, March 18, 2009
SEDA Architecture
A Complext software is divided into a number of independent components - 'Stage' with request queues connecting these stages. Each Stage is free to configure and tune its own threadpool. Each stage has an event handler that takes the requests from the request queue and carries out the appropriate action and finally delivers the request to the next request queue in the software structure.
This model is seen to perform better than 'Thread per request' model as well as 'Bounded thread pool' model. This also provides a very good opportunity to implement self tuning resource management mechanism.
Reference:
http://www.eecs.harvard.edu/~mdw/papers/seda-sosp01.pdf
Thursday, March 12, 2009
Column Oriented Data Storage
In case of analytics and data warehousing applications, various analytical operations are carried out for a specific column and hence the other columns belonging to the same record are of little to no significance in these operations. These kind of applications can benefit from the column oriented data storage technique. The data belonging to the same column are stored one after another. One entry from each of the column store is read for creating a single record. This may sound expensive in most general cases but for various column oriented operations, this type of data storage may prove to be extremely performant.
But a simple storage layer optimization cannot yield better performance. This needs to be matched by the algorithms that are specifically designed to take advantage of the fact that data belonging to the same column are stored close to each other. If that is done, column store can provide high performance for analytical operations like that used in data warehouses.
Acknowledgement:
Thanks to article from http://www.databasecolumn.com
Tuesday, February 24, 2009
DBFS Storage Engine
The Uniform Access Provider (UAP) component is responsible for exposing the underlying data through a relational data access interface. The UAP abstracts the complexities of data layout and data orientation and provides a uniform interface for any type of supported data. The UAP provides Row Store and Column Store as the two default data orientation components. The UAP also provides various data layout implementations like CSV, Fixed Width and In Memory. The UAP supports a pluggable data layout factory that can be used to plug-in various other types of data layout implementations.
Index Manager
The Index Manager provides the index support for the data. This lets the DBFS clients to query the data in a much faster way. Index Manager implements various index data structures like B+ tree, Bitmap Index and Hashes. Index Manager also manages the various index related activities like Index creation and update.
Statistics Manager
The Statistics Manager is a passive component in the Storage Engine module. The UAP and the Index Manager components publish various data access and modification statistics to the Statistics Manager component. The Statistics Manager uses this statistical information to carry out various housekeeping activities as well as implement different data access optimization strategies.
Cache Manager
All the DBFS read/write operations are routed through the Cache Manager component. When the Cache Manager finds any match, the values are returned from the cache. This provides faster access compared to the disk access. Cache Manager provides various cache replacement algorithms like Least Recently Used (LRU) and Least Frequently used (LFU). The data is cached in a memory sensitive manner to conserve the amount of memory and other system resources used by the data cache.
I/O Manager
The I/O Manager provides access to the underlying disk storage. The I/O manager is responsible for carrying out the data read, write and update operations in an optimized manner. The I/O Manager provides asynchronous non-blocking I/O operation as well as synchronous blocking I/O operations.
DBFS Coordinator
The architecture of the DBFS Co-ord is show below:
Bootstrap Manager
Bootstrap Manager is responsible for initializing all the DBFS sub-systems. The inter process communication between various DBFS sub-systems requires each DBFS sub-system to run a listener on a TCP port that is registered with the DBFS Coord. The ports required by each DBFS sub-system is not fixed and therefore, each sub-system is free to choose their port during the bootstrap process and the sub-system is required to communicate the port address to the DBFS sub-system before the Bootstrap Manager finishes the sub-system initialization phase.
DBFS Health Monitor
DBFS Health Monitor is responsible for communicating with each of the DBFS sub-systems to monitor the health of the component and also manages the health of the DBFS system as a whole. DBFS Health Monitor has a heart beat monitor which listens to the DBFS sub-systems during their active lifetime and if there is any anomaly, Health Monitor initiates the failover operation and hands in the control to the FailOver Manager.
FailOver Manager
When the DBFS Health Monitor activates the FailOver Manager, the FailOver Manager calls the Bootstrap Manager to re-initialize the failed sub-system. Also, the FailOver Manager recreates the DBFS state before the component failed. FailOver Manager switches the control from the failed component to the newly activated component in a transparent manner which the DBFS clients are unaware of.
Unified Connection Manager
The Unified Connection Manager is the gateway or the entry point to the DBFS system. The Unified Connection Manager runs at a TCP port that is known to the DBFS clients and all the DBFS clients communicates with the DBFS System through this port. The UCon Manager accepts the incoming requests and creates a new session for each of the DBFS client connection and the DBFS clients can perform various file system related operations through the connection handle. The UCon Manager is responsible for providing a connection handle to the authenticated DBFS clients. It is essential that the UCon Manager runs at a very well known TCP port that the DBFS client is aware of. When the DBFS is shutdown, the UCon Manager terminates all the client connections and waits for the clients to exit if required and carries out the termination in a graceful manner.
DBFS Architectural Overview
DBFS is architected for high throughput, parallel, asynchronous data access. DBFS is a multi process, multi thread program. Each module is executed as a separate process and each request is handled in a separate thread which is obtained from the module specific thread pool. The Inter process communication is carried out through TCP sockets communication for reliable message delivery.
Architectural Overview
The design goals of DBFS are scalability, reliability, high throughput and high concurrency. In order to maximize the CPU utilization, the different DBFS sub-systems are run as separate process in an independent JVM instance.
DBFS provides a very high degree of scalability and concurrency by supporting asynchronous, non-blocking I/O operations. The number of I/O requests served is very high as there is no synchronous blocking I/O calls to the underlying storage system.
DBFS provides reliability by providing a foolproof mechanism for storage integrity and consistency. The data delivered by the DBFS is consistent across multiple request threads and are reliably persisted in the underlying storage. The health of the overall DBFS system is monitored and a mechanism for failover is provided for transparent switching of the DBFS subsystems.
Components of DBFS
DBFS consists of three main sub-systems – DBFS Storage Engine, Transaction Management System and Recovery Management System. Each of these sub-systems has its own address space which is shared by all the components of a sub-system. Therefore, each of these sub-systems is run as a separate process and the components in the sub-system runs as a Java™ Thread inside the corresponding VM. The Inter process communication between these sub-systems is carried out through TCP/IP socket communication.