Tuesday, February 24, 2009

DBFS Storage Engine

The DBFS Storage Engine module is responsible for providing access to the underlying storage. The DBFS clients are completely unaware of the DBFS Storage layout and the data orientation and the DBFS Storage engine provides a consistent relational access to the underlying data.

Architecture of DBFS Storage Engine
Uniform Access Provider
The Uniform Access Provider (UAP) component is responsible for exposing the underlying data through a relational data access interface. The UAP abstracts the complexities of data layout and data orientation and provides a uniform interface for any type of supported data. The UAP provides Row Store and Column Store as the two default data orientation components. The UAP also provides various data layout implementations like CSV, Fixed Width and In Memory. The UAP supports a pluggable data layout factory that can be used to plug-in various other types of data layout implementations.

Index Manager
The Index Manager provides the index support for the data. This lets the DBFS clients to query the data in a much faster way. Index Manager implements various index data structures like B+ tree, Bitmap Index and Hashes. Index Manager also manages the various index related activities like Index creation and update.

Statistics Manager

The Statistics Manager is a passive component in the Storage Engine module. The UAP and the Index Manager components publish various data access and modification statistics to the Statistics Manager component. The Statistics Manager uses this statistical information to carry out various housekeeping activities as well as implement different data access optimization strategies.

Cache Manager

All the DBFS read/write operations are routed through the Cache Manager component. When the Cache Manager finds any match, the values are returned from the cache. This provides faster access compared to the disk access. Cache Manager provides various cache replacement algorithms like Least Recently Used (LRU) and Least Frequently used (LFU). The data is cached in a memory sensitive manner to conserve the amount of memory and other system resources used by the data cache.

I/O Manager

The I/O Manager provides access to the underlying disk storage. The I/O manager is responsible for carrying out the data read, write and update operations in an optimized manner. The I/O Manager provides asynchronous non-blocking I/O operation as well as synchronous blocking I/O operations.

No comments: