Read concentration can be solved with a Data Caching System based on Redis and HBase, while write concentration can be addressed through Big Table which is solely based on HBase. It has a hybrid structure in which NoSQL and RDBMS are combined. As Big Table hasn’t been implemented yet, this article won’t be discussing it further.
① Data Characteristics
The system data can be categorized into common data, individualized data, and personalized history data. Common data is basically the master data and personalized data is the individualized version of master data such as one’s own favorites, while personalized history data such as purchases and views is the time-series data.
Read concentration usually takes place for common and individualized data, whereas write concentration is mostly for personal history data. Common data is mostly read then being written while it’s still relatively small, as the administrator generates it and users read it. On the other hand, because individualized and personal history data are generated by users, read and write both take place frequently even though their sizes are quite large (Tab. 1).
Redis was chosen for the NoSQL which holds common data where read is more common than write and the size is smaller. Redis is a memory based key-value store with rapid replication, which caught people’s attention.
According to the testing which went through 10,000 writes on Master during 300,000 read concentrations per second on a Redis Slave then checked if all copies were made on all Slaves every 1ms, the copy took 1.2ms on average.
9,988 out of 10,000 writes were initially synced, then the remaining 12 writes were also found copied within 1ms. Some irregularity was observed but it was still within 5ms, and the one which took 10ms on the far left was due to the access request (Fig. 2).
Even though it’s more common to construct multiple Redis into a Ling form based on Coherent Hashing, this one took a different format by having one Master connected to n x Slaves (Fig. 3), since the goal here is to distribute read concentration horizontally by copying common data(=n), instead of a large scale Caching.
HBase was chosen as the NoSQL which stores individualized data generated and is searched by users. Unlike Redis, HBase is NoSQL which has permanent storage and a table-like data structure column. Its Block Cache and Sharding especially drew their attention.
Block Cache is a function which keeps an entire block that’s been read once in memory of a set size to reduce Disk I/O. Once used with enough secured memory, it can perform like Memory DB.
A test was done to measure how long it takes for light SQL (result set 0.8KB) and heavy SQL (result set 300KB) cached on Redis and HBase to run 100 times, then the result was compared with ORACLE. It shows that as SQL gets heavier, it is faster to read Redis and HBase than reading ORACLE.
In addition, compared to Redis which is the fastest among memory DBs, the difference between Redis and HBase is about 6 times when SQL is light but almost similar when its heavy.
HBase goes through sharding based on row key. Sharding is when data is distributed horizontally in a distributed environment. As mentioned earlier, Redis also implements a distributed cache with coherent hashing to shard large scale data and process it. However they both have their own limits.
First, Redis cluster doesn’t determine the location of data. Where to locate the data within the distributed environment should be implemented by clients, while HBase takes care of this problem through cluster-side setting. Once a table is generated with the standards for distribution, all clients need to do is connect, read, and write.
Second, HBase secures data consistency through CP structure NoSQL. This helps users see major data such as purchase history information in a consistent manner. On the other hand, a Redis cluster has a structure similar to Cassandra, which is considered an AP structure. AP structure NoSQL doesn’t secure this consistency.
④ Inquiry Result Caching
Once the inquiry result is cached, SQL based application programs no longer require modification. As seen in Tab. 2 and 3, it is also possible to provide the service with consistent speed regardless of its heaviness.
There’s also a downside however, because cache is made according to the SQL search option, and it requires large scale storage. In this case NoSQL shows its advantage as it’s cheaper and its scale can be larger. Also, ORACLE and cache sync is not done on the row level.
This is because it’s hard to know what rows are being processed when tables are joined without bind variable input. For this reason, table A and D in Fig. 4 can be controlled on the row level, but table B and C can be controlled only on the table level.
When synced by tables the cache life gets shorter as it is renewed due to rows irrelevant to SQL results. Still, the table level sync is considered effective. In a system where read is extremely more common than write, shorter cache life doesn’t lower the cache efficiency substantially (Fig. 5).
Fig. 5 compares the virtual cache being renewed once an hour and five times an hour based on the observation made on a node where Redis showed 6,000 reads per second. Because the scale on axis Y is much larger than axis X, the cache efficiency doesn’t show much difference.
Sync control by rows makes maintenance trickier, and generalization also becomes difficult as a result. There can be many people who change the data on a running system including manual changes. There are also a number of staff members and organizations that manage these people who make changes (Fig. 14). Yet in this case, the main data (here it means user ID and items for broadcasting) was set to be controlled by rows.
⑤ Cache Renewed by Users
A decision was made to change the cache renewing process from the administrator changing it directly to user transaction. Even though the administrator renewing cache may work when caching small datasets, it doesn’t do much when the dataset is large.
The dataset is being made in a separate server over a long period of time, although the existing system is operating a cache in the form of files for certain common data. This is a structure which makes it difficult to sync a database and cache in real time.
To sync the data immediately, the structure was designed through which only the specific changes instead of the entire dataset are moved. The changes are easier to manage since they are smaller, and the time required to transfer data to the cache system is also shorter.
When user transactions access the cache system, the internal logic confirms the changes and renews related cache. Until the next change is informed, this cache is shared among users.
Information change on Redis cluster is written on master, and automatically synced to different Slaves. Users access certain Slaves through load balancing logic and read/generate cache while referring to changed data (Fig. 6).
In an HBase cluster, changes are made through user updates, insert, and delete processes. The changes are written through HBase within the transaction. These are called changed data within individualized data. Changes made among common data are read in Redis. Two data are referred to generated or read cache. (Fig 7).
⑥ Sync Queue Batch
Non-sync method was chosen to transfer the changes made in the database to a cache system. The job can be done through a queue table and the batch program which reads queue tables and writes on the cache system.
This is because the changes are made by many people and program languages are different, and as a result, a well-suited environment is needed for all these different circumstances so that NoSQL can cache the system directly. When the changes can be inserted into a queue table, it becomes much easier to apply them.
Although this is a non-sync method, up to 20,000 changes can be made in an hour and a change can be applied to a cache within a few seconds, there are not that many objects to be changed. (Tab. 4)
⑦ Common Module
There are three different parts. There are HBase for SoSQL access and Redis for read and write, then the part in charge of interface. Since existing application programs were constructed based on Pro*C on TP Monitor from a middle ware called TMAX, a common module had to be provided in the form of C/C++ based API.
The basic function of a common module is to check whether the content of the cache is up to date and return it when users run SQL, or to read data from ORACLE to generate new content and send the result if it’s not up to date.
Load balancing function distributes the load on the Redis Slaves and thrift servers. Because it communicates with the inside cluster for error detection, it doesn’t use hardware called L4.
When runtime Blocking function detects an error from one of the nodes while distributing the load to nodes (=n) in the cluster, it blocks that part and creates a bypass through other nodes. In the case of HBase, Errors from HBase Region Server and from Thrift should be distinguished for different measures, because the latter is a connection error while the former is a data error.
Full blocking function stops either/both Redis and HBase from being used. This function is necessary for maintenance and to deal with the entire NoSQL error.
There is also a function which examines compatibility regularly. This is to prevent broken sync from being left broken for a long period time, as there is always a chance for a cache and database sync to break when the read number of certain cache, RDB and cache data compatibility is examined automatically.
If they are found incompatible, the cache is blocked from being used any further, and reported to the administrator. The examination cycle and number can be set at two different levels. When they are set shortly in the beginning but a bit longer later on, incompatibility caused by developing mistakes can be corrected quite early.
HBase write failure function is prepared for situations where RDB and cache compatibility is broken during partial or entire HBase errors. This function generates a queue concept table for RDB and sets the sync to resume through a batch after fixing the error.
The last function to introduce is for long running SQL. Long running SQL takes a longer time to generate a cache, so this function made dirty cache searchable for the response speed while it is being generated.
① Test Environment and Scenario
28 programs on D system among the seven services sharing an ORACLE database were applied to use cache. D service was being provided from a RAC node.
Operation database server is RAC 2 node and its specifications is as follows.
- HP Superdome 32 core (3,910,767 tpmc)
- Memory 256GB
- Storage VNX5300
- Oracle 10.2.04
Redis Cluster uses LG CNS SBP Appliances consisting of eight servers with specifications as follows. Two are Masters, and the rest are Slaves.
- HP DL360p Gen8
- CPU 1P/8 Core 2.6 GHz
- Memory 64GB
- Redis 2.8.0
HBase cluster uses SBP Appliance consisting of eight servers with specifications as follows. Two of them are Hadoop namenodes and the remaining six are datanodes.
- HP DL380p Gen8
- CPU 2P/16 Core 2.6 GHz
- Memory 64GB
- Apache HDFS 1.1.2
- Apache HBase 0.94.5
- Thrift 0.8.0
Cache was blocked from being used and only ORACLE was set for use at 23:29, which is right after the busiest hour. Assuming the usage rate before and after three minutes of blocking cache, then the server index (CPU usage rate) and database index (LRB/sec and execution/sec) were compared. Blocking was maintained for 26 minutes.
② Test result
Compared to when without cache, the CPU usage rate dropped by 32%, LRB/sec by 56%, and execution/sec by 36%. This architecture becomes more effective when there are more users, so it is expected to show even better results during busy hours.
Even though there were a lot of concerns, LG CNS SBP Appliance consisting of HBase, Redis, and commodity Linux server was successfully applied to a 1st grade B2C system and stably run. For over half a year since it opened on June 25th, 2014 to this very moment I’m writing, it has been operating without a single error.
This means an architecture can cache big data through OSS from the big cache system point of view. The structure where SQL results are cached doesn’t require much modifications on the existing application system, so this type of architecture can be used for different systems with various purposes. Big cache systems accompanied with big table systems can now finally end the vicious circle of RDB expansion.
The fact that the client recognized performance and stability of HBase as a part of their big cache system was another gain from this project. HBase is important because it has permanent storage, and soon it will be used as storage for various data.
Strengthening cache monitoring from the big cache system based on statistics and elongating the cache life by analyzing collected data and finding development error are the tasks for the future.
Today, we had a look at new ways to utilize big data. I hope this article helped you get one step closer to big data.
Written by Euijin Lim, LG CNS
 Oreilly, HBase: The Definitive Guide 2011, Lars George, p216 [back to the article]
 Oreilly, HBase: The Definitive Guide 2011, Lars George, p7 [back to the article]
 The only row on HBase available for search [back to the article]
 Hanbit Media. Memecached and Redis for large data construction 2012. Daemyeong Kang. pp.23-26 [back to the article]
 CP (Consistency-Partitioning Tolerance) in CAP [back to the article]
 An AP type NoSQL developed by Facebook and made public by Apache Software Foundation [back to the article]