Infinidat Neural Cache: Harnessing Deep Learning to Drive High Performance at Low Cost
When All-Flash Arrays (AFAs) first appeared, the conventional wisdom about them consisted of two data points: they were very fast, and they had a high cost for raw storage capacity. Drawing upon decades of enterprise storage experience, Infinidat's architects came up with a design that was faster than all-flash but actually had a much lower cost for raw storage capacity. And they did this with a system which is mostly composed of very cost-effective hard disk drives (HDDs). How did they do that? Well, there are a number of design innovations that contribute to this, but one of the key ones is how Infinidat manages read cache space to maximize read cache hit rates.
Most enterprise storage systems use a tiered design with a very small amount of expensive DRAM cache and a large amount of slower but much lower cost persistent storage. As data is requested and gets pulled into cache, systems typically pre-fetch additional data that they believe is likely to be referenced next (thereby lowering the latency to access that data when it is needed). Accurate pre-fetching increases a system's read cache hit rate, resulting in overall better performance whereas inaccurate pre-fetching slows a system down (as processing will have to wait until the requested data is brought into cache from the much slower persistent storage). Conventional algorithms like "least recently used" or "first in first out" are used to determine which data gets moved out of cache to make room for the data that is being pre-fetched. All in all, how a system manages in particular read cache space has a pretty big impact on the overall perceived performance of the system, particularly when that system operates at scale.
Infinidat developed a unique approach called "neural cache" that is all about using intelligent data placement in a tiered storage design to cost-effectively optimize read cache hit rates. Neural cache uses metadata tagging to track not just access frequencies but many other metrics that conventional systems do not track – block sizes, read vs write frequencies, what data gets used together, associated application I/O profiles, etc. – and uses that data to develop block-specific "activity vectors." Activity vectors change over time as block metrics change so they always provide the best information to determine which data is most likely to be referenced together (and hence should be pre-fetched) and where data should be written (when it's moved out of cache). Which data gets pre-fetched together is determined by real-time analysis of activity vectors which indicate which data has a high probability of being used together. Neural cache maintains a knowledge of both spatial and temporal locality on a block-by-block basis, and this approach has proven to drive extremely high read cache hit rates for mixed enterprise workloads that exhibit a mix of random and sequential access. And because this method is based on a deep learning approach, it quickly optimizes as conditions change, using a homeostatic approach (just like your home thermostat) to always keep the system operating at peak performance without any manual intervention.
There are several other aspects to the InfiniBox design that contribute to the extremely high read cache hit rates. First, InfiniBox uses multiple storage tiers – a DRAM cache that is nonpersistent, a solid-state tier that is persistent, and all the rest of the system capacity on low-cost spinning disk. Infinidat uses a larger DRAM cache than most enterprise storage vendors – up to 3TB in its larger systems – which provides more cache capacity. This higher cache capacity accomplishes two things: it allows more data to be stored closer to the CPUs to help boost the higher cache hit rates, and it allows data to stay in cache a bit longer to feed multiple subsequent accesses and provide better data inputs for the activity vectors, leading to more effective data placement. That DRAM cache is backed by a persistent, solid-state tier that can be as large as 368TB on its larger systems, and that capacity is largely managed as a secondary cache. The remainder of the system's capacity is on HDDs, and for larger systems (4PB+) that can be over 90% of the overall capacity of the system. Remember from my prior blog that over 70% of installed systems in the field have 2PB+.
Second, data in the write cache will be coalesced based on activity vectors and written out serially to persistent storage for higher write efficiencies. In the case of solid-state storage, this minimizes any garbage collection activities over the long term and for spinning disk, it takes advantage of the much higher sequential write performance (vs random write performance) of HDDs. As side advantages, minimizing random writes on the flash improves media endurance while doing so on the HDDs contributes to better device-level reliability.
And third, all of the system metadata is kept in DRAM in trie data structures for fast, efficient access. Based on a decades-old design originally leveraged by Google to speed internet searches, Infinidat developed and patented a similar design optimized for storage access. Trie data structures deliver a very scalable way to enable literally billions of objects (such as a data block) to be referenced very quickly without having to traverse a deep "tree", or risk dealing with hashing algorithm collisions at hyper-scale. Fewer instruction cycles are required to "find" any needed data in those rare cases where it has not already been pre-fetched. This makes it very efficient for data to be pre-fetched together, even when that data does not reside together (as it might not if activity vectors have evolved since the data's last use).
Infinidat's Neural Cache uniquely employs that subset of artificial intelligence, machine learning (AI/ML) known as Deep Learning. Deep Learning evaluates multiple decision options and continuously learns which ones did and which ones did not prove more helpful in meeting its objective, effectively leveraging what looks like a "neural network" to pursue its goal (which in this case is to optimize data placement over time so as to minimize read cache misses). The longer it runs, the closer it comes to its target and, when things change (as they often do with modern workloads) it adapts and begins once again closing in on its objective. It is fully self-managing, and an excellent example of how (and why) most storage vendors will ultimately move to similar in-system approaches and away from the more static cache management and more manually intensive approaches to system optimization that are so prevalent today.