Infinidat Blog

How Performance Analytics Helps You Solve Problems When Storage Looks Like the Usual Suspect

I recently learned a new term, that probably requires no explanation for most storage professionals: TTI or “Time To Innocence.” This is the time it takes the storage admin to prove the problem is not in the storage layer. Storage admins often spend a lot of time proving their systems’ innocence, and when the storage really is to blame, they spend just as much time looking for the root cause, usually having to do with a change – a change in workload which could cause a change in performance, a change due to connectivity failures, or any other cause.

Usual Suspect

INFINIDAT recently announced our InfiniBox 3.0 release, which includes a multi-dimensional instrumentation layer called Performance Analytics, which allows storage administrators to cut TTI to just a few seconds. Additionally, InfiniBox 3.0 allows them to ask any type of question about their performance (more below).

Let’s dive into each of these scenarios and see how an InfiniBox admin can quickly prove his innocence:

It wasn’t me, your honor. I’m innocent!
Proving your innocence from a storage admin’s perspective is about two main questions:
What’s the current performance (IOPS, latency) and how does it compare to the past?
If the application is suddenly consuming twice the number of IOPS it usually does, the storage admin is usually “acquitted” and can go back to doing his job – maintaining and managing the storage to support the business.

With InfiniBox, the task of diagnosing performance issues is handled using InfiniMetrics, our performance archive (the evidence locker if you will) which allows administrators to quickly perform a “post mortem” for any performance issue even if it has already disappeared. While we love the simplicity of the InfiniMetrics user interface and its scalability, looking at any storage performance history requires the administrator to know where to look. The real magic of performance analytics is when you’re not sure what you’re looking for.

I’m guilty your honor, I’m just not sure who made me do it.
When an application misbehaves, looking for the root cause of an ongoing performance issue is the proverbial needle in a haystack (or should I say storage stack?). Usually, the first question administrators ask themselves is: “which applications are the top consumers of the system’s performance resources,” followed by “what are these applications doing right now?”. This is where the multi-dimensional performance analytics in InfiniBox 3.0 come into play.

So what do we mean by “multi-dimensional performance analytics?” InfiniBox now has a “BI-like” (BI=Business Intelligence) engine that can look at any performance metric from multiple views. Let’s say you run a top IO consumers report and find that volume XYZ is your top IO consumer. You may want to investigate further by looking at the following:

  • What is this volume doing in terms of reads, writes, or non-IO SCSI commands?
  • Are the volume’s IO generated by the host or are they reads generated by replication to the DR site?
  • Which hosts are the top performers within this volume?
  • Is the high latency on this volume a result of large sequential IOs (normal) or a small random IO (problem)?
  • Are the hosts accessing this volume balanced across the targets (or maybe there’s a bottleneck on the target port that is causing the high latency)?

Think of each of these possible avenues of investigation as looking at the same data (performance) from another dimension (a term coined by BI). What the storage admin needs is the ability to quickly change between views to get to the root cause of the performance issue in the quickest way possible.

In InfiniBox 3.0, we’ve designed a very scalable API that can report on all of these dimensions, second-by-second across thousands of data points, and allows the administrators to switch between them without having to build all the views in advance – all the data is there waiting to be mined.

While the new instrumentation layer enabled us to completely overhaul the performance GUI (and for the CLI lovers – the CLI too), this is just the beginning. Now that we have the engine, we can expose more and more of its capabilities to the administrator. We are already working on adding more metrics, adding more workflows to our GUI based on customer and partner input, and expanding the level of data collection and analytics.

With all that, we have even more good news for our customers: As always, the only cost of this functionality is a software upgrade, thanks to InfiniBox’ all inclusive “batteries included” licensing policy. There’s no fine print and no additional PO for the new features.

I have evidence your honor!

About Eran Brown
Eran Brown is the EMEA CTO at INFINIDAT.
Over the last 14 years, Eran has architected data center solutions for all layers — application, virtualization, networking and most of all, storage. His prior roles include Senior Product Management, systems engineering and consulting roles, working with companies in multiple verticals (financials, oil & gas, telecom, software, and web) and helping them plan, design and deploy scalable infrastructure to support their business applications.