HPE InfoSight Nimble Labs Overview and Deep Dive


OK, on to Capacity Consumers Forecast. This App can break down historical capacity consumption and even forecast by application type. This is exactly the same as the other capacity focused APP, called Capacity Consumers Timeline, except this one shows you predicted usage for roughly one month into the future based on the forecast range you select. I’ll just choose the date range I want to see in the chart, then select the timeframe I want to use to model the forecasting. I can select physical, logical or allocated space, which changes my options for the other two drop downs as you can see here when I’m clicking through. I’ll choose Logical usage so we can see the breakout of volume vs. snapshot data usage. Now we can see our past data usage as well as some predictive analytics about future data consumption. What I think is really great is that you can keep an eye on things from an overall application viewpoint, and not just looking at volumes or the overall combined storage pool. This chart is interactive so I can mouse over and show or hide certain things so I can better digest and understand this information. Also notice the timeframe used for the forecasting is clearly separated from the overall timeline so you can spot any unusual growth periods and maybe avoid those times for the forecasting data range. I can go in and chose to see volume and snapshot data usage combined instead of split out like we started with. Now I’ll go back and change to view physical space used vs. logical to see what that looks like. Now let’s change context and move to Performance-related Applications. Volume Performance breaks down I/O into sequential and random, which most other analytics don’t do. I’ll select a date range and the storage pool I want to analyze, then choose what I want to filter on. I’ll choose applications and then select one or more of the configured application types in this pool. Now select one or more metrics to show. I’ll just choose a few for this 1st example.
Here we get a nice set of interactive color-coded graphs showing us details on the metrics we selected for the timeframe in question. In this case, IOPS, throughput and latency are broken down by read and write operations. Now I’ll go in and select everything so you can see how that looks. Now we can also see cache hit rates, as well as random vs. sequential details broken down to read and write actions. Scrolling all the way down we see a few histograms showing average read, write and operation block size. This is very helpful when trying to determine the specific blend of block sizes and overall workload characteristics for a certain set of volumes for a specific timeframe. Overall, this particular App provides nice, low level diagnostic inform ation which can be used by even the most advanced storage or application engineers. Pool performance telemetry provides details regarding various low-level aspects of the system, including splitting up workloads between sequential and random. In this case, the selection process is pretty straightforward. I just select the pool I want and the date range and hit the big green button. What this gives me is a very, very detailed breakdown of the various aspects of workload characteristics.
Of course, if you’re using this to troubleshoot a specific event on a specific day, you can jump right to that time by a click drag and release action. This zooms in the whole chart and we can get down to one minute granularity at this level. Notice the timestamp in the upper right-hand side of this graph. Now I’m going to reset the zoom and show you something else that jumps out to me.
The red lines in the sequential read chart indicate high potential impact periods. If you’re not familiar, Potential Impact is a combined score showing problem timeframes with regards to overall latency and time of day combined with workload characteristics. Things such as read/write patterns and block size as it relates to application type. It deduces which high latency events are of concern based on customer workload. As opposed to some of the other low-level diagnostics apps, potential impact is a more user-friendly way for us to indicate possible performance issues. So what we can see is there is a repeating pattern of high potential impact periods happening every night between midnight and 3 AM. We can also see a brief spike in storage CPU utilization at the beginning of this timeframe. The combination of these factors is signaling that something is going on during this time which is making the array work harder and, most importantly as indicated by elevated potential impact scores, application workloads are in danger of being negatively affected. One of my favorites is reoccurring performance patterns which helps you visually identify daily events and view historical performance tending over time. As with many of the other Apps, I’ll need to first select a date range and the storage pool I want to analyze, then choose what I want to filter on. You can see a familiar list of objects here and I’ll just choose applications to be consistent with what we see in the other demos. Chose one or more application types, then select the performance metrics to focus on. In this case, we can only choose one metric instead of multiples like with some of the other apps. We get a nice color-coded chart which overlays the time granularity selected. In this case we chose ‘daily’, so we see from left to right, midnight through 11:55 PM, total latency stats for the applications we selected, each block representing a 5 minute slice.
The vertical axis represents days, the most recent up top and the oldest at the bottom. As shown on the key to the right of the graph, the darker red the color the more total latency. This makes it very easy to spot periods of higher total latency starting every night at midnight and continuing through about 2 or 3 AM. As I adjust the time granularity selection, you can start to see that something changed starting the week of 9/30-10/6 which caused an overall spike in total latency. More than likely, a change was made around that time which resulted in this performance shift.
This concludes our overview of a several particularly useful Apps, which can be found in the Nimble Storage, Labs section of the HPE InfoSight web portal. I encourage you to check out the other demos I’ve linked in the description section of this video if you’re curious about other Apps we have up on the portal.

Leave a Reply

Your email address will not be published. Required fields are marked *