6Answers

Show IO on Netapp

12.6k Views
  • 6
name

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191

Backtrace:

File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

name Punditsdkoslkdosdkoskdo

Show IO on Netapp

I think I might be hitting the IO limits of what my Netapp can deliver, as I have been adding more servers to my cluster and iowait has gone up on each server.

However, how do I quantify this? How can I use Netapp CLI tools to view current IO stats? I am aware of "stats show" but not seeing an "io" object or similar. How do I know what the Netapp is supposed to be able to deliver?

If anyone has more experience with Netapp than I, I would greatly appreciate the help.

Thanks!

Thee are several options to monitor performance of NetApp filer. It depends on version of DataOntap. Just execute sysconfig and you will see version. You can use OnCommand Performance manager as GUI tool for clustered Ontap. Another option for clustered Ontap is QoS as performance monitor. For 7-mode you can use systat or statit console commands.

  • 1
Reply Report

This answer only applies to 7-mode - I have no experience with cluster mode.

With performance problems, there is simply no easy answers.

You have counters for iops, that you can show with sysstat -x.

stats show system will give you something similar - a list of NFS/FCP/CIFS ops etc.

On their own though, these things are fairly arbitrary - how do you know how many IOPs it 'too many'?

The thing I find a most useful indicator is looking at consistency points. Again, back to the sysstat -x. The way filers do write IO is they fill an NVRAM cache. This cache is flushed periodically, and data is written to disk in bursts.

What type of consistency point occurred is a good indicator of whether your system is 'happy'. https://kb.netapp.com/support/index?page=content&id=3014024

T means your system is idle. (triggered by timer - not much happened for 10s, so it thought it better destage anyway)
S or Z is a 'forced' cp because of a snapshot/snapmirror op. (and usually isn't a problem)
F or H or L means your system is getting busy.  (F is nvram filling with write data, H/L represent high and low watermarks for memory)
B or b means your system is struggling. (Back to back CPs, which means your hitting the limits of your ability to write to disk.

This is almost entirely about write IO though. Another reason your system can be struggling is read IO. Writes can easily be cached; reads must be fetched immediately - and only in some cases can they be cached.

Your stats show counter will give you disk_data_read and disk_data_written. sysstat -x will give you the same, and a notion of disk utilisation. (But be warned - that utilisation is 'cross system' so won't show you if you have one really hot aggregate averaged with a 'cold' one).

You can also run stats show volume to get per-volume IO stats. This will give you an idea of total of reads/writes, and which volume they're going to. It also distinguishes between 'read' 'write' and 'other'. 'other' can be quite significant, and problematic.

  • 1
Reply Report

Well, I guess you executed io-stats and see "iowait" on server-side and made the this conclusion "Netapp may be to slow". If you now look to Netapp you will find everything and nothing to prove you theory.I promises you.
Not because of not enough information out of the Netapp storage. But if you not know what you are look for you will not come to the point of a problem (if there is a problem/performance issue related to the storage)
Therefor I would suggest another approach: look from server to storage - foolow the I/O flowFirst of all how are the server's connected ? Fibre-Channel SAN ? NFS/iSCSI (IP based) ?
Check at what time you see "iowait" and do you see "iowait" with no/or little io-busy ? and with low LUN-utilizaion ? --> may this be related to running backup ?
What server are connected ? Most VMWare ?
How is the I/O characteristics (read/write) ration?
Could there be problem with unaligned I/O ?
How is the I/O queue configured on server-side ?
You should analyses from server to storage, not vice versa. Start with a clear picture of you configuration / storage topology. This would also help us to give you more ideas for checking if there is a (storage) issues and where is it located.

  • 0
Reply Report

The Performance Advisor tool that comes with OnCommand Unified Manager is what you'd want. This software is free to all NetApp customers. It will monitor IOPS information at the controller, aggregate, volume and LUN level.

  • 0
Reply Report

Trending Tags