Notes and Links for NetApp with VMware iSCSI Datastores (and ifgrps)

Mainly focusing on acquiring information for implementing NetApp ONTAP with VMware, iSCSI Datastores, and iSCSI LIFs on an IFGRP. HCI is also mentioned (since it is iSCSI SAN.)

Provisioning for SAN protocols > iSCSI express configurations for ESXi using VSC > iSCSI configuration workflow:

Configuring your network for best performance:
1) Connect the host and storage ports to the same network.
It is best to connect to the same switches. Routing should never be used.
2) Select the highest speed ports available and dedicate them to iSCSI (if possible).
3) Disable Ethernet flow control for all ports.
4) Enable jumbo frames (typically MTU of 9000).

Configuring host iSCSI ports and vSwitches
It is recommended that you use IP Hash as the NIC teaming policy, which requires a single VMkernel port on a single vSwitch.

Binding iSCSI ports to the iSCSI software adapter
The ports you created for iSCSI must be associated with the iSCSI software adapter to support multipathing.

Configuring the ESXi host best practice settings
VMware vSphere Web Client > Home > vCenter > Hosts > Right-click the host > Actions > NetApp VSC > Set Recommended Values.

Deciding where to provision the volume > Creating a new SVM
If CIFS is one of the protocols you selected, then the security style is set to NTFS. Otherwise, the security style is set to UNIX.
Keep the default language setting C.UTF-8.

ONTAP 9: iSCSI Configuration for ESXi using VSC Express Guide (PDF):

How to configure VMware vSphere 6.x on Data ONTAP 8.x

The best practices configuration on vSphere 6.x with Data ONTAP 8.x includes configuring the following:
- For IP-based storage, NetApp recommends separating IP-based storage traffic from public IP network traffic

Setting up a multipathing configuration:
For clustered Data ONTAP, only ALUA configuration is supported for iSCSI, FC, and FCoE protocols.
Base your multipathing configuration on …:
The VMware default ALUA Storage Array Type Plugin (SATP), with either the Round Robin (RR) or Most recently Used (MRU) PSP

Advanced configuration parameters for vSphere 6.x clients:
Note: These recommendations require that each HBA configuration in the ESX host for the protocol being used (FC, FCoE, or iSCSI) uses the default factory configuration. For FC, FCoE or iSCSI:
Disk.QFullSampleSize = 32
Disk.QFullThreshold = 8

VMware vSphere with ONTAP (Best Practices)

2.5 Virtual Storage Console
The most important best practice when using vSphere with systems running ONTAP software is to install and use the Virtual Storage Console (VSC). The VSC is a vCenter plug-in that simplifies storage management and efficiency features, enhances availability, and reduces storage costs and operational overhead, whether using SAN or NAS. It uses best practices for provisioning datastores and optimizes ESXi host settings for multipath and HBA timeouts (these are described in Appendix B).

2.6 General Network
- Separate storage network traffic…
- Jumbo frames
- NetApp only recommends disabling network flow control on the cluster network ports within an ONTAP cluster…
- Configure the Ethernet ports as Rapid Spanning Tree Protocol (RSTP) edge ports or by using the Cisco PortFast feature (if using 802.1Q VLAN trunking enable the Spanning-Tree PortFast trunk feature.)
- Use switches that support link aggregation of ports on two separate switch chassis, using a multichassis link aggregation group approach (i.e. Cisco vPC)
-- Disable LACP for switch ports connected to ESXi unless using dvSwitches 5.1 or later with LACP configured.
-- Use LACP to create link aggregates for ONTAP storage systems, using dynamic multimode interface groups with IP hash.
-- Use IP hash teaming policy on ESXi.

2.7 SAN (FC, FCoE, iSCSI), RDM
The NMP recognizes the ONTAP cluster as ALUA, and it uses the ALUA storage array type plug-in (VMW_SATP_ALUA) and selects the round robin path selection plug-in (VMW_PSP_RR).
Other best practices to consider:
- Make sure that a logical interface (LIF) is created for each SVM on each node in the ONTAP cluster for maximum availability and mobility.
- For iSCSI networks, use multiple VMkernel network interfaces on different network subnets that use NIC teaming when multiple virtual switches are used. Or use multiple physical NICs connected to multiple physical switches to provide HA and increased throughput. In ONTAP, configure either a single-mode interface group for failover with two or more links that are connected to two or more switches or use LACP or other link-aggregation technology with multimode interface groups to provide HA and the benefits of link aggregation.
- Use VSC to create and manage LUNs and igroups. VSC automatically determines WWPNs of servers and creates appropriate igroups. VSC also configures LUNs according to best practices and maps them to the correct igroups.

Need to Change the Ifgrp Distribution Function from IP to Port for Cluster in Production (HTML):

1) Remove a port from the ifgrp
2) Create a new ifgrp with the "port" distribution function
3) Migrate the LIF from the old to the new ifgrp - network interface migrate (15 second disruption)
4) Delete the old ifgrp and add the second port to the new ifgrp

Ethernet Storage Design Considerations and Best Practices for Clustered Data ONTAP Configurations:

Best Practice: IFGRP Resiliency
As a best practice, NetApp recommends that when you create an IFGRP, you should create the IFGRP by using ports from different NICs. You should, however, verify that they are the same model or chipset and that they have the same port speed, functionality, and so on. This step is critical in maintaining consistency in the IFGRP in the event of a port failure. By maintaining consistency with port aggregation and by spreading the IFGRP over NICs in different PCI slots, you decrease the chances of a slot failure bringing down all the ports in an IFGRP.

Note: If you use iSCSI (RFC 3720) in clustered Data ONTAP, there are two methods to achieve path redundancy. One method is to use IFGRPs to aggregate more than one physical port in partnership with a Link Aggregation Control Protocol (LACP)–enabled switch. Another method is to configure hosts to use Microsoft Multipath I/O (MPIO) over multiple distinct physical links.
Both of these methods allow a storage controller and a host to use aggregated bandwidth, and both can survive the failure of one of the paths from host to storage controller. However, MPIO is already a requirement for using block storage with clustered Data ONTAP, and the use of MPIO has the further advantage of no additional required switch configuration or port trunking configuration. Also, the use of an IFGRP for path management when you use iSCSI is not supported; again, you should use MPIO with iSCSI. However, the use of an IFGRP as a port for an iSCSI LIF is supported.

Best Practice: IFGRP Port Settings
The network interfaces and the switch ports that are members of a dynamic multimode (LACP) IFGRP must be set to use the same speed, duplex, and flow control settings. However, NetApp also recommends that you follow the same practices if you create any of the following different IFGRP types.

Load Balancing for Multimode IFGRPs
Four distinct load-balancing modes are available:
- Port. Use this distribution method for optimal load-balancing results. However, it doesn’t lend itself as well to troubleshooting, because the TCP/UDP port of a packet is also used to determine the physical port that is used to send a particular packet. It has also been reported that switches operating in particular modes (mapping MAC/IP/port) might exhibit lower than expected performance in this mode.
- MAC. This method is useful only when the IFGRP shares the same VLAN with the clients that have access to the storage. If any storage traffic traverses a router or a firewall, do not use this type of load balancing. Because the MAC address for every outgoing IP frame is the MAC address of the router, using this type of load balancing results in only one interface in the IFGRP being used.
- IP. This method is the second best for load distribution, because the IP addresses of both the sender (LIF) and the client are used to deterministically select the particular physical link that a packet traverses. Although it is deterministic in the selection of a port, the balancing is performed by using an advanced hash function. This approach has been found to work under a wide variety of circumstances, but particular selections of IP addresses might still lead to unequal load distribution.
- Sequential. … NetApp does not recommend that you use this type.

Note: Remember, the load balancing in an IFGRP occurs on outbound traffic, not inbound traffic. So, when a response is being sent back to a requester, the load-balancing algorithm comes into play to determine which “path” is optimal to use to send the response back. Also, it’s important to note that the preceding load-balancing options might differ from other settings in the environment. You should thoroughly check other devices that might be connected to the ports on the nodes in the cluster. If you use “port” on the IFGRP configuration on the cluster, make sure that the switch port on the Cisco, Juniper, Brocade, or other device is also configured in the same way.

The following NVAs are very interesting for low level commands and detail (I plan to cover these in note form in future posts.)

FlexPod Express with VMware vSphere 6.5 and FAS2600 (DEPLOY)

FlexPod Datacenter with Microsoft Exchange 2016, SharePoint 2016, and NetApp AFF A300 (DEPLOY)

FlexPod Express with VMware vSphere 6.7 and NetApp AFF A220 (DEPLOY)

VMware Private Cloud on NetApp HCI (DESIGN)

VMware Validated Design for NetApp HCI (DESIGN): VVD 4.2 Architecture Design:

Comments