Storage Area Network, SAN. Intel storage systems. Devices implementing SAN infrastructure

SAN switches

SAN switches are used as a central switching device for SAN network nodes. You plug one end of the optical cable into a connector on your server adapter or disk array controller, and the other into a port on the switch. A switch can be compared to a set of wires that are crossed in such a way as to allow each device on the network to “talk” over one wire to every other device on the network at the same time. That is, in other words, all subscribers can talk at the same time.
One or more switches interconnected form a fabric. A single fabric can consist of one or more switches (up to 239 currently). Therefore, a factory can be defined as a network consisting of interconnected switches. A SAN can consist of several fabrics. Most SANs consist of at least two fabrics, one of which is a backup fabric.
You can connect servers and storage to a SAN using a single switch, but it is good practice to use two switches to avoid data loss and downtime if one of them fails. Figure 1 shows a typical fabric that uses two switches to connect servers to a disk array.

Fig 1. The simplest factory using 2 switches.

As the number of servers and storage in your SAN increases, you simply add switches.

Figure 2. SAN Fabric expansion

Modular or regular switches (modular switches)

SAN switches come in a variety of sizes from 8 to hundreds of ports. Most modular switches come with 8 or 16 ports. The latest trend is the ability to increase the number of ports on a purchased switch in increments of 4. A typical example of such a switch is the Qlogic SANbox 5200 (Fig. 3). You can purchase this product with 8 ports in the base, and then expand it to 16 in one module and up to 64 ports (!) in four modules, interconnected by 10 Gigabit FC.

Fig 3. Qlogic SANbox 5200 - four-module stack with 64 ports

Director switches

Directors are much more expensive than modular switches and typically contain hundreds of ports (Figure 4). Directors can be seen at the center of very large switched fabrics as the core of the network. Directors have exceptional fault tolerance and keep the entire infrastructure running 24 hours a day, 7 days a week. They allow you to carry out routine maintenance and replace modules on the fly.

Rice. 4. SilkWorm 1200 128 port and McData InterPid 6140

The director consists of a platform, port modules hot swap(usually 12 or 16 ports) and hot-swappable processor modules (usually dual-processor). The director can be purchased with 32 ports and can be expanded to 128 - 140 ports.
IN corporate networks SANs are typically used by directors as the core of the network. Modular switches are connected to them as terminal (edge) switches. These, in turn, are connected to servers and storage. This topology is called core-to-edge topology and allows you to scale the network to thousands of ports (Fig. 5).

Rice. 5. Core-edge topology using directors.


SAN routers or multiprotocol switches

SAN routers are used to connect remote SAN islands into a single network to solve problems of disaster protection, consolidation of storage resources, organizing procedures for back-up of data from remote departments to tape and disk resources of the main data center, etc. (Figure 6.). Consolidation of remote SAN networks into a single resource is the next step in the evolution of data storage networks after the introduction of SAN in the head office and departments of enterprises (Fig. 7).

Rice. 6: McDATA Eclipse 1620, 3300 and 4300

Rice. 7: Consolidating remote SANs into a single resource

SAN islands can be connected using the FC protocol and conventional modular switches or directors, through a single mode optical cable (single mode cable or dark fiber) or using multiplexing equipment (DWDM). However, this method will not allow you to go beyond the city limits (radius 70 km). For greater removal you will need the Fiber Channel over IP protocol (FCIP, http://www.iscsistorage.com/ipstorage.htm), implemented in McData's Eclipse routers (Fig. 6). FCIP wraps each FC frame in an IP packet for transport over the IP network. The receiving side unpacks the IP packet and takes out the original FC frame from there for further transmission over the local FC network. Here the distances are not limited. It's all about the speed of your IP channel.

FC Cable Types

Fiber optic or copper cable is used as a physical transmission medium in FC networks. Copper cable is a jacketed twisted pair cable and was used primarily for local connections in 1Gbit/s FC networks. Modern FC 2Gbit/s networks mainly use fiber optic cable.
There are two types of fiber optic cable: single-mode and multi-mode.

Single mode cable (long wave)

In single mode cable (SM) there is the only way propagation of a light wave. The core size is usually 8.3 microns. Single-mode cables are used in applications that require low signal loss and high data rates, such as long distances between two systems or network devices. For example, between a server and a storage facility, the distance between which is several tens of kilometers.

The maximum distance between two FC 2Gbit network nodes connected by a single-mode cable is 80 km without repeaters.

Multimode cable (short wave)

Multimode (MM) cable is capable of transmitting multiple light waves over a single fiber, since relatively big size The core allows light to travel at different angles (refraction). Typical core sizes for MM are 50 µm and 62.5 µm. Multimode fiber connections are best suited for devices operating on short distances. Inside an office, building.

The maximum distance over which a multimode cable supports a speed of 2 Gbit/s is 300 (50um) and 150m (62.5um).

Cable connector types

FC cable connectors are:

Transceiver types (GBIC types)

Devices for converting light into an electrical signal and vice versa are called transceivers. They are also called GBIC (Gigabit Interface Connectors). The transceiver is located on the FC adapter board (FC HBA), usually it is soldered into it, in the switch - in the form of a removable module (see figure) and on a storage device in one form or another.

Transceivers are:


SFP-LC HSSDC2

Removable transceiver modules (SFP)

HSSDC2: for 1/2Gbit FC for copper cable
SFP-LC: (Small Form Factor Pluggable LC) 1/2Gbit FC Short/Long wave for fiber optic cable with LC connector
SFP-SC: (Small Form Factor Pluggable SC) 1/2Gbit FC Short/Long wave for fiber optic cable with SC connector

In the matter of knowledge, SAN encountered a certain obstacle - inaccessibility basic information. When it comes to studying other infrastructure products that you have encountered, it is easier - there are trial versions of the software, the ability to install them on a virtual machine, there are a bunch of textbooks, reference guides and blogs on the topic. Cisco and Microsoft produce very high-quality textbooks, MS has at least cleaned up its hellish attic called technet, even there is a book on VMware, albeit only one (and even in Russian!), and with an efficiency of about 100%. Already on the data storage devices themselves, you can get information from seminars, marketing events and documents, forums. On the storage network there is silence and the dead stand with scythes. I found two textbooks, but didn’t dare buy them. This is "Storage Area Networks For Dummies" (there is such a thing, it turns out. Very inquisitive English-speaking "dummies" in the target audience, apparently) for one and a half thousand rubles and "Distributed Storage Networks: Architecture, Protocols and Management" - looks more reliable, but 8200 rubles with a 40% discount. Along with this book, Ozon also recommends the book “The Art of Bricklaying.”

I don’t know what to advise a person who decides to learn at least the theory of organizing a data storage network from scratch. As practice has shown, even expensive courses can yield zero results. People in relation to SAN are divided into three categories: those who do not know what it is, those who know that such a phenomenon simply exists, and those who, when asked “why make two or more factories in a storage network,” look with such bewilderment, as if they were asked something like “why does a square need four corners?”

I’ll try to fill the gap that I was missing - describe the base and describe it simply. I will consider a SAN based on its classic protocol - Fiber Channel.

So SAN - Storage Area Network- designed for consolidation disk space servers on specially dedicated disk storage. The bottom line is that this way disk resources are used more economically, are easier to manage and have better performance. And in matters of virtualization and clustering, when several servers need access to one disk space, such data storage systems are generally irreplaceable.

By the way, due to the translation into Russian, some confusion arises in SAN terminologies. SAN in translation means “storage network” - storage system. However, classically in Russia, storage means the term “data storage system,” that is, a disk array ( Storage Array), which in turn consists of a Control block ( Storage Processor, Storage Controller) and disk shelves ( Disk Enclosure). However, in the original the Storage Array is only a part of the SAN, although sometimes the most significant one. In Russia we get that the storage system (data storage system) is part of the storage network (data storage network). Therefore, storage devices are usually called storage systems, and the storage network is SAN (and confused with “Sun”, but these are trifles).

Components and Terms

Technologically, SAN consists of the following components:
1. Nodes, nodes
  • Disk arrays (data storage systems) - storage (targets)
  • Servers are consumers of disk resources (initiators).
2. Network infrastructure
  • Switches (and routers in complex and distributed systems)
  • Cables

Peculiarities

Without going into too much detail, the FC protocol is similar to the Ethernet protocol with WWN addresses instead of MAC addresses. Only, instead of two levels, Ethernet has five (of which the fourth has not yet been defined, and the fifth is mapping between the FC transport and the high-level protocols that are transmitted over this FC - SCSI-3, IP). In addition, FC switches use specialized services, analogs of which for IP networks are usually located on servers. For example: Domain Address Manager (responsible for assigning Domain ID to switches), Name Server (stores information about connected devices, a kind of analogue of WINS within the switch), etc.

For a SAN, the key parameters are not only performance, but also reliability. After all, if the database server loses its network for a couple of seconds (or even minutes), it will be unpleasant, but you can survive. And if at the same time the hard drive with the database or OS falls off, the effect will be much more serious. Therefore, all components of a SAN are usually duplicated - ports in storage devices and servers, switches, links between switches and, a key feature of a SAN, compared to a LAN - duplication at the level of the entire infrastructure of network devices - the fabric.

Factory (fabric- which actually translates from English as fabric, because... the term symbolizes an intertwined connection diagram of network and end devices, but the term has already been established) - a set of switches connected to each other by inter-switch links ( ISL - InterSwitch Link).

Highly reliable SANs necessarily include two (and sometimes more) fabrics, since the fabric itself is a single point of failure. Those who have ever observed the consequences of a ring in the network or a deft movement of the keyboard that puts a kernel or distribution switch into a coma with unsuccessful firmware or command understand what we are talking about.

Factories can have an identical (mirror) topology or be different. For example, one fabric may consist of four switches, and another - of one, and only highly critical nodes can be connected to it.

Topology

The following types of factory topologies are distinguished:

Cascade- switches are connected in series. If there are more than two, then it is unreliable and unproductive.

Ring- closed cascade. It’s more reliable than a simple cascade, although with a large number of participants (more than 4), performance will suffer. And a single failure of the ISL or one of the switches turns the circuit into a cascade with all the consequences.

mesh). Happens Full Mesh- when each switch connects to each. Characteristic high reliability, performance and price. The number of ports required for interswitch communications grows exponentially with the addition of each new switch to the circuit. With a certain configuration, there will simply be no ports left for nodes - everyone will be occupied by ISL. Partial Mesh- any chaotic association of switches.

Center/periphery (Core/Edge)- close to the classic LAN topology, but without a distribution layer. Often, storage is connected to Core switches, and servers are connected to Edge. Although an additional layer (tier) of Edge switches can be allocated for storage. Also, both storage and servers can be connected to one switch to improve performance and reduce response time (this is called localization). This topology is characterized by good scalability and manageability.

Zoning (zoning, zoning)

Another technology typical of SAN. This is the definition of initiator-target pairs. That is, which servers can have access to which disk resources, so that it does not turn out that all servers see all possible disks. This is achieved as follows:
  • the selected pairs are added to the zones previously created on the switch;
  • zones are placed in zone sets (zone set, zone config) created there;
  • zone sets are activated in the fabric.

For an initial post on the topic of SAN, I think that’s enough. I apologize for the varied pictures - I don’t have the opportunity to draw them myself at work yet, and I don’t have time at home. There was an idea to draw it on paper and take a photograph, but I decided that it was better this way.

Finally, as a postscript, I will list Basic guidelines for SAN fabric design.

  • Design the structure so that there are no more than three switches between two end devices.
  • It is desirable that the factory consist of no more than 31 switches.
  • It is worth setting the Domain ID manually before introducing a new switch into the fabric - it improves manageability and helps to avoid problems of the same Domain ID, in cases, for example, of reconnecting a switch from one fabric to another.
  • Have multiple equivalent routes between each storage device and the initiator.
  • In cases of uncertain performance requirements, proceed from a ratio of the number of Nx ports (for end devices) to the number of ISL ports as 6:1 (EMC recommendation) or 7:1 (Brocade recommendation). This ratio is called oversubscription.
  • Zoning recommendations:
    - use informative names of zones and zone-sets;
    - use WWPN zoning rather than Port-based (based on device addresses, not physical ports of a specific switch);
    - each zone - one initiator;
    - clean the factory from “dead” zones.
  • Have a reserve of free ports and cables.
  • Have a reserve of equipment (switches). At the site level - necessarily, perhaps at the factory level.

If you manage your own infrastructure in your own data center, you must go through a selection of different storage offerings. The choice of storage solution largely depends on your requirement. Before finalizing a specific storage option for your use case, it helps to understand the technology a little.

I was actually going to write an article about object storage (which is the hottest storage option in the cloud). But before we go and discuss this part of the storage arena, I thought it would be better to discuss the two main storage methods that have co-existed for a very long time, which are used by companies domestically for their needs.

Deciding your storage type will depend on many factors, such as the following.

  • Type of data you want to save
  • Usage diagram
  • Scaling
  • Finally, your budget

When you start your career as a system administrator, you often hear your colleagues talk about various methods storage such as SAN, NAS, DAS, etc. And without a little digging you should be confused with different conditions storage Confusion often arises because of the similarities between different storage approaches. The only hard and fast rule for staying up to date with technical terms is to keep reading the materials (especially the concepts behind a particular technology).

Today we will discuss two different methods, which define the storage structure in your environment. Your choice of the two in your architecture should only depend on your use case and the type of data you store.

At the end of this tutorial, I hope you will have a clear understanding of the two main storage methods and which to choose for your needs.

SAN (Storage Network) and NAS (Network Attached Storage)

Below are the main differences between each of these technologies.

  • How the storage is connected to the system. In short, how the connection is made between the access system and the storage component (directly connected or network connected)
  • Type of cable used for connection. In short, it is a type of cables to connect a system to a storage component (such as Ethernet and Fiber Channel)
  • How input and output requests are performed. In short, it is a protocol used to perform input and output requests (like SCSI, NFS, CIFS, etc.)

Let's discuss SAN first and then NAS, and at the end let's compare each of these technologies to clear up the differences between them.

SAN (storage network)

Today's applications are very resource intensive due to the requests that need to be processed simultaneously per second. Take the example of an e-commerce website where thousands of people place orders per second and all of them need to be properly stored in a database for later retrieval. The storage technology used to store such high-traffic databases must be fast in serving and responding to queries (in short, it must be fast in and out).

In such cases (when you need high performance and fast I/O), we can use SAN.

SAN is nothing but a high-speed network that makes connections between storage devices and servers.

Traditionally, application servers used their own storage devices attached to them. Talk to these devices using a protocol known as SCSI (Small Computer System Interface). SCSI is nothing but a standard used for communication between servers and storage devices. All regular hard drives, tape drives, etc. Use SCSI. In the beginning, a server's storage requirements were fulfilled by storage devices that were enabled inside the server (the server used to talk to that internal storage device using SCSI. This is very similar to how a regular desktop talks to its internal hard drive.).

Devices such as CD-ROMs are connected to the server (which is part of the server) using SCSI. The main advantage of SCSI for connecting devices to a server was its high throughput. Although this architecture is sufficient for low requirements, there are several limitations such as the following.

  • The server can only access data on devices that are directly linked to it.
    If something happens to the server, data access will fail (since the storage device is part of the server and is connected to it using SCSI)
  • Limit the number of storage devices that the server can access. In case the server requires more storage space, there will be no more space that can be connected since the SCSI bus can only accommodate a finite number of devices.
  • Additionally, the server using SCSI storage must be close to the storage device (since parallel SCSI, which is a common implementation on most computers and servers, has some distance limitations; it can operate up to 25 meters).

Some of these limitations can be overcome by using DAS (Directly Attached Storage). The smart used to directly connect the storage to the server can be any of SCSI, Ethernet, Fiber, etc.). Low complexity, low investment, ease of deployment has led to DAS being adopted by many for normal requirements. The solution was good even in terms of performance when used with faster environments such as fiber channel.

Even an external USB drive connected to the server is also a DAS (well conceptually its a DAS since it is directly connected to the server's USB bus). But USB flash drives are not usually used due to the speed limitation of the USB bus. Typically, heavy and large DAS storage systems use SAS (serial attached SCSI) media. Internally, the storage device may use RAID (which is usually the case) or something to provide storage capacity to the servers. Currently options SAS storage provide speeds of 6 Gbps.

An example of a DAS storage device is the MD1220 from Dell.

On the server, the DAS storage will look very similar to your own drive or an external drive that you have connected.

Although DAS is good for normal needs and gives good performance, there are limitations such as the number of servers that can access it. Store the device or let's say DAS storage should be close to the server (in the same rack or within the acceptable distance of the media being used).

It can be argued that Direct Attached Storage (DAS) is faster than any other storage method. This is because it does not involve some of the overhead of data transfer over the network (all data transfer occurs on a dedicated connection between the server and the storage device. Basically it is serially connected SCSI or SAS). However, due to recent improvements in fiber channel and other caching mechanisms, SAN also provides better speeds similar to DAS and in some cases exceeds the speed provided by DAS.

Before entering SAN, let's understand several types and media methods that are used to connect storage devices (When I talk about storage devices, please don't think of it as a single hard drive. Take it as an array of disks, maybe some kind of RAID level (think of it as something like a Dell MD1200).

What are SAS (Serial Attached SCSI), FC (Fibre Channel) and iSCSI (Internet Small Computer System Interface)?

Traditionally, SCSI devices, such as an internal hard drive, are connected to a common parallel SCSI bus. This means that all connected devices will use the same bus to send/receive data. But shared parallel connections are not very good for high precision and create problems for high-speed transfers. However serial connection between the device and the server can increase the overall data transfer throughput. SAS between storage devices and servers uses a dedicated 300 MB/sec per disk. Think of a SCSI bus, which has the same speed for all connected devices.

SAS uses the same SCSI commands to send and receive data from the device. Also, please don't think that SCSI is only used for internal storage. It is also used to connect an external storage device to the server.

If data transfer performance and reliability are a choice, then using SAS is the best solution. In terms of reliability and error rate, SAS drives are much better than older ones SATA drives. SAS was designed with performance in mind, making it full duplex. This means that data can be sent and received simultaneously from a device using SAS. Also, one SAS host port can connect to several SAS drives using expanders. SAS uses point-to-point data transfer using serial communications between devices (storage devices such as disk drives and disk arrays) and hosts.

The first generation of SAS provided 3Gb/s speeds. The second generation of SAS improved this to 6 Gbit/s. And the third generation (which is currently used by many organizations for extremely high throughput) improved this to 12 Gbps.

Fiber Channel Protocol

Fiber Channel is a relatively new interconnect technology used for fast data transfer. The main purpose of its design is to enable data transfer at higher speeds with very low/negligible latency. It can be used to connect workstations, peripherals, storage arrays, etc.

The main factor that differentiates fiber channel from other connection method is that it can manage both network and I/O communications over the same channel using the same adapters.

ANSI (American National Standards Institute) standardized Fiber Channel during 1988. When we say Fiber (in Fiber channel) does not think that it only supports optical fiber medium. Fiber is the term used for any medium used for fiber channel protocol connections. You can even use copper wire for a lower cost.

Note that the ANSI Fiber Channel standard supports networking, storage, and data transfer. The Fiber channel does not know the type of data you are transmitting. It can send SCSI commands encapsulated in a fiber channel frame (it does not have its own I/O commands for sending and receiving memory). The main advantage is that it can include widely used protocols such as SCSI and IP internally.

The components of a fiber channel connection are listed below. The requirement below is the minimum to achieve a single point connection. Typically this can be used for direct connection between the storage array and the host.

  • HBA (Home Bus Adapter) with Fiber Channel port
  • Driver for HBA card
  • Cables for connecting devices in a fiber channel HBA

As mentioned earlier, the SCSI protocol is encapsulated within a fiber channel. Thus, typically the SCSI data must be modified into a different format that the fiber channel can deliver to its destination. And when the receiver receives the data, it transfers it to SCSI.

You might be thinking why we need this mapping and remapping, why can't we directly use SCSI to deliver the data. This is because SCSI cannot deliver data over long distances to many devices (or many hosts).

A fiber link can be used to connect systems up to 10 km (if they are used with optical fibers, you can extend this distance by having repeaters between them). And you can also transmit data at 30m using copper wire to reduce the cost in the fiber channel.

With the advent of fiber channel switches from many major vendors, connecting large numbers of storage devices and servers has become an easy task (as long as you have the budget to invest). Network ability fiber channel has led to the advanced implementation of SAN (Storage Area Networks) for fast, long-lasting and reliable data access. Most computing environments (that require large amounts of data to be transferred quickly) use a fiber optic SAN with fiber optic cables.

The current fiber channel standard (called 16GFC) can transfer data at 1600 MB/s (remember, this standard was released in 2011). Upcoming standards are expected to provide speeds of 3200 MB/s and 6400 MB/s in the coming years.

iSCSI Interface (Small Computer Interface)

iSCSI is nothing but an IP-based standard for interconnecting storage arrays and storage nodes. It is used to carry SCSI traffic over IP networks. This is the simplest and cheapest solution (though not the best) for connecting to a storage device.

This is a great technology for location independent storage. Because it can establish a connection with the storage device using local area networks, wide area network. Its storage network interconnection standard. It does not require special cables and equipment, as in the case of a fiber channel network.

For a system using a storage array with iSCSI, the storage appears as a locally attached disk. This technology emerged after fiber channel and was widely adopted due to its low cost.

It is a network protocol that runs on top of TCP/IP. You can guess that this is not very good performance compared to fiber channel (simply because everything runs over TCP without special hardware or changes to your architecture).

iSCSI introduces little CPU overhead on the server because the server must perform additional processing for all storage requests over the network using regular TCP.

iSCSI has the following disadvantages compared to fiber channel

  • iSCSI introduces slightly more latency compared to fiber channel due to IP header overhead
  • Database applications have small read and write operations that, when running on iSCSI,
    iSCSI, when executed on the same LAN that contains other normal traffic (other infrastructure traffic other than iSCSI), will result in read/write latency or poor performance.
  • Maximum speed/throughput is limited by the speed of your Ethernet and network. Even if you combine multiple links, it doesn't scale to the fiber channel level.

NAS (Network Attached Storage)

The simplest definition of NAS is “Any server that shares its own storage with others on a network and acts as a file server is the simplest form of NAS.”

Please note that Network Attached Storage shares files over the network. Not a network storage device.

The NAS will use an Ethernet connection to share files over the network. The NAS device will have an IP address and will then be accessible via the network through that IP address. When you access files on file server on your Windows system, it's basically a NAS.

The main difference is how your computer or server handles a particular storage. If the computer treats the storage as part of itself (much like you attach a DAS to your server), in other words, if the server's processor is responsible for managing the attached storage, it will be some kind of DAS. And if the computer/server considers the storage attached as another computer that shares its data through the network, then it is a NAS.

Direct Attached Storage (DAS) can be treated like any other peripheral device like mouse keyboard etc. Since a server/computer is a direct attached storage device. However, NAS is another server or to say that the hardware has its own computing functions that can share its own storage with others.

Even SAN storage can also be considered hardware that has its own processing power. So the main difference between NAS, SAN and DAS is how the server/computer sees it. The DAS storage device appears on the server as part of itself. The server sees it as its physical part. Although the DAS storage may not be inside the server (usually another device with its own storage array), the server sees it as its own internal storage (the DAS storage appears on the server as its own internal storage)

When we talk about NAS, we need to call them stocks and not storage devices. Because the NAS appears on the server as a shared folder instead of a shared device on the network. Don't forget that NAS devices are computers themselves that can share their storage with others. When you share a folder with access control using SAMBA, its NAS.

Although NAS is a cheaper option for your storage needs. This is really not suitable for a high performance enterprise level application. Never think about using database storage (which should be high performance) with NAS. The main disadvantage of using a NAS is the performance issue and dependence on the network (in most cases, the LAN that is used for normal traffic is also used to share storage with the NAS, making it more congested).

When you share NFS export over a network, it is also a form of NAS.

NAS is nothing more than a device/equipmet/server connected to a TCP/IP network that shares its own storage with others. If you dig a little deeper, when a file read/write request is sent to a NAS share connected to a server, the request is sent as a CIFS (Common Internet File System) or NFS (Network File System) network. The receiving end (NAS device) when receiving an NFS request, CIFS then converts it into a set of local storage I/O commands. This is the reason why a NAS device has its own processing power.

So, NAS is file-level storage (as it is basically a file-sharing technology). This is because it hides the actual file system under the hood. This gives users an interface to access its shared memory using NFS or CIFS.

A common use for a NAS that you may find is to provide each user with a home directory. These home directories are stored on the NAS device and mounted on the computer where the user logs in. Because the home directory is available on the network, the user can log in from any computer on the network.

Benefits of NAS

  • NAS has a less complex architecture compared to SAN
  • It is cheaper to deploy on existing architecture.
  • No changes are required to your architecture as a normal TCP/IP network is the only requirement

Disadvantages of NAS

  • NAS is slow
  • Low throughput and high latency, making it unsuitable for high-performance applications

Return to SAN

Now let's get back to the SAN (Storage Area Network) discussion we started earlier at the beginning.

The first and most important thing to understand about SAN (besides what we already discussed at the beginning) is the fact that it is a block-level storage solution. And the SAN is optimized for high volume block level data transfer. SAN works best when used with a fiber channel environment (optical fibers and a fiber channel switch).

The name “Storage Network” implies that the storage is located on its own dedicated network. Hosts can connect the storage device to themselves using either Fiber Channel, TCP/IP networking (SAN uses iSCSI when used over tcp/ip networking).

SAN can be thought of as a technology that combines best features both DAS and NAS. If you remember, DAS appears on a computer as its own storage device and is well known, DAS is also a block-level storage solution (if you remember, we never talked about CIFS or NFS during DAS). NAS is known for its flexibility, basic network access, access control, etc. A SAN combines the best of both worlds because...

  • SAN storage also appears on the server as its own storage device
  • His solution is for block-level storage
  • Good performance/speed
  • Network functions using iSCSI

SAN and NAS are not competing technologies, but are designed for different needs and tasks. Because SAN is a block-level storage solution, it the best way Suitable for high performance data storage, storage Email etc. Most modern solutions SANs provide disk mirroring, archiving, backup and replication functions.

A SAN is a dedicated network of storage devices (can include tape drives, RAID arrays, etc.) that work together to provide superior block-level storage. While NAS is one device/server/ computing device, it uses its own storage over the network.

Key differences between SAN and NAS

SAN NAS
Block-level data access Accessing File Level Data
Fiber Channel is the primary media used with a SAN. Ethernet is the primary media used with NAS
SCSI is the main I/O protocol NFS/CIFS is used as the primary I/O protocol in NAS
SAN storage appears as native storage on the computer NAS downloads as shared folder on computer
He may have excellent speed and performance when used with light guides This can sometimes degrade performance if the network is used for other things as well (which is usually the case)
Used primarily for higher performance level data storage Used for small reads and writes over long distances

7 SAN building blocks

The previous sections provide an overview of Fiber Channel topologies and protocol. Now let's look at the various devices and components that are used to create Fiber Channel storage networks. The main structural elements of a SAN include:

■ bus adapters;

■ Fiber Channel cables;

■ connectors;

■ Connectivity devices, which include hubs, switches, and fabric switches.

Note that all addressable components within a Fiber Channel SAN have unique names WWN ( World Wide Names), which are analogues of unique MAC addresses. The WWN in the Fiber Channel specification is a 64-bit number written as XX:XX:XX:XX:XX:XX:XX:XX. IEEE assigns each manufacturer a specific address range. The manufacturer is responsible for uniquely allocating assigned addresses.

7.1 Bus adapters

Bus adapter (host bus adapter - NVA) connects to a computer and provides interaction with storage devices. In the world of personal computers running Windows adapters buses are usually connected to PCI bus and can provide connectivity for IDE, SCSI and Fiber Channel devices. Bus adapters operate under the control of a device driver, i.e. SCSIPort or Storport miniport driver.
When initialized, the HBA port registers with the fabric switch (if available) and registers the attributes stored on it. Attributes are available to applications with using the API from the switch or bus adapter manufacturer. The SNIA (Storage Networking Industry Association) is working on a standardized API that supports various vendor APIs.
For storage area networks that have high fault tolerance requirements, some HBA manufacturers provide additional capabilities, such as automatic switching to another HBA if the primary one fails.
In a shared ring, only two devices can simultaneously receive and transmit data. Let's assume that one of them is a HBA connected to a host and receiving data from a storage device. However, if this adapter is connected to a switched fabric SAN, it can send multiple read requests to multiple storage devices at the same time.

Responses to these requests can come in any order. Typically a fabric switch provides round robin service to the ports, which further complicates the HBA's task; in this case, the order of arrival of packets will be such that each subsequent packet will come from a different source.
Bus adapters solve this problem in one of two ways. The first strategy, called store and sort, involves storing data in node memory and then sorting the buffers using central processor. Obviously, this is an inefficient approach from a CPU point of view and the overall load is associated with context switching every few tens of microseconds. Another strategy is on the fly- involves the use of additional system logic and chips on the bus adapter itself, which allows context switching without using CPU cycles. Typically, the time between context switches when using this strategy is several seconds.
One reservation allows one Fiber Channel frame to be sent. Before sending the next frame, the sender must receive a signal Receiver Ready. To effectively use a Fiber Channel link, multiple frames must be transmitted simultaneously, which will require multiple reservations and therefore require more memory to receive frames. Some HBAs have four 1 KB buffers and two 2 KB buffers, although some high-end adapters have 128 KB and 256 KB for buffer reservation. Note that this memory typically requires two ports; those. When one memory region receives data from the Fiber Channel SAN, the remaining memory regions can send data to the host PCI bus.
Additionally, HBAs play a role in failover and disaster recovery architectures that provide multiple I/O paths to a single storage device.

7.1.1 Windows operating system and bus adapters

In Windows NT and Windows 2000, Fiber Channel adapters are treated as SCSI devices, and drivers are created as mini-SCSI drivers. The problem is that the SCSIPort driver is outdated and does not support the capabilities provided by newer SCSI devices, let alone Fiber Channel devices. Therefore in Windows Server 2003 a new driver model, Storport, was introduced to replace the SCSIPort model, especially for SCSI-3 and Fiber Channel devices. Note that Fiber Channel drives are used by Windows as DAS devices, which is enabled by the abstraction layer provided by the SCSIPort and Storport drivers.

7.1.2 Dual routes

Sometimes necessary increased productivity and reliability, even at the expense of increasing the cost of the finished solution. In such cases, the server is connected to dual-port disks through multiple HBAs and multiple independent Fiber Channel SANs. The main idea is to eliminate a single point of failure in the network. Additionally, during times when the system is operating normally, multiple routes can be used to balance the load and improve performance.

7.2 Fiber Channel cable types

There are mainly two types of cables used: optical and copper. The main advantages and disadvantages of cables are listed below.

■ Copper cables are cheaper than optical cables.

■ Optical cables support higher data transfer rates than copper cables.

■ Copper cable can be used over shorter distances, up to 30 meters. In this case, the optical cable can be used at a distance of up to 2 kilometers (multi-mode cable) or up to 10 kilometers (single-mode cable).

■ Copper cable is more susceptible to electromagnetic interference and interference from other cables.

■ Optical data typically must be converted into electrical signals for transmission through a switch and back into optical form for further transmission.
There is only one type of copper cable, unlike optical cable, which comes in two types: multimode and single-mode.
For short distances, multimode cable is used, which has a core diameter of 50 or 62.5 microns (a micron is a micrometer, or one millionth of a meter.) The light wave used in multimode cable has a length of 780 nanometers, which is not supported in single-mode cables. For long distances, a single-mode cable with a core diameter of 9 microns is designed. A single-mode cable uses a light beam with a wavelength of 1300 nanometers. Despite the topic of this chapter (the Fiber Channel interface), it is worth mentioning that such cables can be used to build networks based on other interfaces, such as Gigabit Ethernet.

7.3 Connectors

Because Fiber Channel supports multiple cable types (and media technologies), devices (such as bus adapters, interface devices, and storage devices) come with connectors that support media connections to reduce overall costs. There are several types of connectors designed for different transmission media and interfaces.

■ Gigabit interface converters (GBICs) support serial and parallel translation of transmitted data. GBIC converters provide hot pluggability, i.e. Enabling/disabling GBIC does not affect the operation of other ports. The converters use a 20-bit parallel interface.

■ Gigabit link modules (GLMs) provide similar functionality to GBICs but require the device to be powered down to install. On the other hand, they are somewhat cheaper than GBICs.

■ Media Interface Adapters are used to convert signals between copper and optical media and vice versa. Media interface adapters are typically used in HBAs, but can also be used on switches and hubs.

■ Small Form Factor Adapters (SFF) allow you to place more connectors of different interfaces on a board of a certain size.

7.4 Interface devices

Interconnection devices connect the components of storage networks. These devices range from low-cost Fiber Channel hubs to expensive, high-performance, managed fabric switches.

7.4.1 Fiber Channel split ring hubs

FC-AL hubs are a cost-effective option for connecting multiple Fiber Channel nodes (storage devices, servers, computer systems, other hubs and switches) into a ring configuration. Hubs typically provide between 8 and 16 ports. The hub can support different transmission media, such as copper or optical.
Fiber Channel hubs are passive devices, i.e. any other device in the ring cannot detect their presence. Hubs provide the following features:

■ internal connections, which allow any port to connect to any other port;

■ the ability to bypass the port to which a malfunctioning device is connected.
The most a big problem in the operation of the ports is due to the fact that at the current time they can only support one Fiber Channel connection. The figure shows that if port 1 is given control to establish a session with port 8, no other port will be able to transmit data until the established session ends.
Hubs can be connected to Fiber Channel switches without modification. You can also create a cascade of hubs by connecting two hubs with a cable.
FC-AL hubs dominate the Fiber Channel market, but Fiber Channel fabric switches are becoming increasingly popular as costs drop.
FC-AL hubs are created by companies such as Gadzoox Networks, Emulex and Brocade.

7.4.2 Fiber Channel split ring switches

The most significant advantage of FC-AL switches over hubs is that they support multiple connections simultaneously, whereas hubs only support one connection at a time.

Rice. Fiber Channel hub

The ability to simultaneously support multiple connections comes with its own challenges. Devices connected to the ring switch are not even “aware” of their role. Ring switches are involved in both data transmission and ring addressing. Below is more information on this topic, as well as a look at the role of switches in SANs and how vendors are adding new features to their products.

Fig. Fiber Channel switch

Ring switches and data transmission

A server that intends to access a storage device must send an arbitration request to control the ring. In a normal hub-based FC-AL ring, each device receives an arbitration packet before it is returned to the server HBA, giving the server control of the ring. The ring switch will send a success response immediately without sending requests to other nodes. At this point, the HBA will send a basic Open packet destined for the storage device port, which will be forwarded by the ring switch. If the port is not transmitting data at this time, special problems should not arise. Otherwise, it is possible that conflict situations. To solve this problem, the ring switch must provide buffers to temporarily store frames destined for port 7. Some switch vendors provide 32 buffers per port for this purpose.

Ring switches and FC-AL addressing

FC-AL hubs do not play a role in assigning addresses to devices and only transmit basic address frames around the ring. The same can be said for most switches. However, some devices may insist on receiving a specific address. Some hubs have the ability to control the order of port initialization, which allows a specific port to initialize first, after which the device will be connected to the required port.

Switches and Ring Initialization

The FC-AL protocol requires ring reinitialization when a device is connected, disconnected, or reinitialized. Initializing the ring in this way may disrupt existing communication between the other two devices. Some switch manufacturers provide the ability to selectively screen and forward LIP (Loop Initialization Primitives) packets. This operation is intended to minimize problems, reduce ring reinitialization time, and preserve existing data sessions where possible. At the same time, it is necessary to ensure the uniqueness of device addresses.
If all devices participate in ring reinitialization, address duplication does not occur because the devices “protect” their addresses. However, if some devices do not participate in ring reinitialization, it is necessary to prevent the assignment of already allocated addresses to devices that participate in ring reinitialization. Address uniqueness is ensured by additional ring switch logic. When adding a storage device, a LIP packet must be sent to the server, but LIP does not need to be sent to storage devices that never communicate with other storage devices.
Some storage devices can communicate directly with other storage devices, which is used to back up data.

Ring switches and fabric architecture

If all devices in the ring "know" about the fabric architecture, the ring switch transmits the necessary frames in the normal way, such as Fabric Login frames. If the devices in the ring do not support the fabric architecture, the ring switch must itself perform a sufficiently large
workload.
Some vendors' ring switches do not support cascading. Additionally, some ring switches require a firmware update before connecting to fabric switches. Some switches must be upgraded to fully support the fabric architecture before connecting them to the SAN.
FC-AL switches are manufactured by companies such as Brocade, McDATA, Gadzoox Networks, Vixel and QLogic.

7.4.3 Fiber Channel switches

Fiber Channel Fabric Switches (FC-SW) provide multiple high-speed communication sessions simultaneously with all devices. At the moment, the main switches support speeds of about 1 Gbps, while speeds of 2 Gbps are also no longer a wonder. In general, fabric architecture switches are more expensive per port than hubs and FC-AL switches, but they provide much more functionality.
Fabric architecture switches are more efficient than hubs and FC-AL switches. For example, switches provide the special services described above, provide flow control through basic control packets, and, more importantly, some switches are capable of emulating FC-AL functionality to provide backward compatibility with older devices.
Some fabric switches support unbuffered routing. The idea is that when a frame header is received, the switch quickly finds the destination header while the frame is still being received. The advantage of this approach is the reduction of delays in frame delivery and the absence of the need to store the contents of the frame in buffer memory. The disadvantage is the immediate transmission of all frames, including damaged ones.
Fabric switches play an important role in the security of Fiber Channel storage networks.

7.4.4 Comparison of three connection devices

The table summarizes the functionality and differences between the three types of Fiber Channel devices.

7.4.5 Bridges and routers

In this chapter and throughout the article, the terms bridges and routers do not refer to traditional Ethernet bridges and IP routers. In this case, bridges and routers refer to devices for Fiber Channel, and not for Layer 2 and Layer 3 network protocols.
Bridges are devices that provide interoperability between Fiber Channel and legacy protocols such as SCSI. Fiber Channel to SCSI bridges allow you to preserve your existing SCSI storage investment. Such bridges support SCSI interfaces and Fiber Channel and convert data from the two protocols. This way, a new server with a Fiber Channel HBA installed can access existing SCSI storage devices. Bridges provide an interface between the parallel SCSI bus and the Fiber Channel interface. Routers have similar capabilities, but for multiple SCSI buses and Fiber Channel interfaces. Storage routers, or smart bridges, provide additional capabilities such as LUN masking and mapping, and support SCSI Extended Copy commands. As data transfer devices, routers use Extended Copy commands for use by storage libraries, allowing data to be copied between a specified target device and a connected library. This feature is also called independent backup (serverless).
Examples of router and bridge manufacturers include Crossroads Systems, Chaparral Network Storage, Advanced Digital Information Corporation (ADIC after acquiring Path-light), and MTI.