• There are a couple of points I think need to be clarified.

     

    In the article, two of the three types of storage are treated the same.

     

    SAN is Storage Area Network.

    NAS is Network Attached Storage. 

     

    They are not the same.

     

    A NAS uses the same network as regular Ethernet traffic.  Disk I/O is contending with Google searches, business applications, and downloads from sites we never, ever visit. A SAN uses an entirely separate dedicated network based on Fibre Channel to do all I/O. There is no non-storage I/O contention. 

     

    The Storage Network Industry Association is a very good source for SAN and NAS information (http://www.snia.org). Particularly helpful is their Technical Tutorials page at http://www.snia.org/education/tutorials/

     

    Do you need to “build the physical arrays” (i.e. mapping a set of drives to act together, sometimes known as ‘binding’) or not?

     

    It is entirely dependent on the SAN vendor. There are vendors that offer systems that do not require any construction of physical arrays, either by hand or behind the scenes, without complexity or difficulty troubleshooting. 

     

    In fact, being allowed to treat all of the drives in a SAN as one huge block of storage that can use different RAID levels for any given logical drive means that any logical drive seen by any host can maximize the number of drives that are used for any given I/O.

     

    This also lowers the technical requirement for huge, static caches (big bucks that could be spent on applications, not infrastructure) because every request will inherently be spread across the maximum possible number of drives & will minimize I/O latency. More spindles participating is very, very, very good.   The SPC-1 public benchmarks prove this point.  There is no correlation between cache size and SPC-1 IOPS; on the other hand, there is a strong correlation between number of spindles and SPC-1 IOPS.

     

    In the end, the whole point of SAN/NAS storage is to be able to flexibly and securely put the data on drives at the fastest reasonable speed for the lowest total cost of owning and managing the system from birth to death. If total cost (time and money) of a database over time was not a  concern, why would us DBAs have to think about using bigint vs. tinyint?  Every penny & every minute you do not spend on storage could be better spent on pizza & beer.