HUB PAGE: Software-Defined Storage: How Did We Get There?
by Jaap van Duijvenbode on January 10, 2017
In today's corporate data centers, managers have begun using software defined storage to address a very pressing problem.
The amount of data corporations generate and consume every day is exploding. Many enterprises have data-intensive workloads that are now core parts of their business. But data center infrastructure and budgets are having a tough time keeping up. IT managers are discovering that just throwing more servers and disk arrays at the problem is a losing proposition. That's why traditional storage practices have begun to change.
Why the traditional storage infrastructure is inadequate for today's workloads
In the traditional data center, the storage infrastructure is hard to manage and expensive to provision and operate. It depends on proprietary storage arrays, typically built with custom-designed ASICs (application-specific integrated circuits), and configured as SAN or NAS. Storage system software is directly tied to the particular hardware configurations it manages.
Not only is this arrangement very expensive (ASIC development costs alone can run to $10+ million), but it's also very inflexible. When workloads spike, or hardware reaches its performance limits or end of life, managers usually buy more of the same type of equipment, and usually from the same manufacturer (leading to vendor lock-in). And to insure operations won't be impacted by unexpected surges in the storage demand, surplus capacity that may never be used must also be purchased.
With the amount of data to be handled expanding rapidly, and IT budgets not keeping pace, a new data storage paradigm was desperately needed. It started with the concept of virtualization, sometimes called software-defined compute. Virtualization involves emulating physical resources, such as a server, in software. The widespread use of multiple virtual machines running on a single server laid the groundwork for implementing the same approach for storage. That, in turn, led to the concept of software defined storage.
What Is Software Defined Storage?
The term Software Defined Storage (SDS) refers to an arrangement in which the intelligence and management functions of the storage system are decoupled from the underlying hardware, and concentrated in a layer of advanced software. The intent is to manage storage from the SDS interface, without regard to the types, configurations, or even geographical locations of the storage units employed.
Chris Preimesberger, Features and Analysis Editor at eWEEK, puts it this way: "The ultimate goal of SDS is to provide a single, unified set of storage services across all storage devices for maximum availability, performance and efficiency, as well as to ensure the overall health and protection of vital storage assets."
Advantages of SDS
According to a December 2015 survey conducted by Gartner, 48 percent of storage professionals were already actively investigating or piloting SDS implementations. And interest continues to grow. Both large and small enterprises are considering the adoption of SDS based on a number of clear benefits the technology provides:
1. Allows Use Of Inexpensive Commodity Hardware
> One of the great advantages of the SDS approach is that it shifts intelligence away from the storage array and into the controlling software. Functions such as deduplication, replication and snapshotting no longer need to be provided in hardware. The result is that storage devices can be implemented using inexpensive commodity drives rather than more complex and costly specialized devices.
2. Simplifies Storage Management
An important feature of SDS is the fact that the interface presented to users (or to using applications via APIs) is the same no matter what mixture of hardware implementations is employed. This permits the management and provisioning of the storage infrastructure to be greatly simplified. Tasks such as data backup and restore, disaster recovery, and provisioning of additional capacity can be handled in a unified manner through policy-based management functions enacted by the SDS software, without the necessity of dealing with the idiosyncrasies of different hardware/firmware configurations.
3. Provides Greater Flexibility
Because SDS is hardware-agnostic, storage administrators can freely mix and match hardware and media types. Flash memory-based solid state drives (SSDs) can be used alongside or in conjunction with hard disk (HDD) arrays. Older legacy drives can be given new life through the SDS software's ability to provide intelligence and features at the system level that may be lacking at the device level. Plus, the software can use tiering and caching techniques to match application workloads to the performance capabilities of the various storage units it manages.
Because SDS software treats devices from different manufacturers, and of different types (SSDs, HDDs, and even entire SAN or NAS subsystems) as parts of a single storage resource pool, adding units to meet capacity or performance demands is an easy process. Such provisioning can be automated, potentially reducing the time required to bring new resources on board from weeks to minutes. Plus, in a well designed SDS system, replacement and upgrading of hardware can be accomplished transparently, without disrupting operations.
An added bonus of this hardware heterogeneity is that it helps avoid vendor lock-in, since units from various providers can be included in the same storage pool.
4. Facilitates Distributed Computing and Collaboration
> Having remote or branch sites (ROBOs) maintain their own servers and data locally may seem convenient, but it makes collaboration with other parts of the organization very difficult and error prone. If different locations each use and modify company data within their own sites, it's hard to know which is the authoritative copy. And how will the information from the remote locations be transmitted to the home office and integrated into the company's central data repository?
An added factor is that when ROBOs manage their own data, each location must be responsible for its own IT infrastructure, including data backup facilities and disaster recovery arrangements. The costs, not only for equipment, space, power, cooling, and maintenance, but also in terms of the time and attention of employees who often are not IT specialists, can be significant.
With an SDS approach, the organization can maintain a centralized, authoritative, master copy of each dataset that can be concurrently accessed by authorized users wherever they may be. For example, Talon's FAST™ SDS solution uses intelligent local caching to allow geographically dispersed users to access the central copy of a file just as if it was housed at the remote site. At the same time, by employing a sophisticated file locking mechanism, the software ensures that users in different locations cannot simultaneously make conflicting changes to the master copy.
5. Increases Scalability
SDS allows multi-dimensional scaling. Each node can scale up by adding storage devices, up to the limits imposed by the hardware configuration of each particular unit. But the real advantage of SDS is in its elastic scale-out capability: capacity can be added simply by adding nodes, each of which may consist of a different type of storage. Thousands of such nodes can be accommodated, allowing storage to easily expand to the petabyte range and beyond, without adding any storage management complexity.
6. Provides Better Data Security
Because SDS software manages all devices in the storage pool through a single interface, it can implement sophisticated data security protections for the entire storage system. Functions such as backup, replication, and disaster recovery, as well as monitoring and analytics, can be accomplished through the SDS console rather than having to be applied for each device or array.
This unified approach that covers an organization's entire storage infrastructure makes managing, testing, and validating data security functions much easier. The company's IT professionals can devise an optimal solution and implement it for all the organization's data without having to deal individually with each site and functional area.
Moreover, since storage is only accessible through the SDS software's virtual interface, the storage system is inherently better protected from would-be intruders. Encryption, both for data in-transit and data at rest, is managed by the software. Access permissions for the entire storage infrastructure can also be managed through the SDS portal, allowing for greater control and consistency. Use of SDS allows data security best practices to be applied system-wide.
7. Lowers TCO
SDS systems are designed to thrive on commodity x86-based hardware that is much cheaper than the custom-designed products used in dedicated storage systems. And because the x86 infrastructure is so well established and widely understood after decades of use, maintenance and support costs should also be lower.
In addition, with the SDS software able to implement functions such as data deduplication and compression at the system rather than device level, the amount of storage devoted to data replication can be minimized, further reducing costs.
All this contributes to SDS systems being less costly to acquire and operate than traditional storage solutions. In fact, a Gartner research report states that use of SDS can reduce TCO by at least 50% without degrading performance or compromising service level objectives.
SDS is still a new technology, and companies considering it should be aware that there are some potential pitfalls that may arise.
Perhaps the foremost of these is the temptation to minimize costs by purchasing a software-only SDS solution, then attempting to integrate it with separately purchased or legacy hardware. Unless yours is a large enterprise with storage experts available to handle the integration task, this is a temptation to be avoided.
"The correct processors, memory, disks, operating system settings, drivers, firmware, and bug fixes take a lot of time to select, assemble, and test," says Laz Vekiarides, chief technology officer at ClearSky Data. "If this level of training, maintenance, and operational planning doesn't appeal to your business, seek a third-party partner that can help manage the process."
A related issue is the fact that when software and hardware are purchased from different vendors, getting adequate support can quickly become a headache. The software vendor may provide outstanding support for their product, and the hardware vendor the same for theirs. But who do you call to fix issues that concern the interaction of the two?
One answer to such issues revolves around hyperconvergence, which bundles storage and compute resources into a single storage appliance. Another may be the cloud services model, in which an expert services vendor provides an SDS interface to storage managed, either on-premises or in the public cloud, by the provider.
It's Best To Move Slowly To Incorporate SDS Into Your Data Center
Issues such as the ones highlighted above indicate that although SDS holds tremendous promise for the data center of the future, companies should be wary of attempting a sudden and wholesale switch from the traditional data center to SDS. Industry experts recommend against attempting to overhaul your entire storage infrastructure in one step. It's better to start with small projects, perhaps by implementing SDS first with new applications rather than risking disruptions to production workloads.
SDS and the Future of the Data Center
SDS is a foundational element of the data center of the not-so-distant future. The corporate data center is evolving toward a fully virtualized and highly automated environment, called the Software Defined Data Center (SDDC), in which all resources are managed and delivered as software-defined services. These services include not only Software Defined Storage, but Software Defined Networking (SDN), Software Defined Compute (another name for server virtualization), and even Software Defined Security.
Gartner Research characterizes the SDDC this way: "The primary goal of the SDDC is agility and speed by enabling IT-enabled services to be quickly, and transparently, provisioned, moved and scaled across network segments, across data centers, and potentially, into the cloud independent of the physical infrastructure underneath."
For many organizations, getting started with SDS can not only provide important benefits in the present, but can also serve as a first step toward the SDDC. If you are ready to consider how SDS can both improve your data center right now, as well as helping to prepare it for the future, Talon can help. For more information, please visit our Next Generation Software-defined Storage page.