HPE_Master_ASE_-_Advanced_Server_Solutions_Archite 274 Pages Epub 440 Pages PDF 549 Pages DOC

March 11, 2017 | Author: Chua Hian Koon | Category: N/A
Share Embed Donate


Short Description

HPE Master ASE Advance Server Solutions...

Description

HPE Master ASE Server Solutions eBook (Exam HPE0-S22)

First Edition Miriam Allred

HPE Master ASE - Advanced Server Solutions Architect V3 Official Certification Study Guide (Exam HPE0-S22)

Miriam Allred © 2016 Hewlett Packard Enterprise Development LP. Published by: Hewlett Packard Enterprise Press 660 4th Street, #802 San Francisco, CA 94107

All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the publisher, except for the inclusion of brief quotations in a review. ISBN: 978-1-942741-33-6 WARNING AND DISCLAIMER This book provides information about the topics covered in theHPE Master ASE - Advanced Server Solutions Architect V3 certification exam (HPE0-S22). Every effort has been made to make this book as complete and as accurate as possible, but no warranty or fitness is implied. The information is provided on an “as is” basis. The author, and Hewlett Packard Enterprise Press, shall have neither liability nor responsibility to any person or entity with respect to any loss or damages arising from the information contained in this book or from the use of the discs or programs that may accompany it. The opinions expressed in this book belong to the author and are not necessarily those of Hewlett Packard Enterprise Press. TRADEMARK ACKNOWLEDGEMENTS All third-party trademarks contained herein are the property of their respective owner(s). GOVERNMENT AND EDUCATION SALES This publisher offers discounts on this book when ordered in quantity for bulk purchases, which may include electronic versions. For more information, please contact U.S. Government and Education Sales 1-855-447-2665 or email [email protected]. Feedback Information At HPE Press, our goal is to create in-depth reference books of the best quality and value. Each book is crafted with care and precision, undergoing rigorous development that involves the expertise of members from the professional technical community.

Readers’ feedback is a continuation of the process. If you have any comments regarding how we could improve the quality of this book, or otherwise alter it to better suit your needs, you can contact us through email at [email protected]. Please make sure to include the book title and ISBN in your message. We appreciate your feedback. Publisher: Hewlett Packard Enterprise Press HPE Contributors: Jim Robinson, Chris Powell, Chris Bradley, Jeff Holderfield, Andrew Leber, Brian Beneda HPE Press Program Manager: Michael Bishop

About the Author Miriam Allred has spent the last ten years configuring, testing, and troubleshooting HPE wired and wireless networks. She also has extensive knowledge of servers, storage, and cloud technologies. Miriam combines this wide range of technical expertise with pedagogy and instructional design training, allowing her to create technical training courses for both advanced and entry-level networking professionals. Miriam Allred has a Masters degree from Cleveland State University and a Bachelors degree from Brigham Young University.

Introduction Based on the Architecting Advanced HPE Server Solutions course, this self-study guide helps you prepare for the HPE Master ASE - Advanced Server Solutions Architect V3 certification exam (HPE0-S22). This certification validates you can design, differentiate and deploy advanced enterprise server solutions including HPE Integrity, Apollo and Moonshot servers. Additionally this certification validates your ability to design and demonstrate the best solution based on customers technical, financial, and business needs.

Certification and Learning Hewlett Packard Enterprise Partner Ready Certification and Learning provides end-to-end continuous learning programs and professional certifications that can help you open doors and succeed in the New Style of Business. We provide continuous learning activities and job-role based learning plans to help you keep pace with the demands of the dynamic, fast paced IT industry; professional sales and technical training and certifications to give you the critical skills needed to design, manage and implement the most sought-after IT disciplines; and training to help you navigate and seize opportunities within the top IT transformation areas that enable business advantage today. As a Partner Ready Certification and Learning certified member, your skills, knowledge, and realworld experience are recognized and valued in the marketplace. To continue your professional and career growth, you have access to our large HPE community of world-class IT professionals, trendmakers and decision-makers. Share ideas, best practices, business insights, and challenges as you gain professional connections globally. To learn more about HPE Partner Ready Certification and Learning certifications and continuous learning programs, please visit http://certification-learning.hpe.com

Audience This book is designed for consultants, sales engineers and presales technical architects who recommend, design and demonstrate HPE Server solutions for large scale, more complex, or mission-critical scenarios.

Assumed Knowledge To achieve the HPE Master ASE - Advanced Server Solutions Architect V3 certification, it is assumed

that you have a minimum of six years’ experience with architecting HPE server solutions. Candidates are expected to have advanced level industry-standard technology knowledge and business acumen from training and hands-on experience.

Relevant Certifications After you pass the exam, your achievement may be applicable toward more than one certification. To determine which certifications can be credited with this achievement, log in to The Learning Center and view the certifications listed on the exam’s More Details tab. You might be on your way to achieving additional certifications.

Preparing for Exam HPE0-S22 This self-study guide does not guarantee that you will have all the knowledge you need to pass theexam. It is expected that you will also draw on real-world experience and would benefit from completing the hands-on lab activities provided in the instructor-led training.

Recommended HPE Training Recommended training to prepare for each exam is accessible from the exam’s page in The Learning Center. See the exam attachment, “Supporting courses,” to view and register for the courses.

Obtain Hands-on Experience To pass the exam, Hewlett Packard Enterprise strongly recommends a combination of training, thorough review of additional study references, and sufficient on-the-job experience.

Exam Registration To register for an exam, go to http://certification-learning.hpe.com/tr/certification/learn_more_about_exams.html

Chapter 1 Recognize Industry Trends EXAM OBJECTIVES • Describe the trends affecting enterprises and explain how these trends lead to the four Transformation Areas. • Describe the key business challenges enterprises are facing. • Review the role of a server architect, emphasizing how the architect helps companies. • Provide an overview of the HPE enterprise server solutions covered in this ebook: ✓ Apollo solutions ✓ Moonshot ✓ Integrity Superdome X

Assumed knowledge Before reading this chapter, you should meet the following criteria: • Knowledge of processors including DDR3 and DDR4 memory, hard disk drives (HDDs), solid state drives (SSDs), and RAID levels for storage volumes • Experience with HPE ProLiant rack and blade servers and options for them such as HPE Smart Array Controllers • Knowledge of HPE BladeSystems including interconnect modules and Virtual Connect (VC) modules • Experience managing and maintaining servers including iLO, Intelligent Provisioning, UEFI, HPE Insight Remote Support, HPE Insight Online, HPE Smart Update Manager (SUM), and HPE Insight Control server provisioning (ICsp) • Very familiar with HPE OneView capabilities

Chapter topics In this chapter, you will first briefly review the HPE Server Certification paths. Next, you will look at the major trends facing the IT industry and how HPE Transformation Areas address these changes. You will then learn about the HPE software-defined data center (SDDC) and learn about the HPE server solutions that are covered in this ebook.

HPE Server Certification overview This section outlines the HPE Server Certification, focusing on how this ebook fits within that certification and what you will gain as an HPE architect.

HPE Server Certification Paths Overview

Figure 1-1 HPE Server Certification Paths Overview The information in this ebook is designed for architects and integrators following the Server Solutions Architect path, shown in Figure 1-1. The ideal candidate should have enterprise-level server architecture and design expertise and be interested in gaining a solid understanding of HPE Superdome X, Apollo 6000, Moonshot, DL7XX, and underlying technologies through the ebook activities and validation of your skills through examination. After passing the exam associated with this ebook (HPE0-S22), you will be certified as a Master Accredited Solutions Expert. Keep in mind that although the certification exam is associated with this ebook, the exam also tests you on your mastery of prerequisite training and HPE OneView—as much as 10%–20% of the items might be on these subjects.

What you will gain from this ebook as an HPE architect In this ebook, you will learn how to become a trusted adviser for your customers. This chapter will introduce you to new trends in IT that have become a vital part of almost every company’s day-to-day operations, as well as a revenue generator. By understanding the key ways that customers need to transform to prosper in the new idea economy, you will be able to design HPE server solutions that better meet customers’ needs. The rest of this ebook guides you through architecting HPE Apollo, Moonshot, and Integrity Superdome X solutions, teaching you how to design solutions based on customer business requirements. It also helps you understand how to present the benefits of these solutions to customers —maximizing the opportunity for the customer to accept your proposal.

HPE Transformation Areas for the new idea economy In this section, you will learn about the pressures placed on today’s businesses and how the HPE Transformation Areas address these concerns.

The idea economy is here

Figure 1-2 The idea economy is here Ideas have always fueled business success. Ideas have built companies, markets, and industries. However, there is a difference today. As you see in Figure 1-2, businesses operate in the idea economy, which is also called the digital, application, or mobile economy. Doing business in the idea economy means turning an idea into a new product, capability, business, or industry. This has never been easier or more accessible—for you and for your competitors.

Today, an entrepreneur with a good idea has access to the infrastructure and resources that a traditional Fortune 1000 company would have. That entrepreneur can rent compute capacity on demand, implement a software-as-a-service enterprise resource planning system, use PayPal or Square for transactions, market products and services using Facebook or Google, and have FedEx or UPS run the supply chain. Companies such as Vimeo, One Kings Lane, Dock to Dish, Uber, Pandora, Salesforce, and Airbnb used their ideas to change the world with very little start-up capital. Uber had a dramatic impact after launching its application connecting riders and drivers in 2009. Three years after its founding, the company expanded internationally. Without owning a single car, Uber now serves more than 300 cities in 58 countries (as of May 28, 2015). The company has disrupted the taxi industry; San Francisco Municipal Transportation Agency reported that cab use in San Francisco has dropped 65% in two years. In a technology-driven world, it takes more than just ideas to be successful, however. Success is defined by how quickly ideas can be turned into value.

Creating disruptive waves of new demands and opportunities

Figure 1-3 Creating disruptive waves of new demands and opportunities Figure 1-3 illustrates how the idea economy presents an opportunity and a challenge for most enterprises. On the one hand, cloud, mobile, big data, and analytics give businesses the tools to accelerate time to value. This increased speed allows organizations to combine applications and data to create dramatically new experiences, even new markets. On the other hand, most organizations were built with rigid IT infrastructures that are costly to maintain. This rigidity makes it difficult, if not impossible, to implement new ideas quickly. Creating and delivering new business models, solutions, and experiences require harnessing new types of applications, data, and risks. It also requires implementing new ways to build, operate, and consume technology. This new way of doing business no longer just supports the company—it becomes the core of the company.

IT must become a value creator that bridges the old and the new

Figure 1-4 IT must become a value creator that bridges the old and the new To respond to the disruptions created by the idea economy, IT must transform from a cost center to a value creator, as shown in Figure 1-4. In order to evolve, IT must shift focus • From efficiently hosting workloads and services to continuously creating and delivering new services • From simply providing hardened systems and networks to proactively managing and mitigating risks • From just storing and managing data to providing real-time insight and understanding • From using software to automate business systems to differentiating products and services Customers need to make IT environments more efficient, productive, and secure as they transition to the idea economy. They need to enable their organizations to act rapidly on ideas by creating, consuming, and reconfiguring new solutions, experiences, and business models. One of the first steps in achieving this kind of agility is to break down the old infrastructure silos that make enterprises resistant to new ideas internally and vulnerable to new ideas externally. Designing compelling new experiences and services does not work if the infrastructure cannot support them. The right compute platform can make a significant impact on business outcomes and performance. Examples include storage that “thinks” as much as it stores, networking that moves information faster and more securely than ever before, and orchestration and management software that provides predictive capabilities. Each company is on a unique journey to the cloud, custom-made for the way it consumes and allocates resources, transforms to the changing landscape, implements financial models, and achieves desired outcomes.

This unique journey starts with four transformation areas

Figure 1-5 This unique journey starts with four transformation areas This unique journey starts with four transformation areas, shown in Figure 1-5. The HPE Transformation Areas are designed to • Generate revenue and profitable growth • Increase agility and flexibility • Deliver remarkable customer experience • Amplify employee productivity • Reduce cost and risk These transformation areas reflect what customers consider most important: • Transforming to a hybrid infrastructure—A hybrid infrastructure enables customers to get better value from the existing infrastructure and delivers new value quickly and continuously from all applications. This infrastructure should be agile, workload optimized, simple, and intuitive. • Protecting the digital enterprise—Customers consider it a matter of when, not if, their digital walls will be attacked. The threat landscape is wider and more diverse than ever before. A complete risk management strategy involves security threats, backup and recovery, high availability, and disaster recovery. • Empowering the data-driven organization—Customers are overwhelmed with data; the solution is to obtain value from information that exists. Data-driven organizations generate real-time, actionable insights. • Enabling workplace productivity—Many customers are increasingly focused on enabling workplace productivity. Delivering a great digital workplace experience to employees and customers is a critical step.

Transform to a hybrid infrastructure

Figure 1-6 Transform to a hybrid infrastructure An organization might see cloud services as a key component to access the IT services they need, at the right time and the right cost. A hybrid infrastructure is based on open standards, is built on a common architecture with unified management and security, and enables service portability across deployment models. Getting the most out of hybrid infrastructure opportunities requires planning performance, security, control, and availability strategies (Figure 1-6). For this reason, organizations must understand where and how a hybrid infrastructure strategy can most effectively be applied to their portfolio of services.

The Hewlett Packard Enterprise perspective on hybrid infrastructure Customers struggle with rigid infrastructures and need to transform to an agile, hybrid infrastructure that generates business value. Based on extensive research, Hewlett Packard Enterprise has defined a strategy for helping customers in this transformation. The following paragraphs describe the HPE point of view. Open matters Companies are transforming to a hybrid infrastructure because they need flexibility and agility. HPE will help them to avoid vendor lock-in so that they can maximize flexibility in the future. Expertise matters Hewlett Packard Enterprise has decades of experience helping customers design their data centers and get the most out of their IT infrastructure. In addition, a majority of companies are seeking help in moving to the cloud. Hewlett Packard Enterprise design services help companies to obtain the private or hybrid cloud solution that adapts to their needs.

Control matters IT needs to become a service provider for line of business (LOB). HPE provides converged, software-based tools that help in this endeavor. They bring the entire infrastructure under control, automating provisioning and management as much as possible. In this way, provisioning times decrease from weeks to minutes. Infrastructure matters Every workload is unique. The HPE converged, software-defined hybrid architecture lets companies optimize for the needs of each workload. Balancing the needs of the particular use case, the company can tune efficiency, availability, and performance to the right levels. Business continuity matters While eager to obtain the promised agility and efficiency of cloud, CIOs are concerned about the integrity of their data. HPE designs its solutions to protect companies’ business information, whether hosted on- or off-premises, from external threats. HPE solutions also protect companies from the inherent risks of lost or improperly managed data that occur with rapid data growth.

Protect the digital enterprise

Figure 1-7 Protect the digital enterprise Protecting a digital enterprise requires alignment with key IT and business decision makers for a business-aligned, integrated, and proactive strategy to protect the hybrid IT infrastructure and datadriven operations, as well as enable workplace productivity (Figure 1-7). By focusing on security as a business enabler, HPE brings new perspectives on how an organization can transform from traditional, static security practices to intelligent, adaptive security models to keep pace with business dynamics. HPE solutions help customers protect their data in a variety of ways. HPE StoreOnce delivers simple and secure data backup and recovery for the entire enterprise. HPE ProLiant Gen9 servers support options such as UEFI Secure Boot to prevent untrusted, potential malware from booting. This ebook,

though, focuses on the capabilities on HPE Integrity Superdome X Systems in preventing unplanned downtime or data loss for mission critical workloads—you will learn more about these systems later in this chapter and throughout this ebook.

Empower the data-driven organization

Figure 1-8 Empower the data-driven organization A data-driven organization leverages valuable feedback that is available consistently from both internal and external sources (Figure 1-8). By harnessing insights from data in the form of information, organizations can determine the best strategies to pave the way for seamless integration of agile capabilities into an existing environment. Because both technical and organizational needs must be considered, HPE helps organizations define the right ways to help ensure that processes, security, tools, and overall collaboration are addressed properly for successful outcomes. Later in this ebook, you will learn how HPE server solutions provide the ideal infrastructure for a variety of data workloads.

Enable workplace productivity

Figure 1-9 Enable workplace productivity Organizations seeking to improve their efficiency and speed place a premium on creating a desirable work environment for their employees, including offering technology employees want and need. They believe they must enable employees to work how, where, and whenever they want. HPE solutions for the workplace provide secure, easy, mobile collaboration and anywhere, anytime access to data and applications for better productivity and responsiveness (Figure 1-9). To achieve the expected potential, spending growth for mobile resources is expected to be twice the level of IT spending growth in general, according to 2014 IDC survey results. Later in this ebook, you will learn about the HPE server solutions that are purpose-built for supporting the application and desktop delivery solutions that enhance employee productivity.

The HPE software-defined data center Next, you will look at the HPE software-defined data center, which is abbreviated as SDDC.

The HPE software-defined data center

Figure 1-10 The HPE software-defined data center SDDC is a concept in which the infrastructure of an organization’s data center extends the use of virtualization technology by abstracting, pooling, and automating all of the physical data center resources. A basic business definition of the term is “systems and procedures used in a manner that enable infrastructure resources to be controlled at the software level in response to changing business conditions.” Currently, the most typical response to changing business conditions is to burst out to additional virtual machines (VMs) using the hybrid cloud model. This is a useful step, but is only a onedimensional response to business conditions. What if the network conditions change, the storage requirements change, or both? That is why a business needs to progress toward an SDDC, where computing resources can be more fully adapted and can conform to the changing characteristics of business activities. Implementing an SDDC in effect amounts to delivering an IT as a Service (ITaaS) solution, illustrated in Figure 1-10. In an SDDC, the various elements of the infrastructure (which include network, storage, compute, and security resources) are virtualized and delivered as a service. Although ITaaS might represent an outcome of an SDDC, the focus of the SDDC solution is more for the benefit of the data center architects and IT staff, instead of the users or the consumers of the resources. Software abstraction in the data center infrastructure is not visible to the consumers. An SDDC can take the form of various potential implementation scenarios being offered by vendors. Consequently, some critics see the SDDC as an evolving marketing tool, whereas proponents expect that software will define data centers of the future and so they accept that the SDDC is a work in progress. An SDDC encompasses many concepts and data center infrastructure components where each component can be provisioned, operated, and managed through a programmatic user interface. The core architectural components that comprise a given vendor ’s SDDC solution might include the following: • Compute virtualization, which is a software implementation of a computer ’s processor, memory,

and I/O resources. This is, of course, commonly referred to as hypervisor software. • Software-defined networking (SDN) or network virtualization. This might involve provisioning VLANs on a switch, Ethernet ports operating as a single or aggregated link, ports supporting access or VLAN trunking, security settings, and so forth. • Software-defined storage or storage virtualization. This might involve provisioning storage LUNs on a storage array and HBA zoning on a SAN switch. • Management and automation software that enables an administrator to provision, control, and manage all SDDC components. An SDDC is not the same thing as a private cloud because a private cloud only has to offer a virtual machine self-service solution. Within the private cloud, the IT administrators could use traditional provisioning and management interfaces. The SDDC instead envisions a data center that could potentially support private, public, and hybrid cloud offerings. Some of the commonly cited benefits of an SDDC include improved efficiencies by extending virtualization across all resources, increased agility to provision resources for business applications more quickly, improved control over application availability and security through policy-based definitions, and the flexibility to run new and existing applications in multiple platforms and clouds. In addition, an SDDC implementation could further reduce a company’s energy usage by enabling servers and other data center hardware to run at decreased power levels or be turned on and off as needed. The SDDC is also likely to further reduce the costs for data center hardware and challenge traditional hardware vendors to develop new ways to differentiate their products through software and services. In summary, further acceleration of access to data center resources will require new control options, which suggests software-defined solutions will be needed to accomplish such objectives.

Role of IT in an SDDC

Figure 1-11 Role of IT in an SDDC The HPE journey toward a comprehensive SDDC solution must address the role of IT, where IT transitions from operating primarily as a cost center to a business value center. The table in Figure 111 lists some of the typical objectives of a traditional IT organization operating as a cost center as

opposed to an IT center ’s objectives, which are more typical of one evolving to a business value center. For example, this IT transition needs to address: • Supporting hybrid cloud operations instead of the strictly traditional on-premises scenario • Meeting expectations where the data center can be used for developing integrated applications and workflows supporting software as a service (SaaS), as opposed to being used primarily to deploy custom off-the-shelf (COTS) applications • Putting infrastructure in place in a matter of hours instead of weeks • Enabling business projects to be completed in 3–6 months instead of 9–12 months • Determining success based on key performance indicators (KPIs) instead of more basic IT operational metrics HPE envisions policy-based automation using open architectures as a key underpinning to an SDDC solution.

HPE SDDC architecture

Figure 1-12 HPE SDDC architecture The HPE architecture for the SDDC can be viewed as consisting of three major layers, which are shown in Figure 1-12: • Application—This layer is a next-generation applications platform supporting business applications and their related infrastructure applications. • Control—This layer provides control functions for IT administrators, LOB, and application level. The control layer implements the software-defined abstractions or constructs that map to the infrastructure resources needed to support application and service requests. • Infrastructure—The infrastructure layer presents a unified physical and virtual view. This layer

supports open, standards-based, programmatic access to the underlying disparate physical and virtual infrastructure resources (compute, storage, and networking) and hardware platforms. Collectively, these three layers in the HPE SDDC architecture unify the key functions of the IT organization: operations, security, governance, and business processes.

HPE SDDC infrastructure layer

Figure 1-13 HPE SDDC infrastructure layer In the HPE SDDC architecture, the infrastructure layer is responsible for provisioning, managing, and supporting the relationship between the physical and virtual resources of the IT infrastructure, as shown in Figure 1-13. The virtual infrastructure serves as an overlay upon the underlying physical hardware components. The underlying physical infrastructure is referred to as an underlay. For each of the major components of the physical infrastructure (compute, storage, networking, security, and facilities), there is a corresponding abstracted element: vCompute, vStorage, vNetworking, vSecurity, and vFunctions. Programmatic control and infrastructure management are tightly linked, and in some cases, the same tools can be used manually as well as through an application programming interface (API). HPE OneView is one example—each action that can be performed by an IT administrator through the graphical user interface (GUI) can also be done through the Representational State Transfer (REST) API. This allows HPE OneView to be part of a toolset for the IT administrator to use, or part of the programmed actions initiated by control panel applications. HPE OneView also provides management and analysis connections to power, cooling, and facilities management utilities. This can help ensure that changing requirements of the infrastructure resources do not get ahead of the associated facilities, power, and cooling support needs. For example, this helps to avoid situations where the IT group moves all the web traffic to one section of the data center, but forgets to adjust the power and cooling for that area of the facility. This ebook focuses mainly on designing the infrastructure layer, as well as server solutions that support transformation to an SDDC. You will also learn how those solutions integrate with higher

layers.

Designing server solutions to help customers transform the business This section introduces you to the HPE server solutions that you will learn about in this ebook.

HPE Apollo 2000

Figure 1-14 HPE Apollo 2000 The HPE Apollo 2000 System, shown in Figure 1-14, is the enterprise bridge to scale-out architecture. As IT strives to be a value creator, even the most conservative enterprise customers are looking for ways to save space and become more efficient. The Apollo 2000 delivers twice the density of traditional rack mount systems and the efficiency of a shared infrastructure; however, it maintains a familiar form factor, the same racks, cabling, serviceability access, operations, and system management. There is no retraining of personnel or cost of change for introducing efficient, space saving, scale-out architecture. The Apollo 2000 System brings HPE ProLiant Gen9 server technology, including iLO4, into this 2U, multi-server chassis. The HPE ProLiant XL170r Gen9 Server and the HPE ProLiant XL190r Gen9 Server offer more configuration choices that cover a much wider range of scale-out workloads. Storage and I/O flexibility enable customers to optimize for performance or economy—the right compute for the right workload. The r2800 chassis with 24 SFF drives allows customers to choose how they allocate the hard drives across the server nodes. Up to four expansion slots in the XL190r support accelerators or other fullsize cards. And you can mix and match trays to build a unique solution or partially populate, leaving room for growth in the future.

HPE Apollo 4000 Family

Figure 1-15 HPE Apollo 4000 Family Figure 1-15 introduces the Apollo 4000 Gen9 family of products, targeting big data solutions. Starting from the Apollo 4200 Gen9 server and moving up to the Apollo 4510 Gen9 Server, this server family handles data-intensive workloads that range from Hadoop analytics to object storage. Enterprise and SME customers who want to start or grow big data solutions with purpose-built, density-optimized infrastructure that is ready to scale will find exactly what they need in the HPE Apollo 4200 Gen9 Server. This new system is ideal for customers wanting to deploy smaller object storage systems; Hadoop and NoSQL-based big data analytics solutions; and smaller, data-intensive, high-performance computing clusters. The HPE Apollo 4510 System is purpose-built for object storage solutions. Customers can deploy cost-effective, HPE Apollo 4510 Systems optimized to meet the needs of their object storage solution requirements at any scale. HPE Apollo 4510 Systems can be configured to form the foundation platform for the whole variety of big data object storage solutions—from cost-effective, highcapacity content repositories that address petabyte-scale data volumes, to the tuned responsiveness required for content distribution systems. The HPE Apollo 4530 System is purpose-built for big data analytics. It can be configured to optimally match technology requirements for economical large-scale, Hadoop-based data analytics or it can be configured for more complex, compute-intensive analytics with high-performance processors.

HPE Apollo 6000

Figure 1-16 HPE Apollo 6000 The HPE Apollo 6000 System, shown in Figure 1-16, delivers 4x better performance per dollar per watt than a competing blade, using 60% less floor space. From the beginning, HPE designed this platform for scalability and efficiency at rack-scale, delivering a total cost of ownership (TCO) savings of U.S. $3M per 1000 servers over 3 years. The Apollo 6000 provides the flexibility to tailor the system to precisely meet the needs of your customers’ workloads. They can scale by chassis or rack with a single modular infrastructure, external power shelf dynamically allocating power to help maximize rack-level energy efficiency, and easy management. The system supports up to 160 x 1P servers/48U rack or 80 x 2P servers/48U rack with 8 chassis. You will look at the various compute options, optimized for various HPC workloads, later in this ebook. Efficiency at rack scale is fueled by HPE’s unique external power shelf, dynamically allocating power to help maximize rack-level energy efficiency while providing the right amount of redundancy for your customers.

HPE Apollo 8000

Figure 1-17 HPE Apollo 8000 For large compute problems, such as predicting agricultural parameters for optimal crop growth or finding a medical cure, researchers are excited about the new HPE Apollo 8000 System (shown in Figure 1-17), fueling ground-breaking research in science and engineering with HPE’s leading-edge technology. The HPE Apollo 8000 System reaches new heights of performance density, with 144 teraflops/rack. That’s up to 4x the teraflops per square foot and up to 40% more FLOPS/watt than comparable aircooled servers. In fact, the environmental advantages of the HPE Apollo 8000 System can be taken one step further by leveraging the water used to cool the solution to heat your customers’ facilities— which National Renewable Energy Laboratory (NREL) estimates will save them $1,000,000 in OPEX, including the money that would otherwise be used to heat the building. At the same time, HPE Apollo 8000 System helps reduce your customers’ carbon footprint, saving up to 3800 tons of CO2 per year. That is about the same amount of CO2 produced per year by 790 cars.

HPE Moonshot

Figure 1-18 HPE Moonshot

Figure 1-18 shows the HPE Moonshot System, a revolutionary server design that addresses the speed, scale, and specialization required for the IT of today that is emerging around the converging trends of mobility, cloud, social media, and big data. From the position as the leading provider of x86 servers for Internet environments, HPE has created the HPE Moonshot System, the second offering from HPE Project Moonshot. HPE Moonshot System is the world’s first software-defined server platform to deliver breakthrough efficiency and scale by aligning just the right amount of compute, memory, and storage to get the work done. The HPE Moonshot System adopts a federated approach to server design that saves energy and cost and enables extreme scale-out without a corresponding increase in complexity and management overhead. HPE Moonshot 1500 Chassis incorporates common components that include management, fabric, storage, cooling, and power elements and accommodates up to 45 individually serviceable hot-plug cartridges. The innovative software-defined cartridges can include one or more servers and are designed for specific Internet of Things (IoT) solutions, providing optimal results for a given workload. The workload range extends from dedicated hosting, data analytics, and web front end to more advanced functions made possible by graphics processing units (GPUs), digital signal processors (DSPs), and field-programmable gate arrays. HPE Moonshot enables enterprises to maximize their ability to innovate and speed their time to market with new services while reducing costs and energy use.

HPE Integrity Superdome X

Figure 1-19 HPE Integrity Superdome X The HPE Integrity Superdome X servers are purpose-built and optimized for mission-critical workloads that require the highest availability, scalability, performance, and efficiency (as shown in Figure 1-19). They provide a way for enterprises with the most critical and demanding business processing, decision support, and database workloads to gain the benefits of an x86 platform. As you will learn later in this ebook, Superdome X servers offer built-in Reliability, Availability, and Serviceability (RAS) features that customers have previously found only in UNIX-based or mainframe systems. Superdome X servers can, therefore, detect and recover from errors and failed or failing components, keeping your customers’ mission-critical applications running. Superdome X servers also offer unique hard partitioning features, which provide greater reliability than virtual partitioning.

These servers are highly scalable. They support up to 16 sockets and 24 TB of memory, delivering nine times the performance of 8-socket servers.

Chapter 1 Activity Take a few minutes to review the high-level benefits of HPE server solutions for customers seeking to transform and embrace the new idea economy. Specifically, consider how the HPE server solutions covered in this ebook (HPE Moonshot, Apollo, and Integrity Superdome X) support the four key ways that customers want to transform. • List the HPE products that support each Transformation Area: – Transform to a hybrid infrastructure – Protect the digital enterprise – Empower the data-driven organization – Enable employee productivity • List at least two benefits of these products in helping customers to transform You can use what you just learned for this activity, as well as the “Supplemental content” section at the end of this chapter. Do not worry if these benefits are high level at this point. You will learn much more about these products and how they support customer business requirements throughout this ebook.

Summary In this chapter, you learned that in today’s idea economy, enhanced access, data, and connections are driving exponential innovation, which creates disruptive new challenges and opportunities for IT. In this idea economy, organizations must protect their digital enterprise, empower the data-driven organization, enable workplace productivity, and transform to a hybrid infrastructure. You also reviewed the SDDC infrastructure and learned how HPE has solutions to support the architecture. Finally, you were introduced to the HPE server solutions that will be covered in the rest of this ebook.

Learning check Review what you have learned by answering these questions. Then check your answers in Appendix A: Answers to Learning Checks. 1. What is one way that an SDDC differs from a traditional data center? a. It focuses on functionality. b. It helps IT act as a cost center. c. It focuses on usability and experience. d. It enables project delivery to occur in 9–12 months.

2. Which HPE solution is part of the scale-up compute portfolio? a. HPE Moonshot b. HPE Integrity Superdome X

c. HPE Apollo 2000 d. HPE Apollo 6000

For answers, See Chapter 1 in Appendix A.

Supplemental content HPE perspective on hybrid infrastructure server solutions: Exceptional technology innovation

Figure 1-20 HPE perspective on hybrid infrastructure server solutions: Exceptional technology innovation Hewlett Packard Enterprise offers exceptional technology innovations that help businesses achieve rapid service delivery and exceptional growth, as you see in Figure 1-20. With the right compute solutions, your customers can take the business to the next step in automation because HPE servers are software-defined and cloud-ready. HPE OneView, which uses easy-to-program RESTful APIs to communicate with management capabilities embedded within HPE servers, helps to automate the server lifecycle. HPE ProLiant servers (especially blade servers in an HPE BladeSystem), together with HPE OneView, deliver a whole new experience for IT with the Power of One—one infrastructure, one management platform, from one company to speed the delivery of services. Only the Power of One delivers leading infrastructure convergence, availability with federation, and agility through data center automation. For customers who need a private or a hybrid cloud, the solutions integrate seamlessly with HPE Helion CloudSystem, which lets IT define various resource pools for individual use cases. The company can then easily deploy the right workload to the right location on the fly. Of the solutions that you will focus on in this ebook, HPE Moonshot solutions are supported by HPE Helion CloudSystem. HPE server solutions are composable, which means that their components can be combined to meet

particular use cases. They are also scalable so that companies can easily expand to increase their capacity and servers. Finally, the server solutions are converged with networking and storage solutions such as StoreVirtual VSA, making it simple for the company to orchestrate services rather than just servers.

Empower a data-driven organization with HPE

Figure 1-21 Empower a data-driven organization with HPE When customers are struggling to extract value from their data, their issues might ultimately derive not only from issues with their data analytics tools but also from an infrastructure that is not optimized to support and manage large volumes of data. Hewlett Packard Enterprise helps customers to lay the foundation for data-driven computing, as you see in Figure 1-21. As required by their particular workloads, customers can scale the infrastructure up (by adding power to single systems) or out (by adding systems). Customers can scale up with HPE ProLiant servers, designed for virtualization density. Some mid-market customers with advanced needs can scale up even more with HPE Integrity, which provides the leading performance and availability that mission-critical applications need. For the customers who need solutions tailored for the precise demands of big data and big data analytics, you can deploy density-optimized, scale-out solutions. In the next sections, you will learn how to choose HPE Moonshot and HPE Apollo systems for the appropriate roles within a big data solution.

HPE ProLiant and Moonshot Are the foundation for a data-driven organization HPE ProLiant servers can scale up to meet the high demands of a data-driven organization. These servers offer impressive performance and scalability. HPE gives customers the flexibility to choose from a variety of options based on their compute and application requirements. In this way, they obtain the proper expandable solution for their data center without overprovisioning. Now they can achieve breakthrough efficiency at a compellingly low TCO. At the same time, HPE ProLiant servers deliver top-of-the line resiliency features and support experience so that customers can achieve high uptime levels.

The higher-end ProLiant servers easily support customers’ ballooning data and the applications that draw on that data. Larger mid-market and enterprise customers who need to scale out find exactly what they need in the HPE Moonshot solutions. These solutions offer unrivaled scalability. To obtain greater compute density and flexibility as they grow, customers simply add more server cartridges. HPE Moonshot continues to deliver excellent throughput, supporting a growing user base with up to 1.7 times more operations per second than traditional 2U 2P rack servers (based on HPE Internal testing). At the same time, they offer a 66% lower TCO than those traditional servers (based on HPE Internal testing when the servers have an 80% read-heavy workload). This ebook focuses on the HPE Moonshot servers. (You learned how to architect solutions with HPE ProLiant servers in prerequisite training.)

Distinguish HPE Apollo as the ideal foundation for big data Purpose-built for mid-market or enterprise big data, HPE Apollo 4000 servers are ideal for customers who need to deploy smaller object storage systems, Hadoop, and NoSQL-based big data analytics solutions. These systems provide storage density, easy scalability, flexible configurations, performance and efficiency, and simple management converged with other HPE solutions. In a later chapter, you will learn how to design HPE Apollo 4000 solutions for holding Hadoop distributed file system (HDFS) and act as storage nodes in the HPE Big Data Reference Architecture. You will also learn how to using HPE Apollo 4000 for big data analytics and object storage. Dense storage capacity The Apollo 4200 servers provide more storage density than any other 2U server: up to 28 or 54 hotplug drives, depending on the model. The Apollo 4500 family can scale even further with support for up to 68 drives, depending on the model. Easy scalability These servers’ ultra-dense storage makes it easy for customers to scale their big data solution. Flexible configurations These servers can be configured for industry-leading storage density. They can also be configured for performance and throughput. Whatever your customers need, from object storage to data analytics to high-performance computing data-intensive applications, the Apollo systems can deliver. Performance and efficiency The Apollo servers can be configured for high performance and throughput. They support up to 16 memory DIMM slots with up to 1024GB, delivering the performance required for in-memory data processing for near real-time analytics. Powerful SAS and SSD drives provide up to 12 GB output to speed data transfer for analytics workloads. Customers will notice the difference in performance, unlocking the power of their analytics applications and giving them immediate competitive

advantages from their data. Common management These Apollo servers integrate seamlessly into traditional enterprise data centers with the same rack dimensions, cabling, service options, administration procedures, and tools. It is the ideal bridge system for enterprises that want to implement a purpose-built big data server infrastructure today and scale in affordable increments.

Enable workplace productivity with HPE

Figure 1-22 Enable workplace productivity with HPE Your customers cannot afford to ignore technologies that allow their employees to use the network, communicate, and collaborate in new and more productive ways. Under constant pressure to work faster and smarter, your customers’ employees need real-time access to information, whether they are on the road or in the office. With HPE Moonshot for Citrix XenApp, shown in Figure 1-22, your customers can quickly scale app delivery solutions to hundreds or thousands of users. Whether your customers choose solutions that streamline application delivery, offer hosted desktop infrastructure, or deliver a mobile workspace, HPE and Citrix offer what your customers need to boost mobile productivity while maintaining IT operational control. Innovative HPE Moonshot with Citrix solutions enable your customers to • Address specific mobile workspace challenges and requirements • Improve compliance and security, with all data residing on centralized servers, enabling IT to have greater control over apps and data • Boost cost-efficiency by using the right compute for each specific workload, so there are no wasted resources • Improve space and environmental efficiency (HPE Moonshot’s high-density design reduces space, cooling costs, and the energy footprint) • Support up to 2000 users in a single HPE Moonshot chassis

With HPE Moonshot and Citrix, organizations receive the technology they need to boost mobile productivity and speed innovation while maintaining IT operational control and improving operational efficiency.

HPE optimized compute portfolio

Figure 1-23 HPE optimized compute portfolio HPE has a portfolio of purpose-built solutions for a variety of workloads, as you see in Figure 1-23. These platforms support both scale-out and scale-up architectures to meet workload requirements. The sections below provide a brief overview of the solutions. You will dive into greater detail on the HPE Apollo, Moonshot, and Integrity Superdome X solutions throughout the rest of this ebook. Scale-out compute The scale-out compute part of the portfolio includes the Apollo family: the Apollo 2000 for generalpurpose scale-out compute, the Apollo 4000 family for big data and object storage, Apollo 6000 and 8000 for HPC, and Moonshot for next generation apps. Apollo systems provide leading storage density along with compute performance flexibility and the same iLO, APM, and Insight CMU management to meet the needs of a full range of big data workloads. This combination delivers leading space and power efficiency while lowering overall TCO. Apollo 6000 delivers rack-scale efficiency for HPC with • Up to 4x better performance per watt per dollar when compared to the competition • Leading performance per $ per watt HPE Moonshot System is unlike any other server that exists today. It is a huge leap forward in infrastructure design that delivers breakthrough efficiency and scale by aligning just the right amount of compute, memory, and storage to get the work done. The idea is very simple—replace generalpurpose processors with more energy-efficient SoCs (Systems-on-Chip) containing integrated accelerators tailored for specific workloads. Scale-up compute

The scale-up compute part of the portfolio consists of Integrity Superdome X platforms, the ProLiant DL580 and 560 servers, and BladeSystem BL660c series. HPE is the only vendor unifying UNIX and x86 with a single architecture so that customers have choice and investment protection from a suite of products for mission-critical workloads (Integrity, Superdome, NonStop, MCx86; enterprise). HPE has enabled x86 workloads (Linux, Microsoft Windows) on the Integrity Superdome X server platform. The HPE ProLiant DL Servers are general-purpose rack servers for enterprise applications. They deliver top performance, reliability, and efficiency for on-premises and cloud-hosted database, data warehouse, consolidated/virtualized IT apps, and high-performance computing workloads.

Chapter 2 Gather Customer Requirements EXAM OBJECTIVES • Identify key decision makers and explain how to engage them in a discussion about the company’s business requirements and challenges • Obtain data and documentation required to understand the company’s business requirements • Explain best practices for creating requirements, statements, and documents

Assumed knowledge Before reading this chapter, you should meet the following criteria: • Knowledge of processors, including DDR3 and DDR4 memory, hard disk drives (HDDs), solid state drives (SSDs), and RAID levels for storage volumes • Experience with HPE ProLiant rack and blade servers and options for them such as HPE Smart Array Controllers • Knowledge of HPE BladeSystems, including interconnect modules and Virtual Connect (VC) modules • Experience managing and maintaining servers, including iLO, Intelligent Provisioning, UEFI, HPE Insight Remote Support, HPE Insight Online, HPE Smart Update Manager (SUM), and HPE Insight Control server provisioning (ICsp) • Very familiar with HPE OneView capabilities

Chapter topics In this chapter, you will learn strategies for gathering information about customer requirements, including business continuity and availability requirements, IT management requirements, and facility requirements. You will also review key decision makers and their top-of-mind issues so that you are better prepared to engage them in discussions about their company’s business and technical requirements.

Customer requirements You will begin by examining general strategies for discussing requirements with customers, documenting information about the customer ’s existing solutions and needs, asking questions at the right level for each decision maker ’s business role, and defining meaningful and effective requirements statements. Understand the scope and constraints of the design Before you begin collecting the basic requirements from the customer, you should understand the scope of the project. Specifically, you should obtain from the customer a clear definition of the following: • General scope and purpose—Obtain a basic understanding of the scope of the design. Are you designing a server solution for a new application? Or does the customer already use the application and need a hardware refresh to improve performance? Or are you simply scaling out an existing solution? You should have a clear understanding of what the customer needs from the solution in general terms, and you should make sure that you and the customer agree completely upon the scope and purpose before you begin your design. • Implementation timelines and timeframes—Understand when the customer expects or plans to have the solution completely installed and operational. Defining the timeframe also includes defining whether the customer intends to implement the solution all at once or in phases. If the customer has not offered up the reason for the specific timeframe, you might want to ask. If a deadline is particularly tight, you should understand what is at stake and what is pushing an accelerated timeframe. • Budget—While some would argue that putting in place a proper design should drive the architecture, the fact is that budget will likely play a large role in many projects. You should understand the budget for the project and also try to assess what kind of leeway you have to exceed the budget. Budget needs to take into account not only hardware but also installation services and any training that the staff might need to become comfortable with the implementation. These components are often requested to be broken out into separate sections for clarity. In some cases, you might wish to present a customer with multiple designs. One design might meet the budgetary constraints. Another design might exceed the budget, but will show decision makers the features and functionality that they would get if they were to increase the budget for the project. A contrast and comparison between two designs, one optimal and one that fits within the budget, might open the door to helping you create a better solution for the customer. When making these

comparisons, ensure that you are focusing on the business needs of the customer. Decision makers need a clear understanding of how a cost-reduced solution will impact business operations. As you gain an understanding of the scope and constraints of the design, you should do your best to reflect them accurately to the customer. You should both be in agreement on these parameters before you begin your design, so it is important that there is no misunderstanding between you and the customer.

Focus on business requirements

Figure 2-1 Focus on business requirements You should always begin your process of defining solution requirements at a high level. Server solutions address business issues, and customers seek new solutions because they have a need or a problem that they hope these solutions can ameliorate. The high-level solutions are useful because they remind you that decision makers are pushed on one side by the problems that they have encountered in the past and pulled on the other side by the benefits that they hope to achieve. You can win these customers with a design that addresses the problems and provides new benefits. For example, the company might be attempting to improve operations, address existing deficiencies, or reduce the company’s risk. Figure 2-1 provides some examples of these objectives. Note that these high-level objectives are not intended as rigid divisions. For example, one company might look at obtaining more compute power per rack unit as a way of improving efficiency. Another customer might have an application that does not perform well, but be prevented in scaling out due to physical constraints. For this customer, obtaining more power per rack unit is a way to address current deficiencies. You will also need to use your understanding of the current trends in security threats, your knowledge of existing security measures, and your understanding of the customer ’s security requirements to create a solution that helps the company reduce their risk. You need to be familiar with the regulations that govern particular industries. For example, in the

United States, companies that provide health care must comply with the Health Insurance Portability and Accountability Act (HIPAA). Retail organizations must comply with the Payment Card Industry Data Security Standard (PCI DSS 2.0). Because most companies’ business activities extend beyond country borders, you must understand the company’s overall requirements for complying with regulations. Most regulations extend well beyond personal firewalls and local data encryption into extensive security requirements. While regulations might not specifically state security requirements, companies will want to add extra layers of protection for applicable servers and data in order to protect themselves from security breaches that could result in fines or other types of penalties. These fines can add up quickly, since each piece of data that is compromised can incur a fine. One security breach could compromise thousands of records and incur fines that can place an organization’s financial stability at risk.

Begin to identify the applications and workloads that the solution must support In your initial discussions, you should ask which applications and workloads the solution must support. For example, ask about applications such as OTLP, Big data analytics, and Cloud native. (These are just a few examples; it is by no means a comprehensive list.) You will focus on designing solutions for particular applications and workloads in Chapter 3.

Assess business continuity and risk management requirements You need to ensure that the customer can provide its services with minimal interruptions. Service interruptions can lead directly or indirectly to lost revenue, depending on the importance of the application. For example, if a transactional database goes down, the company’s operations grind to a halt. You can calculate the cost to the company with this formula: (Annual revenue/Annual hours)* Business reliance on service = Revenue lost per hour of downtime The other main risk that you need to help the customer mitigate is that of lost data, which might occur through hardware failure. Again, the risks are greater for mission-critical data, such as that stored in transactional databases. Help the customer to consider all the costs of unplanned downtime and data losses, including • Lost revenue—As mentioned above, an outage or data loss can cause the company to lose revenue. Also consider the impact that losing data related to the revenue stream could have on the company’s budget sheet. If income can only be reported after the data is manually entered following an outage, this could equate to revenue not being recognized on a balance sheet until weeks after the outage. This can be more significant if the outage is during a prime processing time, or if the data is not entered until the next fiscal year of the organization. • Damage to reputation—How does downtime affect the brand or the reputation of the company? • Impact to human resources—How are the personnel in the company affected by an outage? Does downtime equate to late nights or weekends in order to make up for lost productivity? Does it mean filling out paperwork by hand, only to have to re-enter the data into the system at a later time?

• Impact to regulatory compliance or contractual obligations—Will a service outage jeopardize compliance or create a breach of contract? If so, what are the ramifications of noncompliance or a breach of contract? • Cost to recover—What are the actual costs to recover from a failure? Aside from the actual cost in dollars, what kinds of reactions from senior-level managers and executives will a service outage provoke?

Quantify availability Your customer might request specific availability levels. When you calculate availability, you should look at these metrics: • Mean-time between failures (MTBF)—A measure (in hours) of the time between failures or outages. This is sometimes also referred to as mean-time between service outages or MTBSO. • Mean-time to repair (MTTR)—A measure (in hours) of the time it takes to recover from a failure. Availability is expressed as a percentage that is derived from MTBF and MTTR: Availability = MTBF/(MTBF + MTTR) For example, if the MTBF is 4000 hours, and the MTTR is 1 hour, availability is 99.98%. (4000/(4000 + 1) = 99.98). This type of availability would be roughly four 9s availability. The number “four” refers the number of nines in the percentage of uptime. Desired availability generally ranges from three 9s to five 9s. Table 2-1 provides the allowable downtime based on the required availability. Table 2-1 Availability calculations

As you discuss the availability requirements with the customer, it is important to specify a timeframe. A company might be able to accept a cumulative downtime of 87 hours in a year, but could not tolerate an outage of more than an hour or two on any given day. Discuss with the customer the absolute requirements for applications in the smallest timeframe that is acceptable. The higher the availability, the greater the cost, so you should encourage the customer to invest in greater availability for the truly mission-critical applications. For example, a transactional database handles mission-critical operations and data; the customer should invest in a solution that provides 99.999% availability here. On the other hand, a server that supports just one node in an HPC cluster

does not need to provide the highest levels of availability. Study the information in Table 2-2 to learn more about the severity levels for applications. Table 2-2 Application criticality Severity

Description

Missioncritical

• Requires 99.99% or 99.999% availability. • Downtime will disrupt the core business operations on which the customer bases its mission; downtime will cause large-scale loss of revenue, loss of business, loss of productivity, loss of reputation, or other otherwise significantly harm the company financially. • Downtime impacts multiple segments of the company.

Businesscritical

• Requires 99.9% availability. • Downtime will disrupt employees’ ability to do their jobs and might indirectly lead to loss of revenue, loss of business, loss of productivity, or loss of reputation if sustained. • Downtime might impact one segment or several segments of the company.

Noncritical

• Can tolerate 99% availability. • Downtime does not pose a significant risk or will not cause significant loss of revenue. • Downtime affects one or only a handful of individuals within a group or segment of the company.

Create effective requirements statements After you identify the high-level business needs, you must then transform them into specific, clearly defined requirements statements. A precise requirements document not only reassures the customer that your solution will align with their needs and vision, but it also protects you from unwarranted blame in the future. If a customer later indicates that the solution does not meet specific criteria, you can turn to the design requirements to show that those criteria were either not listed or not given priority. Thus, you have protected yourself from an unfavorable situation. Some customers will come with an RFP that already has specific requirements. Others need help elaborating their high-level needs into more precise ones. For each business need, you should create several design requirements statements of increasing specificity. Each requirements statement should accomplish the following: • Accurately reflects what the customer desires • Defines the requirement at a precise enough level that you can design a solution that unambiguously meets the requirement • Assigns a value that places that requirement into a hierarchy based on its importance to the customer The IEFT recommends the use of key words in the construction of requirements statements (see RFC 2119). Table 2-3 provides some examples. Table 2-3 Creating effective requirements statements Importance

Example Keyword

Critical Absolute must

Must/shall/required or Shall

Examples • The server that hosts the application virtualization controller

have

not/must not

MUST remain available with the failure of up to one link. • Data MUST remain available with the failure of up to one disk. • The NoSQL database solution MUST be able to perform 1,000,000 read or write operations per second.

High Preferable, but not an Should/Recommended or absolute requirement Should not/Not recommended

• The NoSQL database solution SHOULD be able to perform 1,500,000 read or write operations per second. • The solution SHOULD support automated OS provisioning.

Low Desirable, but not at all required

• The server MAY use load balancing on its adapters.

May/optionally

Create a requirements traceability matrix (RTM)

Figure 2-2 Create a requirements traceability matrix (RTM) As you define the business and technical requirements more precisely, you can begin to plan the technical tasks that support those requirements. A Requirements Traceability Matrix (RTM) such as the one in Figure 2-2 helps you to track the requirements throughout a project and ensures that each is fulfilled. Use the RTM to define each task required to fulfill the requirement. Fully define the task, including deliverables that will indicate that the task is complete.

Discussions with decision makers You will now consider the various decision makers you typically work with to gather the information you need to propose a solution. You will consider the concerns that drive each decision maker so that you are prepared to engage each one in a meaningful discussion about their company’s requirements. You will later be able to pitch a solution to each one.

Identify key decision makers

Figure 2-3 Identify key decision makers You will meet and talk with different types of decision makers, such as those you see Figure 2-3. You should be aware of each decision maker ’s top-of-mind issues so that you can tailor your questions to the role. Business leaders such as Chief Executive Officers (CEOs) will be able to answer questions about the company’s strategic goals or give you an idea about the future expansion of the company. However, you should not ask CEOs detailed technical questions, such as specifics about application architecture. Customers might start with general statements. They might not know exactly what they need. You must act as a sounding board, soliciting increasingly detailed information that you can then consolidate into specific and concise requirements statements. For example, a customer might begin by explaining that they need a server refresh to support big data analytics. You must draw out more information. What type of analytics does the customer plan to use? Is the customer dealing with structured data in a SQL database or dealing with unstructured data? Will users run queries on older data that might take several hours to complete? Or is the customer looking for real-time results? As you follow this process, remember the difference between business, technical, and financial questions: • Business questions—Ask how IT impacts the business, avoiding technical details and focusing on the underlying business needs. • Technical questions—Ask about the specific technology or solutions in place or required to meet business needs. • Financial questions—Ask questions to determine the available budget for the solution. You should also begin to determine how the company measures success or failure. For example, how does the company determine if the solution delivers the business outcomes it seeks? You will explore total cost of ownership (TCO) and return on investment (ROI) in Chapter 10, but at this point you should get an idea of what the financial decision makers’ expectations are and how they measure the value of their investment. The questions, of course, overlap. The business questions direct you toward particular high-level business needs, while the technical questions help you to figure out the best ways to approach

fulfilling those needs. You need to remember that technical questions should always flow from business requirements, particularly when you hear different information from different individuals. Sometimes these individuals are simply phrasing things differently, but are essentially saying the same thing about what is required. But other times you are truly hearing differing messages that you need to resolve with the key decision maker. You will now take a closer look at key decision makers and review each one’s concerns and roles so that you can tailor your message accordingly.

Understand CEO’s requirements CEOs focus on the overall corporate strategy. (See Table 2-4 for their top concerns.) When you engage with one of these decision makers, stay focused on business requirements and benefits. For example, a CEO’s top priority might be bringing products to market more quickly. In this case, you must be able to explain how a given solution can help the company achieve this goal. Table 2-4 CEO’s concerns CEO (Business decision maker)

• Increasing market share and profitability • Gaining competitive advantage • Reducing costs • Improving the customer experience • Enhancing productivity • Mitigating risk • Enhancing shareholder returns

• Approving early decisions about major business initiatives

Understand Line of Business manager’s requirements In the past several years, as IT solutions have become embedded in all aspects of the business, line of business (LOB) decision makers (such as the VP of Product Development or VP of Sales) have taken an increasingly prominent role in IT purchasing decisions. According to a survey by Harvard Business Review, an average of 5.4 people have “formal sign off on each purchase.” Furthermore, these people have a variety of jobs and functions and are even located in different geographies. (See “Making the Consensus Sale,” March 2015.) These decision makers will quickly focus in on their business outcomes and priorities. When an LOB manager has goals and responsibilities that rely on IT applications, the manager will have firm opinions and detailed requirements about how the application performs. These requirements will not be detailed in the technical sense—an LOB manager probably would not tell you that a server must provide a specific amount of memory. But they will be detailed in terms of what the managers need to gain from the solution. For example, the VP of Product Development might require the electronic design automation (EDA) application used by designers to complete jobs within a certain amount of time. Table 2-5 outlines the LOB manager ’s roles and top concerns. Table 2-5 LOB manager’s concerns Who LOB manager (Business decision maker)

Concerns • Identify tools that will make employees more • Identify applications that will attract new customers and retain existing customers

Roles • Obtain the solutions that will drive business • Attract new customers

• Work with IT to obtain the tools and applications the business units need and get them implemented in a timely manner

• Retain existing customers • Improve customer experience

Understand CIO’s and IT VP’s requirements Like CEOs, Chief Information Officers (CIOs) and IT VPs focus on the overall corporate objectives. However, CIOs and IT VPs have the specific responsibility of driving IT strategy to deliver these overall objectives. For example, they must ensure that the company complies with regulations, thereby reducing the company’s risk. (See Table 2-6 for a summary of their concerns.) The IT VP might also be responsible for defining policies and best practices, and any change or new solution must fit within these practices. For example, the customer might have policies about how data is stored in order to prevent data compromise or data loss. These policies are often part of a larger set of security policies and best practices, defined by the IT directors or perhaps a Chief Information Security Officer (CISO). Other policies might define minimum requirements for the infrastructure used for particular applications. Furthermore, CIOs and IT VPs are motivated by budget concerns and might be frustrated by the high costs of operating a data center. You could interest such a customer in solutions that will reduce operating expenses. For example, you might suggest a density-optimized solution that delivers more compute power in a smaller space and with reduced power requirements. Table 2-6 CIO’s and IT VP’s concerns Who CIO and IT VP (Business and high-level technical decision maker)

Concerns • Upholding SLAs • Reducing costs • Ensuring compliance with regulations • Ensuring network security

Roles • Driving IT strategy • For SMBs (VP/Director IT), controlling all infrastructure • Controlling budget

Understand IT director’s and manager’s requirements IT directors and managers focus on the technical level—although they must still understand how technical solutions and decisions affect the business. IT directors and managers are responsible for day-to-day operations. Table 2-7 lists some of the many day-to-day operations that IT must handle. First and foremost, they are concerned with keeping the data center running efficiently and mitigating operational risks—whether those risks come from outside threats, improperly scoped hardware, or faulty change management practices. They want to do more than keep the data center running. They want to improve uptime and minimize the time and effort required to manage and maintain the solution. Some of your customers might be in the process of implementing Information Technology Infrastructure Library, or ITIL, which defines the organizational structure and skill requirements of an IT organization. ITIL also imposes a formal process for managing incidents, problems, configurations, changes, releases, and even the service deck itself. (For more information, visit

www.itlibrary.org.) Larger companies might also have a solution architect. The decision makers will advocate for the right hardware for their workloads. Table 2-7 IT operations decision makers Who

Concerns

Roles

Server Operations Director or Manager (Technical decision maker)

• Avoid operational risks • Improve uptime • Minimize time and effort required to manage and maintain the solution • Enforce data standards and security policies • Manage patch/software releases • Implement ITIL (www.itlibrary.org)

• Running core and edge networks • Managing changes,upgrades, maintenance, and troubleshooting

Solution Architect or Planning Engineer (Technical influencer and possible decision maker)

• Ensuring infrastructure meets application needs • Minimizing the time and effort required to manage and maintain the solution • Ensuring the long-term viability of the solution

• Ensuring the long-term viability of the solution • Directing changes to the solution

Understand CFO’s requirements In most companies today, IT budgets remain flat, but expectations for IT solutions continue to increase. LOB managers hold IT departments accountable for services provided, and managers and employees alike demand less downtime and increased productivity. Meeting these challenges requires not only higher productivity and better utilization of IT assets but also an alignment between business goals and IT objectives. CFOs are interested in solutions that reduce costs and increase the efficiency of IT operations through productivity tools and improved utilization of resources. CFOs are also looking for risk mitigation technologies that allow them to make more informed decisions and maximize business profits. The CFO is focused on ways to control expenditures by tracking and consolidating all IT expenses by asset, project, contract, and owner. The CFO wants to maximize the value of existing assets and support intelligent financial decisions while being able to capture, monitor, measure, and manage costs associated with assets, contracts, or projects. The CFO needs to reduce costs by retiring, offshoring, or outsourcing IT services while still delivering IT services on-time, within budget and with established quality standards. Table 2-8 summarizes the CFO’s requirements. Table 2-8 CFO’s requirements Who Chief Financial Officers

Concerns • Controlling IT spending • Tracking operational and capital expenditures holistically • Increasing availability and performance of revenue producing and tracking application to prevent revenue loss

Roles • Managing financial risks of the business • Aligning IT needs and shrinking corporate budgets • Financial planning, record keeping, and reporting

Understand procurement manager’s requirements With the increasing attention on fiscal responsibility and improved management, the procurement manager is challenged to improve costs, mitigate compliance and security risks, and provide information to drive business decisions, as you see in Table 2-9. The procurement manager is focused on improving measurement tools to provide visibility into how the IT organization is doing, while finding the right resources at the right price. The procurement manager is often in charge of gathering other individuals involved in the buying process. Table 2-9 Procurement’s requirements Who Procurement manager

Concerns • Enforcing corporate standards • Obtaining services and products the business needs to operate • Finding the right resources to enable IT, at the right price

Roles • Provide information to drive business decisions • Control all procurement processes • Purchasing decisions

Maintain awareness of political climates In addition to being able to discuss business issues and solutions with different decision makers, you must always assess the politics and culture of the company. By understanding and navigating the customer ’s politics, you can create a proposal that the key decision makers are more likely to adopt. You should be aware of political factors such as these: • Organization of IT—Does the company have converged teams that include both server and networking specialists? Or is there a server team and a networking team? • What are the group dynamics? Do you notice any hostility between certain groups? Does one group seem to have more to say than another group? Do some groups have similar objectives but cannot see eye-to-eye on a solution? • Remember to ask about the ramifications of the server solution design. Who will take the most responsibility for the success or failure? What are the rewards and who will be rewarded for success? What are the ramifications for failure? • Impact to employees, customers, or partners—Who will the server design affect? Will certain groups within the company be taking on more or less responsibility after the design implementation? If certain groups will be taking on less responsibility, does this imply downsizing? Is it possible that some of the people helping you with the design could be eliminated after the design is completed and implemented? • History of organization—Find out as much about the history of the previous IT implementations as possible; this will help to avoid past pitfalls. If previous successful implementations immediately led to downsizing, this might hinder employees’ willingness to implement new technologies. If the organization has a history of poor implementations, then it is vital to find out the source of the failures to avoid repeating these mistakes. While your job as a server architect is not to moderate the political climate, you might find that understanding a company’s dynamics can make it easier for you to get the job done. If you can see the

commonalities between different groups within the company, you can help design solutions that meet the needs of a broader segment of the company. If you understand who stands to gain from the new server solution, you might be able to make a friend or an ally that can provide you with the data you require. And if you understand how the network design will impact groups and personnel within the business, you may gain a better insight into why certain individuals are not as forthcoming as they might otherwise be.

Gather information about new requirements You now have a sense of the many different stakeholders with whom you will interact. Next, you will turn to exploring strategies for collecting the information you need from these decision makers in order to architect a solution that meets their needs. You can take several approaches to gathering information about the executive, IT operation, and LOB requirements, and you have probably used at least some of these many times. In addition to meeting with decision makers for personal interviews, you can ask these decision makers if they would work with you in conducting user or IT staff surveys and questionnaires. Such surveys can uncover issues and pain points of which high-level decision makers are less aware. They can also give you valuable information about how employees actually use applications and what they expect from the solution. You might be designing a solution to host a new application that the customer is rolling out. But often you will be providing an upgrade intended to deliver better performance, greater efficiency, or greater scale for an existing application. You need to understand as much as possible about the application and the current solution. Request information such as current server specifications, logical topologies, and application architectures. Also, request information about current performance and resource utilization. This information will prove invaluable as you design your solution. For example, if you know that the existing servers are constantly reaching their memory limits, you would know to provision more memory for the corresponding servers in your solution. It is important to note that you should treat any document that the customer provides you with respect and confidentiality. If you have not already done so, be prepared to sign a Non-Disclosure Agreement (NDA) or some other form of confidentiality agreement before gaining access to this information. When a customer has sensitive government information, you may even be required to have a security clearance. You should also treat the documentation the customer provides you as subject to some uncertainty regarding its accuracy. IT jobs are demanding and keeping documentation up to date is not always a priority. The information the customer provides is there to help inform the decisions that you make when you design the customer ’s network solution. A rule of thumb is to verify the revision date of any document that the customer provides you. The further the current day is from the revision date, the less credence you should give to the document, even if the customer asserts that things have not changed. It can be surprisingly easy for important requirements to remain undefined when you rely solely on discussions. You can job shadow an SME to learn exactly how applications are architecture and used, to uncover the precise infrastructure requirements, and to gain insight into IT processes. You might also be able to uncover pain points or inefficiencies that you can solve in your solution, making the solution more attractive to the customer. For example, you might observe that server administrators

spend a significant amount of time provisioning servers with their OS or that they struggle to give you the information that you have requested about resource utilization. You would then know that the customer might be a candidate for a provisioning and monitoring solution such as HPE Cluster Management Utility (CMU).

Ask questions You will now explore some of the different types of questions that you might ask during the personal interview: Verification, New information, Golden nuggets, Opinion, and Commitment. You do not need to memorize these categories or worry about determining whether a specific question fits one category or another. What is important is that you consider all the types of questions that you can ask and know how to ask appropriate ones.

Verification

Figure 2-4 Verification You will be gathering a great deal of information from many different sources. You must verify that you have understood what stakeholders have told you, seeking to avoid assumptions that could lead to design errors. Your sales partners pass on some information to you, but this information is often high-level, and you must verify and deepen it. In the example in Figure 2-4, the architect confirms that he or she understands all of the EDA tools a customer is using in order to ensure that the high performance computing (HPC) solution meets the needs.

New information You must ask many questions to uncover new information. Even if you have worked with a customer before, do not assume that you understand the environment. Ask for updates. Use these types of questions to work with SMEs to fill in any knowledge gaps. For example, you often are planning a server refresh in order to deliver better performance for an application that a customer is already using. You need to learn as much as possible about how the application is performing now so that you can understand what needs to change. In the ongoing EDA example, you might ask, “Have administrators monitored resource usage during analysis jobs and discovered any overutilized resources?” And although your sales partners have probably already discovered many business needs in preliminary discussions, keep your ears open for other business requirements that you might be able to meet.

Golden nuggets You will find “golden nuggets” of information as you ask fact finding, problem identification, and implication questions that lead decision makers toward understanding the importance of the solution that you will propose in meeting their business requirements. You begin by finding facts about the current state of affairs. For example, you might say, “I’m told that EDA jobs can take hours to run. What do designers do while they wait?” The customer ’s answer will probably point toward a problem. Your next question should make that problem explicit: “Did I understand correctly that designers cannot continue working while they wait for their jobs to complete? Is the backlog affecting deadlines?” After the customer has acknowledged the problem, you can draw out the implications, pointing the customer toward the ways in which your solution can solve the problem and meet the customer business requirements. For this example, you might ask, “Could you bring products to market more quickly if you had hardware that could better support your EDA jobs?”

Opinion Sometimes asking decision makers to share their opinions is the best way to discover unidentified issues. For example, you could ask managers for the department using an EDA application, “Do you believe that your designers have the help they need to work efficiently?” Questions like this can reveal requirements that stakeholders might not otherwise have mentioned, but that can transform your proposal from a merely adequate one to the one the customer chooses to implement. Such questions also demonstrate to stakeholders that you care about their issues and opinions. Finally, you can gain valuable information about stakeholder attitudes. As you know, when you plan a solution, you are not only wrestling with technical requirements but also with political issues. Do key stakeholders have a bias toward particular types of solutions or designs, such as InfiniBand versus Ethernet for an HPC interconnect? Do they seem likely to want the best solution money can buy, or do they want you to balance their requirements with their budgetary constraints?

Commitment This last question category is intended to help win decision makers to your side and to make them more likely to commit to your proposal. Acknowledge stakeholders’ expertise and ask for their honest thoughts about the project and their objectives. For example, “You have been in this role for a number of years, and I am still learning about the organization. What are your thoughts about this project and the intended goals of the project?” When you know what is important to the stakeholder—and when the stakeholder knows that you value what they value—you can create a proposal that the stakeholder is more likely to accept.

IT management requirements You should also become familiar with the customer ’s IT processes and governance requirements. You can then recommend the appropriate management solutions and lifecycle services to meet the

customer needs.

Management domains

Figure 2-5 Management domains Customers often divide IT management into various domains, such as the ones shown in Figure 2-5. The domains of most relevance to you are hardware management, software management, and facilities management. However, modern data centers require more convergence and cooperation between teams. The customer might have a siloed IT governance culture, but it is important that you avoid falling into the silo trap. A storage or network manager might have a crucial piece of information about the current data center infrastructure that will affect your design. For example, in order to propose uplink modules for HPE Moonshot chassis, you must understand how the chassis will fit in the data center network. Take care to involve all stakeholders in discussions to avoid changes to plans at a late date. Understanding a company’s IT governance processes and a customer ’s particular goals for a project can also help you to offer the correct solution to the customer. For example, sometimes customers have more or less standalone projects. They are deploying a new application and want to get the complete infrastructure required for that application without extensive efforts across siloed management teams. Offer these customers HPE ConvergedSystems, which bring together the servers, networking, and storage required for various applications—delivering a proven solution that is up and running in a fraction of the time for a typical project.

Management and monitoring tools

Figure 2-6 Management and monitoring tools Customers require tools that operate at several different levels, as shown in Figure 2-6. Element-level tools manage and monitor a single component, such as one server. Resource pool tools manage multiple resources, such as all of the customers’ storage arrays or servers. Finally, a solution stack tool manages and monitors converged resources, including all of the compute, storage, and network resources required for a solution. All of these management tools can have a role to play within a customer ’s overall processes. You will learn about HPE solutions that you can recommend at each level.

IT processes and HPE solutions to transform them As you meet with the customer, you should assess the level of standardization and automation that has already been achieved with its IT processes, as well as the level that the customer wants to achieve. You can help customers understand that they can avoid costly human errors, reduce IT operational costs, and roll out applications more quickly by standardizing, automating, and aligning the infrastructure with LOB requirements. You can then propose the HPE management solutions that support the level of transformation that the customer wants. (You will learn more about these solutions throughout this ebook.)

Chapter 2—Activity 1 To review the information you have learned, spend a few minutes completing an activity. You will read about a fictitious company called Make Things Better (MTB) and try to uncover the company’s pain points and identify key initiatives.

The situation MTB is a large manufacturer of health products, pharmaceuticals, and consumer packaged goods. The company’s tagline is “Products for a healthier and happier world.” It is perhaps best known for a groundbreaking medication that slows the ravages of Alzheimer ’s disease. Headquartered in the New York metropolitan area, MTB comprises about 250 subsidiary companies with operations in 63 countries and products sold in 172 countries. The company had worldwide sales of $78 billion during 2013. (Note: All financial figures in this scenario are US dollars.) MTB employs about 110,000 people worldwide. MTB has two enterprise data center pairs, one pair in Pennsylvania, USA, some 35 miles (~56 km) apart, and another pair in the Netherlands about 10 miles (~16 km) apart. It also has six regional data centers, located in Brazil, Australia, Singapore, South Africa, India, and China. In addition to these data centers, MTB has 160 remote locations and roughly 105 manufacturing sites all around the globe, each with varying IT requirements. You are a solutions architect with HiP Solutions, an HPE partner based in Brooklyn, New York, and you have an ongoing relationship with MTB.

Pain points Read about MTB’s pain points below and then answer the multiple choice question that follows. Over the years, MTB has allowed its business units to shape their own IT solutions, even as it has tried to wrap some global governance policies around IT in an effort to streamline operations and improve the procurement process. Practically speaking, however, this has not worked, and MTB is experiencing issues with its aging data center, such as outdated environments, nonstandard products, different vendors, and a mismatch of tools. In addition, MTB’s subsidiary companies that manufacture pharmaceutical products must comply with local laws and regulations. 1. Which statement best describes the insight you gained about the customer ’s pain points? a. Although MTB wants to improve its global governance of IT, it still has a massive distributed behavior. b. Replacing MTB’s aging data center infrastructure with HPE solutions will ease the customer’s difficulties while allowing it to continue functioning with separate business units. c. MTB has reached the cloud-readiness stage, but needs help moving from a CAPEX to an OPEX model. d. MTB’s number one priority is to document the different IT governance and procurement policies in various business units.

You can check your answer by referring to Appendix B: Answers to Activities.

HPC, R&D, and Big Data Read about MTB’s HPC, R&D, and Big Data initiatives and then answer the following multiple choice questions. To address MTB’s top-line goal of speeding up manufacturing, the MTB research group is designing a new manufacturing process that requires unusually high levels of speed and efficiency. This group is driven by the innovations they have been able to realize from new compute capabilities. As the CIO, Amita Deva said about the group’s activities, “There is no end to the insatiable demands for every increment of HPC from this research group.” The biggest challenge for MTB in this space is maximizing compute power within budget for capital expenditures, personnel, and facilities. The group has looked at the Open Compute Project (OCP), which is championed by Facebook and other large IT companies, but members are worried about the commitment of low-cost manufacturers to a given product platform because MTB’s projects can last for years. You have lunch with a friend who is a scientist in the pharmaceutical field. She tells you casually that the IT department of one of MTB’s local business units is looking to refresh its HPC environment. Even if you do not know the details of a customer ’s HPC environment, it is still possible to demonstrate the value of an HPE solution. The HPE Hyperscale Business Value Calculator enables you to compare by workload a density-optimized solution to a traditional rack system. Using a simple comparison of an Apollo 6000 solution with a SuperMicro SD-5038ML-H8TRF with no customizations, including list pricing, you can show 11% TCO savings over three years. This number should be enough to pique the customer ’s interest. 2. What does this information tell you about HPC and MTB? (Select two.) a. MTB’s new manufacturing process and the lead from your friend tell you that HPC is a hot topic within MTB. b. Only one or two MTB business units and operating companies are looking at HPC. c. If you can bring HPC into the SDDC environment, a flexible pool of HPC resources might benefit MTB in general and also allow for better control through centralization. d. MTB may be interested in the HPC solution, but only because you have demonstrated that it may result in cost savings.

3. MTB’s big data environment for clinical trials currently resides on the Teradata platform. This environment is starting to become a bottleneck, so Teradata has recently submitted a proposal for expansion. One of your coaches told you that the proposal on the table is for $26M. What can you do? a. Explain that MTB would save money by switching to an all HPE platform. b. Propose a solution that can offload data from the Teradata environment, allowing MTB to extend the life of the current environment without performing a complete migration or paying Teradata a large amount of money. c. Do nothing, as MTB is clearly invested in the Teradata platform and it would cost the company more money to integrate another vendor’s solutions.

You can check your answers by referring to Appendix B: Answers to Activities.

Decision makers

Read about MTB’s key decision makers. Then answer the question. Knowing that HPC is a hot topic with MTB, you are now entering discussions with key decision makers. Before you meet, you review what you have learned about them. You know that MTB has been in turmoil since the chief executive officer (CEO) resigned two years ago for a career outside MTB. A new CEO joined the company 14 months ago and, as is customary, brought along his friends, including the new CFO. The four top executives include: – The CEO, Rick Jaggers, previously worked in the financial services sector. He is eager to prove his value to the company, which has not had a breakthrough new drug for several years. – Amrita Deva, the chief information officer (CIO), most recently worked for another large pharmaceutical company. She is familiar with Teradata, EMC storage, HPE servers, and Cisco networking. She has been working hard to bring global governance to disparate business units and to standardize IT services. – The chief financial officer (CFO) and economic buyer is Denzel Walker. His most recent role was CFO for a financial services company, where he reported to Jaggers. He has mentioned several times that he has stepped into a big job, imposing restrictions on ballooning R&D budgets and trying to sort out why some projects are consuming budget but producing fewer results. – Janet Choi, the chief technology officer (CTO), previously worked at one of the Big Four consulting companies. She is a personal friend of Walker ’s and strongly favors HPE servers, storage, and networking solutions. She is a key driving force behind initiatives for speeding up development and manufacturing with HPC and is frustrated that she cannot get working environments up and running as quickly as she wants. 4. Which question is most appropriate for each decision maker? (Match the question to the decision maker.) a. Jaggers (CEO) b. Deva (CIO) c. Walker (CFO) d. Choi (CTO) __Could you tell me more about how developers are using HPC? What do they do when they cannot get the compute resources they need to run a job? __What is the biggest stumbling block stopping IT from deploying HPC environments that meet manufacturing’s insatiable demands at the pace they require? __A year from now, what do successful R&D and manufacturing departments look like to you? How will they be using HPC to get products on shelves more quickly? __I am hearing R&D and manufacturing say that they need more HPC compute power to finish their projects. Would you be interested in giving them that without expanding the data center physical footprint and power costs?

You can check your answers by referring to Appendix B: Answers to Activities.

Facilities requirements You will now consider what you need to know about the facilities to ensure that the products that comprise the solution can be delivered, moved to the data center, and installed successfully.

Discuss requirements with facilities manager You need to call or meet with the facilities manager and go over the requirements for delivering the equipment. Ask questions such as: • What information do you need to provide? • What is the process for delivering large shipments? • What is the delivery address? • Are there packaging restrictions, such as size and weight limitations? • Is the dock large enough for a semitrailer? • Who will meet you onsite for the delivery? • How will the equipment be moved from receiving to the data center? What route will be taken? Carefully document the answers to these questions and send this document to the facilities manager to validate the information.

Site survey You should conduct a site survey. Ideally, ask the facilities manager to meet you onsite and walk you through the path that the movers will take to move the products from receiving to the data center. If you can use an elevator, how big is the elevator? Is there a freight elevator? Will the products fit inside the elevator? If you must move the equipment up or down stairs, how wide are the stairs? How sharp are the turns? Measure doorways: How high are they? How wide? What time can the equipment be moved? Does it need to be moved after work hours? How many people are required and what equipment is needed to move the products?

Site survey: data center requirements You should visit the data center and ensure that it meets the requirements for the solution you are proposing. You need to consider power requirements and environmental requirements. You also need to determine where the new products will be housed and how they will be arranged. Is there enough room? Will you need to remove legacy products before you can install the new ones? How will that migration happen? What safety regulations must be followed when the equipment is being moved and installed? What are the security regulations? Do you need a temporary access card to get into the data center?

Data center facility availability tiers

Figure 2-7 Data center facility availability tiers You might also need to understand the availability level for which the data center facility is designed. The Telecommunications Industry Association (TIA) has defined a standard for data center design, TIA-942, which includes architectural, security, electronic, mechanical, and telecommunications requirements (see Figure 2-7). TIA-942 defines the availability levels for the infrastructure systems that support servers, such as the power and cooling systems, with four tiers. (The 2014 release of TIA-942 replaced the term “tier” with “rating”; however, you might still encounter customers who use the “tier” terminology.) • Tier 1 provides one distribution system and no redundant components. • Tier 2 provides one distribution system with redundant components. • Tier 3 provides multiple distribution systems, only one of which is active, and also redundant components. • Tier 4 provides multiple active distribution systems, each with redundant components. Tier 1 and Tier 2 systems are subject to both planned and unplanned downtime (Tier 2 systems are less so, due to the redundant components). Tier 3 and Tier 4 systems do not require planned downtime for system maintenance. Tier 3 systems are still vulnerable to some unplanned downtime, while Tier 4 systems are protected against at least one worst-case event. Different systems within a data center can meet the requirements of different tiers. The data center ’s tier, based on the system with the lowest rating, defines a guaranteed availability level: • Tier 1 = 99.671% or annual downtime of 28.8 hours • Tier 2 = 99.741% or annual downtime of 22 hours • Tier 3 = 99.982% or annual downtime of 1.6 hours • Tier 4 = 99.995% or annual downtime of .4 hours

Chapter 2—Activity 2 In this activity, you will consider the power and environment requirements that you should determine before designing a solution. You will further consider what you need to ask the IT director about the placement and arrangement of the products in the data center. Finally, you will consider what you need to know about the company’s safety and security regulations. You are performing a site survey to ensure the site is ready for the new HPE equipment. You must understand the power and environment requirements for the solution you are recommending and ensure the customer ’s data center can meet those requirements. You must also know where the new solution will be housed in the data center, and follow the safety and security regulations when you install the new solution. 1. What power requirements must you consider when installing a new solution in the data center? 2. What environmental requirements must you consider when installing a new solution in the data center? 3. What should you ask about the placement and arrangement of the products in the new solution? 4. What should you ask about the safety and security regulations? You can check your answers by referring to Appendix B: Answers to Activities.

Find specific requirements for HPE products

Figure 2-8 Find specific requirements for HPE products It is important to carefully consider the unique requirements of the equipment you are designing for. To find these requirements—which vary from product to product—visit the HPE Information Library at http://h17007.www1.hpe.com/us/en/enterprise/servers/solutions/info-

library/index.aspx#.VtCVkPIrKUk. Here, you can find user and installation guides for a specific product or solution, which include site considerations and setup requirements. Figure 2-8 shows a sample of listings for Servers & Management software.

Summary This chapter has given you strategies for discussing customer requirements, such as business continuity and availability, with key stakeholders. You have also considered the customer ’s IT management processes and the need to assess how ready the customer is for transforming. Finally, you have considered the logistical considerations you must take into account for delivering, moving, and installing the solution.

Learning check Review what you have learned by answering the following questions. Then check your answers in Appendix A: Answers to Learning Checks. 1. What is most likely to be a concern for an LOB manager? a. That IT solutions follow best practices b. That IT solutions meet security standards c. That IT solutions meet their tactical requirements d. That IT solutions support automated patch management

2. If a server provides 99.999% availability over a year, how much unplanned downtime can it experience? a. 26.3 seconds b. 5.3 minutes c. 44 minutes d. 8.7 hours

For answers, See Chapter 2 in Appendix A.

Chapter 3 Advanced Architecture for Server Solutions EXAM OBJECTIVES • Analyze the special needs of data, High-Performance Computing (HPC), and mission-critical workloads • Given a customers’ specific requirements, architect a solution for a data, HPC, and mission-critical workloads

Assumed knowledge Before reading this chapter, you should have a basic understanding of the following: • Design concepts such as server-to-storage ration and scale-out deployments • Processors, including DDR3 and DDR4 memory, hard disk drives (HDDs), solid-state drives (SSDs), and RAID levels for storage volumes • HPE ProLiant rack and blade servers and options for them such as HPE Smart Array Controllers • HPE BladeSystems, including interconnect modules and Virtual Connect (VC) modules

Chapter topics In this chapter, you will analyze the requirements for data-driven organizations and learn how to architect solutions to meet these requirements. You will also consider requirements for HPC and mission-critical applications and review solutions for each one.

Architecture for data-driven organizations In this section, you will look at the variety of workloads that have emerged in data-driven organizations from traditional relational SQL databases to massive object storage solutions. You will consider the unique needs of each workload and learn about general strategies for meeting those needs. This discussion will lay the framework for later chapters when you learn in more detail how to architect the appropriate HPE server solution to meet the needs of various applications and workloads.

Data management challenges Knowledge is power: data has become a key resource and potential revenue generator for almost every industry. But companies find it difficult to harness the complex and vast amounts of big data. Industry analysts characterize the emerging world of big data in three ways: • Volume—Data has been growing exponentially for years, and it promises to continue to do so. Data is transforming from being counted in terabytes to being counted in zetabytes, and analysts estimate that people will have generated 40 ZB of data by 2020. To deal with the challenges of volume, customers require scalable solutions. • Velocity—Users are constantly generating new data from which companies need to extract realtime value. Enterprises do not have the luxury of replicating different types of data from globally distributed sources into a single data warehouse and processing it every night. Cost and time limitations drive companies to process data more efficiently and even in real-time in some cases. Only in this way can the company gain true business value from its data. High-velocity data demands high-performance technologies, systems that can process data instantly in shrinking time windows, and systems that can scale on demand. • Variety—Adding to the complexity of managing big data is its variety. In addition to structured data in relational databases, companies must store millions and even billions of unstructured data objects. Structured data is organized in a way that facilitates automated processing and searching. Unstructured data is not organized and does not facilitate automated processing or searching. Such data is becoming more and more common; in fact, roughly 85% of data today is unstructured. Examples of unstructured data include voice mail, memos and other correspondence, meeting notes, image files, audio and video files, and email messages. Companies need solutions that can store each type of data efficiently. Some analysts add other Vs, including veracity (the accuracy of data, added by IBM) and value. However, most focus on volume, velocity, and variety.

Note that these data challenges apply not only to big data, usually associated with Hadoop, but to all of a company’s data assets. The following sections discuss many types of data workloads.

Scaling models required by different workloads

Figure 3-1 Scaling models required by different workloads To fully leverage all of its data assets, a company must use the technology best suited for that asset’s value and data personality. You can classify data applications in two broad categories: ones that require scale out compute and ones that require scale up compute, as shown in Figure 3-1. Scaling out involves deploying a high density of less powerful servers. The scale out model has become popular in recent years because it often provides more flexibility and allows companies to grow in a more cost-effective manner. However, scale-up solutions still have a role to play for the right applications and workloads. These solutions use powerful servers with a large number of processors, a large memory capacity, and perhaps a great deal of storage. They deliver high performance, high availability, high reliability, and disaster tolerance for mission-critical workloads.

Parallelization on scale-up and scale-out systems

Figure 3-2 Parallelization on scale-up and scale-out systems Parallelization lets an application take advantage of the compute resources that are available in either a scale-up or a scale-out solution. Some applications are easily parallelized while others can only be partially parallelized, if at all. For example, a big data analytics application might need to analyze millions of records to find each record that mentions a specific word. The application can easily split the task into smaller tasks, each of which analyzes a different set of records. This type of task is called embarrassingly parallel. If, however, the application then needs to total the number of mentions, this final part of the task cannot be parallelized because it depends on the completion of previous tasks. Application designers can make demanding applications run more quickly on scale-out systems using distributed computing in which a resource scheduling mechanism divides a process into multiple jobs and assigns jobs to different servers. In later sections about data workloads that require scale-out compute, you will see the different approaches that applications can take to distributing tasks to nodes. A scale-up system has many processors with multiple cores, each core essentially being a processor that shares memory with other cores on the same processor. Even a server in a scale-out system might have two processors and several cores on each processor. Multi-threading lets the server take advantage of all of these processors and cores. A workload essentially consists of a series of operation instructions that a processor core executes in order. Instruction level parallelization, illustrated in Figure 3-2, lets the server assign some of the operations in the series to different processor cores to execute in parallel. The server can only do this when the results of the operations do not affect each other. Therefore, instruction level parallelization can only take advantage of a limited number of processor cores. An application running a single system can implement a higher degree of parallelization through multi-threading. The application process creates multiple threads for each part of the task, and each thread executes on a different processor core. (Time splitting lets multiple threads share a core, but each thread then takes longer to execute.) For example, a SQL database assigns a different worker thread to handling queries from each concurrent user. Different applications support different levels of multi-threading, depending on how easily parallelized they are, as well as on how developers chose to program the application.

Modern data applications and workloads that require scale-out compute

Figure 3-3 Modern data applications and workloads that require scale-out compute Now that you have an idea about some of the differences between scale-out and scale-up models, you will examine data applications that require scale-out compute (shown in Figure 3-3). Each application is best served by a specific technology: • Object storage allows companies to achieve massive content storage. • Virtualized storage helps to make block and file storage more cost effective and simpler to manage. • Hadoop is designed for analyzing big unstructured data or big data. • Not only SQL (NoSQL) databases provide simple databases for large amounts of unstructured data. Each technology is optimized across scale, performance, and cost efficiency attributes to deliver a specific value proposition. The following sections explain each technology in more detail, characterizing the workload and explaining how scale-out compute best meets the workload needs, providing • Distributed performance that can be aggregated across scale-out building blocks • Density optimization that reduces the data center footprint and power consumption • Direct-Attach-Storage for eliminating storage complexity and achieving better performance • Configuration flexibility to reconfigure storage and compute ratios as necessary (as you will learn more throughout this ebook)

Comparing block, file, and object storage

Figure 3-4 Comparing block, file, and object storage Before you look at technologies for supporting block, file, and object storage, take a moment to compare these types of storage that you see in Figure 3-4. Block storage Block storage allows devices such as servers or virtual machines (VMs) to access data on remote disk arrays at the block level. A storage area network (SAN) connects the devices together through a technology such as Fibre Channel (FC), a networking technology separate from Ethernet: FCoE, a technology that encapsulates FC for transmission over Ethernet; or iSCSI, a TCP/IP protocol that can run on Ethernet. The device is called an initiator; controllers for the disk arrays are called targets. The SAN helps an initiator discover targets and logical unit numbers (LUNs) on those targets, a LUN being a part of a disk drive, a disk drive, or a RAID set. To the initiator, the LUNs appear as local drives, which they are allowed to access at the block level. The FC, FCoE, or array stores raw storage volumes, and the initiator imposes the file system. Block storage provides high performance for use cases such as providing the boot image for VMs. File storage File storage, or network attached storage (NAS), follows a client/server model. A NAS server hosts the data, as well as a file system for that data. The NAS itself might store the data on direct attached storage (DAS) or block storage accessed through a SAN. When NAS clients connect to the NAS server, their OS views the file system as a mounted volume. When the client needs to read or write to a file, it must send the request to the NAS server, and the client interacts with the data at the file level, rather than the block level. The NAS server is responsible for serving the data and ensuring consistency as multiple clients connect. File storage does not provide as high a performance as block storage, but it is better suited for situations in which multiple clients need to share access to a file system. Traditionally, NAS only supports a single server or two servers acting in an active/passive failover design, so the NAS server could be a bottleneck for read and write IOPS. NAS clusters help to alleviate this issue. Object storage

Object storage also follows a client server model. A cluster of object servers store data as generic objects over an underlying file structure. (You will learn more about objects in the next section). Clients can read and write data at the object level. Generally, a client application performs the read and writes rather than the OS treating the data as data on a mounted volume. Therefore, applications must be aware of the object storage solution. However, some object storage solutions allow the OS to view objects as files on a mounted volume.

Object storage for mass content: Object definition You will now look at object storage in more detail, examining how it meets the needs for storing massive amounts of content. An object provides a flexible way to store data of any type or size. This unstructured data might be voice mail, memos and other correspondence, meeting notes, image files, audio and video files, email messages, or any other type of data. In addition to the data, the object includes metadata, which provides contextual information about the object. The customizable metadata might specify information that helps to index the object, that informs clients about the object’s usage, that marks the data as confidential for a specific user or security group, and so on. Each object also has a unique ID. Unlike a file system, objects are stored without any hierarchy, making the flat object solution very scalable.

Object storage for mass content: Example architecture (OpenStack Swift)

Figure 3-5 Object storage for mass content: Example architecture (OpenStack Swift) OpenStack is an open source system for providing infrastructure as a service (IaaS) cloud computing. The OpenStack Swift component provides cloud-based object storage. In the Swift architecture (illustrated in Figure 3-5), a cluster of object storage servers hosts the storage devices, which are generally DAS. These devices form a ring, which does the following: • Lists the location of each storage device (the IP address and TCP port for the object server and the physical device ID) • Maps partitions to the storage device that should hold that partition—More precisely, the ring maps a replica of a partition to the storage device; each partition has three replicas (by default) on different devices.

• Specifies the length for hashes—When a server needs to determine the partition for storing an object, it hashes the object ID, which produces the ID for the partition to be used. In short, the ring provides object storage servers all the information that they need to replicate objects and distribute them across each other. All client requests for objects begin with a request to a proxy server for the object location. The proxy server also stores the ring so that it can inform the client of the location. The client then reads and writes to the object through a direct interaction with the object server. Swift further defines containers that store listings for the objects. The container can define policies for storing data such as how data is replicated, allowing organizations to set up different tiers of service. Accounts store lists of containers. An account corresponds to a tenant, permitting multi-tenant cloud solutions. Several object storage solutions exist, each with its unique architecture. However, most solutions have many features in common with the Swift architecture, including a map that helps replicate and distribute objects across many storage servers, proxy servers for informing clients of the location of objects, and the ability for clients to send object requests directly to object storage servers. As this description has made clear, object storage is optimized for extreme scale. Object storage servers use relatively simple, cost-effective DAS. They follow a simple model for distributing data— and object delivery services—across many servers, making a density-optimized, scale-out solution a natural fit.

Block and file storage: Virtualized block storage

Figure 3-6 Block and file storage: Virtualized block storage Virtualized block storage, shown in Figure 3-6, can offer a more cost-effective solution than a traditional SAN. A traditional SAN storage array that has adequate capacity can be relatively expensive, and establishing a SAN can be quite complex. HPE StoreVirtual makes it possible for customers to replace SAN storage arrays with cost-effective servers with DAS. The StoreVirtual solution enables the servers to act as iSCSI targets, providing block storage to initiator servers or

VMs over an Ethernet network.

Block and file storage: Virtualized file storage

Figure 3-7 Block and file storage: Virtualized file storage Similarly, HPE StoreEasy helps to make NAS simpler and more cost-effective, as you see in Figure 3-7. A standalone NAS server has to scale up expensive hardware and still has the potential to become a bottleneck. HPE StoreEasy provides NAS clustering so that relatively cost-effective and simple to manage scale-out servers can serve a large number of NAS clients. For storage, these servers can use an HPE StoreVirtual solution or a block storage array; the StoreVirtual option delivers a solution built on cost-effective DAS.

Hadoop 2 for unstructured data analytics: Apache Hadoop 2 architecture

Figure 3-8 Hadoop 2 for unstructured data analytics: Apache Hadoop 2 architecture You should understand a bit about how Hadoop is architected (see Figure 3-8) so that you know which HPE solutions to position for various components in a big data solution. At the foundation of the architecture lies the data itself. Hadoop Distributed File System (HDFS) handles distributing the data across storage nodes in a variety of file formats including CSV, JSON record Optimized Record Columnar (ORC), and Parquet files. HBase might run on top of the file

system, organizing the data into a columnar map or NoSQL database. Data processing applications such as MapReduce2 (MR2) applications query the file system or database in order to complete data analysis jobs. Originally, MapReduce was the only framework for Hadoop data processing applications, and it was responsible for scheduling analysis jobs as well as completing them. In Hadoop 2, yet another Resource Negotiator (YARN) has taken over the scheduling functions. In addition, YARN permits the integration of other frameworks for data processing applications into the same resource scheduling framework. You will now look at the components of this architecture in more detail.

Hadoop 2 for unstructured data analytics: HDFS

Figure 3-9 Hadoop 2 for unstructured data analytics: HDFS HDFS runs on a cluster of storage or data nodes. Designed for a scale-out approach, HDFS distributes the vast number of files required for a big data solution across multiple data nodes. To provide fault tolerance for single nodes, HDFS replicates data, typically by a factor of three. In other words, each file is stored on three different nodes, as shown in Figure 3-9. The nodes work together to handle data replication, reads, and writes. A name node stores the filesystem Meta data. When an application needs to access a file, it contacts the name node, which tells the application which data nodes store the file. Because the responsibility for responding to requests to read and write files is distributed across many nodes, the solution scales well.

YARN Hadoop 2 for unstructured data analytics

Figure 3-10 YARN Hadoop 2 for unstructured data analytics

Figure 3-10 illustrates that data processing or data analytics applications run on compute nodes using data in the HDFS.

Figure 3-11 YARN Hadoop 2 for unstructured data analytics These applications are designed for parallel processing, so they require a solution to assign pieces of a task to available nodes, shown in Figure 3-11. YARN and the application running on YARN work together to provide that solution. The YARN Resource Manager schedules the running of an application with a Node Manager on a Hadoop compute node. The Node Manager creates a container with the CPU, memory, and bandwidth resources allocated for the task. The container runs an Application Manager, which is specific to the particular YARN application. The Application Manager is responsible for dividing the task into pieces that it can assign to other compute nodes. Those compute nodes’ Node Managers check with the YARN Resource Manager to determine which resources they can allocate to the task.

Hadoop 2 for unstructured data analytics: YARN applications

Figure 3-12 Hadoop 2 for unstructured data analytics: YARN applications This architecture will make more sense when you consider an example, illustrated in Figure 3-12.

MR2 is a common model for applications. A retail company might have an MR2 application that helps it to analyze customer shopping patterns. This application could run a query about the average number of days until shoppers made their next purchase, based on what product they originally purchased. The first step of the job would involve mapping. Every compute node would be assigned a series of customer records to analyze. The compute node would fetch those records from storage and make a result file that maps keys (in this case, product bought) with values (in this case, the number of days until the next purchase). For the next part of the analysis, the compute nodes need to shuffle their results such that the same compute node has the results for all the same keys. For example, one compute node might be assigned all of the results for books. This step is required so that the compute node has all necessary information for running its reduce step—in this case, totaling the number of days until the next purchase after all book purchases and then calculating an average. Apache Spark is another general framework for applications that use parallel computing to process large data sets. Spark can run processes similar to MapReduce, as well as other types of processes, using a general framework in which applications apply actions to collections called resilient distributed datasets (RDDs). The application creates these RDDs by pulling in HDFS files or by transforming other RDDs with filters. Spark is optimized to speed up processing for frequently accessed, “hot” data, which is placed in the compute node’s memory. For some purposes, Spark can operate much more quickly than MapReduce applications, particularly for in-memory processing. Spark can use YARN as its resource management component and draw on data stored in HDFS. It can also use different resource managers such as Apache Mesos. As you see, YARN applications might require relatively fewer or more processing and memory resources. Those resources are distributed across many nodes for extreme scalability and efficiency. Throughout this ebook, you will learn about flexible approaches for designing the right relationship between compute and storage.

Simple NoSQL databases

Figure 3-13 Simple NoSQL databases

HDFS is designed primarily to deliver files to data processing applications that analyze “cold” data (data that is stored over time and accessed relatively infrequently) and that do not need to return immediate results. However, some applications need to produce results more quickly. In addition, rather than analyzing long sequences of records, some applications might need random access—the ability to read and write to different parts of files. Rather than run directly on HDFS, an MR2, Spark or other data processing application can run on a database. That database helps to organize the data within a file system, and the proper type of database can significantly speed up operations for applications that require low latency read/writes to files. You are probably familiar with SQL databases, the primary form of relational databases. A relational database consists of rows, each of which is called a record or a key, and columns with values. For example, a record might be a customer account. Columns could be customer name, email address, pending transaction IDs, and so on. In a relational database, every row has a value for every column. Relational SQL databases work with structured data, but Hadoop is designed for handling unstructured data. In addition, relational databases are optimized for reading and writing to a record as individual transactions. NoSQL databases are designed to organize structured and unstructured data. NoSQL databases can take different forms, but in essence, they map a row, or a key, to values in a less rigid way than relational databases. Take HBase, the NoSQL database for Apache Hadoop, as an example (see Figure 3-13). Based on Google BigTable, HBase allows developers to create flexible tables (or maps) that meet their needs. The table has fixed column families, which organize related columns together. However, new columns can be freely added, and a row (or a key) can have values for whichever columns the developer chooses. The table is stored sparsely, meaning that when a row does not have a value for a column, the column simply does not exist for that row, and no space is consumed for it. A NoSQL column-oriented database is optimized for analytics, so you will often encounter customers who require such databases as part of their big data and analytics solution. The HBase database is distributed across HBase Region Servers, each of which holds part of the database and is responsible for handling read and write queries to that part of the database. The Region Server operates as much as it can in memory, so the compute node that runs a Region Server requires generous memory. This scale-out approach provides extreme scale and efficiency. Cassandra, another example of a NoSQL database, differs in some ways, but similarly consists of rows with values for a flexible number of columns that are organized into column families (called tables in Cassandra). Cassandra can run on HDFS or on a different file system such as Cassandra File System (CFS). And Cassandra also distributes parts of the database across compute nodes.

Design decisions for Hadoop and NoSQL databases that run on HDFS Now that you understand the characteristics for each scale-out application’s workloads, you can look at the design decisions for some of the more complicated applications. For Hadoop and NoSQL databases that run on HDFS, you must choose:

• Whether the data and compute nodes are colocated or not • How to plan the ratio of compute resources versus storage resources You will examine each of these decisions in the next few sections.

Traditional architecture: Colocated compute nodes and storage nodes

Figure 3-14 Traditional architecture: Colocated compute nodes and storage nodes Traditionally, Hadoop has operated under the principle of bringing compute to storage for data processing. That is, compute nodes are colocated on the data nodes in the form of servers with direct attach storage (DAS), as shown in Figure 3-14. A YARN application can then assign a piece of a job to a node that stores the data for that job locally. This architecture made sense in the past when network bandwidths did not allow moving remote data to compute nodes in a timely manner. Remember: processors can operate on data most quickly when the data resides in memory, and next most quickly when it is on local storage. Remote storage traditionally provides the slowest access. However, this architecture has led to many inefficiencies as companies’ data and data analysis needs have expanded. Colocating the compute nodes for an application with the data nodes constrains the data to that application. However, a company’s needs are rarely met by one application. Therefore, isolated clusters with the same data proliferate with one running a MapReduce applications, another running Apache Spark, and so on. Companies are already dealing with data explosions, and this inefficient model leads to unnecessary duplication, expense, and management complexity. In addition, the traditional architecture treats compute and storage as one unit, so the two are forced to scale together. Traditionally, IT has scaled solutions with one spindle, or disk drive, per processor core. However, some workloads are computationally intense and would benefit from more cores per drive. Some applications such as Apache Spark applications might benefit from more memory. Other applications might benefit from more compute power. With the traditional architecture, you cannot design to meet these particular needs.

HPE Big Data Reference Architecture: Optimized compute nodes and storage nodes

Figure 3-15 HPE Big Data Reference Architecture: Optimized compute nodes and storage nodes HPE has discovered that modern Hadoop applications experience better performance when compute nodes and data nodes are separated, as shown in Figure 3-15. This model allows you to design a compute layer with HPE servers that are optimized for intense data processing and analytics workloads. Now you can select servers that have the compute or memory resources that the particular analytics application requires—without worrying about the server ’s storage capacity. Equally, you can design a storage layer with HPE servers that are optimized for storing and delivering data. Highspeed 10 GbE Ethernet provides high enough bandwidth that bringing data to the compute nodes does not interfere with performance—in fact, this fabric can provide higher bandwidth than some local storage subsystems. This model returns flexibility and scalability to the data center. If the customer requires more compute power, you can add compute nodes to the solution. If the customer ’s data expands, you can scale the storage nodes. Perhaps even more crucially, you can avoid creating isolated clusters. If the customer has multiple analysis applications, you can plan a cluster of compute nodes for each while allowing the clusters to share the same storage. HPE has also found in testing that this model can enhance performance, increasing read IOPS by as much as 30%.

How to plan the ratio of compute to storage With the HPE Big Data Reference Architecture, you now have the choice for how to balance compute resources and storage resources. The traditional Hadoop guidelines—about one core per drive—can give you a starting point for planning. However, you will need to consider the particular needs of your customer. Factors that affect the ratio of compute to storage include

• The type of analytics that the customer intends to use • The number of applications and jobs that the solution must handle • How quickly the customer requires results

How to balance compute resources versus storage resources: Application requirements

Figure 3-16 How to balance compute resources versus storage resources: Application requirements You can classify data analysis tasks into two categories: CPU bound or IO bound tasks (shown in the table in Figure 3-16). When most tasks are intense CPU bound ones, you want a higher ratio of compute to storage node. If tasks are IO bound, you can have a more traditional balanced compute to storage node ratio. For the IO bound tasks, if you are using the HPE reference architecture with separated compute and storage nodes, keep in mind the need for 10 GbE speeds between compute and storage nodes. Certain Hadoop frameworks for applications also require relatively more compute or memory resources. These include • Hive—Hive is a data warehouse that acts much like a structured, SQL database built on top of HDFS. Hive provides metadata and indexing that can help to speed analyses and queries. • Spark—Spark, as mentioned previously, is an alternative application framework to MapReduce, optimized for faster, more random queries. • Solr—Solr provides indexing and searching for data in HDFS.

How to balance compute resources versus storage resources: Usage requirements Also consider how the customer plans to use data analysis. Is the customer ’s big data solution primarily for archival with occasional analysis jobs? If so, you can plan a lower compute to storage ratio. Or does the customer plan to run many queries and analysis tasks at once? In the latter case, you must raise the compute to storage ratio so that enough compute nodes are available to run the processes.

Discuss, too, how quickly the customer needs results. The more quickly results are required, the more compute power and memory per TB, you must provide.

Modern data applications and workloads that require scale-up compute

Figure 3-17 Modern data applications and workloads that require scale-up compute You will now turn your attention to data applications and workloads that require a scale-up approach. As you see in Figure 3-17, these include structured databases used for business transactions, as well as in memory databases. Both of these types of databases require the extreme performance, high availability, reliability, and disaster tolerance provided by a scale-up approach.

Structured database Structured or relational databases consists of related tables. A table includes rows (which are called records) and fixed columns (which define parameters for those records). For example, a relational database might store customer records for a retail organization. Columns might include first name, last name, phone number, and so on. Every record has a value for every column. Applications can read and write data to the relational database using Structured Query Language (SQL). SQL databases are by far the most common form of structured, relational database. Customers often use structured databases for business operations. These databases must support complex online transactional processing (OLTP). They might also be used for complex online analytic processing (OLAP). The next sections describe these workloads in more detail. Because the business operations supported by OLTP are often mission critical, the databases require a high performance infrastructure optimized for high Availability, Disaster Tolerance, and business Continuance.

Structured database: OLTP Applications can interact with databases in two ways: using online transaction processing (OLTP) or online analytics processing (OLAP). You will examine OLTP first. OLTP or transactional applications involve small, simple insert and delete operations to structured databases. A user making a purchase from an online retailer is an example of an online transaction. Data entry is another example.

OLTP databases typically have high-performance demands. The process might divide into many threads to handle many users and their transactions with the database concurrently. The application must be responsive and able to read and write data quickly because users typically interact with it in real time. Therefore, the multiple threads benefit from multiple processor cores, speeding their response time. Because OLTP applications are multi-threaded and must maintain data consistency, they work well with a scale-up model. In addition, they relate to critical business operations, making safeguards against data loss or corruption crucial.

Structured database: OLAP

Figure 3-18 Structured database: OLAP OLTP can be complemented by OLAP, which analyzes data in order to extract business intelligence from it. For example, OLAP might help a company to analyze customer records in order to make better decisions about how to attract customers. Applications such as SAP Customer Relationship Management (CRM) often rely heavily on the insights from OLAP. OLAP typically works with large datasets over a longer period of time than a fast OLTP transaction. Relational OLAP (ROLAP) runs queries on an OLTP database. However, an OLTP database is optimized for simple deletes and inserts to rows. Therefore, ROLAP does not provide the best performance for complex queries. Multi-dimensional OLAP (MOLAP) can combine and slice data in different ways, permitting complex queries and analysis. It operates on data in a data warehouse, which is a structured database that is designed to accommodate the different needs of analytics and BI. For example, the database is often column oriented. Companies must move data from the OLTP database to the OLAP warehouse using an extract, transform, and load process on a daily or weekly basis (as illustrated in Figure 3-18). To support the complex queries, OLAP data warehouses also require a high-performance, highavailability scale-up model.

In-memory database for real-time analytics

Figure 3-19 In-memory database for real-time analytics Innovative new in-memory databases are designed to provide faster and more powerful analysis. SAP HANA is the most common example of an in-memory database, although some customers might use the in-memory capabilities of structured databases such as Microsoft SQL and Oracle. As you learned, OLAP traditionally requires replication of datasets from an OLTP database, which takes time. Because companies only replicate the data periodically, queries run on out-of-date data. SAP HANA resolves this issue by establishing a single database for OLTP and OLAP. The SAP HANA database appears as one database to users, as shown in Figure 3-19. However, it includes a component optimized for OLTP and a component optimized for OLAP, into which up-to-date data is streamed. An in-memory database holds OLAP datasets in memory. Because processors can operate on in-memory data much more quickly than they can on data on a local or remote disk drive, analysis runs much more quickly and users can receive real-time results. Such databases require vast amounts of memory, of course, and generally high performance. If they support mission-critical processes, they must also provide high levels of reliability and availability. Thus, in-memory databases are suited to scale-up infrastructure.

Summary of compute requirements to address data challenges

Figure 3-20 In-memory database for real-time analytics

The table in Figure 3-20 provides an at-a-glance summary of the characteristics of the workloads that you have explored throughout this section.

Optimized compute solutions for data-driven organizations

Figure 3-21 In-memory database for real-time analytics Figure 3-21 shows a summary of the HPE ISV partners who provide the different types of applications that you have explored. HPE also provides cloud and software solutions for data-driven organizations; however, these are not the focus for this ebook. Also note that HPE provides services —and, of course, you can deliver your own services to help customers meet their availability requirements.

HPE optimized compute portfolio for data driven organizations

Figure 3-22 In-memory database for real-time analytics HPE delivers the optimized infrastructure for these applications. Figure 3-22 summarizes the solutions optimized for scale-out compute, including:

• HPE Apollo 2000, 4000, 6000, and 8000 Systems • HPE Moonshot Systems It also shows the compute solutions optimized for scaling up, including HPE Integrity Superdome X Systems. HPE scale-up rack servers and Integrity blade servers can also provide scale-up compute, but they are not the focus for this ebook.

Architecture for HPC You will now look briefly at the architecture for HPC applications.

High performance computing (HPC) HPC uses extremely complex computations to solve complex problems. HPC applications can model systems in which many factors interact in many ways in order to predict how the systems will behave. For example, an HPC application might model weather systems and predict that you will need an umbrella that night. Another HPC application might simulate an electronic chip to help engineers assess the design and improve the design. To perform these computations, an HPC application requires vast amounts of computing power. This power is typically measured in floating-point operations per second (FLOPS); a floating-point operation is any operation that involves numbers with decimal points. HPC applications require systems that can perform at the level of teraFLOPS. (For some applications, you will need to know both the single-precision FLOPS, which refers to the rate for operations on 32-bit numbers, and double-precision FLOPS, which refers to the rate for operations on 64-bit numbers.)

HPC clusters

Figure 3-23 HPC clusters Today, HPC applications often run on clusters of powerful servers, each of which contribute processing power and memory to the overall task. A cluster consists of one or more management nodes and the worker nodes (see Figure 3-23). The compute nodes (sometimes called worker nodes)

are the servers that contribute their processor cores, accelerators, memory, and disk space to performing computations. Each node runs a cluster-capable OS such as Linux CentOS or Microsoft HPC Pack 2012, which acts as the platform for the HPC application or applications and also enables the node to communicate with the other nodes. Many HPC applications are programmed to break down jobs into smaller tasks, which might run at least partially in parallel. To run such a job correctly, the nodes must communicate closely. Applications use libraries known by the cluster OS—most commonly Message Passing Interface (MPI)—to program these communications. In other words, HPC often takes a scale-out approach similar to the approach that you examined with big data analytics. However, HPC focuses on processing power and complex computations on smaller sets of data. Often, many users need to use the HPC cluster to run thousands of computations, or jobs, a day. Some jobs might take hours to complete, and the cluster has a finite set of resources. If users had to manually initiate a job on a set of compute nodes, they would have to constantly ask each other which resources they can use and interfere with each other ’s work. A job scheduling or workload management program allows users to request HPC jobs and manages the assignment of available compute nodes to the jobs. The program might assign one or more nodes to a job. Some programs can assign a specific processor on a multi-processor server to job or even assign a core on a processor. The program might also be able to match a job to a node with the proper resources, such as a minimum processor speed or RAM size. The scheduling program is only responsible for initiating the job on the right resources. After a parallelized application begins to run on the assigned worker nodes, MPI (or a similar interface) handles the synchronization of the job. Examples of scheduling programs include Adaptive Moab, Altair PBS Professional, and UNIVA Grid Engine.

HPE optimized compute portfolio for HPC

Figure 3-24 HPE optimized compute portfolio for HPC The HPE Apollo 6000 and 8000 Systems are optimized for HPC at midrange and large scale, while the Apollo 2000 Systems can provide good HPC solutions at a smaller scale (see Figure 3-24). These modular solutions deliver vast amounts of computing power in a small physical footprint with power and cooling efficiency. Customers can easily scale out enclosures populated with mix and match compute options tailored to specific requirements. You will learn how to plan an HPC cluster using these solutions in the next chapter.

Architectures for mission-critical applications In this section, you will consider how to use redundancy and resiliency in scale-up and scale-out architectures. You will also learn about RAS, which stands for reliability, availability, and serviceability.

Meeting availability requirements

Figure 3-25 Meeting availability requirements

When designing and planning availability for the server solution, you should be familiar with the concepts of resiliency and redundancy and their relationship to each other and to availability. • Redundancy—The inclusion of multiple components that provide the same function • Resiliency—The ability to quickly adapt to change and to recover from errors such as hardware failures The two concepts are closely related. Redundancy provides the foundation for resiliency while resiliency ensures that the redundant components do not go to waste by automatically adapting to failures and quickly accepting the viable, redundant alternative. For example, a server might have two redundant NICs, but it is only when the integrator sets up NIC bonding that the server can take advantage of the redundancy. Similarly, RAID lets a server ’s storage controller distribute multiple copies of data across disk drives so that a drive failure can occur without data loss. You should also consider how the scaling model affects the best way to deliver availability. In a scaleup model, each server provides a critical service that other servers cannot. The server hardware should be optimized for reliability, availability, and serviceability (RAS). (The next sections describe RAS in more detail.) In a scale-out architecture (illustrated in Figure 3-25), on the other hand, multiple servers fulfill the same function, building in greater availability for the solution as a whole. Typically, the cluster of servers can tolerate the loss of one node with minimal impact on the overall service delivery.

Defining RAS A server optimized for RAS must deliver reliability. That is, it must detect and correct errors to ensure that data is never lost or corrupted. Further, the server must identify and contain uncorrectable errors, signaling other components so they can take the appropriate action. The server must also provide availability, guaranteeing uninterrupted operation. Redundancy built into the hardware—extra processors, extra DIMMs, extra network adapters, and so on—help to protect from unplanned downtime; however, the server must also have the resiliency to instantly and automatically fail over to a redundant component if an active one fails or must be deactivated. Further, the server must be able to isolate failing components to prevent issues from spreading. The system might also need to provide clustering features that allow for upgrades and maintenance on a single node without affecting the service. Finally, the server must be serviceable. As well as handling failed components reactively, it should use predictive analysis to identify potentially failing components, deactivating these components so that the system can continue operating with the healthy ones without data loss or corruption. System partitioning should isolate workloads, making it simpler to maintain one workload without affecting others. As much as possible, the server should heal itself so that it can continue functioning until replacement components are installed. The system should also allow for hot-pluggable replacements that allow uninterrupted service.

RAS hardware features

To support mission-critical workloads, a server needs RAS features embedded throughout the hardware. Each system should work to ensure data integrity, to proactively detect errors, and to mitigate potential issues before they cause data loss or downtime. The server should have a processor such as an Intel Xeon E7 processor that is designed for RAS. When a traditional processor detects a data error that it cannot correct—whether data in the memory or cache or data crossing a system bus—the processor produces a “Machine Check Exception” that can crash the system. A processor designed for RAS, on the other hand, will not produce an exception and crash. Instead, it will flag the bit with the error in order to contain the error and to inform the firmware and OS of the problem. The processor should also provide additional features for detecting, flagging, and containing various types of errors so that they do not propagate over the network or to storage. Many of these features involve informing firmware of the issue and having the firmware handle the error. Therefore, it is critical that the firmware supports the RAS processor features; otherwise, the server will not benefit from them. Enterprise servers typically have DIMMs that support error code correction (ECC). ECC uses extra bits to encode data along with parity information so that if a bit is corrupted, the memory can detect the problem and recover the bit. ECC protects memory from single-bit errors, in which one bit is flipped due to issues such as background radiation or a failing DRAM. ECC can also detect, although not correct, double-bit errors. This capability is called single error correcting and double error detecting (SECDED). For mission-critical workloads, though, SECDED is not enough. The memory must proactively detect multiple-bit errors and prevent them from accumulating. In addition, it must protect from persistent errors (such as those caused by a failing DRAM as opposed to background radiation). Persistent errors can cause multiple-bit errors to accumulate, resulting in corrupt data and a potential system crash. The memory must be able to deactivate the failing DRAM so that the DIMM can continue to function without data corruption using the healthy components. All hardware paths within the server must work to ensure reliable data delivery. Transmitters should resend data if they do not receive acknowledgements from receivers, and receivers should use cyclic redundancy checks (CRC), a short code added to the data that will no longer be the same if the data changes, to verify the received data’s integrity. The hardware should detect issues on a path and take steps to create a path that avoids bad wires. Other system components such as system clocks should be fully redundant and hot-swappable. Finally, all power and cooling systems should have redundant components so that the system can continue running optimally even if one or more components fail. Fans and power supplies should be hot-swappable to support simple serviceability.

RAS software features Some of the hardware features mentioned in the previous section involved flagging errors to be handled by the server firmware or OS. Thus, the firmware composes an integral part of the server ’s RAS features. In addition to helping to isolate and contain errors, the firmware should provide analysis engines for monitoring all hardware components. By detecting failing or failed components early, the firmware can prevent those components from causing issues. It deactivates the faulty component and perhaps helps the server instantaneously fail over to a redundant component. The system can then continue to operate using the healthy components without risk of downtime or data

corruption until the installation of replacement components. For the server to continue working optimally, of course, the server must be designed with redundant components throughout. For example, it should have more processor cores than required for the workload in case some must be deactivated. This section has given you an overview of the type of hardware and software RAS features that mission-critical workloads require. You will examine specific RAS features in Chapter 8 “HPE Integrity Superdome X.”

Chapter 3—Activity You will now return to the MTB scenario introduced in Chapter 2—Activity 1. You will learn more about MTB’s initiatives and begin to assess ways to help MTB fulfill these initiatives. Last month, one of your colleagues held an executive briefing at MTB, which Jaggers, Deva, Walker, and Choi attended. At this briefing, you learned that MTB has decided on a new IT strategy: • A software-defined data center (SDDC) is MTB’s future direction. The executives considered a cloud solution, but they decided to aim toward SDDC. • MTB is reworking its data center strategy. • Deva will issue an RFI in the next few weeks. • Walker and Choi are investigating their manufacturing execution system (MES), which is built on a transactional database. Employees complain that the system is not always responsive or available, so they cannot use it the way that it is intended. • After fixing the issues with MES, Choi wants to enhance the solution with Business Intelligence (BI) analysis. • HPC is another avenue that Deva is investigating. Currently, the R&D facilities of various operating companies within MTB purchase and manage their own HPC environments without the involvement of MTB’s central IT. HPC clusters that have grown organically offer different levels of service, some performing well and others less so. Expanding clusters are causing IT sprawl. • Manufacturing departments are becoming interested in wading into big data analytics. Although R&D facilities are using Teradata big data environments, the license period is ending. Also manufacturing IT members are biased toward open source frameworks, and they want to use Hadoop on their choice of infrastructure. They have a lot of unstructured data that it would like to start storing in a more scalable way immediately. But they are still working on fully identifying their analytics needs and developing applications. Now answer the following questions. 1. What approach would you recommend that MTB takes for increasing the responsiveness and availability of the MES solution? Also, what would help MTB continue to scale in the future? 2. How well does MTB’s current approach to deploying HPC applications fit with its desire to move toward SDDC? Should MTB change its approach and, if so, how? 3. What type of server infrastructure will meet the needs for manufacturing’s Hadoop solution? What advantages does the HPE Big Data Reference Architecture provide? You can check your answers in Appendix B: Answers to Activities.

Summary This chapter has introduced you to various types of data applications, as well as HPC applications. You have learned how to architect server solutions that meet the particular needs of each type of application and workload. You also learned about how server solutions can fulfill the RAS

requirements of mission-critical workloads.

Learning check Review what you have learned by answering these questions. Then check your answers in Appendix A: Answers to Learning Checks. 1. What characterizes OLTP workloads? a. Very large datasets b. Distributed datasets over multiple systems c. The need for scale-up architectures d. Computationally complex queries

2. A customer requires a solution for a mission-critical transactional database. Why do Intel Xeon E7 series processors provide a good fit for this workload? a. These processors have built-in RAS features for workloads that cannot tolerate any data loss or corruption. b. These processors provide the highest clock speed per core but relatively few cores—the best fit for transactional workloads. c. These processors provide high performance for a low TCO, enabling fast scale out for the mission-critical workload. d. These processors are specifically designed for use with scale-out, clustered applications.

3. For which customer need does object storage provide the best solution? a. Need to provide block-level access to remote drives b. Need to store structured databases for transactional processing c. Need to store billions of voice, video, and email files d. Need to provide a remote drive from which VMs can boot

For answers, See Chapter 3 in Appendix A.

Chapter 4 HPE Apollo Solutions for HPC EXAM OBJECTIVES • Explain the features and benefits of HPE Apollo 2000, 6000, and 8000 solutions • Position HPE Apollo 2000 and 6000 solutions for the right use cases and workloads • Create an implementation plan for an HPE Apollo 2000 or 6000 solution, including plans for the proper performance, scalability, high availability, and management

Assumed knowledge Before reading this chapter, you should have a basic understanding of the following: • Advanced architectural concepts (which are outlined in Chapter 3,“Advanced Architecture for Server Solutions”) • Processors, including DDR3 and DDR4 memory, hard disk drives (HDDs), solid state drives (SSDs), and RAID levels for storage volumes • HPE ProLiant rack and blade servers and options for them such as HPE Smart Array Controllers • HPE BladeSystems, including interconnect modules and Virtual Connect (VC) modules

Chapter topics This chapter begins with an overview of the HPE Apollo family. Then you will examine use cases for the solution. Finally, you will learn about planning the architecture before examining how to manage the Apollo family servers.

HPE Apollo 2000, 6000, and 8000 overview This section introduces you to the HPE Apollo 2000, 6000, and 8000 families.

HPE Apollo 2000

Figure 4-1 HPE Apollo 2000 HPE Apollo 2000 Systems offer an alternative solution for smaller high-performance computing (HPC) clusters and for companies taking their first steps toward HPC. The HPE Apollo 2000 System, shown in Figure 4-1, is the enterprise bridge to scale-out architecture. It delivers twice the density of traditional rack mount systems and the efficiency of a shared infrastructure, but maintains a familiar form factor—the same racks, cabling, serviceability access, operations, and system management. No retraining of personnel or cost of change is required to introduce efficient, space-saving, scale-out architecture. The Apollo 2000 System brings HPE ProLiant Gen9 server technology, including iLO4, into this 2U, multi-server chassis. Storage and I/O flexibility enable customers to optimize for performance or economy—the right compute for the right workload.

Apollo 2000 System offerings

Figure 4-2 Apollo 2000 System offerings The Apollo 2000 System is a density-optimized, 2U shared infrastructure chassis for up to four ProLiant Gen9 independent, hot-plug servers. It has all the traditional data center attributes, including support for standard racks and cabling, as well as rear-aisle serviceability access (see Figure 4-2). A 42U rack fits up to 20 Apollo r2000 series chassis, accommodating up to 80 servers per rack. Apollo 2000 System servers provide the flexibility to tailor the system to the precise needs of each workload, with a range of compute, I/O, and storage options. Apollo 2000 System servers can be “mixed and matched” within a single chassis to support different applications. A chassis can even be deployed with a single server, leaving room to scale as customers’ needs expand. The Apollo 2000 chassis comes with four new-generation, single-rotor fans, and an additional four fans can be added for redundancy. The power can be managed by the HPE Advanced Power Manager (HPE APM), an optional rack-level manager discussed in Chapter 9,“Monitoring and Managing HPE Solutions.”

HPE ProLiant XL170r—Gen9 1U Node

Figure 4-3 HPE ProLiant XL170r—Gen9 1U Node The ProLiant XL170r Gen9 Server (shown in Figure 4-3) is a 1U half-width, two-processor server with configuration options for the following: • Performance and efficient central processing units (CPUs)—Intel Xeon E5-2600v3 or v4 series processor options with choices from 4cores to 22 cores, 1.6GHz–3.5GHz CPU speed, and power ratings between 85 W and 145 W. Customers can also choose Intel Xeon E5-1600v3 series processors with choices from 4cores to 8 cores and 3.2GHz–3.7GHz CPU speed. • 16 memory DIMM slots with up to 512 GB double data rate fourth generation (DDR4) memory at up to 2133 MHz. • Two I/O slots for a choice of fabric and clustering options including 1 GbE, 10 GbE, 40 GbE, and 56 Gb/s FDR InfiniBand; Fibre Channel (FC); options for either one PCIe slot, plus a FlexibleLOM or two PCIe slots. The Apollo 2000r series chassis accommodates up to four independently serviceable ProLiant XL170r Gen9 servers, supporting up to 80 servers in a 42U rack.

HPE ProLiant Apollo XL190r—Gen9 2U Node

Figure 4-4 HPE ProLiant Apollo XL190r—Gen9 2U Node

The ProLiant Apollo XL190r Gen9 Server (shown in Figure 4-4) is a 2U half-width, two-processor server with similar configuration options as the XL170r for CPU and memory. However, this server adds additional PCIe slots in multiple configurations, providing support for additional expansion cards and for two integrated accelerators per server. The tray supports a variety of NVIDIA and AMD graphics processing units (GPUs) and Intel Xeon Phi coprocessors; you should check the tray’s QuickSpecs for up-to-date information. This server leverages Intel’s latest Xeon E5-2600 v3 or v4 series processors, increasing performance up to 30%–40%. It supports DDR4 HPE SmartMemory with speeds of up to 2133 MHz and 512GB maximum, boosting bandwidth and efficiency up to 50% over previous generation servers. The dense and flexible HPE Apollo 2000 Chassis can also dramatically accelerate professional applications with the GPUs or coprocessors.

Apollo 2000 storage flexibility

Figure 4-5 Apollo 2000 storage flexibility The Apollo 2000 has two chassis options (shown in Figure 4-5), with different storage configurations. The HPE Apollo r2200 Chassis includes 12 large form factor (LFF) hot-plug SAS or SATA HDDs or SSDs allocated equally across server nodes. The HPE Apollo r2600 Chassis includes 24 small form factor (SFF) hot-plug SAS or SATA HDDs or SSDs, also allocated equally across server nodes. The HPE Apollo r2800 Chassis provides 24 SFF hot-plug SAS or SATA HDDs or SSDs, but it lets customers flexibly map the desired number of drives to each node. The ProLiant XL170r and XL190r servers have embedded SATA storage controllers. Customers can also purchase PCIe Host Bus Adapters (HBAs) for SAS connectivity, as well as Smart Array Controllers to add features such as HPE Smart Cache to improve performance and RAID 10 to improve fault tolerance and uptime. All Apollo 2000 Chassis are built with the following:

• Four server slots per chassis • Up to two 800W/1400W power supplies • HPE Thermal Logic technology for lower power consumption and airflow • Four single-rotor fans (standard) and options for four additional single-rotor fans for redundancy • Improved power consumption and acoustics

HPE Apollo 6000

Figure 4-6 HPE Apollo 6000 HPE Apollo 6000 systems are designed to help customers obtain the right performance for their HPC applications with the right economics (see Figure 4-6). The extremely dense system can deliver up to 20 servers in 5U, giving customers up to four times more performance per dollar and per watt while using 60% less rack space compared with traditional servers. The systems consist of several chassis that share dynamically allocated power, making it easy to scale the solution, as well as maximizing rack-level energy efficiency and simplifying management. The modular system lets you choose the right compute, memory, fabric, and storage options for the customer ’s workloads. By tailoring the solution to the requirements, you enhance the performance while decreasing total cost of ownership (TCO) by as much as $3 million.

Apollo a6000 Chassis

Figure 4-7 Apollo a6000 Chassis The HPE Apollo a6000 Chassis is designed with density optimization in mind to help you manage and scale to your business computing demands. The new modular HPE Apollo a6000 Chassis was designed to hold various compute servers and/or accelerator trays to fit your specific workload (see Figure 4-7). Each chassis can hold up to 10 single-slot trays or up to 20 servers. Cooling concerns are reduced by five dual rotor fans that share a cooling zone and, as an additional feature, power can be managed by an HPE Advanced Power Manager (APM) option at the server, chassis, or power shelf level. Quick stats include the following: • 5U tall • Fits standard 19-inch rack, ideal for 1.0m depth rack • Holds 10 single compute trays vertically • Rear NIC cabled • 5x80mm redundant fans • Connects to Power Shelf for pooled power (no internal power) The chassis has these features: • One slot and two slot tray support – 10 single slot trays – Five double slot trays • Mix-and-match trays • Shared cooling

• 12V DC power distribution • Up to 5700W per chassis The chassis also offers these serviceability features: • Front serviceable trays • Standard rear cabling • Front serviceable hot-plug drives • Redundant, hot-plug fans

Apollo 6000 Power Shelf

Figure 4-8 Apollo 6000 Power Shelf The HPE Apollo 6000 Power Shelf, shown in Figure 4-8, offers pooled power for rack-level efficiency as well as N+N redundancy to support your customers’ data center needs. Depending on the power configurations of the trays within a chassis, the power shelf can support two to four fully populated HPE Apollo a6000 Chassis with maximum DC power up to 15.9 kW. The HPE Apollo 6000 Power Shelf, with its redundant hot-plug power supplies, can also be configured for single- or threephase input.

Quick stats for the shelf include the following: • 1.5U tall • Efficient pooled/shared power infrastructure • Holds up to six power supplies max – 2650W Platinum hot-plug (15.9kW nonredundant) – 2400W Platinum hot-plug (14.4kW nonredundant) • Supports N+1 or N+N redundancy

• One power shelf can support up to three to four fully loaded enclosures, depending on power capacity per enclosure

Apollo 6000 server options

Figure 4-9 Apollo 6000 server options The Apollo 6000 server has three options, shown in Figure 4-9. For single-threaded workloads, the HPE ProLiant XL220a Gen8 v2 Server has two single-socket servers in each front-accessible server tray. Both of the processors are Intel® Xeon® E3-1200 v3 processors and each one has four dedicated DDR3 memory slots, each capable of holding up to 8GB UDIMMs. Each server also has two Hot Plug SFF drives, and one Serial/USB/Video (SUV) port dedicated to it. The HPE ProLiant XL230a Server delivers 2P performance, while taking advantage of the Apollo 6000 System’s modular flexibility and rack-scale efficiency. This server leverages Intel’s latest Xeon E5-2600 v3 and v4 series processors, increasing performance up to 70%, and DDR4 HPE SmartMemory, which boosts bandwidth and efficiency up to 50% over previous generation servers. The modular HPE Apollo a6000 Chassis accommodates up to 10 single-slot XL230a server trays to address various workload needs. The HPE ProLiant XL250a Server delivers 2P performance with dual accelerators, while taking advantage of the Apollo 6000 System’s modular flexibility and rack-scale efficiency. This server leverages Intel’s latest Xeon E5-2600 v3 and v4 series processors, increasing performance up to 70%, and DDR4 HPE Smart Memory, which boosts bandwidth and efficiency up to 50% over previous generation servers. The modular HPE Apollo a6000 Chassis can accommodate up to five double-slot XL250a server trays to address various workload needs. For acceleration, customers can choose from a variety of NVIDIA and AMD GPUs, as well as Intel Xeon Phi coprocessors; you should check the tray’s QuickSpecs for up-to-date information.

HPE Apollo 8000

Figure 4-10 HPE Apollo 8000 For customers with the greatest HPC demands, HPE offers the HPE Apollo 8000, a supercomputer solution that is the water-cooled version of the HPE 6000 (see Figure 4-10). The HPE Apollo 8000 can hold up to 144 densely packed powerful compute nodes or 72 compute nodes with accelerators. It also holds InfiniBand switches to interconnect the nodes at lightning speed. This solution packs so much computing power into a rack by using innovative water cooling to allow more powerful processors in a smaller space, differentiating it from the competition. The HPE Apollo 8000 water-cooled rack supports four times as many teraflops per square foot than air-cooled systems for more than 250 trillion floating-point operations per second (TFLOPS) per rack. Not only more powerful, the HPE Apollo 8000 is also greener, delivering 40% more floatingpoint operations (FLOPS) per Watt and consuming 28% less energy than air-cooled systems. The HPE Apollo 8000 features many patented features, including dry disconnect servers. The cooling system is sealed off such that IT staff can remove servers for maintenance without disrupting the system. In addition to saving a company’s cooling costs, the HPE Apollo 8000 can actively help make the company greener in other ways. The company can recycle the water heated by the system and use it to heat the facility. In these ways, the HPE Apollo 8000 can save up to 3800 tons of CO2 per year (or the equivalent of 790 fewer cars). The HPE Apollo 8000 can meet the needs for scientific organizations that need to perform research computing, climate modeling, and protein analysis. It can also provide product modeling, simulations, and material analysis for manufacturing companies—as well as meet many other supercomputing use cases. Partners cannot sell this solution, but you can refer customers who might benefit from an HPE Apollo 8000 solution to HPE for an assessment of their needs. Because the Apollo 8000 is not the focus of this chapter, this section will not go into detail concerning the 8000’s components and options. You can learn more about the HPE Apollo 8000 by visiting the HPE website.

HPE Apollo 2000 and 6000 use cases You will begin by learning about the high-performance computing (HPC) use cases for which HPE Apollo 2000 and 6000 solutions are designed.

Why HPC: To out-compute is to out-compete

Figure 4-11 Why HPC: To out-compute is to out-compete The HPE Apollo systems that you just reviewed are purpose-built to support enterprise HPC (see Figure 4-11). HPC no longer belongs to large research facilities; enterprises across many verticals have recognized that to out-compute is to out-compete. By embracing HPC, they can bring better products to market more quickly. In fact, 97% of companies that had adopted supercomputing said they could no longer compete or survive without it. The competitive advantages extend from the enterprise level to the national level, as well with political leaders and governments recognizing the trend and encouraging the adoption of HPC.

HPC applications

Figure 4-12 HPC applications

Because so many enterprises are adopting HPC, you will find opportunities to design HPC solutions for customers across many verticals, as you see in Figure 4-12. Electronics continue to make vast strides as smartphones get smarter, cars become more efficient and connected, and hardware manufacturers pack more and more power into smaller packages. Engineers could not keep up this pace without the aid of computers themselves. Manufacturers use computeraided engineering and electronic design automation (EDA) to simulate and design better and better chips. In the healthcare vertical, HPC helps researchers model systems at the level of the ecosystem or the molecule. Pharmaceutical researchers use HPC to design drugs that are safer and more effective. Scientists use HPC to develop greener ways to run the world—from more efficient solar chips to new types of batteries. HPC also powers their research in innovative fields such as genetics and computational fluid dynamics. You see the products of HPC in almost any movie, where computer-generated imagery (CGI) effects convince us to believe the unbelievable. But HPC also has a home in the music industry, which HPC is helping to produce better quality. You should be ready to ask about the need for HPC with proposals for government and education entities, any of which use HPC for research. Financial institutions require HPC to inform decisions such as where to invest money or which types of loans to make. This particular type of HPC is called a Monte Carlo simulation—a simulation that informs decisions that are influenced by many variables, some of them random. But Monte Carlo simulations are not just about finance. A retail company might need to choose the best location to open a new branch. A software company might need to decide how much to devote to developing a particular project. Note that you might also encounter customers who are looking for a cloud solution as a way to scale out and obtain the resources that they need for HPC.

HPC application requirements

Figure 4-13 HPC application requirements

The wide range HPC applications have many requirements in common (see Figure 4-13). They all demand the highest possible levels of performance and efficiency—the type of efficiency that densityoptimized solutions such as HPE Apollo 6000 solutions can deliver without compromising performance. Customers need accessible solutions with options for smaller solutions if their application has fewer requirements. And the solution must be able to scale easily as requirements increase.

Demand for infrastructure optimized for the application

Figure 4-14 Demand for infrastructure optimized for the application Although all HPC applications have performance and efficiency requirements in common, no single type of server hardware gives the right fit to every HPC application because applications differ in their architecture. Customers need an infrastructure that is tailored for their application’s requirements, as illustrated in Figure 4-14. A multi-threaded HPC application lets a compute node divide a job process into many threads. A server with multiple cores or—as you will learn in more detail a bit later—GPU or coprocessor accelerators can run the threads in parallel and complete the process more quickly. A node with fewer cores could still run the job, but the threads would have to time-share the cores, and the job would take longer to complete. Deploying nodes with many cores and accelerators might increase the performance because jobs can run on fewer nodes, decreasing the chance of the interconnect acting as a bottleneck. Remember what you learned about HPC applications in an earlier chapter. These applications often divide tasks among worker compute nodes using a mechanism such as Message Processing Interface (MPI). Threading is a bit different from parallel processing across an HPC cluster. Threading applies to how a single node handles the process. Distributed processing applies if the node makes an MPI call to another node to help run the job. An HPC application can use both multi-threading and parallel processing. EDA and Monte Carlo simulations tend to be single or lightlythreaded. This means that each node can only use one or a few processor cores due to the application architecture. Such applications get the best performance boosts from increasing the power of each processor core in preference to increasing the number of processor cores per node. In addition, a 1P server might even be able to execute the job more efficiently than a 2P server; the 1P server introduces less latency because it does

not need to maintain cache coherency. You need to help your customers find the right fit for their HPC application.

HPE Apollo 2000 and 6000 architecture This section focuses on HPC use cases addressed by the HPE Apollo 6000. It also touches on appropriate situations in which to deploy the HPE Apollo 2000. The next section guides you through architecting HPE Apollo 2000 and 6000 solutions, helping you to choose components to meet customer requirements, plan for rack-level efficiency, and scale out the design.

Tailoring to the workload

Figure 4-15 Tailoring to the workload You will now learn more about tailoring the solution to the workload (see Figure 4-15). The next several sections give you guidelines for selecting compute trays, accelerators, memory, storage, and fabric components for the HPE Apollo 2000 and 6000 solutions.

Tailoring the compute tray to the workload: HPE Apollo 6000

Figure 4-16 Tailoring the compute tray to the workload: HPE Apollo 6000

HPE Apollo a6000 Chassis provide ten compute tray slots, each of which you can populate with one of three compute trays (shown in Figure 4-16). Optimized for single-threaded HPC, the HPE ProLiant XL220a Gen8 v2 compute tray includes two one-processor (1P) servers for a total of 20 per-chassis. A chassis is 5U, so the servers have four times the density of a traditional 1U server. As you learned, single-threaded HPC applications (such as EDA, as well as some engineering, risk analysis, and life sciences applications) benefit from processors with higher clock speeds, even if those processors might have fewer cores. The XL220a delivers the fastest clock speed, up to 3.7 GHz/s and 4.1 GHz/s with Turbo boost, with an Intel Xeon E3-1200 v3 four-core processor. Each core provides better per-thread performance than a core on a 2P server. Because 20 of these 1P servers fit in the same space as 10 2P servers, the system as a whole is optimized. For some single-threaded HPC applications, deploying these servers can improve efficiency by 35% over deploying 2P servers. And according to a SPECjbb2013-MultiJVM benchmark of June 2014, the XL220a is the industry-leading 1P server with 16,252 max-jOPS and 4721 critical-jOPS. For lightly threaded and multi-threaded HPC, you can receive more power from the HPE ProLiant XL230a Gen9 compute tray. This tray includes one 2P server for a total of 10 2P servers per chassis. These servers support the latest generation Intel Xeon E5-2600 v3 or v4 processors, which provide up to 70 percent more power and 36 more efficiency than the previous generation. Examples of HPC applications that run well on this tray include risk analysis (Monte Carlo simulation) and oil and gas seismic processing. The HPE ProLiant XL250a Gen9 compute tray boosts performance for multi-threaded HPC applications. This tray features the same 2P server as the ProLiant XL230a tray, but adds support for up to two accelerators. You can select an NVIDIA Tesla accelerator tray, an Intel accelerator tray, or an AMD accelerator tray. You can then install up to two accelerators of the corresponding type in the tray (NVIDIA Tesla K40, Tesla K80, Tesla M60, Tesla M60 LAF, or Grid K1 Quad GPUs, Intel Xeon Phi 5110P or 7120P coprocessors, or AMD FirePro S9150 GPUs, as of the publication of this ebook). To make room for the accelerators, the XL250a is a double-width server tray. Therefore, the density of servers per-chassis is lower—five 2P servers per-chassis. However, for the right HPC applications, the accelerators can more than make up for this lower density. Examples of HPC applications that benefit from acceleration include as seismic analysis, risk analysis, Monte Carlo simulation, weather simulation, and genomics. Note that some types of HPC, such as Monte Carlo simulation, could fit well on various processors; you should examine the needs of your particular customer ’s application and use case. In a moment, you will learn more about how you can determine whether an application will benefit from accelerators.

Tailoring the compute tray to the workload: HPE Apollo 2000

Figure 4-17 Tailoring the compute tray to the workload: HPE Apollo 2000 As mentioned in the previous section, you should choose the Apollo 2000 System when the customer requires a smaller deployment. In addition, perhaps the customer is just getting started with HPC and wants a solution with a familiar form factor. The HPE Apollo 2000, shown in Figure 4-17, provides this familiar 2U form factor. It supports two options for compute trays: • The ProLiant XL170r is quite similar to the XL230a. The XL170r also provides a 2P server with Intel Xeon E5-2600 V3 processors and is well suited to lightly multi-threaded HPC. The Apollo 2000 chassis can hold four of these trays. • The ProLiant XL190r provides a 2P server with E5-2000 V3 processors as well as two trays for similar accelerator options as the XL250a (as of the publication of this ebook: NVIDIA Quadro K4000, NVIDIA Tesla K40 or K80 GPUs, NVIDIA GRID K2-RAF PCIe GPUs, NVIDIA GRID M60RAF Dual GPUs, AMD S9150 accelerators, and Intel Xeon Phi 5110P coprocessors). An Apollo r2000 chassis can hold only two of these trays.

Why GPU and coprocessor acceleration

Figure 4-18 Why GPU and coprocessor acceleration

CPUs were designed to meet the needs of many different types of workloads, including singlethreaded processes and multi-threaded ones. GPUs, on the other hand, were originally designed for just one purpose: rendering graphics. Rendering each pixel constituted one task, separate from other tasks, so GPUs were optimized for multi-threading, rendering as many pixels as possible in parallel, illustrated in Figure 4-18. Many HPC applications also feature workloads that can be parallelized and divided into many threads. These applications benefit highly from running on a CPU that is enhanced with a GPU. The NVIDIA GPUs can boost performance up to ten times depending on the application. The Tesla K40 dual GPU provides up to 7 single-precision TFLOPS and 0.2 double-precision TFLOPS with NVIDIA GPU Boost, 24 GB memory, and 288 Gb/s memory bandwidth. The K80 dual GPU provides up to 8.73 single-precision TFLOPS and 2.91 double-precision TFLOPS with NVIDIA GPU boost, 24 GB memory, and 480 Gb/s memory bandwidth. Refer to NVIDIA materials for the latest specifications and information on other GPUs. The AMD S9150 provides 5.07 single-precision TFLOPS, 2.53 double-precision TFLOPS, 16 GB memory, and up to 320 Gb/s memory bandwidth. Instead of GPU accelerators, you can install Intel coprocessors. The coprocessor consists of a dense group of cores (60 for the Intel Xeon Phi 5110P and 61 for the Intel Xeon Phi 5110P) and solid memory and memory bandwidth (8 GB and 320 Gb/s for the 5110P, 16 GB and 352 Gb/s for the 7120P). These coprocessors, like GPUs, are also optimized for highly parallelized tasks and can speed those tasks with up to 1.2 double-precision TFLOPS. See Intel materials for more precise benchmarks. Before you choose accelerators, it is crucial that you discover whether your customer ’s application is architected to take advantage of that accelerator. NVIDIA and Intel provide searchable lists of such applications: • http://www.nvidia.com/object/gpu-applications.html • https://software.intel.com/en-us/xeonphionlinecatalog The NVIDIA site provides estimates of how much the GPU will accelerate the performance for the application. All of these GPUs and coprocessors, including the AMD FirePro 9150S, are OpenCL 1.2-compliant. (OpenCL is an open source project for developing parallel computing, graphics, and other types of applications that can run on a variety of hardware.) You can look up libraries and applications that use OpenCL at https://www.khronos.org/opencl/resources. After you know that your customer ’s application can use the acceleration, you can consider whether you need one of the higher memory and performance options. Also remember to check the particular accelerators that are supported by the XL server.

Tailoring memory to the workload: Capacity

Figure 4-19 Tailoring memory to the workload: Capacity You will now move on to the next choices for tailoring the compute tray options: choosing the number and type of DIMMs (see Figure 4-19). You will need to work closely with the customer to determine the memory capacity requirements for their application. To keep the computation running as quickly as possible, the application needs to be able to work with data in the memory rather than a drive. Some HPC applications work with smaller sets of data, while others work with very large ones. By increasing the capacity of the memory to hold as much of the dataset as possible, you can improve the performance for the application. Also consider the number of processor cores, because all of the cores share the same memory. A multi-threaded HPC application can use the cores intensively. In addition, HPC schedulers often allocate jobs per-processor core. If a processor might be handling several different jobs on its cores, you should take care to plan enough memory so that the jobs do not contend too much, which would decrease the performance of the solution. In other words, you would try to plan enough memory to hold the dataset for several jobs. As a general rule, provision at least 2 GB per core. Preferably, provision at least 4 GB or even 8 GB per core, depending on the application’s demands. Note, though, that this is only a guideline intended to give you a minimal starting point for planning. Understanding the dataset size and number of jobs per processor is critical. Later in this chapter, you will also learn a bit about benchmarking application needs. The XL220a compute tray supports up to 32 GB of RAM per processor, which is often enough for single-threaded applications. You should generally provision up to this level to get the best performance. If your customer requires more memory, you can select the XL230a instead. The XL170r, XL190r, XL230a, and XL250a compute trays have four memory channels with two slots each on each processor. Currently, the XL230a and XL250a support DIMMs with capacities up to 64 GB, for up to 512 GB of RAM with one processor and 1024 GB with two processors, providing ample memory for processors with many cores. The XL170r and XL190r currently support DIMMs with capacities up to 32 GB for 256 GB with one processor and 512 GB with two.

Tailoring memory to the workload: Performance

Figure 4-20 Tailoring memory to the workload: Performance Often HPC requires you to maximize for performance, so you should select higher speed memory, as you see in Figure 4-20. For the XL170r, XL190r, XL230a, and XL250a compute trays, you should also consider which type of memory to install: registered DIMMs (RDIMMs) or load-reduced DIMMs (LRDIMMs). LRDIMMs generally provide better performance with some costs in higher energy use. Also note that standard rather than low voltage memory provides better performance. Performance also depends on how you distribute the memory. To obtain the best performance, you should balance the DIMMs (UDIMMs, RDIMMs, or LRDIMMs) in each of the memory channels on the processor. For example, if you need 64 GB for an XL230a processor, select four 16 GB DIMMs— one for each channel—rather than two 32 GB DIMMs. You must install the memory in the correct DIMM slots based on which processors you are using, how many DIMMs you are using, and the number of ranks the memory provides. Visit http://h22195.www2.hp.com/DDR4memoryconfig to obtain valid memory configurations. See Table 4-6 in the “Supplemental content” section at the end of this chapter for an overview of the compute tray options; for details, refer to the compute tray’s QuickSpecs.

Tailoring storage to the workload

Figure 4-21 Tailoring storage to the workload Because HPC typically works within a cluster of compute nodes, each of which might need access to the same files, shared storage plays a crucial role. However, local storage can still be important to the functioning of the application. For example, the application might use local drives for temporary files to which they need to read and write quickly during a particular job, as illustrated in Figure 4-21. For both types of storage, you need to consider the vast demands that HPC can place on storage. HPC calls for both high performance and high capacity. First, consider the performance needs. As you know, storage performance is generally measured in random input/output operations per second (IOPS), which measures how many different read or write requests the storage can accommodate per second, and in sequential IOPS, which measures how quickly the drive can deliver a sequence of data such as a complete file. For local storage, the random IOPS versus sequential IOPS demands depend largely on how the application works. You will consider various factors in the next section. HPC can create very high demands for random IOPS in the shared storage because, as each compute node works on its job or portion of a job, the node accesses the shared storage. Many different nodes might access a shared drive at the same time, asking for different files or portions of files. If computations involve accessing many different small files—as does, for example, the physical design portion of EDA—the random IOPS must be particularly high. If the HPC application calls for nodes to work with large files, a high sequential IOPS might be important as well. HPC can also create large capacity demands. The application might be working with vast data sets and large, complex file systems. The next sections give some guidelines for maximizing IOPS, particularly, random IOPS. Storage I/O can be the slowest part of a job, so enhancing performance can pay off in speeding up the job’s runtime. (On the other hand, storage I/O might only form a small part of the job, in which case performance increases are less important. Consider the particular needs of the customer application as you optimize.) The next sections also point to ways that shared storage can scale to meet the needs of large HPC clusters that work with a great deal of data.

Tailoring local storage to the workload: IOPS and throughput

Figure 4-22 Tailoring local storage to the workload: IOPS and throughput You will now look more closely at planning the local storage on each compute node. First, consider some of the options that you have for different types of drives. In the HPE Apollo 6000 Systems, each XL compute tray has its own drives. The HPE Apollo 2000 chassis, on the other hand, provide the drives for their compute trays—depending on the chassis either allocating the same number of drives to each tray in a fixed manner or flexibly allocating them, as you learned earlier. In either case, the compute trays support both SAS and SATA HDDs, as well as SATA SSDs and SAS SSDs (SAS SSDs are not currently supported on the XL220a). Figure 4-22 indicates generally how these options compare in the performance that they provide. SSDs cost more than HDDs, but they outperform HDDs in several important areas. They provide higher sequential IOPS and much higher random IOPS. Consider how the customer and HPC application will be using this storage. If the local storage is intended for purposes unrelated to the HPC application, you can propose less high performance options. When the HPC application is using the local storage for temporary files, though, optimizing for performance can be critical. Assess how intensively the HPC application will use the local drives. Do they need to read or to write from the drives frequently? In that case, the higher cost SSDs might be worthwhile for the customer. Will the application need to read from and write to different portions of the file throughout the computation? In this case, the local storage must deliver a high random IOPS. Or will a particular job bring a small file into its memory, use its memory, and then write a result to the drive at the end of the computation? In this case, sequential IOPS might be more important. In either case, SSDs deliver the best performance. If you need to propose HDDs as a less expensive alternative, always recommend enterprise-class drives. Select the higher rotations per minute (RPM) option to optimize random IOPS. Also consider the protocol, which affects throughput. The 12Gbps options, of course, provide higher throughput and also sequential IOPS, which depends in a large degree on the throughput. Traditionally, SAS drives generally provide better performance and reliability, but a high capacity SAS drive is more expensive than a SATA drive with the same capacity. If the customer requires the highest performance for reading and writing local data, consider adding

a HPE Value Endurance (VE) PCIe Workload Accelerator to the compute tray. These accelerators increase the IOPS for connected SSDs and can provide very low latency and four times more transactions per server.

Tailoring storage to the workload: Other considerations

Figure 4-23 Tailoring storage to the workload: Other considerations You should also discuss with the customer the reliability requirements. How mission critical is data stored on local drives? In many cases, files in local storage are copied to shared storage, but the company might have special requirements. Also consider the endurance requirements. You might want to propose high endurance SSDs drives because drives often get a lot of use as temporary files are saved to them over many jobs. Note that HPE provides SSDs that are optimized for different purposes, whether read-intensive, write-intensive, or mixed-use. You should discuss which types of use the customer ’s HPC application requires. (Note that you cannot reach the maximum capacity indicated in Figure 4-23 with some varieties.) You should have now selected the type of drive. Next, determine the required capacity. Discuss with the customer whether the drives will be used for temporary files only or whether files will accumulate on them. Sizing the local storage to accommodate the full temporary needs can speed up the job by ensuring that the node does not have to interact with shared storage many times throughout the job. The HPC application might give guidelines as to the local needs. For example, an EDA application might require twice as much local storage space as memory. As you see in Figure 4-23, the compute trays for the Apollo 6000 servers support high-capacity options for both HDDs and SSDs, so you should be able to meet the customer requirements no matter which type of drive you have selected based on the performance requirements. The Apollo r2000 Chassis can provide an even larger amount of local storage capacity for their compute trays, depending on the drive type and chassis, as shown in Table 4-1. As always, this table and the capacity information in Figure 4-23 are provided for your convenience; you should check QuickSpecs for the latest information. Table 4-1 Local storage for HPE Apollo 2000

Also note that the compute trays include embedded controllers for the drives (HPE Dynamic Smart Array B140i (SATA controller). However, you might propose a Smart Array controller instead, which is supported in a PCIe expansion slot. As you know from prerequisite training, Smart Array controllers provide additional benefits such as HPE Secure Encryption and SSD Smart Cache. The customer might also require the Smart Array Controller to support SAS drives. As of the publication of this ebook, supported controllers besides the embedded controller include the following: • For the XL220a—HPE Smart Array P430/2G and 4G SAS controller • For the XL230a and XL250a—HPE Smart Array P440 SAS controller • For the XL170r and XL190r: – HPE Smart Array P440/4G Controller

– HPE Smart Array P441/4G Controller – HPE Smart Array P840/4G Controller – HPE Smart Array P841/4G Controller For more details, refer to the compute tray’s QuickSpecs.

Assessing how the current environment affects shared storage and fabric choices Your final choices for tailoring the solution to the customer scenario include selecting network adapters for the compute trays and adding additional components, such as servers to host shared storage or top of rack (ToR) switches to support the HPC interconnect fabric. These choices depend on the customer ’s current environment, which you should assess during conversations with the customer. If the customer already has an HPC application, your questions should reveal what type of shared storage the application uses and how compute nodes reach that storage. You should also discover the type of interconnect fabric, if different from the fabric used to connect compute nodes and storage. With this knowledge, you can ensure that your final choices for the solution fit with the customer ’s current environment. You can also use surveys to assess customer ’s satisfaction with the current environment and general expectations. The request for proposal (RFP) might also include updating the storage and fabric components of the solution, so you need to be ready to architect that portion of the solution or to work with a team member to architect it. The next sections provide guidelines for assessing the shared storage solution, proposing a new solution if necessary, and also proposing network adapters that fit the customer ’s environment and requirements.

Shared storage approaches that you might encounter

Figure 4-24 Shared storage approaches that you might encounter You will now look at some of the shared storage options that you might encounter, which are

explained in the sections below. It is important that you understand these options (shown in Figure 424) to ensure that your solution fits with them. You will probably encounter Network Attached Storage (NAS) most often, although parallel distributed storage is becoming more common. Storage area network (SAN) shared disk A SAN provides block storage through a technology such as FC or iSCSI. To the compute node operating system (OS), the disks in the FC array appear as local drives, which they are allowed to access at the block level. A SAN can provide high random IOPS, as well as high sequential IOPS with the sequential IOPS, depending largely on the network bandwidth. However, although FC provides a limited degree of access control, block storage technologies were not designed to manage multiple nodes accessing the same shared disks. Each compute node connected to the SAN requires a shared disk solution to manage shared access. For example, Oracle Cluster File Sharing (OCFS) helps a node lock a file before altering it. For this solution, all compute nodes require a connection to the SAN. If the SAN uses iSCSI and the compute nodes use Ethernet for their interconnect, the compute nodes can use their interconnect for the storage traffic as well. Compute nodes traditionally required FC HBAs to reach an FC SAN. However, they can now use Converged Network Adapters (CNAs), which carry both Ethernet and FC over Ethernet (FCoE) traffic, allowing the nodes to use the same links for the interconnect and for storage. Note that the fabric infrastructure must also support FCoE as well as the Data Center Bridging (DCB) technologies that provide low latency and lossless delivery. To minimize the number of nodes that connect to the SAN, some HPC applications distinguish between compute nodes and IO nodes. The application is designed to allow compute nodes to direct their file requests to IO nodes. Only the IO nodes connect to the SAN and run the shared disk solution. If your customer ’s application takes this approach, you will need to determine whether to propose a solution for the IO nodes, which often have different requirements from the compute nodes, since you need to optimize them for serving files rather than for running computations. Traditional Network Attached Storage (NAS) solution In a NAS solution, compute nodes receive access to files on shared storage drives through a NAS server. As long as the node is set up as a NAS client, the shared drive appears as a local drive to the OS just as it does in a SAN shared disk solution, allowing the HPC application to call up files without special coding. However, the compute nodes only access storage at the file level, not the block level, and a NAS server controls each node’s access. This solution can be more reliable than a shared disk solution in which a misbehaving node might improperly write to a file. Network File System (NFS) is the typical NAS solution for Linux nodes. Each compute node is an NFS client to the NFS server. The NFS server holds the file system and connects to the shared storage drives, which are generally directly attached to the server—although they could be attached through a SAN. The NFS server is responsible for serving all files to the compute nodes. The NFS server ’s ability to meet random IOPS demands depends on the capabilities of its local disks as well as the server ’s ability to handle sessions with many clients. You can optimize sequential IOPS for this solution by increasing throughput across the path between compute nodes and the shared

storage. Scale-out solutions Traditional NAS allows only one NAS server (or perhaps one NAS server and a standby server for high availability) per file system. Because a single NAS server cannot always meet the high IOPS and capacity needs of HPC, companies are looking for ways to scale out. HPE IBRIX Fusion is an example of a scale-out NAS solution, which combines traditional NAS protocols such as NFS with a clustered file system. The cluster of NAS servers connects to shared storage in a SAN. Because nodes have many more servers to address their needs, the storage solution can scale much further. In an architecture that is perhaps more common (because it eliminates the need for a SAN), a cluster of servers can each contribute its local disk drives to the solution, and data is striped across these disks. This approach is called a parallel file system. Lustre is one of the most common parallel file systems for Linux HPC environments. Another example is GlusterFS, which can use a shared SAN or local disk drives. Lustre includes metadata servers (which store information such as file names, directory names, and file access rules) and object servers (which store the actual files). Compute nodes are clients. When a client needs to access a file, it first contacts a meta server to learn where the file is stored. It then contacts the multiple object servers that store pieces of the file. Because clients interact with many servers in parallel and because data is distributed across each server ’s drives, the solution can scale predictably with the addition of more servers. Like NFS, Lustre enables the compute node OS to view shared files as local.

Options for when the customer needs a shared storage solution: HPE ProLiant SL4540

Figure 4-25 Options for when the customer needs a shared storage solution: HPE ProLiant SL4540 If the customer wants to keep their current storage solution, you can move on to selecting network

adapters, keeping in mind what you learned about how compute nodes reach the shared storage. In some cases, though, you will need to propose a shared storage solution. HPE ProLiant SL servers are optimized for scale-out NAS and parallel distributed file solutions. For example, an HPE ProLiant SL4540 provides up to three powerful compute nodes and up to 60 SFF drives (one-node configuration). You can add the required number of SL servers to the HPE Apollo 6000 rack or scope out a rack to serve multiple HPE Apollo racks. You can choose one of three models (see Figure 4-25). The 1x60 model has one server node that has 60 drives. The 2x25 model has two nodes, each of which has 25 drives for 50 total. The 3x15 node has three nodes, each of which has 15 drives for 45 total. The models with fewer server nodes have more capacity at the expense of the ability to serve fewer clients. To maximize random IO, you would choose a model with more nodes. Other training provides guidelines on planning HPE ProLiant SL solutions, but Tables 4-2, 4-3, and 4-4 below show the maximum capacity to give you an idea of the number of systems that you will require. Table 4-2 Maximum storage capacity for HPE SL4540 1x60 Model Disk type

Protocol

Maximum capacity*

HDD

SATA

360TB (60 x 6TB)

SAS

360TB (60 x 6TB)

SATA

48TB (60 x 800GB)

SSD

*Two slots for SFF drives are also provided, adding up to 2TB (HDD) or 1.6TB(SSD) Table 4-3 Maximum storage capacity for HPE SL4540 2x25 Model Disk type

Protocol

Maximum capacity*

HDD

SATA

300TB (50 x 6TB)

SAS

300TB (50 x 6TB)

SATA

40TB (50 x 800GB)

SSD

*Four slots for SFF drives are also provided, adding up to 4TB (HDD) or 3.6TB(SSD) Table 4-4 Maximum storage capacity for HPE SL4540 3x15 Model Disk type

Protocol

Maximum capacity*

HDD

SATA

270TB (45 x 6TB)

SAS

270TB (45 x 6TB)

SATA

36TB (50 x 800GB)

SSD

*Six slots for SFF drives are also provided, adding up to 6TB (HDD) or 4.8TB(SSD)

Options for when the customer needs a shared storage solution: HPE Apollo 2000 local storage

Figure 4-26 Options for when the customer needs a shared storage solution: HPE Apollo 2000 local storage If you are architecting an HPE Apollo 2000 solution and you need to propose a shared storage solution, you might choose to use the local storage for this purpose (see Figure 4-26). These HPC clusters tend to be relatively small, and the Apollo r2000 chassis provides a higher density of storage, sometimes enabling it to meet the cluster ’s needs. The HPE Apollo 6000 compute trays each have their own storage. The HPE Apollo r2000Cchassis, on the other hand, provides the storage to installed trays: • HPE Apollo r2200 chassis—Provides 12 LFF SATA or SSD HDDs or SDDs, equally distributed (three per XL170r server or six per XL190r) • HPE Apollo r2600 chassis—Provides 24 SFF SATA or SAS HDDs or SSDs, equally distributed (six per XL170r server or 12 per XL190r) • HPE Apollo r2800—Provides 24 SFF drives such as the r2600; however, you can choose how many to allocate to each server The HPE Apollo r2800 chassis can be a good choice for the HPC solution. You can select one or two servers to act as the file servers or hosts for shared storage and assign all or most of the drives to them. HPE Apollo 2000 Systems can also act as alternatives to HPE SL 4540, providing shared storage for HPE Apollo 6000 HPC clusters.

Tailoring fabric to the workload: Options

Figure 4-27 Tailoring fabric to the workload: Options You are now ready to select adapters for the solution. The HPE Apollo a6000 Chassis provides ten Innovation Zones, where you install the fabric options (shown in Figure 4-27). Each Innovation Zone is dedicated to a compute tray slot, and you can mix and match options so that you can select the right option for each. For each XL220a, XL230a, or XL250a compute tray, you can either install an IO module, which includes two 1GbE ports, or a Dual FlexibleLOM riser, which supports up to two FlexibleLOM cards. For each XL230a or XL250a tray, you can alternatively install a PCIe/FlexibleLOM riser, which supports one FlexibleLOM card and one card with a PCIe form factor. For example, you could install an FC HBA for connecting to an FC Storage Array. HPE Apollo XL170r and XL190r servers for r2000 Chassis also support FlexibleLOM cards, as well as other network adapters. You must choose the appropriate risers to support the FlexibleLOM cards, and your choices also affect how many PCIe expansion slots you can use for components such as Smart Array Controllers, HBAs, and accelerators (for the XL190r). The FlexibleLOM cards and I/O network adapters include options for four-port 1GbE, two-port 10G Ethernet, two-port 10G FlexFabric (which might provide special features described on the next page), and two-port InfiniBand. For details about the exact adapter models, refer to the compute tray’s QuickSpecs.

Tailoring fabric to the workload: Choosing options

Figure 4-28 Tailoring fabric to the workload: Choosing options To choose between the adapter options, you need to consider what you learned about the customer ’s existing environment as well as collect information about the HPC application requirements. For highly parallelized, multi-threaded HPC applications, which run jobs across many nodes in a cluster, the interconnections between compute nodes can act as a bottleneck, slowing down the computation. Properly provisioning the interconnections, on the other hand, will seem to give the cluster a performance boost. Consider questions such as these: • Does the customer have an existing fabric solution in which your proposal must fit? If so, you must, of course, select adapters that match the current solution. • If you have more freedom in the proposal, does the customer have a preference toward Ethernet or toward InfiniBand? Does the customer IT staff have more experience with Ethernet? If so, an Ethernet adapter that can meet the performance requirements might be the best choice. • How parallelized is the application? Do nodes need to share large amounts of data with each other at very low latency? Or are nodes running batch jobs that run largely independently? Whenever HPC applications are highly parallelized, such as with Message Passing Interface (MPI), the interconnect must deliver high throughput and low latency. The InfiniBand options for HPE Apollo 6000 compute trays can provide 10 Gbps or 56 Gbps. InfiniBand also delivers extremely low latency. InfiniBand avoids the traditional IO stack and instead uses Remote Direct Memory Access (RDMA) to connect nodes at the memory level, essentially extending internal fabric between nodes. Ethernet can provide high speeds, but traditionally it has higher latency. However, if the customer prefers Ethernet, certain HPE FlexFabric 10 GbE adapters can support RDMA over Converged Ethernet (RoCE), which reduces latency (see Figure 4-28). Note that Converged Ethernet uses DCB technologies to ensure the low latency and lossless delivery required for RoCE. Make sure that the fabric infrastructure also supports these technologies. The FlexFabric adapters also feature offloading of traffic processing, which prevents precious

compute power being consumed by processing traffic. If the compute nodes are running independent batch jobs, 1 GbE might meet their needs adequately. However, keep in mind that the nodes storage needs as well (as discussed below). • What type of remote storage solution are you planning? And how does the HPC application use files retrieved from this storage? If the application is interacting often with the shared storage, low latency is a must. What size files does the application work with? Larger files require more bandwidth (higher speeds). Will a single NIC, or a pair of redundant adapters, provide enough bandwidth for the interconnect and the storage traffic? – If the customer requires an FC SAN solution, you might want to select FlexFabric adapters that support FCoE. The two 10 GbE ports can then provide both the interconnect and the SAN connection, as well as redundancy for both of these connections. – If the customer requires a NAS or a parallel distributed file solution, the compute nodes generally connect to the servers using Ethernet (although InfiniBand might be used in some cases). If you are planning to use Ethernet for both the interconnect and shared storage, plan for 10 GbE, not 1 GbE. If you are using InfiniBand for the interconnect, you will probably need to plan two cards for each compute node: one InfiniBand and one Ethernet. • What type of availability is required? Many HPC applications or their management applications have mechanisms for dealing with the loss of a single node within a cluster. If not, the loss of a node could compromise the completion of an important task that has taken hours to compute. Based on the capabilities of the customer ’s software, decide whether one port on each server is sufficient for the interconnect or whether two are required for better resiliency. If you are using Ethernet, you must bond the adapters, and you might want to use NIC bonding (teaming) in a mode such as link aggregation control protocol (LACP) that allows both adapters to be active to increase throughput.

Options for when the customer needs a fabric proposal

Figure 4-29 Options for when the customer needs a fabric proposal

As discussed earlier, you might need to deliver top-of-rack (ToR) switches as part of the solution, ensuring the proper technologies for maximizing the cluster performance. Figure 4-29 shows an example design for an HPC cluster with many servers, which are housed in several HPE Apollo a6000 chassis. Several HPE ProLiant SL servers are providing the shared storage. This design uses a pair of HPE FlexFabric 5930 switches, just one example of an HPE switch model that meets the needs. Using Intelligent Resilient Framework (IRF), these switches can act as a single virtual switch, giving you the freedom to connect one NIC in a bonded pair to one switch and the other NIC to the other switch for better resiliency. Because the two switches act as one, the bonded adapters can use a bonding mode that requires aggregation on the switch side, such as LACP, so the solution provides load-balancing on the bonded adapters as well. The two 5930 switches could support a smaller cluster of about 100 nodes on their own. If you need to scale the cluster beyond that, the ToR switches could connect to another tier of switches (such as HPE FlexFabric 7900 switches) in a leaf-spine topology. The switches support Transparent Interconnection of Lots of Links (TRILL) to ensure that the full bandwidth on the uplinks is used. Note that an HPC interconnect often requires a nonblocking or near to nonblocking design, so network designers will need to plan the ratio of 40 G uplinks to 10 G server links accordingly. These switches also support FCoE, so they can support servers that use FCoE to connect to shared FC or FCoE storage. They can even connect directly to native FC storage using flexible ports that can operate in FC or Ethernet mode (see Figure 4-30).

Figure 4-30 Options for when the customer needs a fabric proposal with FCoE support If you have selected InfiniBand for the interconnect, you must ensure that the customer has Mellanox FDR edge switches that support the same adapter type (10 Gb/s or 56 Gb/s). The edge switches will generally need to connect to core director switches in a nonblocking fat tree topology. Examples of this topology are shown in Figure 4-31 and Figure 4-32. Note that you can establish one link or two links per server.

Figure 4-31 Options for when the customer needs a fabric proposal, InfiniBand with one link per server

Figure 4-32 Options for when the customer needs a fabric proposal, InfiniBand with two links per server

Creating a design for tiered capabilities

Figure 4-33 Creating a design for tiered capabilities Generally, an HPC cluster features many nodes with identical hardware configurations. Sometimes, though, you can help the customer optimize performance with the right economics by creating a plan with different types of nodes to meet the needs of an application with varying requirements.

For example, an EDA application often runs not only many smaller jobs but also a few larger ones that combine pieces of a simulation. If the customer has a solution for scheduling jobs to run on servers with the correct resources, you might design two types of servers, most with less memory and some with more memory. Remember that you can also mix and match compute options. Again, you can maximize the efficiency of the plan by creating different tiers of servers, some with more compute power and some with less, as required by the mix of workloads in the customer environment, illustrated in Figure 4-33.

Chapter 4—Activity 1 Next, take time to review what you have learned by designing an HPE Apollo solution for the following customer scenario. Your sales partner has discovered an opportunity with an automotive company. This company relies on its EDA application to design more efficient, safer, and more powerful cars. Designers are beginning to complain that their jobs are backing up, and wait times are interfering with their ability to meet deadlines. The company is looking for a server infrastructure upgrade to improve performance for the EDA application. Current solution The solution that you will replace consists of • 120 2P servers: – Four core 2.5GHz Intel Xeon E5-2609 v2 processors – 16 GB RAM on each processor (32 GB total per server) – Four 1 Gbps ports (two per processor) • One NAS (and a standby NAS) with 120 TB storage (including replicated data) Workload Assume that you have discussed the workload requirements with the customer and discovered: • The customer uses an array of Synopsys EDA tools, including IC Compiler, Design Compiler, Saber, Proteus, and more (see Table 4-5). • The company’s IC Compiler is set up for light-threading with a maximum of four threads. It does not use the Design Compiler Ultra edition that permits multi-threading. Other tools also tend to be single-threaded. • Most jobs are scheduled to run on individual nodes. Some IC Compiler jobs, however, use distributed computing. • Most jobs use and produce quite small files, between 8 KB and 16 KB. The final simulations, though, produce very large files. Table 4-5 Example Synopsys EDA applications Application

Description

IC Compiler

Tool for designing physical chips

Design Compiler

Tool for enhancing and speeding up IC Compiler

Saber

Platform for simulating and validating systems

Proteus

Tool for analyzing proximity effects on full chips and building a correct model

Requirements

The customer wants to reduce the run time for jobs by 30%. IT administrators find it difficult to monitor their existing solution. However, after extensive analysis, they believe that both inadequate processing power and inadequate memory are delaying jobs. The company also wants to increase shared storage to 200 TB (including replicated data) and provide 100 GB of local storage on each compute node for temporary files. The NAS is acting as a bottleneck. The customer wants to migrate to a distributed file system with a cluster of file servers to handle the requests. You need to propose hardware for this solution. The solution can tolerate the loss of a node whether because of a problem with the hardware or a link that connects to the server. The solution should continue to operate as normal if up to one power supply fails. Design questions As you review these questions, record your answers. Refer to Table 4-6 in the “Supplemental content” section at the end of this chapter for an overview of the compute tray options; for details, refer to the compute tray’s QuickSpecs. 1. Does the customer ’s application support GPU or coprocessor acceleration? Visit these links to search for the application: • http://www.nvidia.com/object/gpu-applications.html • https://software.intel.com/en-us/xeonphionlinecatalog • https://www.khronos.org/opencl/resources 2. Which compute tray will you recommend for the solution and why? 3. How much memory will you recommend for each processor? How much total for your selected compute tray? 4. Visit http://h22195.www2.hp.com/DDR4memoryconfig. Use the tool to plan how you will configure the DIMMs. If you are given multiple choices, choose a design that will help optimize performance. Record your configuration and explain the reasoning behind your choices. If you are planning to use HPE XL230a or XL250a trays, select DDR4 and then choose the Apollo system and your tray. The tool does not currently support the HPE XL220a compute trays. However, you can plan the memory for a Gen8 server with similar options to get an idea of a valid configuration. Select DDR3, HPE ProLiant DL servers, and HPE DL320 Gen8 v2. You can then plan the memory for one server. Remember that the XL220a compute tray contains two 1p servers, which must each have the same memory configuration. Therefore, double the design that you select when you record it. 5. You will need to test to come to a final decision about how many compute trays to propose. At this point, about how many compute trays and HPE Apollo 6000 chassis will you plan to propose? 6. You have several options for drives that meet the customer ’s capacity requirements. What are some additional questions that you can ask the customer to help you make these choices?

7. For the shared storage solution, which server model will you choose? How many are required to meet the customer ’s needs? Justify your choices. 8. Will you use InfiniBand, 10 GbE, or 1 GbE? How many ports will you plan for each server? Explain your reasoning. Also list further questions that you might ask the customer if you cannot make a choice. You can check your answers by referring to Appendix B: Answers to Activities.

Planning for power at the rack level

Figure 4-34 Planning for power at the rack level You now know how to plan the building blocks of your solution. But HPC clusters need to scale out; customers often require hundreds or even thousands of processors. As the first step in scaling out the solution, you can install several Apollo a6000 Chassis in a rack. The chassis are designed to transform the rack into one, easily managed, efficient unit. The chassis itself does not provide any power. Instead, an external power shelf powers two or four fully loaded chassis or six partially loaded chassis (see Figure 4-34). The shelf intelligently and dynamically allocates power to each chassis as required. By pooling power across multiple systems, the Apollo 6000 Systems save the customer power and cooling costs as well as valuable data center space. Begin to plan the power by thinking of the rack as you design the unit. You will probably be able to support four to six chassis in the rack. Choose a number in that range for your initial plans. You can then scale that number up or down as you plan more precisely how many chassis a power shelf can support for your customer ’s requirements. Depending on these requirements, you might need multiple power shelves in a rack. First, fully scope out the components that you plan to install in each chassis, including all the compute, memory, storage, and fabric options covered in the previous sections. Then collect

additional requirements with questions such as these: What input voltage does the customer ’s site use? What level of power redundancy does the customer require? N+1 redundancy allows the shelf to fully power the solution even if one of the power supplies fails. N+N redundancy delivers even higher availability: a backup power supply for each power supply. Will you use single-phase power or three-phase power? Three-phase power delivers power as three alternating currents, each phase shifted by one-third. This phase shifting ensures that the voltage is never zero, which allows power-hungry devices to draw the power more efficiently. Keep these requirements in mind: – You must use either single-phase or three-phase power for all devices powered by the same shelf. – You must select PDUs that support the correct phase. – Three-phase power supports N+N redundancy but not N+1 redundancy. When you have the answers to these questions, you are ready to plan the power solution using the HPE Power Adviser. You can download the advisor from http://www.hpe.com/info/poweradvisor/download. You can also use the online advisor, http://www.hpe.com/info/poweradvisor/online.

which

is

always

up

to

date,

at

Guidelines for testing the solution

Figure 4-35 Guidelines for testing the solution After you have made your initial choices for the solution, you are ready to test its performance. You might begin by benchmarking performance on compute trays in one chassis. Figure 4-35 lists examples of tools for measuring metrics that are important to HPC applications. However, keep in mind that specifications such as these only give you a starting point for verifying that the solution will meet the customer ’s needs. Each HPC application is unique, and parallelized HPC applications rely on interoperations between multiple nodes. Only testing the customer ’s application can give a true sense of how your solution will perform.

You should develop a proof of concept (POC) solution. Then work with the customer to select a few typical jobs, including ones that require fewer resources, ones that require an average amount of resources, and ones that require the most resources. Determine how many processor or processor cores the customer plans to run these jobs on. Then scope out a POC solution with the required number of processors—perhaps one chassis or one rack. Run the jobs and assess whether the time is what the customer expects or whether you need to adjust the plan. If the latter, you can use server diagnostics such as those provided by HPE Cluster Management Utility (CMU) to assess what is slowing down the job. (You will learn more about Insight CMU later in this ebook.) Is the CPU, the memory, the disk I/O, or the network acting as the bottleneck? When you know the answer, you can plan which resources you need to enhance.

Chapter 4—Activity 2 You will now scope out your solution more fully while also planning power for it. You will use the HPE Online Power Advisor for this purpose. (Keep in mind that the version could change from the time that this ebook was published.) 1. What should you discuss with the customer before planning the power for the HPE Apollo 6000 solution? 2. Access the tool at http://www.hpe.com/info/poweradvisor/online. 3. You might need to activate Silverlight. 4. Agree to the License Agreement (see Figure 4-36).

Figure 4-36 HPE Online Power Advisor: License Agreement 5. Create a profile by filling out your name and email and selecting your country. Then click OK (see Figure 4-37).

Figure 4-37 HPE Online Power Advisor: Profile Information 6. The customer data center uses 220VAC for the input voltage, as shown in Figure 4-38.

Figure 4-38 HPE Online Power Advisor: Input voltage 7. In the navigation pane on the left, expand racks (see Figure 4-39).

Figure 4-39 HPE Online Power Advisor: Racks 8. Choose a 47U Intelligent rack and name it HPC rack, as shown in Figure 4-40. Then click OK.

Figure 4-40 HPE Online Power Advisor: Select the Rack Description 9. Expand Enclosures > HPE Apollo Enclosures and select HPE Apollo 6000, as you see in Figure 4-41.

Figure 4-41 HPE Online Power Advisor: Enclosures 10. Select the chassis that appears in your rack. Click the Config button at the top of the window (see Figure 4-42).

Figure 4-42 HPE Online Power Advisor: Config 11. Your plan probably calls for more than four enclosures. Begin by planning to support four enclosures on one power shelf. 12. Select Single for the Power Phase and choose the power redundancy based on the customer requirements. 13. Now configure the enclosures (see Figure 4-43). For the purposes of the activity, plan the same configuration for all enclosures and compute trays.

Figure 4-43 HPE Online Power Advisor: General Configuration Select All Enclosures same as 1 and click Config, as shown in Figure 4-44.

Figure 4-44 HPE Online Power Advisor: Enclosure Configuration 14. Choose the tray that you selected in the previous activity and click Add (see Figure 4-45).

Figure 4-45 HPE Online Power Advisor: Tray Configuration 15. Select enough trays to fill the enclosure. 16. Click Config, as shown in Figure 4-46.

Figure 4-46 HPE Online Power Advisor: Tray Configuration 17. Select All Trays same as 1, as shown in Figure 4-47.

Figure 4-47 HPE Online Power Advisor: Tray Configuration 18. In the real world, you might need to gather more information. For the sake of the activity, assume that you have discussed options with the customer and decided on:



If you are proposing an XL220a, a four-core 3.5GHz processor, which is the E3-1241 v3



If you are proposing an XL230a or XL250a, a 12-core 2.5GHz processor, which is the E52670v3

19. Click Add. Select 1 for an XL230a or XL250a and 2 for an XL220a (see Figure 4-48).

Figure 4-48 HPE Online Power Advisor: Model Configuration 20. Choose the memory options based on the configuration you determined in the previous activity. 21. Click Add and select 1 or 2 (see Figure 4-49).

Figure 4-49 HPE Online Power Advisor: Model Configuration

HPE Apollo 2000 and 6000 management You will now learn about the onboard management options for HPE Apollo solutions.

Overview of management tools

Figure 4-50 Overview of management tools HPE Apollo solutions provide built-in tools to help you manage the solutions at the system (server and chassis) level. They also support tools that help you to manage at the rack level and the solution level (see Figure 4-50). This section covers the chassis-level tools. Chapter 9, “Monitoring and Managing HPE Solutions,” covers the rack- and solution-level tools, which also support other HPE ProLiant servers.

HPE Apollo management modules

Figure 4-51 HPE Apollo management modules The HPE Apollo a6000 Chassis provides a Management Module with an iLO port through which administrators reach iLO functions on the servers (shown in Figure 4-51). The HPE Apollo 6000 Management Module simply aggregates the iLO connections for ProLiant XL servers installed in the Apollo chassis. Administrators still contact and manage the server at its own iLO IP address like a traditional rack server. If the customer wants to control functions such as power for XL servers on a wider scale, you should propose the HPE APM, which is covered later in this ebook.

The HPE Apollo 2000 chassis supports an optional Rack Consolidation Management (RCM) module. Companies have the option of installing the RCM and using its iLO port to reach XL170r or XL190r iLO functions or of using a dedicated iLO port on each server node. The RCM also provides a port for connecting to APM.

Planning iLO connections

Figure 4-52 Planning iLO connections To allow customers to take advantage of the iLO Management Engine, you must establish the correct connections. (However, if the customer is planning to use HPE APM, APM will provide the iLO connections instead, as described later in this ebook.) As you design the connections for these iLO ports, keep in mind that you should typically isolate the management network from the network that the Apollo servers are using. Plan to add a 1GbE switch (such as an HPE Aruba 3800) for the iLO connections. Each Apollo 6000 management module or Apollo 2000 RCM has two iLO ports. The two ports are not bonded together, which means that if you connect them incorrectly, you could create a loop. Take care to follow the directions below carefully. You can choose to connect each Apollo chassis directly to the switch that you selected for the iLO connections. Connect only one port on each chassis to avoid loops. As an alternative design, you could connect several Apollo a6000 Chassis in a daisy chain and then connect the final chassis to the switch. You would then use both iLO ports on most of the chassis, as shown in Figure 4-52. This design does not introduce a loop, and it uses fewer ports on the network switch; a single switch could support many chassis in many racks. However, this design is less fault resistant. If one connection fails, the customer can no longer reach the iLO management engines on all chassis below the failed connection. For some customers, the increased availability is well worth the limited expense of purchasing 1 Gbps switches with enough ports for all of the chassis. Other customers do not require high availability for the iLO functions and prefer to use fewer ports.

Chapter 4—Activity 3 You will now use the HPE Proposal Web to prepare a presentation of your solution benefits. As you do, keep in mind that you have learned this in product discussions: • The CEO is very concerned about the environment and making operations as green as possible. • IT managers found it difficult to assess how resources in the existing solution were being utilized. • Some decision makers are worried that they would not be able to get a powerful enough solution in just a few racks. • Decision makers need to demonstrate the cost effectiveness of the solution that they choose. Make sure to address these concerns, as well as to list other benefits. You will use the HPE Proposal Web to help in demonstrating values. Instructions for accessing and using HPE Proposal Web are provided below. You can also draw on your power plan. You require access to the HPE Partner Portal to access HPE Proposal Web. If you do not have such access, skip this activity.

HPE Proposal Web 1. Log into the HPE Partner Portal at https://partner.hpe.com. 2. Select My Workspace > Create Proposals. 3. Click Go (see Figure 4-53).

Figure 4-53 Proposal Web 4. Click Partner Login (see Figure 4-54) and enter your credentials again.

Figure 4-54 Proposal Web log on

5. Choose your language portal (see Figure 4-55).

Figure 4-55 Proposal Web: choosing a language 6. Click the Wizards tab. 7. Choose Enterprise Server, Storage, Networking, and Solutions Wizard (see Figure 4-56).

Figure 4-56 Proposal Web: Enterprise Group Wizards 8. Select the components for your solution (the Apollo solutions are included with ProLiant servers). Then click Next. See Figure 4-57.

Figure 4-57 Proposal Web: Enterprise Group Wizards 9. Choose your precise models (see Figure 4-58). Then click Next.

Figure 4-58 Proposal Web

10. Continue clicking through the wizard, customizing the elements as you choose. 11. Download your proposal. 12. Customize the proposal to remove, for example, mention of compute trays that you are not proposing.

Summary In this chapter, you have learned why so many companies are turning to HPC solutions to obtain a competitive edge. You examined HPC use cases and saw how different applications have different needs. You then learned to address these needs with density-optimized HPE Apollo solutions, tailoring compute, memory, storage, and fabric to the workload. Finally, you explored ways to simplify managing, monitoring, and provisioning the solutions.

Learning check Review what you have learned by answering these questions. Then check your answers in Appendix A: Answers to Learning Checks. 1. You are planning to propose an HPE Apollo 6000 System, and you have determined that a customer ’s HPC application will benefit from GPU acceleration. Which compute tray should you propose? a. HPE ProLiant XL190r b. HPE ProLiant XL220a c. HPE ProLiant XL230a d. HPE ProLiant XL250a

2. A customer tells you that its HPC application uses a SAN shared disk solution. What should you make sure to include in your proposal? a. HPE FlexFabric adapters that support FCoE (or FC HBAs) for the compute tray b. HPE ProLiant SL4540 server to act as a NAS c. HPE Apollo 2000 System to connect to the SAN d. PCIe riser and HPE Smart Array Controller P430 or P440 for the compute tray

For answers, See Chapter 4 in Appendix A.

Supplemental content Table 4-6 is provided for use during the Activities. You should check the latest QuickSpecs for the most up-to-date information. Table 4-6 HPE Apollo 6000 compute tray options

Chapter 5 HPE Apollo 4000 for Data-Driven Organizations EXAM OBJECTIVES • Briefly describe the HPE Apollo 4000 portfolio • Position HPE Apollo 4000 solutions for the right use cases • Create an implementation plan for an HPE Apollo 4000 solution, including plans for the proper performance, scalability, and high availability

Assumed knowledge Before reading this chapter, you should have a basic understanding of the following: • Processors, including DDR3 and DDR4 memory, hard disk drives (HDDs), solid state drives (SSDs), and RAID levels for storage volumes • HPE ProLiant rack and blade servers and options for them such as HPE Smart Array Controllers • HPE BladeSystems, including interconnect modules and Virtual Connect (VC) modules • Server management and maintenance, including experience with iLO, Intelligent Provisioning, UEFI, HPE Insight Remote Support, HPE Insight Online, HPE Smart Update Manager (SUM), and HPE Insight Control server provisioning (ICsp) • HPE OneView capabilities

Chapter topics This chapter begins with an overview of the HPE Apollo 4000 family. You then review the scenarios that call for HPE Apollo 4000 solutions. Finally, you learn about decision points for architecting the solution.

HPE Apollo 4000 overview Begin with the overview of the HPE Apollo 4000 family.

HPE Apollo 4000 Family—Purpose-built big data servers

Figure 5-1 HPE Apollo 4000 Family—Purpose-built big data servers Starting from the Apollo 4200 Gen9 server and moving up to the Apollo 4510 Gen9 server, this server family handles data-intensive workloads that range from Hadoop analytics to object storage, as you see in Figure 5-1. The Apollo 4200 Gen9 is a dense storage server product that comes in the familiar 2U form. The large form factor (LFF) version is perfect for object storage deployments, reaching over 4.4 PB per rack (20 servers per 42U rack) The small form factor (SFF) model is ideal for Hadoop analytics, content delivery, and other applications requiring fast spindles. The Apollo 4500 Gen9 Servers come in 1- or 3-node configurations. The Apollo 4510 1-node configuration is purpose-built for object storage. It delivers a total of 68 drives per system, up to 544 TB per system, and over 5 PB per rack when it uses 8 TB drives. The Apollo 4530 3-node configuration is purpose-built for Hadoop analytics. The Apollo 4530 supports Intel Xeon E5-2600v3 processors, as well as up to 16 dual in-line memory modules (DIMMs), for up to 1024 GB memory per system. It has 15 drives per node (45 total drives per system).

Apollo 4200

Figure 5-2 Apollo 4200 The HPE Apollo 4200 Gen9 Server was designed as a versatile, entry-level, density-optimized big data server that integrates seamlessly into traditional enterprise data centers with the same rack dimensions, cabling, and serviceability and uses the same administration procedures and tools (see Figure 5-2). This makes it the ideal bridge system for enterprises wanting to start implementing purpose-built big data server infrastructure today and scale in affordable increments. This 2U rack server has industry-leading storage capacity of up to 224 TB with up to 28 hot-plug LFF hard disk drives/solid-state drives (HDDs/SDDs) per server. It can also be configured for performance and throughput to cover the range of big data solutions technologies from object storage to data analytics and HPC data-intensive applications. For high-performance computing, there are options for Top Bin CPUs, integrated accelerators (GPUs and coprocessors), high-performance, low-latency cluster, and networking input/output (I/O).

HPE Apollo 4200 Gen9 Server model options

Figure 5-3 HPE Apollo 4200 Gen9 Server model options The HPE Apollo 4200 Gen9 Server, shown in Figure 5-3, offers market-leading storage capacity among 2U storage servers, supporting up to 24 LFF hot-plug drives or 48 SFF hot-plug drives. For object storage customers looking for the lowest cost-per-gigabyte economics, the LFF model fits up to 224 TB of storage using 8 TB LFF HDDs. A 42U rack of Apollo 4200s can fit over 4.4 PB. For high-performance Hadoop-based workloads, the HPE Apollo 4200 can support up to 54 serialattached SCSI/(SAS/SSD) with 12 G throughput and 15 K revolutions per minute. The Apollo 4200 Gen9 Server is a perfect match for parallel processing applications like Hadoop, with up to two Intel Xeon E5-2600v3 processors per server that can reach up to 18 cores, as well as Xeon E5-2600v4 processors. For object stores needing fast performance with small objects, or in-memory data processing for analytics software, the Apollo 4200 offers up to 1024 GB of double data rate fourth generation (DDR4) memory (16 DIMM slots with DIMMs up to 64 GB). Other features include • Embedded 2x1 Gb Ethernet, FlexibleLOM expansion slot • Up to five peripheral component internet express (PCIe) slots to give flexibility for future I/O upgrades • Embedded HPE Smart Array P840ar Controller – Supports up to 16 drives with two ports – Provides point-to-point connectivity to SSDs to reduce latency – Includes Smart Storage Battery • Standard rack depth (31.5 inches) • HPE Integrated Lights-Out 4 (iLO4) to significantly simplify server monitoring, manage fault tolerance, and provide prefailure warnings for drives

• Single hot-plug drives for easy swapping, additions, and subtractions • Option to equip up to 10 fans for redundancy • Familiar HPE hard drive, networking, memory, and controller options for seamless transition to a density-optimized infrastructure

Familiar 2U form factor designed for dense storage

Figure 5-4 Familiar 2U form factor designed for dense storage Figure 5-4 shows the architecture of the 2U form factor in the Apollo 4200 Gen9 servers. This example is an LFF model.

Apollo 4510—Purpose-built HyperScale object storage server

Figure 5-5 Apollo 4510—Purpose-built HyperScale object storage server

The HPE Apollo 4510 System is purpose-built for object storage solutions, as shown in Figure 5-5. Customers can deploy cost-effective, HPE Apollo 4510 Systems optimized to meet the needs of their object storage solution requirements at any scale. HPE Apollo 4510 Systems can be configured to form the foundation platform for the whole variety of big data object storage solutions—from costeffective, high-capacity content repositories that address petabyte-scale data volumes to the tuned responsiveness required for content distribution systems. The space-saving storage capacity—of up to 544 TB per system and 5.44 PB per 42U rack—can grow to meet object storage solution needs at any scale, up to 100s of petabytes and much more. The HPE Apollo 4510 System is ideal for a wide variety of object storage solutions, ranging from collaboration and content distribution, to content repositories and active archives, to back-up repositories and cold storage, and everything in between. The HPE Apollo 4510 system is an ideal platform for the variety of object storage solutions supported by the HPE HyperScale Data Eco-System partners, including Cleversafe, Scality, Ceph, and OpenStack/Swift; and it also forms the building blocks for HPE’s own Helion Content Depot. The HPE Apollo 4510 brings HPE ProLiant Gen9 server technology into its 4U, one-server densityoptimized chassis. It includes • New levels of rack-scale storage server density – Up to 68 hot-plug HDDs or SSDs per server—544 TB capacity – Up to 8 TB SAS/SATA drives – Up to 5.44 PB in 42U rack (based on 10 Apollo 4510 Systems with 68 8TB LFF SAS HDDs per system) • New shorter chassis form factor that allows 1 additional chassis in a 42U rack – Up to 10 Apollo 4510 Systems in 42U • Up to four PCIe slots with flexible performance and I/O options to match the variety of object storage workload response and throughput criteria • Up to 16 DIMMs per node with DIMMs up to 64 GB, for 1024 GB total memory

Apollo 4530—Massive Density for Hadoop and Big Data Analytics

Figure 5-6 Apollo 4530—Massive Density for Hadoop and Big Data Analytics

The HPE Apollo 4530 System, shown in Figure 5-6, is purpose-built for big data analytics. It can be configured to optimally match technology requirements for economical large-scale Hadoop-based data analytics, or it can be configured for more complex compute intensive analytics with highperformance processors, up to 1024 GB DDR4 memory per server, SSDs, high-performance disk controllers and fast, high-capacity I/O options. The HPE Apollo 4530 System is ideal for the wide variety of big data analytics solutions. This includes parallel Hadoop-based data mining to develop a 360-degree view of customers to improve the cost-effectiveness of advertising and promotion, increase web commerce sales with “next-product buy recommendations,” and even provide “crowd-sourced quality control” by matching product return data with social media sentiment information. The Apollo 4530 System brings HPE ProLiant Gen9 server technology into this 4U, three-server density-optimized, shared-infrastructure chassis. It also provides: • More storage capacity per server and per rack with 8TB LFF HDDs: – Up to 45 LFF top loading hot-plug HDDs or SSDs – Up to 8 TB SAS/SATA drives – Up to 120 TB per server – Up to 10 chassis per 42U rack with 30 servers – Up to 3.6 PB capacity • CPU choices to optimize for performance or economy – E5-2600v3 or v4 series – 4–20 cores (1.6 GHz–2.6 GHz CPU speed) – Power ratings between 55 and 135 Watts – Up to 1024 GB DDR4 memory at up to 2133 MHz • Up to 5 PCIe slots with flexible performance and I/O options • Up to 16 DIMMs per node with DIMMs up to 64 GB for 1024 GB total

HPE Apollo 4500 Gen9 (2S/4U)

Figure 5-7 HPE Apollo 4500 Gen9 (2S/4U)

Table 5-1 lists the features found in the Apollo 4510 and 4530 Gen 9 servers (which are shown in Figure 5-7). Table 5-1 HPE Apollo 4500 Gen9 server features Feature

Detail

Processors

Up to 2 Intel® Haswell EP E5-2600 v3 or v4, up to 135W, C610 Series Chipset

Memory

16 DIMMs (eight per processor), registered, DDR4(1866/2133) w/ECC

Drive support

• • • •

60 LFF SAS(12 Gb)/SATA(6 Gb) drives in 1 node (1x60), with 8 LFF option in back 15 LFF SAS(12 Gb)/SATA(6 Gb) drives in 3 node backplane (3x15) Includes support of SFF drives in converter PMC Belmont SAS expander

Network

Dual-Port 1 GbE with FlexibleLOM support

Expansion

Up to four low-profile PCIe Gen3 slots • CPU0: One ALOM x8 LP @x16 slot (G3, 25W), One SmartArray with HBA x8 LP @x16 slot (G3, 25W), One SmartArray with HBA x8 LP @x16 slot (G3, 75W) • CPU1 : Two x8 LP @x16 slots (G3, 75W)

I/O

Front: 2 external USB ports per node, Video, Power/Health/UID Buttons and LEDs

Management

iLO4 + one optional dedicated iLO NIC port

Other Features

4U chassis height, hot-plug redundant fans, HPE Gen9 flex slot (AC and DC versions)

Compute

HPE ProLiant XL450 Gen9

HPE Apollo 4500 Gen9 chassis—Top view

Figure 5-8 HPE Apollo 4500 Gen9 chassis—Top view Figure 5-8 shows the architecture of the 4500 Gen9 chassis from the top view.

HPE Apollo 4500 Gen9 chassis—Rear view

Figure 5-9 HPE Apollo 4500 Gen9 chassis—Rear view Figure 5-9 shows the 4500 chassis from the rear view. Notice the placement of the management module, power supplies, and slots for PCI Express Gen 3. It also includes FlexLOM features.

HPE Apollo use cases You will now examine the use cases for HPE Apollo 4000 servers, as well as review the workloads for which they are designed.

Object storage use case

Figure 5-10 Object storage use case Companies that are dealing with exploding amounts of unstructured data cannot take traditional approaches to storing that data. Block storage is simply too expensive for the petabyte scale that

customers require to store their billions of objects. Typically, customers are archiving the data for infrequent access, so they do not require the performance of block storage, optimized for heavy readwrites and speedy I/O. On the other hand, tape (traditionally used for data archival) is too slow, failing to provide timely access to the data when required. Object storage provides the right balance for customers with petabytes of unstructured data, lowering the total cost of ownership (TCO) per gigabyte. However, to deliver the balance of right performance and right economics that the customer is looking for, the object storage solution must be built on the right infrastructure (see Figure 5-10). Customers who attempt to build petabyte scale object storage solutions on “white box” hardware often find that the solution fails to deliver the required performance and reliability. IT staff must manage a complex set of components that might not work well together, causing the company to lose the savings that they gained in lower capital expenditures in higher operating expenditures. HPE Apollo 4000 solutions solve these woes with tested hardware that provides the right performance and management simplicity. As you learned earlier, customers have a choice of flexible hardware. They also provide simple HPE Secure Encryption, a key requirement for many enterprises.

Big data analytics and NoSQL use case

Figure 5-11 Big data analytics and NoSQL use case A large majority of customers, 75%, agree that insights from big data reduce costs and increase revenue (see Gartner 2013 CEO Study). Therefore, customers are highly motivated to find big data solutions that help them to harness their data for day-to-day decision-making processes and for competitive value (see Figure 5-11). For example, companies often have a great deal of data from business transactions with their clients. They can mine this data for patterns that could tell them the best times of day to contact clients, the most effective marketing campaigns, and so on. Customers recognize and are eager to obtain these potential benefits. Analytics and business intelligence are the top technology priorities for small and medium-sized business chief information officers (SMB CIOs). (See Annual CIO Study, Gartner 2014.) But companies are finding it increasingly difficult to extract value from data. According to Forrester, 66% of customers find doing so very or extremely challenging (see Forrester Study 2015). As much

as customers need the right big data analytics software, they need the right infrastructure bolstering that software. Many companies’ issues with big data stem from a lack of servers that are optimized for the correct workloads. HPE Apollo 4000 solutions are purpose-built to support big data analytics, giving customers faster results and the real insights that they need to make day-to-day decisions. In this way, they help customers achieve a real return on investment (ROI) on their data.

HPE Apollo 4000 architecture This section teaches you how to design HPE Apollo 4000 solutions for object storage and big data.

Traditional big data architecture

Figure 5-12 Traditional big data architecture Traditionally, Hadoop has operated under the principle of bringing compute to storage for data processing, as shown in Figure 5-12. That is, compute nodes are colocated on the data nodes in the form of servers with direct attach storage (DAS). A YARN application can then assign a piece of a job to a node that stores the data for that job locally. HPE has designed a new, more flexible architecture that divides compute from storage. This architecture is discussed in a later chapter. This chapter focuses on customers who want to take the traditional approach. The next sections guide you through the decision points for designing a big data analytics solution.

Selecting the HPE Apollo 4000 model for big data analytics

Figure 5-13 Selecting the HPE Apollo 4000 model for big data analytics You learned about the HPE Apollo 4000 models earlier in the chapter. In Figure 5-13, you see an at-aglance view of the two models that are recommended for big data analytics: an HPE Apollo 4200 SFF or the more powerful HPE Apollo 4530 System with up to three ProLiant XL450 servers. The Apollo 4200 is intended as a bridge to big data for customers who want a traditional 2U server and a smallerscale big data analytics solution. As shown in the Figure 5-13, the 4530 System provides a higher density for HDD storage capacity (HDD, being the typical choice for the solution), as well as for more processing power and memory, enabling it to meet the demands of more complex analytics applications. (Note that capacity values are accurate as of the publication of this ebook and provided for your convenience. However, HPE might add new HDDs and SSDs; you should check QuickSpecs for the latest values.)

Scoping big data storage needs

Figure 5-14 Scoping big data storage needs You and the customer should discuss the storage capacity that their big data solution requires. These

discussions will cover the current amount of data that the customer has, as well as the ingest rate—the rate at which the data is expanding (see Figure 5-14). For example, perhaps the customer currently has 300 TB of data and is ingesting data at the rate of 400 GB per day. If the solution should continue to meet the needs for two years, then 1.33 PB of storage capacity is required. From the desired capacity, you can calculate the capacity that you must deliver with the HPE Apollo 4000 solution. First, the storage capacity that is usable by Hadoop distributed file system (HDFS) is generally about 90% of the raw storage capacity. Next, take into consideration that HDFS will replicate the data, typically three times. You must also take into account that applications will need to store files temporarily as part of the analysis job. For example, as compute nodes complete map tasks, they store the result files to the file system so that they will be available for shuffling and analysis during the reduce phase. You should leave about 25% of the space free for these files. To calculate the required storage capacity, multiply the desired capacity by four and then divide by 0.9. For example, if the customer needs 1.33 PB capacity, the solution should provide at least 5.91 PB. However, HPE recommends that customers decrease the capacity requirements by using compression. Discuss with the customer what forms of compression they intend to use. Often customers use Google Snappy (an open source compression library) for data that is frequently accessed because, although Snappy does not compress data to the smallest file possible, it optimizes compression and decompression times. Gzip (another compression library) can compress data to a greater degree, but takes longer to do so. The amount of space saved through compression depends on the type of files that are being compressed. For example, Google cites Snappy as having compression ratios of about 1.5–1.7 for plain text and 2–4 for HTML. Files that are already compressed (such as JPEGs) cannot be further compressed. You will need to agree with the customer on the amount of space that will be made available by compression, taking care to estimate for a worst-case scenario. For example, suppose that the customer is using Snappy and has a mix of file types; most of the data is HTML, but some data are compressed images. You agree on planning for a 1.5 compression factor. Instead of providing 5.91 PB, you will provide 3.94 PB. You might also need to add a bit of space for supporting files. (The servers support two micro SSDs or M.2 2280 SSDs, which you could use for the image.) Note that you do not need to plan for RAID; instead, the disk drives act as Just a Bunch of Disks (JBOD). HDFS handles replication and distribution of data. (On the other hand, you might want to put some other data, such as the OS or HDFS metadata, on disks set up for RAID 10 to provide redundancy for that data.)

Choosing the drive type As you have learned, SSDs provide faster random I/O and sequential I/O than HDDs. However, the performance differences are greatest in random I/O; HDDs can also provide good sequential I/O. MapReduce applications and many analytic applications that operate on HDFS read files as a whole, making sequential I/O most important. SSDs might provide somewhat better performance than HDDs,

but for many big data analytics purposes, the difference in performance is not worth the more significant difference in cost per byte. Thus HDDs often provide the best choice for meeting the customer ’s capacity requirements at the right TCO. The HPE Apollo 4000 family does support SSDs, which you should choose for specialized requirements or a subset of the data. For example, the customer might want to store the most frequently accessed data on SSDs. For some shuffle-heavy applications, you can improve performance by placing the intermediate result files for MapReduce jobs, which need to be written and shuffled during the course of the analytic process, on SSDs. In addition, some big data applications such as NoSQL databases or Interactive Hive require faster, random access to data, and the more significant increases in performance might make the SSDs worthwhile to your customer.

Scoping compute and memory requirements for big data analytics The HPE Apollo 4200 and ProLiant XL450 servers support flexible compute and memory requirements so that you can match them to the workload. As a base processor that works for most environments, you might choose the Intel Xeon-E5 2650v3, which provides 10 cores that operate at 2.3 GHz. With two processors per server with 15 disks, this choice provides over a 1:1 core to spindle ratio (the typical minimum requirement recommended by Hadoop). If the customer has CPU-bound applications such as Impala, Spark, and Solr Search, you can choose a processor with more cores or with more cores and a higher clock speed. (Table 5-2 shows examples of CPU-bound tasks.) As a general starting guideline, big data analytic workloads require at least 4 GB memory per core. This guideline would mean that the 10-core processor would require at least 40 GB memory, and a 2P Apollo 4200 or XL450 server requires at least 80 GB. To maximize memory performance, you also need to follow the recommendation of balancing DIMMs across all memory channels (four per processor). Therefore, you should generally round up to using at least eight 16 GB DIMMs, for 128 GB total. Again, you have the flexibility to provision more memory. For example, you could use 32 GB DIMMs instead, increasing the capacity to 256 GB. You can scale as high as 1024 GB per server, using 64 GB DIMMs and both memory slots. Scale the memory up when the server must support memory-bound applications such as Interactive Hive, Impala, Spark, or HBase or other NoSQL databases. As always, you should refer to HPE reference architectures for the customer application when possible. Table 5-2 Examples of CPU-bound tasks CPU bound Classification Clustering Complex data mining

Feature extraction Natural language processing

Other solution components

Figure 5-15 Other solution components Remember that a Hadoop solution requires a server for the management node, through which users submit queries, and two head nodes, which provide active/standby Resource Managers and other services. HPE recommends that you use three rack servers, such as HPE ProLiant DL360 servers, for these roles (see Figure 5-15). Typically, a cluster uses an isolated private network for communications between all worker nodes, the management node, and head nodes. A connection to an extract, transform, load (ETL) network is required for ingesting the data. Discuss how the customer plans to have the cluster ingest data. Some server administrators dual-home all data nodes on the cluster network and the ETL to distribute the ingesting work. In this case, you would need to ensure that you work with the network architect to plan the HPE Apollo 4200 or XL450 server FlexibleLOM or I/O adapters to support these requirements. Other administrators use an edge node or two redundant edge nodes to handle ingesting and staging all the data for the cluster. This design protects other worker nodes from the external network. If your customer wants to use an edge node or nodes, remember to add an HPE DL360 server for each edge node. This server must have disks with enough capacity to handle the ingest rate.

Guidelines for testing big data analytics As always, you should develop a Proof of Concept (POC) that demonstrates to the customer the performance and the efficiency of the HPE solutions and your design. The POC should match your

design as closely as possible. Before you run the test, it is also important that you tune the nodes to better support the application. Table 5-3 lists HPE reference architecture documents that explain the tuning guidelines. This tuning will ensure the best results from the test. You should also recommend that the system integrator completes the same steps for the final solution so that it operates most efficiently. Table 5-3 HPE traditional big data reference architectures Solution

Reference architecture

Cloudera

HPE Verified Reference Architecture for Cloudera Enterprise 5 on HPE Apollo 4530 with RHEL

Hortonworks Data Platform

HPE Verified Reference Architecture for Hortonworks HDP 2.2 on HPE Apollo 4530 with RHEL

You are then ready to test. Benchmarking tools provide generic metrics—for example, the throughput for reads and writes to the HDFS cluster. Table 5-4 lists some benchmarking tools for big data and analytics. Table 5-4 Example benchmarking tools Solution

Benchmarking tool

Description

NoSQL databases

Yahoo Cloud Service Benchmark

Tests throughput for read/write queries to the database

HDFS

TestDFSIO

Tests throughput and average I/O rate for read/writes to HDFS

HDFS and MapReduce

TeraSort

Tests time for sorting data (large job)

MRBench

Tests average time for completing many small jobs

Benchmarks might have a role to play in your testing, but you are more precisely attempting to determine how well your customer ’s application runs. Plan several tests using the customer applications with datasets of various sizes, including one that meets or exceeds the customer ’s maximum needs. You should also choose tests that place various demands on the solution, including worst-case scenario demands. For example, for an HBase test, you might run read-heavy tests and write-heavy tests, as well as balanced read-write tests. You should also test how the solution handles a high degree of random I/O requests. After you run the test, determine whether the execution time and other metrics are acceptable or whether you need to adjust the solution. The application that you are testing might provide you with valuable metrics for this purpose. For example, Hortonworks Data Platform (HDP) uses Ambari to collect and expose metrics; the Cloudera Manager also tracks metrics. Table 5-5 gives examples of some metrics that you might examine as you test. You can find a complete list of Hadoop metrics at https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/Metrics.html. Table 5-5 Example metrics Application

Metric

Meaning

HBase

regionserver.Server.blockCacheEvictedCount

Number of blocks that had to be evicted from the block cache due to heap size. If this stays at 0, all of your data fits completely into HBase blockcache (stored in the cartridge node memory), which is the most desirable case.

Number of blocks that had to be evicted from the block cache due to heap size. If this stays at 0, all of your data fits completely into HBase blockcache (stored in the cartridge node memory), which is the most desirable case.

Any YARN application

regionserver.Server.blockCacheExpressHitPercent

The percentage of time that requests with the cache turned on hit the cache. Values under 100 mean that hot data being processed cannot be entirely fit into blockcache. If the number is too far below 100, scale up the number of compute nodes.

regionserver.Server.storeFileSize

Aggregate size of the store files on disk. Make sure this value is similar on all region servers in order to properly balance the HBase load.

regionserver.Server.blockCacheFreeSize

Number of bytes that are free in the blockcache. This value indicates how much of the cache is used. It is a good indicator if your data is “warmed” by moving it into cache, so a low value is good.

regionserver.Server.readRequestCount

The number of read requests received. You can use this metric to see how many requests the solution is handling.

regionserver.Server.flushQueueLength

Current depth of the memstore flush queue. This metric should stay about the same over time. If it increases, the node is falling behind with clearing memstores out to HDFS.

QueueMetrics PendingMB QueueMetrics PendingvCores

The current memory or CPU resource requests that are not yet scheduled. A high number might indicate that you need to scale out the number of cartridges so that they provide more memory or cores.

QueueMetrics running_0 QueueMetrics running_60 QueueMetrics unning_300 QueueMetrics running_1440

The current number of applications whose elapsed time is less than 60 minutes, between 60 and 300 minutes, between 300 and 1440 minutes, and more than 1440 minutes. You can use these metrics to determine whether jobs are completing in the customer’s desired execution time.

AppsSubmitted AppsRunning AppsPending AppsCompleted

Number of applications that have been submitted to the resource manager for scheduling, that are running, that are waiting to be scheduled, and that are completed. You can use these metrics to determine whether the solution can handle the required number of jobs. For example, you can see how many applications are running when the number of applications pending begins to reach an unacceptable time.

Object storage architecture

Figure 5-16 Object storage architecture You learned about object storage in a previous chapter. Figure 5-16 reviews the basic design for an object storage solution. A cluster of object storage servers store objects, which they distribute and replicate across their disks based on rules dictated by the application. When clients need to read or write to an object, they send a request to a front-end server, which might be called a proxy server, connector, gateway server, or something else depending on the application. The server tells the client where the client can access an object by using a map that is often called a ring. Often the architecture calls for two such servers for redundancy. After a client knows the location of an object, it obtains the object directly from the object storage server. You can generally use traditional rack servers such as HPE ProLiant DL360p servers for the proxy server role. This chapter focuses on the design for the HPE Apollo 4200 or 4510 Systems, which play the role of object storage servers. Be aware that the architecture shown in Figure 5-16 is highly simplified. Each object storage application such as OpenStack Swift for cloud deployments, Ceph, Scality RING, and Cleversafe has its own architecture and terminology. For example, OpenStack Swift also defines account and container servers, which help to ensure that data for different tenants is isolated and also that data is stored according to customizable policies. Nonetheless, the applications generally follow a model such as the one shown here. You can refer to specific reference architectures for more precise information about that solution’s architecture, as shown in Table 5-6. Table 5-6 HPE object storage reference architectures Solution

Reference architecture

Ceph

Ceph on HPE Apollo 4200/4500 System Servers

Scality

Scality RING on HPE Apollo 4200

Cleversafe

Cleversafe on HPE Apollo 4500

Selecting the HPE Apollo model for object storage

Figure 5-17 Selecting the HPE Apollo model for object storage The HPE Apollo 4510, which supports one ProLiant XL 450 server, is purpose-built for object storage. As you see in Figure 5-17, this system is optimized for storage density as opposed to processor or memory density. (Note that values are accurate as of the publication of this ebook. However, HPE might add new HDDs and SSDs; you should check QuickSpecs for the latest values.) The Apollo 4200, on the other hand, is designed for more general-purpose storage solutions, so it balances processors, memory, and storage. The HPE Apollo 4200 System often makes a good choice for customers who need entry-level object storage solutions around the 1 PB scale. Even for larger solutions, some customers might prefer the Apollo 4200, which provides less storage per node; customers might not want the solution to have to recover up to 544 TB of data if a server goes down. (The solution recovers data by copying it from other replicated copies.) On the other hand, the Apollo 4510 provides the best density and optimization for object storage.

Scoping object storage capacity

Figure 5-18 Scoping object storage capacity

You should follow a similar process to scope the capacity that an object storage solution requires to that for scoping a big data storage solution, as shown in Figure 5-18. First, discuss the needs with the customer, making sure that the customer informs you about how quickly data is accumulating and how long the customer expects this solution to accommodate the new data without additional scaling. Then multiply this data by three to account for the standard replication factor. Make sure, though, to discuss the replication factor with the customer because some customers might choose to use a different factor. Also take into account that your systems will need some disk space for other purposes, such as storing an image and various supporting files. For example, OpenStack Swift requires all object storage servers to store the ring that maps data to its location. Other object storage solutions have similar requirements. You should plan to reserve 6%–10% of the disk space for this purpose. (The servers provide two M.2 2280 SSDs, which you could use for the image.) When you have taken all of these factors into account, the ratio of required capacity to stored data will be just over 3:1. You can look to HPE reference architectures for requirements tailored to the application. For example, Ceph recommends a ratio of about 3.2:1. Finally, discuss with customers whether they want you to take compression into account when you propose a solution. As you learned earlier, compression might reduce the size of stored data to a lesser or a higher degree, depending on the type of compression and the type of files. In many cases, you will not need to plan for redundant array of independent disks (RAID) because the object storage application handles distributing and replicating data. However, you should look at the guidelines for the customer ’s particular application. In some cases, RAID 0 might be recommended.

Planning servers, drive types, and drive number You will also need to choose a type of drive for the solution. Midline HDDs, using either SAS or SATA as the customer prefers, provide the best option for most object storage. Some cases, though, do exist in which greater performance and reliability of SSDs might pay off. For example: • Account and container data for OpenStack Swift when accounts contain millions of containers or containers have listings for millions of objects • Ceph journals Clients can perform random read-writes to the journal, which are then synchronized to the object, enabling better performance and consistency (in case multiple clients are accessing a file at once). Journals can be colocated with the objects, or they can be located on separate SSDs. Using the SSD option enables faster read-writes, but does add to the cost of the solution and might make design and failover more complicated. Refer to the Ceph on HPE Apollo 4200/4500 System Servers reference architecture for more details. You can use the required storage capacity and the maximum storage capacity for the selected HPE Apollo model to begin planning the number of systems for the solution. For example, you have determined that you need to provide 5.8 PB of data, and you have selected HPE Apollo 4510 Systems. Each Apollo 4510 System provides up to 544 TB, so 11 systems are required.

This estimate provides a starting point. You sometimes need to plan more servers that each supports less than the maximum capacity: • Discuss how the customer intends to use the object storage solution. When many clients need fast responses to their requests? In that case, you might choose to use lower capacity drives. Each drive will have fewer demands on it, so the system can provide a better disk I/O. • Also examine the object storage solution’s recommendations for the minimum number of cluster members. These recommendations might affect how well the application is able to protect data from loss. For example, most clusters should have at least three physical systems so that replicated data is distributed across three different systems. • You should also round up requirements so that every object server has an identical disk configuration with the same number and type of disk drives. For example, to provide 6 PB of storage capacity, you should plan 11 fully loaded HPE Apollo 4510 Systems, which actually provide 6.094 PB. This best practice helps to ensure consistency and good performance.

Planning compute, memory, and fabric for object storage Compute You should pay careful attention to the processors that you recommend, particularly for HPE Apollo 4510 Systems, which support a higher density of disks per processor. Each drive represents a device to which clients can make requests. The more cores and the higher clock speed that a processor provides, the better the processor is able to handle these requests. Discuss how the customer intends to use the unstructured data. Is the solution primarily for data archival—“cold” data? Or will clients interact with the data to a fair degree—“hot” data? In the latter case, plan for more powerful processors with more cores. Memory The general industry guideline for object storage memory is about 0.5 GB RAM per 1 TB storage. If an HPE Apollo 4510 System is at full 544 TB capacity, it will require about 272 GB, or perhaps 256 GB for a balanced configuration. Discuss whether the solution will include certain more frequently accessed objects. If so, adding more memory will provide more room for servers to store these objects in a cache and could improve performance. Fabric HPE Apollo 4200 and 4510 Systems support FlexibleLOM and PCIe expansion cards, which include 1 GbE and 10 GbE options. To choose between 1 GbE and 10 GbE, consider the need for a speedy recovery in case of a failure. Replicating 1 TB of data across 1 GbE links takes about three hours. The same process can take just 20 minutes with 10 GbE connectivity. Also consider how “hot” or “cold” the solution is. Hot solutions require greater network bandwidth.

Guidelines for testing object storage solutions The industry does not have widely accepted benchmarks for object storage performance. However, you can still create a POC and demonstrate to the customer how well the HPE Apollo 4000 solution

will perform. Remember to include all components, such as HPE ProLiant DL360p servers to act as proxy or gateway servers. Test the solution with a variety of request scenarios, including • All GET requests (reads), all PUT requests (writes), and then a mix of GETs and PUTs based on the customer expectations • Requests (GETs, PUTs, and then mix) for files of different sizes For each test, continue to scale up the number of requests while monitoring the performance. You want to see that throughput scales more or less linearly as requests are added. Also monitor latency to determine when it begins to rise over an acceptable level for the client (usually, three or so seconds). Make sure that the solution can handle the required number of requests with acceptable latency. If you detect issues, you can use a solution such as HPE Insight CMU or OS tools to check resource utilization and determine what is acting as the bottleneck.

Chapter 5—Activity 1 You will now examine a customer scenario and plan an HPE Apollo 4000 solution to meet the customer ’s needs. In this plan, you will • Design a solution to host big data analytics • Architect the solution to meet the customer ’s needs, including – Providing enough storage capacity – Meeting the compute needs

Scenario A retailer operates a chain of grocery stores throughout a region. The company has a great deal of data about inventory, customers, purchases, and so on. The company is just venturing into big data solutions and plans to deploy Cloudera Hadoop. The customer wants a more scalable and reliable way to store data. The customer also wants to start analyzing that data to make more informed decisions. For example, the customer hopes to learn more about the most loyal customers and the highest-spending customers so that marketing can make better decisions about how to brand the company. The retailer has a relatively small data center with traditional rack servers. The CIO has seen projects fail before due to outdated infrastructure. She wants to ensure that the new big data solution is a success and is pushing the purchase of servers specifically designed to meet the needs of such as solution.

Workload requirements You have discussed the workload requirements with the customer and discovered that • The customer currently has 1.4 PB of data and an ingest rate of 1 TB a day. The customer wants the storage solution to provide the necessary capacity for one year before needing to scale out. • HDFS will use the typical three times replication rate. • The customer plans to use MapReduce2 applications to analyze data on a weekly basis. Currently, the customer has just a few standard queries that it will run each week, and the queries can take hours to complete.

Select HPE products While discussing needs with key decision makers, you have determined that this customer has a strong bias toward the Hadoop traditional architecture. You decide to propose an HPE server solution to support a traditional big data architecture. 1. What are two HPE server solutions that might fit this customer ’s needs? 2. What should you discuss with the customer to help you determine which of these servers you should propose?

You can check your answers by referring to Appendix B: Answers to Activities.

Scope the storage requirements Record your answers to these questions. 3. What should you discuss with the customer when planning how you can under provision storage based on the fact that data will be compressed? 4. Assume that you and the customer have agreed that you can plan on a compression factor of 1.5 (in other words a 1.5 MB file will take up 1 MB). How much storage capacity should you plan for the data nodes? Remember to take into consideration the current data, the ingest rate, the replication factor, and space for result files. (Refer to the “Workload requirements” section above.) 5. Assume that you have discussed the factors that you listed in the first part of this activity and that you have decided to propose HPE Apollo 4530 Systems. Which ProLiant XL server do you propose for this system, and how many can you propose per system? 6. How many HPE Apollo 4530 Systems will you propose? More than one answer could be valid, but think about how you would justify your answer and what you would discuss with the customer to help you make your choice. 7. You learned about baseline processors and memory for a solution such as this, as well as ones that meet enhanced needs. Do you think that this customer has baseline or enhanced needs? 8. You will propose two processors per XL server. Based on your answer to the previous question, which of these processors provides the best choice? (Refer to the information presented in the chapter to remind yourself of the baseline recommendations. Note that the HPE Apollo 4530 System supports more types of processors; you can find a complete list in the QuickSpecs.) a. Intel Xeon E5-2698v3 with 2.3GHz frequency and 16 cores b. Intel Xeon E5-2690v3 with 2.6GHz frequency and 12 cores c. Intel Xeon E5-2650v3 with 2.3GHz frequency and 10 cores d. Intel Xeon E5-2603v3 with 1.6GHz frequency and 6 cores

9. Based on your answer to question 4, how much memory capacity will you plan for each XL server in the Apollo System? The server has 16 DIMM slots (four memory channels with two slots each on each processor). It supports 4/8/16/32 GB RDIMMs and 8/16/32/64 GB LRDIMMs. Which DIMMs will you propose? Although a complete solution has more aspects that you must plan, you have planned the fundamental components for the HPE Apollo 4530 solution. You will plan a big data solution in more depth in Chapter 7 when you learn about the HPE Big Data Reference Architecture. You can check your answers by referring to Appendix B: Answers to Activities.

Summary This chapter has introduced you to the HPE Apollo 4000 family and the use cases for which it is

optimized. You have learned how to design best practice solutions to meet customer requirements for big data analytics based on the traditional Hadoop architecture and for object storage solutions.

Learning check Review what you have learned by answering these questions. Then check your answers in Appendix A: Answers to Learning Checks. 1. Which HPE Apollo server is purpose-built for object storage? 2. Which type of disk drive meets the typical requirements for HDFS? For answers, See Chapter 5 in Appendix A.

Chapter 6 HPE Moonshot Solutions EXAM OBJECTIVES • Briefly describe the HPE Moonshot portfolio • Position HPE Moonshot solutions for the right use cases • Explain options and best practices for designing the networking component of an HPE Moonshot solution

Assumed knowledge Before reading this chapter, you should have a basic understanding of the following: • Processors, including DDR3 and DDR4 memory, hard disk drives (HDDs), solidstate drives (SSDs), and RAID levels for storage volumes • HPE ProLiant rack and blade servers and options for them such as HPE Smart Array Controllers • HPE BladeSystems, including interconnect modules and Virtual Connect (VC) modules • Server management and maintenance, including experience with Integrated Lights-Out (iLO), Intelligent Provisioning, UEFI, HPE Insight Remote Support, HPE Insight Online, HPE Smart Update Manager (SUM), and HPE Insight Control server provisioning (ICsp) • HPE OneView capabilities

Chapter topics This chapter introduces you to HPE Moonshot portfolio, solutions, and the customer use cases that they were specifically designed to address. It then covers some general information that you need to know as you design any HPE Moonshot solution, including how to architect the networking components as well as how to manage the solution. The next chapter will give you specific guidance in designing HPE Moonshot solutions for particular use cases and workloads.

HPE Moonshot overview This topic introduces you to the HPE Moonshot product and its components.

HPE Moonshot System

Figure 6-1 HPE Moonshot System The HPE Moonshot System, shown in Figure 6-1, is a huge leap forward in infrastructure design. It delivers breakthrough efficiency and scale by aligning just the right amount of compute, memory, and storage to get the work done. The idea is very simple—replace general-purpose processors with more energy-efficient Systems-on-Chip (SoCs) containing integrated accelerators tailored for specific workloads. The Moonshot Chassis incorporates everything that is a common resource in a traditional server— power, cooling, management, fabric, switches, and network uplinks are all shared across 45 hotpluggable server cartridges in a dense form factor. This enables massive scale-out without a corresponding increase in complexity and management overhead. It gives customers the right compute for their workloads at the right economics so that they can get the most out of their infrastructure. With HPE Moonshot, customers can • Optimize application performance—Avoid paying for IT they are not fully utilizing by using the best solution their workload • Realize breakthrough economics—Make better use of their data center space and power while reducing complexity • Accelerate business innovation—Respond more quickly to business needs and stay on the leading edge of technology

HPE Moonshot components

Figure 6-2 HPE Moonshot components HPE Moonshot converges compute, storage, and networking within a single chassis. The HPE Moonshot 1500 Chassis houses 45 server cartridges, each of which provides one processor or four processors. Each processor is called a cartridge node and is a server with its own operating system (OS). A 1500 chassis fully populated with 4 processor (4p) cartridges has a maximum density of 180 servers in one chassis (see Figure 6-2). The chassis provides a dense fabric that interconnects the cartridges to each other; it also has two switch modules and two uplink modules to provide cartridges with external connectivity. The chassis houses and manages all power and cooling elements for the cartridges, creating an efficient and green solution. A chassis management (CM) module provides a single point of access to the essential chassis functions and server management functions. It consolidates iLO functions for the chassis and all installed components. This single connection vastly simplifies the management network required to manage servers at scale. In this module resides the logic that controls the chassis functions, such as power distribution and cooling. The iLO CM provides a command line interface (CLI) and graphical user interface (GUI) management interface, as well as a representational state transfer (REST) interface and Intelligent Platform Management Interface (IPMI) for scripting. You will learn more about these components throughout this chapter.

HPE Moonshot 1500 Chassis front and rear view

Figure 6-3 HPE Moonshot 1500 Chassis front and rear view

Front view The HPE Moonshot 1500 Chassis (shown in Figure 6-3) is the essential foundation to unlock total cost of ownership (TCO) savings. The efficient design enables 45 hot-plug server/storage cartridges and 2 low-latency network switch cartridges in 4.3U chassis (5U bezel option available). A 42U rack can easily hold 9 HPE Moonshot 1500 Chassis. The hot-plug cartridges are architected for efficiency, flexibility, and density. Because a chassis can hold 45 servers (1p cartridges) or 180 servers (4p cartridges), a standard 42U rack can hold up to 1620 servers (405 servers in a rack using the singlenode server cartridges). The switch modules are centrally located to enable high bandwidth and low-latency switching. They connect to uplink modules (visible from the rear view), as well as to the cartridges. You will look at the internal switching fabric in more detail later. At a glance, the front panel display conveys the health status of HPE Moonshot System, the individual health for each cartridge in the chassis, as well as the general system health. UIDs are located on the front panel display and on each cartridge.

Rear view The HPE Moonshot 1500 Chassis builds upon the HPE ProLiant SL family by sharing power supplies and cooling for energy efficiency and cost savings. And like the HPE ProLiant BL family, Moonshot adds the benefits of embedded network switching to enable cable consolidation and right-sized networking capabilities for optimizing switch port costs. HPE Moonshot 1500 Chassis supports up to 45 or 180 servers all sharing the 5 dual-rotor fan modules and 2–4 common-slot power supplies. The fan modules are dual-rotor fans, for a total of 10 fans capable of up to ~4500W of cooling. Only two power supplies are required, but up to four power supplies can be added to achieve redundancy. Currently, HPE supports common-slot power supplies

for the HPE Moonshot System, but check for updates in the QuickSpecs. The HPE Moonshot network uplink modules are paired and matched to corresponding switch cartridges in the chassis. The switches/uplink modules are stackable and provide the standard rear cabling of rack-mount servers, but with the cable consolidation of a top-of-rack (ToR) switch. From the rear, you can also access the HPE Moonshot 1500 iLO CM module, which as you learned, provides a single point of access to the essential chassis functions and server management functions.

HPE Moonshot application-focused silicon

Figure 6-4 HPE Moonshot application-focused silicon HPE Moonshot cartridges consist of SoCs. An SoC compresses all the elements connected in a traditional server motherboard—processor, memory, video card, management interface, adapters, and storage controller—on a single chip (as shown in Figure 6-4). HPE Moonshot supports 1P cartridges, which have one SoC, and 4P cartridges, which have four SoCs. It is the SoC’s small form factor that allows the HPE Moonshot chassis to host such a high density of servers, saving the customer power, space, and cost. HPE has also designed each SoC to focus on a specific type of workload, such as video processing. The SoC design includes a space for future features created by Independent Hardware Vendors (IHVs) to distinguish their offerings by meeting new needs. When customers need to handle this future use case, they can simply purchase the new cartridge tailor-made for it.

ProLiant cartridge options

Figure 6-5 ProLiant cartridge options Figure 6-5 and Table 6-1 provide at-a-glance information about the cartridges available at the time of the publication of this ebook. You will learn more about selecting cartridges in the next chapter. Table 6-1 HPE Moonshot cartridges

Highly flexible fabrics

Figure 6-6 Highly flexible fabrics HPE Moonshot is designed with four communication fabrics to reduce complexity and enable flexibility. (See a summary of the benefits in Figure 6-6.) The fabrics are connected via a passive

baseboard/backplane for low cost, high reliability, and future expansion. • The Ethernet fabric is made possible by a standard Moonshot switch cartridge and optional, second switch cartridge. The stackable, low-latency switches are on separate fabrics for isolation and redundancy. • The storage fabric enables optimization of CPU cores to storage from minimum storage applications to storage-rich applications and from multiple servers sharing a single drive to multiple drives dedicated to a single server. • The Moonshot CM module manages the infrastructure power and cooling, but its biggest benefit is a single point of management of the 45 cartridges via a dedicated management fabric with point-topoint connections to each module in the chassis. • The Moonshot architecture was designed with an integrated 2D toroid cluster fabric that allows point-to-point connectivity from cartridge to cartridge. The traces are simply copper traces, and the protocol and functionality can be driven by the requirements of applications and cartridges installed. There are 29 lanes for the four fabrics, 1 for management, 8 for Ethernet (4+4, cartridge to switch for external connectivity), 4 for storage (2+2), and 16 for 2D Fabric (4 x 4, cartridge-to-cartridge for local connectivity). You will learn about these fabrics in the networking section of this chapter (with the exception of the storage fabric because you cannot sell storage cartridges at the time of the publication of this ebook).

Storage options

Figure 6-7 Storage options HPE Moonshot cartridges provide some local storage despite their small form factor. The m300 cartridges give customers the choice between HDDs of capacities up to 500 GB. This cartridge also provides M.2 SATA SSDs. Other cartridges support M2.SATA SSDs of various capacities: Refer to Table 6-1 for details. Some applications require greater amounts of storage than the cartridges can store locally. The HPE

Moonshot cartridge adapters support iSCSI initiators, allowing the cartridges to connect to HPE 3PAR StoreServ Systems. The StoreServ Systems deliver highly scalable, high performance, and easily managed external block storage to the HPE Moonshot System (see Figure 6-7). The StoreServ Systems work with m300, m350, m700, m710, and m710p cartridges. Alternatively, you can propose HPE density-optimized storage servers such as the HPE Apollo 4000 family (or ProLiant SL servers) to provide external storage. The Apollo 4000 servers can also provide iSCSI block storage through HPE StoreVirtual (supported for m700, m710, and m710p cartridges), or they can support file or object storage, depending on the customer needs.

HPE Moonshot use cases In this topic, you will examine different HPE Moonshot use cases based on application requirements.

Why HPE Moonshot

Figure 6-8 Why HPE Moonshot As IT services have penetrated further and further into day-to-day business operations, the applications and workloads hosted in a modern data center have proliferated—not only in number but also in variety. HPE has recognized that customers can no longer rely on all-purpose servers to meet the need of every application. At the same time, provisioning separate infrastructures for each application is inefficient, expensive, time-consuming, and complicated. HPE Moonshot solutions reconcile these conflicting needs. They consist of a variety of options for cartridges, each designed to deliver the best performance for a specific type of workflow. But the cartridges also share storage and networking within the Moonshot chassis, adding up to a dense, efficient solution that is simple to deploy and manage, as illustrated in Figure 6-8.

HPE Moonshot use cases

Figure 6-9 HPE Moonshot use cases You can design HPE Moonshot solutions that are specialized for the four types of applications shown in Figure 6-9. A brief overview of the use cases is provided below; the next chapter delves into more detail on each.

Big data and analytics Data is increasing at an exponential rate. According to the IDC Digital Universe Study: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East, by 2020 every person will be producing 1.7 megabytes of information every second. Companies need solutions to ingest data, store data, and then explore and analyze the data for business value. HPE Moonshot offers cartridges customized for big data analytics.

Mobile workspace Always connected Millennials make up an increasingly large segment of the workplace—and it is not just Millennials who work more productively when they can receive network access anywhere at any time and from any device. Companies are discovering that they can gain more by embracing a mobile and Bring Your Own Device (BYOD) environment than by fighting it. Hosted desktop and application streaming solutions allow employees to run enterprise-class applications, hosted in the data center, on the devices of their choice. But these solutions can place high demands on the servers that host them, particularly as the number of users scales. HPE Moonshot solutions provide the required performance and density.

Media processing Various HPE Moonshot cartridges are also designed to deliver to users the rich, high-quality media content that they demand. Use cases include high-definition video processing and streaming, gaming, and general content delivery.

Web infrastructure Any company that hosts a website knows that revenue increases with user traffic, but keeping pace with demand can be difficult. With its ability to host a mix and match set of cartridges, HPE Moonshot can host an entire, multi-tiered Web infrastructure in a single chassis, creating an efficient, integrated, and highly scalable multi-tiered solution.

HPE Moonshot Partner Program

Figure 6-10 HPE Moonshot Partner Program HPE partners with IHVs such as Intel and AMD to give customers a choice of processors that meet their computing needs. HPE also partners with Independent Software Vendors (ISVs), such as those shown in Figure 6-10, to give customers the peace of mind that comes from tested solutions. You should keep an eye on the current list of Moonshot ISV partners so that you can easily identify opportunities for proposing HPE Moonshot solutions. For example, you might encounter a customer who is seeking a server refresh to improve the scalability of their Citrix XenApp application sharing solution. You would know that HPE and Citrix tests have demonstrated stellar performance for HPE Moonshot 1500 Chassis with m710 cartridges for this scenario. You could refer to a technical white paper for help in architecting a similar solution for your customer, and you could share the HPE and Citrix test results as a compelling proof point for the HPE value in your proposal. Visit the HPE website for a current list of partners.

HPE Moonshot networking Before you dive into the details of designing HPE Moonshot solutions for hosting specific applications, you need to learn a bit more about the solution architecture and general design guidelines. This chapter focuses on these general aspects, beginning with architecting the networking component of the solution. The next chapter covers architecting solutions for specific workloads.

2D torus—Cartridge-to-cartridge connections

Figure 6-11 2D torus—Cartridge-to-cartridge connections Certain cartridges can connect together in a 2D torus. These connections are provided by the chassis fabric—4 10 Gbps lanes per connection—and do not require the use of an internal switch. A 2D torus consists of nodes that connect in rings in two directions, as you see in Figure 6-11. The Moonshot fabric supports a 3x15 2D torus, in which fifteen cartridges connect together in a ring. Each of those cartridges also connects to two cartridges in a ring in another dimension. Each of those other two cartridges is part of another fifteen-cartridge ring. Thus, each cartridge connects to four other cartridges, and all 45 cartridges have multiple high bandwidth paths to all other cartridges with eight or fewer hops. Only cartridges that are designed for use cases that require high-speed server-to-server communication, such as HPC, support the 2D torus connections (currently the m800 cartridges).

Cartridge-to-switch connections

Figure 6-12 Cartridge-to-switch connections To connect cartridges to the data center network, the HPE Moonshot chassis provides slots for two switches (see Figure 6-12). Each installed switch module has four 10.3 Gbps lanes to each cartridge. These lanes connect to adapters on the cartridge. Different cartridges have different adapters, as you will see in the next section.

The switch module also connects to an uplink module on 16x 10.3 Gbps lanes. The uplink module provides the switch with its uplink ports, used to connect to the data center network. A switch module and uplink module are always installed as a pair. By disaggregating the uplinks from the internal switch, HPE Moonshot gives customers greater flexibility for deploying their choice of external interconnects and future-proofing their investment. Only one switch module and uplink module pair is required. However, if only one switch module is installed, each cartridge can only use the half of its ports that connect to that module. The second switch and uplink module pair provides redundancy or connectivity to a second network. Multiple modules can be stacked within or across multiple chassis, reducing the cost of ToR switches and providing failover in the event of a switch or an uplink failure. Table 6-2 lists the features of each switch and uplink module. Table 6-2 HPE Moonshot Switch and Uplink Modules HPE Moonshot Switch Modules

Features

Intended cartridges

Moonshot45G Switch Module

The 45G Switch Module, together with the HPE Moonshot-6SFP Uplink Module, provides 1 GbE network connections to cartridges within the HPE Moonshot 1500 chassis.

m300

Moonshot45XGc Switch Module

The 45XGc Switch Module, together with the HPE Moonshot-4QSFP+ Uplink Module, provides 10 GbE network connections to cartridges within the HPE Moonshot 1500 chassis and 40 Gb/10 Gb connectivity external to the chassis.

m400 m710 m710p

Moonshot180G Switch Module

The 180G Switch Module provides 1 GbE network connections to up to 180 nodes in the HPE Moonshot 1500 chassis.

m350 m700 m800

HPE Moonshot Uplink Modules

Features

Switches supported and intended cartridges

Moonshot 6SFP Uplink Module

Use up to two HPE Moonshot-6SFP Uplink Modules with six 10 GbE SFP+ ports. Each uplink module delivers 60 GbE of aggregate bandwidth to connect the HPE Moonshot System to an external network.

45G or 45Gc m300

Moonshot 16SFP+ Uplink Module

Use up to two HPE Moonshot-16SFP Uplink Modules with 16x 10 GbE SFP+ ports. Each uplink module delivers 160 GbE of aggregate bandwidth to connect the HPE Moonshot System to an external network.

180G or 45XGc m350 m400 m700 m710 m710p m800

Moonshot 4QSFP+ Uplink Module

Use up to two HPE Moonshot-4QSFP+ Uplink Modules with four 40 GbE QSFP+ ports. Each uplink module delivers 160 GbE of aggregate bandwidth to connect the HPE Moonshot System to an external network.

180G or 45XGc m350 m400 m700 m710 m710p m800

Cartridge connectivity options

Figure 6-13 Cartridge connectivity options The cartridge adapters that connect to the switch modules depend on the type of cartridge (see Figure 6-13). HPE Moonshot supports two types of 1P cartridges: ones with two 1 GbE ports and ones with two 10 GbE ports. All 4P cartridges have two 10 GbE ports per node for eight total. For all cartridges, one port on each node connects to one switch module and the other port connects to the second module (if installed).

Selecting switch modules

Figure 6-14 Selecting switch modules You will now learn guidelines for selecting the correct switch module based on the cartridge type (illustrated in Figure 6-14). Begin by considering situations in which all cartridges installed in the HPE Moonshot chassis are the same type. (Here, type refers to 1P G, 1P 10G, or 4P, not the precise cartridge model.) If you are using 4P cartridges, you must select HPE Moonshot 180G Switch Modules, which provide enough ports for each of the four processors on the 45 cartridges. If the chassis includes all 1P G cartridges, you have two options for switches that provide 45 1G ports.

Choose the HPE Moonshot 45G Switch Module for basic switching features such as VLANs, IP routing, and support for Quality of Service (QoS). Choose the HPE Moonshot 45Gc Switch Module (based on HPE Comware OS) when you need these basic features as well as advanced, data center switching features such as the following: • Transparent Interconnection of Lots of Links (TRILL), which lets switches interconnect on many links without creating loops • OpenFlow, which enables switches to be controlled by a software-defined networking (SDN) solution For 1P 10G cartridges, select HPE Moonshot 45XGc Switch Modules, which are based on the same OS as the 45Gc switch modules but provide 10G connectivity. Note that you must choose the same switch module types for both module slots.

Selecting switch modules for chassis with mixed cartridge types You can mix and match cartridges of different models within the same chassis, tailoring the requirements for the different workloads required by a solution. For example, a mobile workplace solution might use m710p cartridges to provide hosted desktops but m300 cartridges to host controllers. If any of these cartridges are different types, such as 1P G and 1P 10G, you need to take care to select switches that support both. The 4P cartridges can only connect to the HPE Moonshot 180G Switch Module, while 1P G and 1P 10G cartridges can work with any of the modules. When a cartridge and a switch support different speeds, the lower speed is used. That is, although a 10G cartridge can be supported by a gigabit switch (45G, 45Gc, or 180G), it only receives a 1 Gbps connection from this switch. When the 45XGc switch supports a mix of 1P cartridges, it provides 10G connections for 10G cartridges, but only 1G for 1G cartridges. Finally, while a 180G switch module can support 1P cartridges, it only provides such a cartridge with one connection because the cartridge itself has only two ports (one of which connects to the other switch). Based on these rules, you must select the 180G Switch Modules whenever the chassis includes 4P cartridges. For this reason, you might try to avoid mixing 4P and 1P 10G cartridges in the same chassis because the 10G ports will only be able to operate at 1 Gbps. Of course, if the cartridges do not require the higher speed connections, then mixing these types of cartridges is permitted. When you need to mix 1P G and 1P 10G cartridges in the same chassis, select the 45XGc switch modules, which can support both types of cartridges and allow the 1P 10G cartridges to benefit from the full bandwidth on their ports.

Selecting uplink modules

Figure 6-15 Selecting uplink modules On its own, an HPE Moonshot switch module does not have any external ports for connecting to the data center network. You must select an uplink module to provide these ports, as shown in Figure 615. The HPE Moonshot 45G and 45Gc Switch support the HPE Moonshot-6SFP+ module. This module provides six 10 GbE connections, meaning no oversubscription for traffic flowing from the 45 cartridges (the typical flow for most use cases). For the HPE Moonshot 45XGc and 180G Switch Modules, you have two options: the HPE Moonshot16SFP+ Uplink Module (which provides 16 10 Gbps SFP+ ports) and the HPE Moonshot-4QSFP+ Uplink Module (which provides four 40 Gbps QSFP+ ports). Both uplink modules support the same total bandwidth. Take the infrastructure at the customer ’s data center into account as you choose a module. Using 40 GbE links requires fewer upstream ports; on the other hand, the customer infrastructure might not yet support 40 GbE to the rack because 40 GbE connections require different cabling with more fibers. For example, 40GBASE-SR4 uses MPO cables. These cables have 12 fiber strands, eight of which are used for transmitting and four for receiving. If the customer has 10 GbE now, but wants the flexibility to upgrade to 40 GbE in the future, select the 4QSFP+ Uplink Module. You can install a QSFP+/SFP+ adapter kit, which converts a 40 GbE QSFP+ port into 10 GbE SFP+ port, enabling the module to fit into the existing infrastructure until the upgrade. Or for short-range connections up to 5 meters, you can use QSFP+ to 4x10G SFP+ Direct Attach Copper Splitter Cables. These splitter cables give the customer four 10 GbE links per QSFP+ port. Note Please disregard any compatibility error message while using a QSFP+/SFP+ adapter.

To choose between the 16SFP+ Uplink Module and the 4QSFP+ Uplink Module, also consider whether you plan to implement stacking (180G) or IRF (45XGc) on the switch modules. You will learn more about these features in a moment. For now, simply know that if you want to use these features, you must dedicate some links to establishing the stack or Intelligent Resilient Framework

(IRF) fabric: • At least four 10 GbE links on a 16SFP+ Uplink Module • At least one 40 GbE link on a 4QSFP+ Uplink Module These links are then no longer available for uplinks. It is best practice to establish at least two links between two-member stacks or fabrics to prevent a situation in which the stack or the fabric splits. If you are concerned about oversubscription, you might choose the 16SFP+ Uplink Module so that you can establish multiple stacking or IRF links without having to dedicate a full 80 Gbps to these links. Note, however, that for some situations 80 Gbps is sufficient. You can also use just one stacking or IRF 40 GbE link. You should then, however, make sure that the network administrator sets up the proper mechanism for dealing with split stacks or fabrics (Multi-Active Detection [MAD] on the 45Gc and 45XGc switches) in case that single link fails. Table 6-3 provides information about oversubscription based on uplink module. Table 6-3 Switch module oversubscription based on uplink module Switch Module

Uplink Module

Oversubscription

Oversubscription with stacking/IRF

45G or 45Gc

6SFP

None

1.1:1 (two 10 GbE links for stacking/IRF)

180G

16SFP+

1.1:1

1.3:1 (four 10 GbE links for stacking/IRF)

4QSFP+

1.1:1

1.5:1 (one 40 GbE links for stacking/IRF) 2.3:1 (two 40 GbE links for stacking/IRF)

16SFP+

2.8:1

3.8:1 (four 10 GbE links for stacking/IRF)

4QSFP+

2.8:1

3.8:1 (one 40 GbE links for stacking/IRF) 5.6:1 (two 40 GbE links for stacking/IRF)

45XGc

Note The Moonshot 6SFP and 16SFP+ Uplink Modules support various HPE SFP transceivers, SFP+ transceivers, and Direct Attach Cables (DACs). The Moonshot 4QSFP+ Uplink Modules support various HPE QSFP+ MPO SR4 transceivers, QSFP+ DACs, QSFP+ to 4x10G SFP+ DAC splitter cables, and QSFP+/SFP+ adapters. You should always refer to the latest module QuickSpecs for a list of the transceivers and cables qualified and certified to work with this module. Transceiver and DAC cables from any manufacturer will be accepted, but they will not be supported by HPE.

Providing redundancy and avoiding broadcast storms

Figure 6-16 Providing redundancy and avoiding broadcast storms

A redundant network design without the proper technologies in place can cause broadcast storms (illustrated in Figure 6-16): network switches continually duplicate and forward broadcasts and multicasts to each other until these packets consume all bandwidth on links—essentially an unintentional denial of service (DoS) attack. In a moment, you will examine methods for properly designing network topologies that provide redundancy and high bandwidth • Use link aggregation (either manual or LACP-based). You can only aggregate links that connect one switch to the same other switch. However, as you will see, a stack or IRF fabric count as a single switch for link aggregation. • Enable a spanning tree protocol such as Rapid Spanning Tree Protocol (RSTP) or Multiple Spanning Tree Protocol (MSTP). RSTP and MSTP handle loops across the broadcast domain, blocking redundant paths, which automatically reactivate in case of failure. • Enable TRILL. TRILL provides a better solution than spanning tree for a modern data center because it allows switches to load balance traffic across many links in a swiftly converging topology. • Place links in different VLANs such that no one VLAN has a looped topology. A VLAN defines a broadcast domain. As long as VLANs segment the looped topology such that no loops exist within a VLAN, no broadcast storm occurs. Often, though, you want redundant connections to carry the same VLAN, so this chapter will not examine this method in more detail. You might use some of these methods in conjunction—for example, creating a link aggregation on multiple ports that connect to the same switch and also enabling TRILL to handle redundant link aggregations that connect to different switches. The next sections provide some example network designs, demonstrating when and how you should use the most common methods.

Redundant design without stacking or IRF

Figure 6-17 Redundant design without stacking or IRF This first design, shown in Figure 6-17, applies to situations where you want cartridge nodes to use their two ports for redundancy within the same subnet. You also do not want to stack the switch modules or implement IRF. For this design, each node’s OS must bond its two ports in a mode that does not require awareness from the connected switches. You can configure the ports in active/standby mode. Only one port will forward and receive traffic. The other port will be on standby in case the active port fails. You can assign the active NIC role to half of the cartridge nodes’ port 1 and the other half of the cartridge nodes’ port 2 so that both switches are handling traffic during normal operation. If the workload demands bandwidth from both ports, the bonded ports could load balance traffic; however, the load balancing must use a mechanism that does not require aggregation on the switch side. For example, a node with a Linux OS can use balance-tlb mode or balance-alb mode. Windows Server 2012 can use switch independent mode with load balancing. On the uplink side, both switch modules are taking advantage of all their uplink ports to connect to data center switches, probably TOR switches. Although not shown in the illustration for simplicity, the data center switches connect the Moonshot chassis at high speeds to other Moonshot chassis and HPE servers in the rack, as well as to the data center core. The same holds true for the designs shown in the next sections. In this example, each 6-SFP Uplink Module has three connections to one data center switch and three connections to another switch. You must aggregate the three links using either a static aggregation or LACP; LACP is generally preferred, and the connected switch should support that protocol. You can create similar designs for the 16SFP+ and 4QSFP+ modules. Also enable spanning tree on both switch modules. Now the modules block links in order to create only one path to the spanning tree root, which is somewhere upstream in the data center. As you see in Figure 6-17, a link aggregation counts as a single link for spanning tree. If one of the active link aggregations fails entirely—for example, a data center switch fails—a module can automatically

unblock the proper links. Note that you should leave the spanning tree priority on each module at the default to prevent the switch module from becoming elected root (0 is the highest priority). You should typically also define the downstream ports that connect to servers as STP edge ports, which enables the ports to come up more quickly and prevents disruptions during failover events. When you do this, make sure that the redundant server ports are notbridged, which would introduce a loop. Remember: the ports should be bonded and bonded in a mode that does not require awareness on the switch modules. RSTP and MSTP have failover times of just under a second rather than times counted in the milliseconds, which are often required for today’s applications. Therefore, this design is shown for your reference, but you should usually use one of the faster converging methods for managing loops, described in a moment.

Redundant design with TRILL: 45Gc or 45XGc

Figure 6-18 Redundant design with TRILL: 45Gc or 45XGc Instead of using RSTP or MSTP on the HPE Moonshot 45Gc and 45XGc switches, you can set up TRILL—as long as the connecting switches in the data center support this protocol (see Figure 6-18). Like the spanning tree protocols, TRILL prevents traffic from looping in a broadcasts domain. However, TRILL fails over more quickly and tends to be more stable than a spanning tree protocol. It also allows switches to use all of their links, creating a lower latency, higher bandwidth, and more scalable topology. On the switch modules, you enable TRILL and set up the downlink ports as TRILL access ports. You might use this design when the data center switches are not implementing IRF. Again, although not shown in Figure 6-18, the data center switches connect the Moonshot chassis to other resources. In this design, the switch modules are acting independently—perhaps because you want to ensure that all uplink bandwidth is available for traffic (none used for stacking or IRF links). Therefore, you set

up the cartridge node ports as you do for the previous example: bonded in a mode that does not require the upstream switch to know about the aggregation.

Redundant design using stacking or IRF

Figure 6-19 Redundant design using stacking or IRF You can combine multiple HPE Moonshot 45Gc or 45Xc Switch Modules into a single IRF fabric, as you see in Figure 6-19. Similarly, you can combine two HPE Moonshot 45G or 180G switches into a single switch stack. The stack or fabric: • Is managed from one interface • Shares a control plane (which is proxied to each member in an IRF fabric) • Appears as one switch to other switches and routers As mentioned earlier, you must dedicate some of the uplink ports to stacking or IRF ports, which connect the modules together. Table 6-4 shows the minimum number of stacking or IRF links for each switch module. You could add more links if you expect that the links will need to carry more traffic between the modules, but in the design illustrated above, the links should carry minimal traffic. Refer to the switch module documentation for details about which port IDs you can use for the links. Table 6-4 Minimum required stacking or IRF links Switch Module

Uplink Module

Minimum required stacking or IRF ports for a two-member stack

45G or 45Gc

6-SFP

Two

180G or 45XGc

16SFP+

Four

4QSFP+

One (two recommended to reduce chances of a split)

In this example, the modules are connecting to an IRF fabric composed of two data center switches, so the design is simple. You place all links across the modules in a single aggregation. If one module or all of its uplinks fail, failover to links on the other module occurs in milliseconds. Switch modules support a maximum of 32 links per link aggregation, so you can aggregate all links on both modules no matter what type of uplink module is used.

Figure 6-20 Redundant design using stacking or IRF If the modules are connecting to multiple data center switches that are not in an IRF fabric, you need to implement another method for managing loops: RSTP/MSTP or TRILL. When you use stacking or IRF, illustrated in Figure 6-20, each cartridge node can implement NIC bonding in any mode that meets the customer needs. If the use case calls for load balancing traffic on both ports, you can use LACP mode (sometimes called 802.3ad mode). You then configure the stack or IRF fabric to establish a link aggregation to each node using one port on each module. Active/standby mode is also supported, but might increase the traffic on the stacking or IRF links because traffic incoming from the uplinks might arrive on the module connected to the standby port; the traffic must then cross the stacking links to reach the module connected to the active port. When you plan connections for a stack or an IRF fabric that you intend to use for redundancy, create a balanced design, as shown in the figure, in which each switch has the same number of connections to each upstream or downstream device. This design prevents undue congestion on the stacking or the IRF links. During normal operation, when traffic arrives on either member of the stack or the fabric, that member can forward the traffic on a local link in the link aggregation connected to the destination device. When you follow these guidelines, using stacking or IRF on two modules should not affect the networking performance—and it will enhance redundancy. Note that, during some failover situations, traffic will need to cross the stacking or IRF link because, for example, half of the cartridge node traffic will still be arriving on a module with failed links.

(Network administrators could create a policy on a 45Gc or 45XGc switch to shut down the ports that connect to cartridge ports if all uplink ports fail.)

Expanding stacking or IRF across multiple chassis

Figure 6-21 Expanding stacking or IRF across multiple chassis When you want to use stacking or IRF to enhance resiliency, you should plan the topology as you saw in Figure 6-20, with two members per stack or fabric and links balanced across the modules. Sometimes, however, customers want to use stacking or IRF to reduce the number of data center switch ports required to connect to the Moonshot chassis and perhaps to eliminate the need for a ToR switch entirely. For example, the customer shown in Figure 6-21 has four Moonshot chassis in a rack. The chassis are populated with 1P 10G cartridges and use Moonshot-45XGc Switches installed in slot A together with 4QSFP+ Uplink Modules. This customer does not require redundancy for individual cartridge node links, so modules are not installed in slot B. However, the customer does want link redundancy at the chassis level, so if each switch module were acting on its own, at least eight 40 GbE ports would be required to connect all chassis in the rack. When you combine the switch modules on all these chassis into a single stack or fabric, the modules can share uplinks. For example, you could establish four uplinks, one on each module. All four chassis have link redundancy because if one of the chassis’ uplinks fails, it can use the stacking or IRF links to reach other module’s uplinks. However, only half the number of ports are required for supporting the rack, so the customer might be able to deploy end or middle of the row (EoR) switches instead of ToR switches. If the solution can tolerate more oversubscription for cartridge traffic, you could reduce the uplinks further. The increased oversubscription is a tradeoff in this design, as is increased latency. Some outgoing traffic must traverse stacking or IRF links before reaching the uplinks; similarly, incoming traffic

might need to traverse stacking or IRF links before reaching the module that connects to the destination cartridge node. To ensure adequate performance, it is recommended that you use a ring topology as shown in Figure 6-21 and that you carefully consider how much bandwidth is required on the stacking links. When traffic flows from cartridge nodes, a switch module that has an uplink in the egress link aggregation forwards the traffic on its local uplink. If it does not have a local uplink, it forwards the traffic on a stacking link to the closest member with an uplink. However, incoming traffic might arrive on any module because the upstream switch decides on which link in the aggregation to send it. Then the traffic must cross the stacking or IRF links to reach the connected cartridge. In many Moonshot use cases, distributing the uplinks across the stack or fabric so that every module has or is close to an uplink reduces traffic on the stacking or IRF links. If the traffic patterns for your use case differ, you might need to add bandwidth to the stacking or IRF links. Inadequate bandwidth can cause lost packets. Table 6-5 shows the maximum number of members per stack or fabric permitted on various switch modules. Keep in mind, though, that performance can decrease as the number of members increases, particularly for the 45G Switch Modules. Table 6-5 Members per stack Switch Module

Maximum number of members per stack or IRF fabric

45G and 180G

9

180G

2

45Gc and 45XGc

4

Using VLANs

Figure 6-22 Using VLANs Often a chassis is housed with cartridges that are part of a scale out solution, and all cartridges belong to the same network. Sometimes, though, you will need to place different cartridge nodes in different networks. You enforce the network divisions at the switch module level by assigning the node ports to the correct virtual local area network (VLAN) without tagging (node ports are access ports on 45Gc and 45XGc modules), as illustrated in Figure 6-22. You can assign different uplinks to different VLANs without tagging if you want to dedicate certain uplinks for certain cartridges’ traffic. If you want cartridges to share uplink ports, assign all the cartridge VLANs to the uplink ports (or aggregations) using tagging (the uplink ports are trunk ports on 45Gc and 45XGc modules).

Chapter 6—Activity Now take some time to complete an activity in which you will review what you have learned about selecting switch modules and uplink modules, as well as properly designing redundant connections. You can check your answers by referring to Appendix B: Answers to Activities.

Select switch modules You will now answer several questions. You need to select the switch modules for several HPE Moonshot chassis. For each scenario below, select the best switch. Assume that the customer wants the highest bandwidth connections and the most advanced switch functions that are possible to achieve in the scenario. You can use Table 6-6 to review the cartridge capabilities. Switch module choices a. HPE Moonshot 45G Switch Module b. HPE Moonshot 45Gc Switch Module c. HPE Moonshot 45XGc Switch Module d. HPE Moonshot 180G Switch Module

Scenarios 1. The chassis has 45 m710p cartridges. 2. The chassis has 45 m350 cartridges. 3. The chassis has 45 m300 cartridges. 4. The chassis has 30 m400 cartridges and 15 m800 cartridges. 5. The chassis has 3 m300 cartridges and 42 m710 cartridges.

Table 6-6 HPE Moonshot cartridges

Select uplink modules Listed below are three choices for uplink modules. Think about and record reasons to choose each type of module. (The reasons for choosing some modules might be quite straightforward; in other cases, you might need to consider more factors.) 1. HPE Moonshot 4-QSPF+ Uplink Module 2. HPE Moonshot 16-SPF+ Uplink Module 3. HPE Moonshot 6-SPF+ Uplink Module

Plan redundant topologies When you plan the connections for an uplink module, you must be careful to plan the correct technologies. Figures 6-23, 6-24, 6-25, and 6-26 show several ways to connect switch modules to data center switches, some more redundant than others. Some of the designs also use IRF fabrics. Sketch these simple figures on a blank piece of paper. Then draw a circle round links that you would

combine in a link aggregation. Also label the figure with any technology that you would implement, such as TRILL or RSTP.

Figure 6-23 Topology 1

Figure 6-24 Topology 2

Figure 6-25 Topology 3

Figure 6-26 Topology 4 You can check your selections in Appendix B: Answers to Activities.

HPE Moonshot management Next, you will learn how to manage and provision HPE Moonshot solutions.

HPE Moonshot iLO CM module

Figure 6-27 HPE Moonshot iLO CM module The iLO functions for a Moonshot chassis’ cartridges are managed through an HPE Moonshot iLO CM module. The iLO CM acts as the single Ethernet gateway for the cartridges. The CM communicates with satellite controllers (SCs), which are embedded throughout the chassis, as shown in Figure 6-27. For example, each cartridge has an SC, as does the switching fabric. The chassis also has more than 1500 sensors, which collect information for the CM. Administrators can access the CM through a serial port and receive access to the CM CLI. Or they can connect through the iLO port (after the CM has received an IP address) and access the CLI through SSH or the GUI through HTTPS. The device-neutral design allows a common interface to ARM SoC cartridges and x86 cartridges. As new types of cartridges are invented, the device-neutral architecture can enable the inclusion of that cartridge into the management fabric. From the management interfaces, administrators can • Monitor cartridge and switch health • View logs • Monitor and manage power utilization • Set cartridges’ boot settings • Manage the CM itself, including updating firmware for the module and SCs, setting the CM’s IP address, and managing the accounts for users allowed access to the CM The iLO CM also provides a serial console connection to each cartridge node’s virtual serial port (VSP). The iLO CM hosts Intelligent Platform Management Interface (IPMI) and the REST API for the cartridge nodes, allowing scripting for monitoring and maintenance tasks. Chapter 9,“Monitoring and Managing HPE Solutions,” explains more about the REST API, which is the preferred API. In short, the iLO CM is quite similar to an HPE BladeSystem Onboard Administrator (OA) with which you should be familiar from prerequisite training. Table 6-7 shows the various privileges that

administrators can assign to iLO CM users and their consistency with existing iLO systems. HPE Moonshot chassis support a single iLO CM module. If the module fails, the iLO and management functions are unavailable until the module is replaced. However, the cartridges and internal switches continue to function, and production services are undisturbed. Table 6-7 HPE Moonshot iLO CM privileges Privilege

Comparison to existing iLO-based platforms

Remote Console Privilege

Allows access to cartridge node Virtual Serial Ports (VSPs)

Boot Priority

Allows configuration of node level PXE/HDD boot settings

Power and Reset

Remains consistent with iLO

Configure CM

Provides same privileges as the Configure iLO option

Administer User Accounts

Remains consistent with iLO

Planning iLO connections

Figure 6-28 Planning iLO connections As you can for HPE Apollo solutions, you can choose to connect each chassis iLO CM port to a network switch, or you can daisy chain several Moonshot chassis together and connect only one to a network switch. The first option provides better availability for iLO CM functions, while the second option conserves ports. Figure 6-28 illustrates how to use the iLO and Link ports on the chassis to daisy chain them together. (Administrators also have to enable the daisy chain function from the iLO CM CLI.) In either case, it is recommended that you use a separate switch for iLO from the switches used for production traffic. Be very careful to avoid loops. On a standalone chassis, connect only the iLO port to the network switch, not both the iLO and Link port. In a daisy chain, connect the iLO port on only one chassis to

the network switch.

Access to HPE Moonshot cartridge node

Figure 6-29 Access to HPE Moonshot cartridge node The HPE Moonshot chassis is headless without any connectors for a video console, keyboard, or USB device. Administrators and system integrators manage the chassis and its components through the CM exclusively. They can access a node’s virtual serial port (VSP) through the iLO CM CLI (SSH session only). Using the VSP, they can set boot settings and configure a one-time boot node, as well as other basic tasks. This access is illustrated in Figure 6-29. Administrators can also configure settings such as cartridge node boot options through the iLO CM GUI. From the GUI, administrators can also establish a Remote Console session with keyboard, mouse, video console, and virtual media—however, they can only do so if the cartridge node is linked with a Moonshot Remote Console Administrator (mRCA).

HPE mRCA

Figure 6-30 Access to HPE Moonshot cartridge node You should typically recommend at least one mRCA for the solution. An mRCA, when installed in an

HPE Moonshot chassis cartridge slot, links to a cartridge node and provides Remote Console access to that node, as shown in Figure 6-30. Which node is linked to the mRCA depends on the slot in which mRCA is installed. HPE recommends installing the cartridge that you want to manage in slot 41. If the cartridge is a 1P cartridge or if you want to manage the first node on a 4P cartridge, install the mRCA in slot 44. To manage one of the other nodes on a 4P cartridge installed in slot 41, place the mRCA as follows: Node 2 = slot 40 Node 3 = slot 42 Node 4 = slot 38 Although these slots are recommended, the mRCA can be installed in different slots and link to different cartridges and nodes. To look up in which slot you should install the mRCA cartridge to link to a particular node—or in which slot you should install a cartridge to link to an mRCA already installed in a specific slot—visit http://h17007.www1.hpe.com/us/en/enterprise/servers/mrca/index.aspx#.VquEUdUrKM8. With mRCA installed, administrators and integrators can access the iLO CM from their own management station and launch a Remote Console session with the linked cartridge node. They now have keyboard, console, and mouse access to the cartridge node. The mRCA also provides virtual media, so administrators can mount an image to the cartridge node from media connected to their device. With access to the virtual media, they can also easily create a golden image, installing all the necessary correct applications and configuring the correct settings for the customer solution. Integrators can then use tools such as Microsoft Windows Deployment Services (WDS) to capture the image (they must customize a capture boot file to enable the capture process to proceed correctly on a headless device). They could then set up preboot execution environment (PXE) as described later in this section, using the captured golden image as the install image file. The mRCA also provides a Debugging Tool, which administrators can use to troubleshoot a node. If the customer primarily wants to use mRCA for the initial deployment of the solution, one mRCA is sufficient. Administrators can install the mRCA in slot 44 of one chassis and create the golden image on node 1 of whichever cartridge is installed in slot 41. If the customer plans to use the mRCA for debugging, you might recommend leaving slots 41 and onward open in one chassis (with blanks installed). Then the administrators can simply install the mRCA in slot 44 and move any cartridge that needs to be debugged to slot 41. If the customer has 4P cartridges, leave slots from 38 onwards open in one chassis. Then administrators can move the mRCA to slot 40, 42, or 38 to debug node 2, 3, or 4 on a cartridge in slot 41. To support more extensive debugging, you might recommend one mRCA per HPE Moonshot chassis. The mRCA supports x86 cartridges. For a current list of supported cartridges, visit http://www.hp.com/go/moonshot.

Provisioning options

Figure 6-31 Provisioning options HPE Moonshot cartridge nodes can use PXE to boot an image from the network for their initial OS installation. The sections that follow provide more information about ensuring that PXE works in for headless Moonshot cartridges. Keep in mind, though, that especially if you are architecting a large solution with multiple HPE Moonshot chassis, you should propose one of two solutions for speeding the provisioning process: HPE Moonshot Provisioning Manager (MPM) or HPE Insight Cluster Management Utility (CMU), as you see in Figure 6-30. HPE Insight CMU provides monitoring capabilities in addition to the provisioning ones and is geared for larger deployments. Chapter 9, “Monitoring and Managing HPE Solutions,” covers these solutions. The mRCA cartridge provides Remote Console (including Virtual Media) access to a node on a linked server cartridge and is recommended for creating the golden image that will be deployed to other nodes through one of the other methods.

Planning for a network deployment without HPE Provisioning Manager or Insight CMU

Figure 6-32 Planning for a network deployment without HPE Provisioning Manager or Insight CMU If the customer does not use HPE MPM or Insight CMU, integrators would set up PXE using much the same process as for other ProLiant servers. However, some special considerations apply. For a Windows installation, they must set up a PXE server, DHCP server, and DNS server (which could all be the same server) and load the proper boot, install, and driver files on the PXE server (as shown in Figure 6-32). Windows Deployment Service (WDS) is a common PXE server for a Windows environment, but HPE Moonshot supports other solutions. Integrators must also use HPE Moonshot Windows Deployment Packs (MWDPs) to customize the boot and install files with the proper drivers and settings for supporting a particular HPE Moonshot cartridge. For example, the MWDP turns on Windows Emergency Management Services (EMS), which allows the OS to install on the headless cartridge nodes. Integrators must also create answer files that allow a cartridge node to go through the installation process without user interaction. When using WDS and other solutions with similar capabilities, integrators should create a pre-staged solution in which each cartridge node’s MAC address is bound to the correct boot and install files. To obtain the MAC addresses, integrators can access the iLO CM and generate a list of the addresses. A Linux installation similarly requires a PXE server with the proper boot and configuration files, a DHCP server, as well as a TFTP server, and an HTTP, NFS, or FTP server to deliver the OS installation files. Again, the same server might provide all of these services. The PXE configuration files require some special settings for the headless environment. Integrators will probably want to use an automatic installation process using kickstart, pre-seed, or AutoYaST files (the precise file type depends on the type of Linux OS). Integrators will need to customize these files to support the headless environment. The HPE Moonshot cartridge nodes are configured to boot from PXE by default. However, integrators can access CM to set up booting from a local HDD or SSD or from iSCSI. Complete instructions for deploying a supported Windows or Linux OS through the network are provided in the Operating System Deployment on HPE ProLiant Moonshot Server Cartridges User Guide.

Microsoft System Center Configuration Manager integration (SCCM)

Figure 6-33 Microsoft System Center Configuration Manager integration (SCCM) HPE Moonshot solutions integrate with Microsoft SCCM (shown in Figure 6-33), another option for the PXE deployment solution. SCCM helps to automate the deployment of Windows OS to multiple nodes. It also allows integrators to create application packages and deploy those to nodes. Integrators can streamline the provisioning process by using scripts to add HPE Moonshot nodes to SCCM. HPE has published a guide to help integrators successfully navigate the deployment. See HPE Moonshot Integration with Microsoft System Center Configuration Manager (SCCM).

Summary This chapter has explained how HPE Moonshot solutions provide compact, converged compute, storage, and networking that is tailored to the workload. You also learned general principles for architecting HPE Moonshot networking. And you examined the many options that customers have for managing Moonshot solutions. The next chapter teaches you how to design HPE Moonshot solutions for particular workloads.

Learning check Review what you have learned by answering these questions. Then check your answers in Appendix A: Answers to Learning Checks. 1. When is an HPE Moonshot 180G Switch Module required for an HPE Moonshot 1500 Chassis? a. Whenever the customer wants to install an HPE Moonshot 4QSFP+ Uplink Module b. Whenever the chassis has a mixture of 10 GbE and 1 GbE cartridges c. Whenever the customer wants to use both ports on a cartridge node d. Whenever the chassis includes any cartridges with four processors

2. An architect plans to connect an HPE Moonshot chassis to data center switches as shown in Figure 6-34. How should the architect plan to configure the four 10 GbE ports to prevent a loop?

a. As a link aggregation that includes all four ports b. As two link aggregations, each of which includes the two ports that connect to one of the data center switches c. As four separate ports with the two ports that connect to one switch assigned to one VLAN and the two ports that connect to the other switch assigned to another VLAN d. As four separate ports, all of which are assigned to the same VLAN

Figure 6-34 Exhibit for learning check 3. How can administrators contact the VSP for an HPE Moonshot cartridge node? a. Through the cartridge’s serial port b. Through the iLO CM CLI c. At the cartridge node’s iLO IP address d. At the cartridge node’s IP address on its first port

For answers, See Chapter 6 in Appendix A.

7 HPE Moonshot Solutions for Particular Workloads EXAM OBJECTIVES • Position HPE Moonshot cartridges for the right use cases and workloads • Create an implementation plan for the following solutions, including plans for the proper performance, scalability, and high availability: – Big data and analytics solution – Video processing solution – Mobile workspace solution – Web infrastructure solution

Assumed knowledge Before reading this chapter, you should have a basic understanding of the following: • Processors, including DDR3 and DDR4 memory, hard disk drives (HDDs), solidstate drives (SSDs), and RAID levels for storage volumes • HPE ProLiant rack and blade servers and options for them such as HPE Smart Array Controllers • HPE BladeSystems, including interconnect modules and Virtual Connect (VC) modules • Server management and maintenance, including experience with Integrated Lights Out (iLO), Intelligent Provisioning, UEFI, HPE Insight Remote Support, HPE Insight Online, HPE Smart Update Manager (SUM), and HPE Insight Control server provisioning (ICsp) • HPE OneView capabilities

Chapter topics This chapter teaches you how to architect HPE Moonshot solutions for four use cases: • Big data and analytics • Video processing • Mobile workspace • Web infrastructure

Big data and analytics This topic explains how to architect HPE solutions for big data and analytics.

HPE Big Data Reference Architecture

Figure 7-1 HPE Big Data Reference Architecture As you learned in Chapter 3, the HPE Big Data Reference Architecture (illustrated in Figure 7-1) can improve flexibility and scalability, while enhancing performance, by separating compute nodes from storage nodes. You are then free to select nodes that are optimized for each role. In addition, you can control the balance of compute to storage nodes. Finally, you can allow multiple compute clusters running different applications to share the same Hadoop Distributed File System (HDFS) cluster, eliminating the sprawl of duplicated data. You will now learn in more detail how to design this architecture.

The storage solution: HPE Apollo 4200

Figure 7-2 The storage solution: HPE Apollo 4200 The HPE Apollo 4000 family provides the ideal solution for the storage nodes (data nodes), which run a file system such as HDFS to store and serve the data. These servers are purpose-built for big data and object storage. The Apollo 4200 System, shown in Figure 7-2, typically provides the best choice for storage nodes when you divide the compute and storage nodes. This system supports up to two processors in the Intel Xeon E5-2600 v3 or v4 family, with a wide array of choices ranging from 4 to 22 cores. Because an HPE Moonshot System will provide the compute, you can choose a mid-range option. The reference architecture for Cloudera recommends a 10-core E5-2660v3 processor. You also have many options for disks. You can choose either a large form factor (LFF) 4200 model, which supports up to 28 LFF disks, or a small form factor (SFF) model, which supports up to 54 disks. Table 7-1 indicates the server ’s maximum storage capacity for various types of storage as of the publication of this ebook (always check the latest QuickSpecs for updates). SATA HDDs will typically work well for this solution. These disk drives optimize capacity over performance in keeping with the storage nodes’ role of storing large amounts of data. The compute nodes, which operate on data and need to write and perhaps shuffle result files, on the other hand, provide highperformance SSDs. The Apollo 4200 System supports two embedded Smart Array controllers, the HPE Flexible Smart Array P840ar and the HPE Dynamic Smart Array B140i. The P840ar provides many RAID levels (RAID 0, 1, 5, 6, 60, and ADM); 4 GB for the flash-backed write cache (FBWC) used with the HPE SmartCache feature to enhance performance for writes; HPE SSD SmartPath, which enhances read performance on SSDs; and optional HPE Secure Encryption. Table 7-1 Maximum local storage capacity for HPE Apollo 4200 Disk type

Protocol

Form factor

Maximum capacity when all disks are this type

HDD

SATA

SFF

108 TB (48 + 6 rear x 2 TB)

LFF

224 TB (24 + 4 rear x 8 TB)

SFF

108 TB (48 + 6 rear x 2 TB)

LFF

224 TB (24 + 4 rear x 8 TB)

SFF

207.36 TB (48 + 6 rear x 3.84 TB)

LFF

44.8 TB (24 + 4 rear x 1.6 TB)

SAS

SSD

SATA

SAS

SFF

86.4 TB (48 + 6 rear x 1.8 TB)

LFF

44.8 TB (24 + 4 rear x 1.6 TB)

The compute solution: HPE Moonshot cartridges

Figure 7-3 The compute solution: HPE Moonshot cartridges HPE offers several Moonshot cartridges that are optimized for data processing and analytics applications such as YARN applications (see Figure 7-3). Each cartridge is tuned for a slightly different workload, as you can see in Table 7-2. The HPE ProLiant m710 and HPE m710p cartridges are optimized for data processing and analytics applications of many types. They are the recommended cartridges in HPE Reference Architectures for Cloudera and for Apache Spark. They also provide excellent performance for NoSQL databases such as HBase and Cassandra. In addition to their excellent processing power and memory, these cartridges have two 10 GbE ports that support Remote Direct Memory Access (RDMA) over Converged Ethernet (RoCE), which helps to reduce latency for communications between the cartridges and the storage. Both cartridges offer a similar set of capabilities; however, the HPE m710p has a more powerful CPU and GPU accelerator than the m710 does. For real-time analysis, choose the HPE m800 cartridges. These cartridges bring the premium performance of Digital Signal Processing (DSP) cores to the dense, efficient, and highly scalable Moonshot chassis. Real-time analysis demands not only high performance, but also high bandwidth and low latency data exchange between compute nodes. The m800 cartridges feature high bandwidth links between the four nodes on the cartridge and also take advantage of the high-speed 2D torus connections within the Moonshot chassis. The HPE m300 and m350 provide a lower cost alternative for basic distributed analytic applications. The m300 provides more processing power per core than the m350, but the m350, with four nodes per cartridge, provides a higher density solution.

If the customer needs faster results for queries through in-memory analytics—that is, operating on datasets in memory rather than on disk—choose the HPE m400 cartridges. The m400 cartridges meet the needs with 64 GB RAM, the highest memory per processor of the HPE Moonshot cartridges. They also support 10 GbE connectivity. Table 7-2 HPE Moonshot cartridges for big data analytics

Figure 7-4 Selecting local storage for compute nodes Many of the cartridge node features are fixed, being tuned already for the workloads for which that cartridge is designed. As you see in Figure 7-4, you do have choices, though, for the amount of local

storage. Most of the cartridges support SSDs, which will provide high performance for the analytic applications. The m300 gives a choice between a high-capacity HDD (up to 1 TB) or a higher performance 240 GB SSD. For analytics, generally choose the SSD. SSDs provide the low latency and high speed I/O required for the intermediate result or shuffle files created by many Hadoop applications. For Spark applications, they provide high speed I/O for any data that does not fit in memory. In either case, you should typically select the higher capacity SSDs to ensure that these files can fit on the SSDs. The performance offered by SSDs might be particularly important for I/O-bound jobs such as sorting, grouping, or transforming data. For a NoSQL server, consider the maximum dataset size for a table. All of the servers together should be able to hold this data in their local storage. Again, you should typically select the highest capacity SSD for this workload. Then you will not have to add more compute nodes simply to obtain more storage.

Three recommended designs

Figure 7-5 Three recommended designs You have selected the HPE Moonshot cartridges and the HPE Apollo 4000 servers for the solution. Next you must scope out the rack. See the recommended designs in Figure 7-5. Architects for big data and analytics solutions often take storage capacity requirements as the beginning point for planning. In the traditional architecture, in which the same servers provide compute and storage, compute provisioning was tied closely to the storage requirements. Generally, architects would select a server that provided one core per disk. The company then scoped out the number of servers based on the storage requirements.

For customers who want a traditional balance of compute and storage, the HPE Big Data reference architecture offers a balanced rack design. This rack consists of three HPE Moonshot 1500 Chassis, each with 45 m710/m710p cartridges, for 135 cartridges (540 cores) total. (Note that you could adjust the cartridge type for a particular use case such as real-time processing, as you learned earlier). The rack also has five HPE Apollo 4200 servers. The Apollo 4200 can support up to 224 TB (using 28x 8 TB SATA HDDs) for 1.2 PB per rack. However, some reference architectures call for the Apollo 4200 to use 4 TB SATA HDDs, which adds up to 112 TB and 560 TB per rack. The second configuration supports a lower storage capacity but the same I/O and compute power, making the rack a bit “hotter.” After you decide on the capacity for the rack, multiply out the number of required racks based on the customer ’s capacity needs. Remember that the HPE big data solutions give you more flexibility in design; compute is no longer constrained by storage, but can scale at the right rate for the customer. Perhaps the customer needs fast or even real-time results for queries. Perhaps the customer has many applications with complex processing demands. Such a customer requires a “hot” solution with more compute power as compared to storage capacity. A hot rack includes seven HPE Moonshot 1500 Chassis (315 nodes with 1P cartridges or 1260 nodes with 4P cartridges) and one HPE Apollo 4200 (112 TB or 224 TB). Other customers might have a great deal of data, but place fewer computational demands on the data. For example, they might be focused more on data archival with limited analysis of stored data. Such a customer would have overprovisioned compute under the traditional model. You can help the customer lower total cost of ownership (TCO) by suggesting a “cold” rack design. This rack provides a great deal of storage, with seven HPE Apollo 4200s and a single HPE Moonshot 1500 Chassis (45 nodes with 1P cartridges or 180 nodes with 4P cartridges). You can also make your own custom mix, but follow the guideline of eight total components (Moonshot 1500 chassis or Apollo 4200 Systems) per rack. For example, perhaps you have opted to provision the Apollo 4200 servers with 112 TB, but this makes a balanced rack a bit hotter than the customer needs. You could plan two full HPE Moonshot chassis for every six Apollo 4200 Systems.

Strategies for selecting a mix of compute and storage

Figure 7-6 Strategies for selecting a mix of compute and storage

If you are unsure of whether the customer needs a hot, cold, or balanced design, you can attempt to determine more precisely how many compute nodes are required, as shown in Figure 7-6. Discuss the number of data analysis application instances the customer expects to run at once and the maximum amount of time in which an application should finish executing a job whether that be an hour, two hours, or more. Each HPE Moonshot chassis loaded with 45 m710/710p cartridges can contribute a certain amount of processing power, memory, and local storage to running an application. For example, if the customer is using an MR2 application, HPE recommends up to 14 map tasks and 7 reduce tasks per m710/710p cartridge. This means that a Moonshot chassis would support up to 628 concurrent map tasks (630 minus 2 for necessary management processes) and a rack of three chassis could support 1888 (1890 minus 2). These limits are designed to ensure that each map task or reduce task has enough memory for the default heap size (about 2 GB for map tasks and about 3 GB for reduce tasks, with reduce tasks being limited until map tasks are complete). The application runs much better when the heap can be loaded fully in the memory rather than the server needing to swap data to and from the SSD. Note also that all together a chassis can hold 21.6/43.2 TB of data locally (m710p cartridges support the higher number); a rack of three chassis could hold 64.8/129.6 TB. Based on the dataset size and number of tasks involved, how long would it take a rack of three chassis to execute an average application job and a maximum job? Or if you need more than five Apollo 4200 Systems to meet the storage needs, how long would it take the multiple racks to run the applications? If the customer wants the application to run more quickly, you must make each rack “hotter.” For example, you might plan a rack of four Moonshot chassis to four Apollo 4200 systems, increasing the compute to storage ratio nearly by a factor of two from 0.6 to 1. You would then need to plan more racks to deliver the required total storage capacity. You might also discuss which resource is most likely to cause a bottleneck for the type of application that the customer is using: CPU, memory, local storage, or network bandwidth. If the customer already has a solution in place, administrators can use that solution’s tools, such as Cloudera Manager, to monitor resource usage. Scope the solution to meet the needs for the bottleneck resource. Another way to think about the requirements is to consider whether the application is running CPU bound or I/O bound tasks. You learned about this strategy in Chapter 3. Table 7-3 reminds you of CPU bound tasks versus I/O bound ones. Also keep in mind that Impala, Spark, and Solr Search applications tend to be CPU bound. If most tasks are intense CPU bound ones, you might want to make the rack a bit hotter. If most tasks are I/O bound, make sure that you plan for 10 GbE RoCE connections and sufficient storage nodes to handle the requests. You might, for example, plan more HPE Apollo 4200 Systems using lower than maximum capacity HDDs. If the application creates intermediate files, make sure to select the maximum size SSD for compute nodes. Table 7-3 Examples of task types CPU bound

I/O bound

Classification

Sorting

Clustering

Grouping

Complex data mining

Data transformation

Feature extraction Natural language processing

When planning an HBase or other NoSQL solution, consider the maximum dataset size for read or write queries. The solution performs much better when the dataset fits on the local SSDs (tens to hundreds of times better based on HPE tests). How many cartridges are required for the full dataset to fit on the cartridges’ total SSD capacity? For example, if a maximum dataset is 20 TB, 45 cartridges with one 480 GB SSD each should meet the needs. For customers who want very fast results, you might scale out the solution a bit more so that more of the dataset fits in the memory (which increases performance 1.5 to 4 times better than when data exceeds the memory but fits on the SSD, according to HPE tests). Reference architectures and tools provided by the application provider (shown in Table 7-4) can also provide insight in scoping the solution. Table 7-4 HPE big data reference architectures Solution

Reference architecture

HBase

HPE Verified Reference Architecture for running HBase on HPE BDRA

Cloudera

HPE Verified Reference Architecture BDRA and Cloudera Enterprise implementation

Hortonworks Data Platform

HPE Big Data Reference Architecture: Hortonworks Data Platform reference architecture implementation

DataStax Cassandra

DataStax Enterprise on HPE Moonshot System with HPE ProLiant m710 Server Cartridges

MapR

HPE Verified Reference Architecture BDRA and MapR Distribution implementation

Reviewing the need for other solution components

Figure 7-7 Reviewing the need for other solution components

The same requirements for additional big data solution components, which you examined in Chapter 5,“HPE Apollo 4000 for Data-Driven Organizations,” hold true for the HPE big data architecture (see Figure 7-7). In this architecture, though, the compute nodes do not need to connect to the extract transform load (ETL) network, even if the solution does not use edge nodes. The storage nodes would handle ingesting data in that case.

Planning the networking connections

Figure 7-8 Planning the networking connections Providing low latency connections between the compute and the data nodes will help the applications to perform better. The m710 and m710p cartridges support 10 GbE with RoCE, so if you are using them, make sure to select the HPE 45XGc switch modules to ensure that the cartridges obtain the 10 Gbps speeds. For all cartridges, you should generally plan bonding or teaming the ports using load balancing so that the node can use both of its ports. If the customer wants to use Link Aggregation Control Protocol (LACP) mode, remember to plan stacking or Intelligent Resilient Fabric (IRF) links between the two switch modules. The compute nodes should all be in the same network, and the virtual local area network (VLAN) for that network can be applied on the top-of-rack (ToR) switches rather than on the switch module. Plan two HPE ToR switches, such as 5930 switches, combined in an IRF fabric to support nonblocking speeds between the HPE Apollo 4200 servers and cartridges in the rack’s multiple Moonshot chassis. Figure 7-8 shows an example in which the 45XGc switch modules connect together on one IRF link (make sure that network administrators implement multi-active detection [MAD] to avoid a split fabric). They then have six 40GbE links for connecting to the 5930 IRF fabric with an LACP link aggregation. The 5930 IRF fabric also has two 10 GbE links to each HPE Apollo 4200 server. The 5930 IRF fabric will have at least 12 40 GbE ports (six on each switch) available for uplinks. If the customer ’s needs require you to plan multiple racks, the ToR switches can connect on these uplinks to an aggregation pair of 5930 switches in another IRF fabric.

The network administrator will also need to plan how the management node will link to an external network and how edge nodes or storage nodes will link to the ETL. The ToR switches could provide these links, using VLANs to isolate the networks from the cluster network. Remember also to include switches with 1G edge ports in the design to provide iLO connections.

Guidelines for testing The guidelines for testing this solution are similar to those for testing a solution that uses the traditional big data architecture. They are repeated here for your reference. Your proof of concept (POC) should match your design as closely as possible, including the HPE Moonshot 1500 Chassis loaded with the correct components, the HPE Apollo 4000 servers, and the HPE data center switches. The HPE Discovery Lab provides you with a secure environment for testing applications on an HPE Moonshot System. You can access the lab through a virtual private network (VPN) from any location. To learn more about the lab and to set up a time to use it, visit http://www8.hp.com/us/en/products/servers/proliant-server.html?compURI=1536877#.VrYR2tsrKM8. Before you run the test, it is also important that you tune the nodes to better support the application. Table 7-5 lists HPE reference architecture documents that explain the tuning guidelines. This tuning will ensure the best results from the test, and you should also recommend that the system integrator completes the same steps for the final solution so that it operates most efficiently. Table 7-5 HPE big data reference architectures Solution

Reference architecture

HBase

HPE Verified Reference Architecture for running HBase on HPE BDRA

Cloudera

HPE Verified Reference Architecture BDRA and Cloudera Enterprise implementation

Hortonworks Data Platform

HPE Big Data Reference Architecture: HortonworksData Platform reference architecture implementation

DataStax Cassandra

DataStax Enterprise on HPE Moonshot System with HPE ProLiant m710 Server Cartridges

MapR

HPE Verified Reference Architecture BDRA and MapR Distribution implementation

You are then ready to test. Benchmarking tools provide generic metrics—for example, the throughput for reads and writes to the HDFS cluster. Table 7-6 lists some benchmarking tools for big data and analytics. Table 7-6 Example benchmarking tools Solution

Benchmarking tool

Description

NoSQL databases

Yahoo Cloud Service Benchmark

Tests throughput for read/write queries to the database

HDFS

TestDFSIO

Tests throughput and average IO rate for read/writes to HDFS

HDFS and MapReduce

TeraSort

Tests time for sorting data (large job)

MRBench

Tests average time for completing many small jobs

Benchmarks might have a role to play in your testing, but you are more precisely attempting to determine how well your customer ’s application runs. Plan several tests using the customer applications with datasets of various sizes, including one that meets or exceeds the customer ’s maximum needs. You should also choose tests that place various demands on the solution, including worst-case scenario demands. For example, for a NoSQL test, you might run read-heavy tests and write-heavy tests, as well as balanced read-write tests. You should also test how the NoSQL solution handles a high degree of random IO requests. After you run the test, determine whether the execution time and other metrics are acceptable or whether you need to adjust the solution. The application that you are testing might provide you with valuable metrics for this purpose. For example, Hortonworks Data Platform (HDP) uses Ambari to collect and expose metrics; the Cloudera Manager also tracks metrics. Table 7-7 gives examples of some metrics that you might examine as you test. You can find a complete list of Hadoop metrics at https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/Metrics.html. Table 7-7 Example metrics Application

Metric

Meaning

HBase

regionserver.Server.blockCacheEvictedCount

Number of blocks that had to be evicted from the block cache due to heap size constraints. If this stays at 0, all of your data fits completely into HBase blockcache (stored in the cartridge node memory), which is the most desirable case. If you see values too far above 0, you need more compute nodes so that each can handle a smaller amount of data that fits in the memory.

regionserver.Server.blockCacheExpressHitPercent

The percentage of time that requests with the cache turned on hit the cache. Values under 100 mean that hot data being processed cannot be entirely fit into blockcache. If the number is too far below 100, scale up the number of compute nodes.

regionserver.Server.storeFileSize

Aggregate size of the store files on disk. Make sure this value is similar on all region servers in order to properly balance the HBase load.

regionserver.Server.blockCacheFreeSize

Number of bytes that are free in the blockcache. This value indicates how much of the cache is used. It is a good indicator if your data is “warmed” by moving it into cache, so a low value is good.

regionserver.Server.readRequestCount

The number of read requests received. You can use this metric to see how many requests the solution is handling.

regionserver.Server.writeRequestCount

The number of write requests received. You can use this metric to see how many requests the solution is handling.

regionserver.Server.flushQueueLength

Current depth of the memstore flush queue. This metric should stay about the same over time. If it increases, the node is falling behind with clearing memstores out to HDFS.

QueueMetrics PendingMB QueueMetrics PendingvCores

The current memory or CPU resource requests that are not yet scheduled. A high number might indicate that you need to scale out the number of cartridges so that they provide more memory or cores.

Any YARN application

QueueMetrics running_0 QueueMetrics running_60 QueueMetrics unning_300 QueueMetrics running_1440

The current number of applications whose elapsed time is less than 60 minutes, between 60 and 300 minutes, between 300 and 1440 minutes, and more than 1440 minutes. You can use these metrics to determine whether jobs are completing in the customer’s desired execution time.

AppsSubmitted AppsRunning AppsPending AppsCompleted

Number of applications that have been submitted to the resource manager for scheduling, that are running, that are waiting to be scheduled, and that are completed. You can use these metrics to determine whether the solution can handle the required number of jobs. For example, you can see how many applications are running when the number of applications pending begins to reach an unacceptable level.

Chapter 7—Activity 1 In this activity, you will design a big data solution using HPE Moonshot and HPE Apollo 4000 systems. In this plan, you will • Design a solution to host big data analytics • Architect the solution to meet the customer ’s needs, including – Providing enough storage capacity – Meeting the compute needs – Adding HBase Scenario The scenario for this activity is similar to that for the Chapter 5 activity. A retailer operates a chain of grocery stores throughout a region. The company has a great deal of data about inventory, customers, purchases, and so on. The company is just venturing into big data solutions and plans to deploy Cloudera Hadoop. The customer wants a more scalable and reliable way to store data. The customer also wants to start analyzing that data to make more informed decisions. For example, the customer hopes to learn more about the most loyal customers and the highest-spending customers so that marketing can make better decisions about how to brand the company. The retailer has a relatively small data center with traditional rack servers. The CIO has seen projects fail before due to outdated infrastructure. She wants to ensure that the new big data solution is a success and is pushing the purchase of servers specifically designed to meet the needs of such as solution. However, unlike the previous scenario, this customer is open to your suggestions about the best architecture for the company’s needs. The customer looks forward to adding more analytics applications as the company begins to reap the benefits of the insights into the data. The customer wants to maintain the ability to scale the solution and the flexibility to add more types of analytics applications. Workload requirements You have discussed the workload requirements with the customer and discovered that • The customer requires 5.23 PB raw storage capacity (which you have calculated using the replication rate and other guidelines discussed in Chapter 5). • The customer plans to use MapReduce2 applications to analyze data on a weekly basis. Currently, the customer has just a few standard queries that it will run each week, and the queries can take hours to complete. • In the future, the customer plans to develop applications for faster queries.

Select HPE products You will propose a solution based on the HPE Big Data Reference Architecture to this customer. Match an appropriate product to each role in the solution. You can use the same product more than once. You do not have to use every product. Products a. HPE DL360 b. HPE Apollo 4200 c. HPE m700 cartridge d. HPE m710p cartridge

Solution role 1. Active and standby head nodes 2. MR2 worker nodes (compute nodes) 3. HDFS data nodes (storage nodes) 4. Management node Also answer this question: 5. The customer wants to distribute data ingestion rather than use an edge node. Which products require dual-homed connections? Will you need to set up VLANs on the Moonshot switch modules? Scope the storage requirements Record your answers to these questions. 6. How many HPE Apollo 4200 Systems will you propose? More than one answer could be valid, but think about how you would justify your answer and what you would discuss with the customer to help you make your choice. 7. When a customer is deploying a new Hadoop solution, you should generally begin with a balanced rack design for the storage nodes and MR2 compute nodes. How many balanced racks are required to meet the storage needs? Scope the compute nodes 8. How many Moonshot chassis will you deploy to fill the balanced racks? 9. Assume that you are proposing 960GB SSDs for the HPE m710p cartridges. How much data can each HPE Moonshot chassis store locally? How much data can the compute node cluster store locally? 10. How many map tasks can the cluster run at once? How many reduce tasks can it run at once? 11. What would be reasons for you to adjust your design to have relatively more or fewer Moonshot chassis? What might you discuss with the customer?

Run tests 12. You have created a POC with your proposed servers, which you have set up with a solution as close to the customer ’s as possible. What guidelines will you follow as you conduct tests? Describe the benefits of the HPE Big Data Reference Architecture 13. Begin to outline a presentation for winning the CIO and other decision makers to your side using the HPE Big Data Reference Architecture. (You will learn more about tools that you can use such as the HPE Alinean TCO/ROI Calculator, in Chapter 10.) Add HBase After the initial successful rollout of the Cloudera Hadoop solution, the customer wants to support faster analysis on smaller, more random datasets. The customer decides to add HBase. You need to scope a solution for the HBase Region Servers. You discuss requirements with the customer, who indicates that the dataset size is 100TB. 14. Which Moonshot cartridges will you plan for the HBase Region Servers? 15. How many Moonshot cartridges will you plan for the HBase Region Servers? How many chassis? You can check your answers by referring to Appendix B: Answers to Activities.

Video processing You will next learn how to architect HPE Moonshot solutions to support video processing and delivery as well as other types of content delivery.

Video transcoding and streaming Various HPE Moonshot solutions are optimized for audio and video transcoding. Transcoding uses a codec to convert raw audio and video files, which contain analog signals, into a compressed digital stream. The codec defines how the signal is sampled, how color is encoded, which bit rate is used for the stream, and so on. Examples of video codecs include H.265, H.264, H.263, H.262, Microsoft WMV, and Google VP6. Customers who require video transcoding solutions include film companies, TV companies, and other creators of media content. Some content creators deliver the content themselves using over the top (OTT) streaming, which refers to any delivery of content, including audio and video, directly through the Internet rather than through a multi-system operator (MSO) such as a cable or broadcast company. Some streaming service companies focus on streaming content generated by others. You have probably streamed video from Netflix, Hulu, and Amazon—all examples of OTT streaming services —many times. Such companies also require transcoding solutions. Rather than host their own web services, a company might use a Content Delivery Network (CDN) solution. The CDN provider, typically located in a data center near a major Internet service provider (ISP), hosts the company’s web content and guarantees a level of availability and responsiveness for

the web services. As multi-media components have become more common in web services, CDNs have also had to move into the realm of video processing and streaming.

Types of video transcoding

Figure 7-9 Types of video transcoding You should learn a bit more about various types of video transcoding, since your architecture decisions might differ based on the type. Figure 7-9 shows the architecture for a Harmonic transcoding solution (Harmonic is an HPE Moonshot partner). Other transcoding vendors have similar architectures. Live transcoding A live transcoding solution receives live content, which it must immediately transcode and package for streaming. The transcoding process is not a simple one-to-one conversion. A video streaming server must be able to serve a variety of clients with a variety of needs. Different Web browsers use different codecs to present videos. One client might request the video at high resolution and another at lower resolution. The server might need to adjust the bit rate for streams, not just for different clients but also for the same client if the client’s connection speed changes. (OTT companies prefer to throttle the transmission speed just ahead of the viewer so that they do not pay the content provider for content that no one views.) The transcoding server must ensure that a properly formatted video stream is available for all of these clients. (Some refer to transrating for adjusting the rate, transsizing for adjusting the resolution, and transcoding to only converting the format; however, most people more generally refer to any alteration as transcoding.) The application might separate the transcoding and streaming roles, or the same server might play both. In any case, not only must the transcoding servers transcode and package multiple streams, they must do so in real-time without the lags that cause users to leave negative comments and look for a new provider. The demands on the servers’ computational power can be intense.

File-based transcoding

File-based transcoding refers to non-time-sensitive transcoding of video files that are then stored in a video library rather than streamed immediately. For example, a TV company might have a transcoding farm to transcode files that will need to be streamed by a multi-screen video on demand (VOD) solution at a later date. Or a company might need to reformat its entire video library into a more efficient codec. Or a company might provide disaster recovery services for content creators and continuously receives new video files that it must transcode. In modern data centers, a transcoding farm works in parallel to transcode the video file, often into multiple formats with various bit rates and resolutions. The same farm can also package the file into a streaming format, ready for delivery to end users by a streaming server.

Transcoding types for which HPE Moonshot cartridges are tailored

Figure 7-10 Transcoding types for which HPE Moonshot cartridges are tailored In order to meet the vast computational demands, video transcoding technology developers have taken several approaches, evolving beyond simply using the general purpose central processing unit (CPU) of an x86 machine. Some developers program for custom application-specific integrated circuits (ASICs), which are hardware architectures specially designed for that application only. These ASICs can deliver excellent performance, but they require specialized hardware dedicated to the video transcoding application. Other developers are programming to make use of graphics processing units (GPUs). As you learned in a previous chapter, a GPU can accelerate easily parallelized processes, and video transcoding fits this description. Intel Quick Sync Video technology provides general transcoding and streaming features that use the Intel Iris Pro GPU. FEI is an open framework for enabling applications to use the Intel Iris Pro GPU. The software vendor can program within this framework to deliver their own customized, advanced video transcoding and streaming features. HPE Moonshot m710 and m710p cartridges, with their Iris Pro GPUs, are optimized for any video transcoding and streaming application that uses Quick Sync or FEI to make use of the GPU. In fact, based on HPE tests, they can increase performance up to 20 times per rack unit compared to traditional servers. They can also work for applications that use the CPU only. Although they do not provide these applications with as much extra power as they bring to Quick Sync- and FEI-based ones, they might increase performance up to 4.2 times per rack unit as compared to traditional servers. The Moonshot solutions are not intended for applications that require custom ASICs.

Both the m710 and m710p provide excellent performance, but the m710p gives a performance boost of about 20% beyond the m710, allowing it to support more video streams or to transcode files more quickly. Figure 7-10 provides examples of HPE partner independent software vendors (ISVs) that use Quick Sync or FEI. As you see, these vendors also have applications that use the CPU only. You should investigate the application that the customer needs HPE servers to support and determine whether they are designed to use GPUs. HPE Moonshot m800 cartridges provide four nodes, each with an ARM core and eight digital signal processing (DSP) cores, enabling them to handle transcoding on the CPU. Select these cartridges also for various forms of voice over IP (VoIP) transcoding and telecommunications use cases. Thomson Video Network is an example of a video transcoding and delivery company that partners with HPE and that has developed a reference architecture using HPE Moonshot m800 cartridges. You can visit this link to view applications, including video transcoding codecs that can benefit from the m800 DSP cores: http://www.ti.com/lsds/ti/processors/dsp/applications.page. Table 7-8 provides more specific information about these cartridges. Table 7-8 HPE Moonshot cartridges for transcoding

Scoping the number of required cartridges for video transcoding How you scope the size of the solution depends on the intended workload. The sections below provide general guidelines that you can use as a starting point. Remember to test your solution, as described later in this section. Also, you should generally over plan by 20%–25% to allow for expansion. File-based transcoding For file-based transcoding, you will follow similar guidelines to those you use for other parallelized, non-time-critical applications such as high-performance computing (HPC) and big data analytics. Consider how many minutes of files need to be transcoded each day and attempt to determine how large of a transcoding farm is required to handle those files. Customers often measure file-based transcoding performance in real-time ratio, which is the duration of the source video divided by the transcoding time. For example, a transcoding farm with a 0.5 real-time ratio would be able to transcode a 90-minute video in three hours. The real-time ratio is affected not only by the total compute power and memory delivered by the solution but also by the complexity of the job. As mentioned earlier, the solution might need to transcode a video file into several different formats as part of the same job. Therefore, when discussing needs with the customer, make sure that you understand the type of conversion that they require. As a starting point for your estimates, you can refer to results from Harmonic, an HPE Moonshot partner for file-based transcoding solutions. Harmonic found that three m710 cartridges provided a real-time ratio of 1 for a transcoding job that output eight different files with different bitrates and resolutions. In other words, the three cartridges can transcode one hour of video in one hour. If the customer needed 24 hours of video transcoded a day (assuming 24 hour a day operation), the three cartridges could provide it. A fully loaded Moonshot 1500 chassis could transcode 360 hours of video a day, assuming near linear scaling. Live transcoding and streaming When the customer wants servers for live transcoding, you should scope the size of the solution based on the number of video streams that the solution must support. The precise number that a cartridge can support at any moment depends on a variety of factors such as the particular solution, the codecs, and so on. HPE has tested the m710p as supporting up to 136 HD streams per rack unit and the m710 as supporting up to 110 HD streams per rack unit. The HPE Moonshot 1500 chassis is 4.3 U and has 45 cartridges, so the m710p supports up to 13 streams and the m710, up to 10.

Identifying other CDN workloads for HPE Moonshot

Figure 7-11 Identifying other CDN workloads for HPE Moonshot HPE Moonshot solutions’ ability to meet customers’ content delivery needs extends beyond video transcoding, as you see in the examples in Figure 7-11. The world of online gaming introduces its own complexities. Not only does the server need to deliver high-resolution graphics, it must do so in an interactive, highly responsive manner. When supporting one of the popular massive multi-player online (MMO) games, the server ’s processing power is taxed with the need to interact with multiple users, each with their own browser, connections, and capabilities. Online gaming companies need servers that can meet the requirements while also permitting easy scaling as the company adds subscribers. HPE m710p cartridges with their top-of-the-line GPUs are recommended for most gaming workloads. For customers with less intensive requirements, such as smaller numbers of players or lower resolution graphics, m700 cartridges can meet the needs. In either case, the HPE Moonshot System provides the extreme density and scalability that customers require. Extreme file transfer refers to the transfer of very large files of perhaps several terabytes. These are often files involved in the production of media content. For example, a media company might need to transfer digital video files from the place of production to a centralized data center. HPE m710p and m710 cartridges are optimized for these workloads, as well as for the video processing and streaming workloads. In fact, the same company might require both. As you learned, CDN providers deliver content on behalf of other companies. A CDN provider might need to deliver video, games, or extreme files, in which case, you should plan an HPE Moonshot solution for that provider much as you would for a company hosting its own delivery services. Sometimes, the CDN provider is focused on less taxing workloads, such as the delivery of more traditional web content. If you are designing a solution for this type of CDN, the m700 cartridges can provide a more cost-effective alternative to the m710 and m710p cartridges.

Planning storage

Figure 7-12 Planning storage The video processing or other CDN solution often requires external storage. A file-based transcoding solution reads input files from this storage and outputs files to it. Typically, select SL servers or Apollo 4000 servers such as the ones that you would use for big data storage nodes, shown here in Figure 7-12.

Meeting the networking needs for file-based transcoding

Figure 7-13 Meeting the networking needs for file-based transcoding As you learned in the previous chapter, the switch modules that you select depend primarily on the selected cartridges. In most cases, you will be using m710p or m710 cartridges, so you should select the HPE Moonshot 45XGc switches and either HPE Moonshot 16-SFP+ Uplink Modules or 4-QSPF+ Uplink Modules, based on the considerations covered in the previous chapter. A file-based transcoding solution generates significant traffic between the HPE Moonshot cartridges

and storage. First, you should set up the solution in such a way that cartridges can use both of their adapters. Typically, you should combine the switch modules in an IRF fabric and set up LACP NIC bonding on the cartridge node adapters. These adapters support Remote Direct Memory Access (RDMA) over Converged Ethernet (RoCE) for low latency communications with the storage, and together they provide 20 Gbps bandwidth. In a single chassis design, the chassis 40 GbE uplinks can connect directly to storage. (The HPE reference architecture for Harmonic WFS Xpress calls for this design.) If multiple chassis must connect to the storage, you can add HPE 5930 ToR switches to the plan to aggregate links for multiple chassis. And if the solution must extend across racks, you can connect the ToR switches to aggregation layer HPE 5930 switches or other HPE data center switches, as you see in Figure 7-13.

Meeting the networking needs for live transcoding and other content delivery

Figure 7-14 Meeting the networking needs for live transcoding and other content delivery For live transcoding and streaming, as well as online gaming and content delivery networks (CDN), you must plan a connection to shared storage, as well as a connection to data center switches for traffic to flow toward external clients. Because the streams are often destined to end users with limited bandwidth, the streaming files are typically much smaller than the original files being transcoded. A high-definition (HD) stream typically consumes between 5 Mbps and 20 Mbps. Therefore, even if an HPE Moonshot chassis is operating at full capacity and supporting 585 streams, only 11.7 Gbps is required. Two 40 GbE links or even two 10 GbE links, if the customer data center only supports 10 GbE, should be sufficient, as shown in Figure 7-14. In the latter case, though, you might want to plan for four 10 GbE links (two on each switch module) for failover situations. For gaming and other CDN workloads, discuss the traffic needs with the customer. However, their bandwidth needs will probably be less than that for HD video streaming. You might need to add more uplink bandwidth between the data center network and a Moonshot chassis that supports cartridges used for extreme file transfer.

Guidelines for testing video processing and content delivery use cases

You should test a cartridge with the customer ’s application to verify that the cartridge can handle the desired workload under a variety of conditions. As mentioned before, you can perform these tests in the HPE Discovery Lab. Plan a mix of workloads, including worst-case scenarios, such as all HD streams for a video transcoding application. When you assess performance, you should be aiming for a particular metric that the customer has defined as the required level of performance. For example, for file-based transcoding, you might calculate the real-time ratio: the length of the video file divided by the length that transcoding takes. File-based transcoding, such as HPC and big data analytic applications, are designed to operate in parallel across many servers, so it is important that you test on your planned number of cartridges. After you assess the real-time ratio provided by these cartridges, you can adjust the number of cartridges up or down as required. Of course, you might not necessarily adjust down. For example, if 45 cartridges provide a better real-time ratio than the customer requires, you might propose the faster performance as a benefit of your solution. For live transcoding, you can monitor CPU, memory, and networking usage, as you add streams to a cartridge, stopping when one of these resources reaches near maximum utilization. (Monitoring networking I/O is less important; CPU or memory is most likely to be the bottleneck.) You might also monitor from the client side because, in the end, it is the user ’s experience that matters. The customer might have a particular metric related to end user ’s experience that the solution must provide. For example, you might need to monitor the wait time to watch time ratio. The streaming workloads isolate cartridges more than the file-based transcoding workloads. Each cartridge streams to a number of users on its own. It is still often a good idea to test performance with a fully loaded chassis. Generally, though, cartridge performance scales linearly within a chassis. In other words, if one m710 handles 10 streams, a chassis should handle 450 streams.

Chapter 7—Activity 2 You will now complete an activity in which you will • Design a video transcoding solution for a TV studio company • Architect a solution to meet the customer ’s needs, including – Transcoding up to 300 hours per day – Storing video library Scenario Your sales partner has found an opportunity for selling HPE infrastructure to a TV studio. Recognizing that more and more viewers are turning to the Internet to watch TV, this studio is seeking to gain a competitive edge by allowing subscribers to stream their shows on demand. The company needs a file-based transcoding solution to convert analog video to a digital format and to package it for streaming. The company has selected Harmonic to provide the software solution, but tests have shown that the company’s existing general purpose servers cannot handle the transcoding workload. You must architect a solution to meet the needs. Workload The file-based transcoding solution will consist of a Harmonic WFS controller that controls a transcoding farm of servers. These servers run ProMedia Carbon and ProMedia Xpress, which can convert video to many different formats. Separate servers in the farm package the transcoded video for multi-screen video on demand (VoD). The WFS solution manages distributing jobs to servers and allows for automated and batch processing. The solution will output video in several bit rates and resolutions using the H.264 video codec. Requirements The studio has several channels with 24-hour programming, which needs to be prepared for streaming in advance of the content’s broadcast date. The studio also has a large backlog of video to convert for on-demand viewing. Finally, the studio has a number of promotional clips and other content that must be prepared for streaming. Decision makers have determined that the transcoding solution must be able to package 300 hours per day. Because Harmonic WFS supports automation and batch jobs, a day will be considered 24 hours. (The servers can go down for scheduled outages, but normal operation will be 24 hours a day.) The company has an FTP server for providing the source files. The company also already has servers that meet the needs for the combined Harmonic Packager and Origin streaming servers. You must plan servers to host the transcoding components. You also need to plan a server or servers to support the video library to which the transcoding farm will output files. The company requires 200 TB of storage (including duplicated data).

Select cartridge models Record your answers to these questions based on the scenario above. 1. Select an HPE Moonshot cartridge model for the transcoding farm. Explain your choice. 2. The solution also requires a server for the WFS controller, WFS manager, and SQL database roles. The controller must meet these specifications: – Processor: Intel or AMD, 3.0GHz (can include Turbo); quad-core preferred – Memory: 12 GB or higher You want to host the controller in the Moonshot chassis to conserve rack space and deliver a complete solution with Moonshot. Select a cartridge that meets these needs. You might have more than one option. Explain your choice. Run tests You are now developing a POC. The customer has told you that they want to separate the transcoding role from the packaging role. The packaging role is less intensive, and the customer tells you to provide two servers to play that role. But you need to test in order to determine how many cartridges are required for transcoding. You will begin by testing how long it takes one cartridge to transcode a file and one cartridge to package the file. (This section gives example test results for the purpose of the activity only. In the real world, results might vary based on the cartridge that you use and the customer ’s workload.) 1. What information do you need to discuss with the customer to set up the test? 2. As you discussed needs, the customer told you that the Harmonic services must be installed on either Microsoft Windows 2008 or Microsoft Windows 2012 R2. Visit http://www8.hp.com/us/en/products/servers/management/operating-environments/os-supportmatrix.html and determine which OS is supported. 3. You have discussed the necessary information and set up the POC. You discover that the m710 cartridge running ProMedia Carbon and ProMedia Xpress can transcode one 60-minute video into all required output formats in 180 minutes. What is the real-time ratio? 4. What real-time ratio does the customer require? 5. How many cartridges should you provide to transcode files? 6. How many Moonshot chassis should you provide? Remember to include the cartridge for hosting the WFS controller. Plan storage Create a plan for how you will provide storage for the solution. Include the connections between the HPE Moonshot solution and the storage. You can check your answers to the questions in this activity by referring to Appendix B: Answers to Activities.

Mobile workspace In this topic, you will learn about designing HPE Moonshot solutions to support mobile workspace applications.

Mobile workspace use case

Figure 7-15 Mobile workspace use case Millennials—who have grown up with constant, ready access to laptops, tablets, and smartphones— make up an increasingly large segment of the workforce. They—as well as many of their older coworkers—work most efficiently when they can access their work from any device, whether they are in their own cubicle, meeting with a coworker, or visiting a customer site. At the same time that employees want to be able to work over the network on applications and projects that move with them from device to device, employees also want to avoid the frustration of lags and poor performance. Many companies have realized that they can benefit from, rather than struggle against, the Bring Your Own Device (BYOD) trend. In a BYOD environment, users’ devices essentially become terminals for applications and services that are hosted in the data center. A BYOD environment can help the customer to save costs through investing less in managed devices. Users can move more freely through an open workspace, inviting collaboration and again helping the company save money through smaller space requirements. BYOD can also help to protect sensitive data, because the data is hosted in the data center. However, if employees are to gain the benefits of anywhere access, they must truly be able to use any applications anywhere, as Figure 7-15 suggests. The solution must be flexible enough to cover a wide range of applications. It must also give users the same high-quality experience that they expect at a traditional workstation across a variety of devices. Meeting these challenges can be complex, and the IT staff needs a solution that helps them to manage both users and devices easily without limiting their choices. The solution must also help to enforce the proper security and isolation, preventing sensitive corporate data from transferring to a user device improperly.

Different needs on different devices

Figure 7-16 Different needs on different devices The BYOD or mobile workspace solution must adapt to the fact that users have different needs and perform different tasks on different devices, shown in Figure 7-16. Users tend to create content on more traditional, larger devices such as laptops, traditional company-managed desktops dedicated to them, or company-managed devices shared with coworkers. On tablets, smartphones, and other small glanceable, pocketable, grab-and-go type devices, users tend to consume content using apps.

Mobile workspace technologies

Figure 7-17 Mobile workspace technologies Companies can choose mobile workspace solutions that meet the needs of the various ways their employees work, as you see in Figure 7-17. Employees such as bank tellers or call center operators tend to work with a few, primarily text-based applications. Session or application virtualization works well for these types of tasks. A remote virtualized application runs on a server in the data center. Users receive on-demand access to the application through a client. This client logs in with the server and accesses the remote application using a display protocol that lets the users interact with the application much as they would with an application installed locally. Virtualized applications can cross the consume-and-create spectrum presented earlier, and they can run on a wide array of devices, including smartphones, tablets, laptops, and desktops. Citrix XenApp, a common example of a virtualized application solution, is an HPE Moonshot partner.

Many office workers spend most of their day working with email, spreadsheets, and word processors. These workers benefit from a virtual desktop infrastructure (VDI) solution. Like application virtualization, VDI enables users to interact with an environment running on a remote server. However, with VDI, this environment is a complete OS as opposed to one application. Data center servers host virtual machines (VMs), each of which is set up with the basic applications and tools that employees require. Employees then log in to their remote VM from whatever device they choose. Traditional VDI can deliver good performance, user experience, and a low cost per seat for employees who use simple office applications, but it falls short of meeting the needs for a large segment of employees. These employees require access to multiple applications that feature graphics and multimedia elements. For example, they might be web designers or software programmers. They might be sales professionals who need to participate in video conferences. The list goes on. All of these workers require applications that support robust web content, smooth video, and hardwareassisted graphics—they are applications that fall on the “create” side of the spectrum that you examined earlier. Traditional VDI cannot give these users the essential combination of CPU and graphics performance. A hosted physical desktop solution, such as that provided by Citrix XenDesktop or Leostream (HPE Moonshot partners), allows users to access a desktop that is hosted in the data center. As with VDI, the user logs in to the desktop remotely. However, the desktop is not a VM but rather a physical machine running just that user ’s OS. Because the machine has exclusive access to the physical hardware, it can better run graphics-intensive applications. Hosted physical desktop solutions run on laptops and desktops, whether those belong to the user or to the company, and whether the physical device is dedicated to one user or shared. Sometimes the hosted physical desktop solution is called Hosted Desktop Infrastructure (HDI) Both VDI and HDI work well with HPE Thin Clients, clients designed to act as terminals for remote desktops without allowing data to leave the data center. As an alternative to hosted physical desktops for users who run highly demanding applications such as medical imaging, computer-aided design/computer-aided manufacturing (CAD/CAM), or oil and gas simulations, graphics-accelerated VDI can provide a good option.

Technologies for which HPE Moonshot is tailored

Figure 7-18 Technologies for which HPE Moonshot is tailored Here you see highlighted the two technologies for which HPE Moonshot solutions are tailored: physical hosted desktop and application virtualization. HPE offers ConvergedSystem solutions for supporting the other technologies, but this ebook does not cover those solutions. Figure 7-18 also shows how the relevant technologies compare in terms of scale, cost, security, and per-user compute power. As you see, the more secure the solution and the more compute power it provides per user, the greater the cost and the less simple it is to scale the solution. Hosted physical desktop provides good security and user isolation, as well as significant per-user compute power. Application virtualization offers the best scalability but less security and lower performance because it does not isolate individual users or dedicate resources to them.

Selecting the correct HPE Moonshot cartridges

Figure 7-19 Selecting the correct HPE Moonshot cartridges For a physical hosted desktop solution, such as a Citrix XenDesktop solution or a Leostream solution with Connection Broker, select HPE ProLiant m710/m710p or m700 cartridges. Often, customers’ physical hosted desktop environments use the CPU for all applications, including rich media ones, leading to slow performance and a poor user experience. Both m700 and m710 cartridges provide GPUs, dramatically improving performance for users who need to run media-rich applications such as video conferencing or design applications, as you see in Figure 7-19. The m710/m710p provides progressively more power per user and 10 GbE ports. The four-node m700, on the other hand, provides a higher density solution and 1 GbE ports. The HPE ProLiant m710/m710p and m700 cartridges are also optimized for application virtualization solutions such as Citrix XenApp. Once again, the m710 cartridges provide more power, as well as higher speed connectivity. The m710p provides more power still, particularly for GPU-accelerated workloads. See Table 7-9 for more specific information about these cartridges. Unlike the big data and video processing workloads, the mobile workspace workloads do not typically call for high-capacity local storage or connectivity to external storage. You could add a smaller SSD to the proposal to hold the image for a local boot, as well as to provide some local storage for users in a hosted physical desktop solution. Table 7-9 HPE Moonshot cartridges for mobile workspace solutions

Scoping the number of required cartridges for mobile workspace applications

Figure 7-20 Scoping the number of required cartridges for mobile workspace applications You should find it fairly straightforward to scope the number of cartridges required for a physical hosted desktop solution. (Figure 7-20 provides a summary of guidelines.) The intent of such a solution is to provide one machine per user, so you simply need to know the number of users who require desktops—which the customer should be able to tell you. Discuss with the customer decision makers whether they have included room for expansion, new hires, and so on in their estimate. Remember that the m710/m710p has one node, so you should plan one cartridge per user. The m700 with its four nodes can support four users. Recommend to provide about 20%–25% more cartridges than are currently required, to allow for future growth. To plan an HPE Moonshot solution to support application virtualization, you need to know the maximum number of users who will concurrently access the application, as well as the type of applications. The customer should be able to give you the maximum number of concurrent users based on user surveys. You should discuss the application types with the customer and classify them as either normal applications or media-rich ones. Citrix and HPE have tested HPE m710 cartridges running Citrix XenApp. Based on these tests, one m710 cartridge can support about 50 users who are using applications such as word processors and spreadsheets. It can support about 40 users who are running more media-rich applications. The chassis performance scales quite linearly, so 45 cartridges can support about 2250 normal users or 1800 rich-media users. Media-rich applications include Adobe Photoshop, CAD applications, and other applications that render 2D and 3D graphics. The m700, m710, and m710p all provide GPUs, but the m710 and m710p provide progressively more power. The m710/m710p cartridge GPU supports OpenGL 4.2. Visit https://en.wikipedia.org/wiki/List_of_OpenGL_programs for a list of more applications that use OpenGL. As you see, the GPU enables the Moonshot cartridge to support almost as many rich-media users as normal users—differentiating it from other servers and making Moonshot a great fit for companies with rich-media users. Use these values to begin planning the number of cartridges for your customer. Remember, though,

the importance of testing with your customer ’s precise application and requirements. Also remember the best practice of leaving room for 20%–25% growth. For example, your customer needs to support 4000 rich-media users. You should plan 125 m710 cartridges, nearly filling three Moonshot 1500 Chassis, to support 5000 users as a starting point. You should then test your solution.

Planning for the solution infrastructure

Figure 7-21 Planning for the solution infrastructure The hosted physical infrastructure or application virtualization solution typically includes a few additional servers beyond the hosted desktops or application servers themselves. These servers help to manage the solution, set up the sessions, and so on. The example in Figure 7-21 shows that for Citrix solutions, these servers might include a controller; Provisioning Services (PVS), which deploys OS; XenMobile, which manages mobile devices and their applications; and Netscaler, a gateway that helps to optimize application delivery. For Leostream, these servers might include load balancers and a Connection Broker cluster. The customer should be able to provide you with a list. Remember to plan cartridges to support these additional servers. The processing demands on these servers are typically less intense, and they might be able to run as VMs. You can add one or two HPE m300 cartridges to one of the planned chassis to host these VMs.

Planning the network connections

Figure 7-22 Planning the network connections For mobile workspace solutions, network bandwidth is the resource least likely to cause a bottleneck, whether you are using cartridges that support 10 GbE or 1 GbE connections. Even if users are running very graphics-intensive workloads, their network bandwidth requirements should not exceed about 30 Mbps. Therefore, m700 cartridges can easily support even four users on one of its 1 GbE ports. And a 6-SPF+ Uplink Module should be able to provide more than enough bandwidth for all 45 cartridges, even if you only use some of its ports. A cartridge supporting application virtualization for many very media-rich users might need a bit more than 1 Gbps bandwidth. You might want to set up both 1 GbE ports on m700 cartridges for load balancing. The m710/m710p cartridges, with their 10 GbE ports, should not have difficulty meeting the requirements (see Figure 7-22). In this type of solution, most of the traffic flows between the cartridges and the external network that connects to users. Work with the network architect to plan the uplink bandwidth. For example, for a chassis with m700 cartridges, you might plan to use two links on each 6-SPF+ Uplink Module to stack the switches and the other four for uplinks. For m710/m710p cartridges, you might use all four ports on a 4-QSPF+ Uplink Module, or, if you are using two switch modules, you could use two ports on each module for IRF and the other two on each module for uplinks. Remember to discuss the customer ’s availability requirements. If the customer wants to provide link redundancy for each cartridge node, you must include two switch modules and accompanying uplink modules, even if they are not required for bandwidth.

Guidelines for testing The HPE Discovery Lab remains a great resource for running tests and demonstrating the excellent solution performance to your customer. Set up a chassis in the intended configuration and plan your tests. For a hosted physical desktop solution, you are primarily testing your selected cartridge. Load the

cartridge with the applications that the customer has told you users require. Then log in to the hosted desktop and run each application, attempting to use the application as much like a user as possible. Run multiple applications at once as users tend to do. Use the OS tools to monitor the CPU and memory. Also assess your experience: Is the application responsive? Does it lag? If you detect any issues or overutilization of the CPU or GPU, you might propose an HPE ProLiant m710 or even m710p rather than an m700. To test an application virtualization solution, you need to simulate many users running the virtualized application. Select a tool such as LoginVSI for this purpose. Begin with one cartridge and simulate a certain number of users, such as 10 or 20. Monitor metrics such as response time—the ultimate indicator of whether the solution is functioning well. The customer might give you a particular response time that cannot be exceeded—for example, three seconds. You can also track the other metrics, including processor utilization, memory utilization, network utilization, disk throughput, and GPU utilization (for rich application delivery) to get a sense of how many resources the application instances demand. Continue to add users until one of the resources (most likely CPU) reaches near full utilization or the response time becomes too long. You can see that the resource is becoming a bottleneck because its utilization will stop rising and plateau. You should record the number of users at this point as the maximum number of users that the cartridge can support. You can then scope out the number of cartridges required. You should also test with multiple cartridges to demonstrate how the solution scales. As you perform the tests, you should also monitor power usage in the iLO CM. One of the primary benefits of an HPE Moonshot solution is that it helps to reduce power and cooling costs. Therefore, the relatively low amount of power consumed as the solution delivers high performance can provide a compelling selling point for your proposal. This recommendation also holds for other types of Moonshot solutions.

Chapter 7—Activity 3 In this activity, you will plan an HPE Moonshot solution to provide application virtualization to your customer. To create this plan, you will • Gather information about the customer • Plan a mobile workspace solution for the customer with: – A rapidly growing workforce – Users who want to use a mix of devices

Scenario A company that designs specialized hardware for electronics has a rapidly expanding workforce, particularly in the design department, but also in sales and other departments. The company is outgrowing its floor space. At the same time, the primarily young and tech-savvy workforce prefers to use a mix of tablets and traditional desktops to do their work. One ongoing challenge is that developers working on the go could lose data. The latest debacle involved a laptop that an employee left behind on the metro in Washington, D.C. Weeks of design work was lost, costing the company hundreds of thousands of dollars. Worse, if an employee lost clients’ private data, the company could fail an audit and be fined. To avoid disasters like this and to boost productivity, the CFO and CISO want to shift the model and provide employees with thin clients that act as terminals for virtualized applications. But the CIO is cautious. A pilot program with virtual desktop solutions has been plagued with issues. The IT team has struggled with capacity and sizing as well as trying to manage multiple vendor relationships. Support tickets have resulted in finger pointing and plans for moving past the pilot change every time the stakeholders discuss them. Record your answers as you review the questions related to the tasks below. You can check your answers by referring to Appendix B: Answers to Activities. Gather information 1. What should you discuss with the customer to gain a better idea about the workload and the solution requirements? 2. What should you discuss to set the CIO’s mind at ease about the solution? After your further discussions, you have collected the information below. The company has decided to use Citrix XenApp for the application virtualization solution. At this point, the pooled resources include • Microsoft Office Professional Plus 2013, which includes applications such as Word, Excel, PowerPoint, OneNote, Outlook, Publisher, Access, and Skype for Business

• Adobe Reader XI • Doro PDF printer • SolidWorks eDrawings Viewer, a 3-D design application that supports OpenGL on hardware as well as software • Internet Explorer • Adobe Photoshop The solution must support 4000 users: • 2000 designers, who use Microsoft Office Professional Plus 2013, including Access and Skype for Business, eDrawings Viewer, and Adobe Photoshop • 1000 sales professionals, who use Microsoft Office Professional Plus 2013, including Skype for Business for video conferences • 750 marketing employees, who use Microsoft Office Professional Plus 2013 and Adobe Photoshop • 250 HR, accounting, and receptionists, who use Microsoft Office Professional Plus 2013 The company does not require redundancy for individual application server links. However, link redundancy is required for the solution as a whole. Because many employees will use design applications, which involve large files, the company would prefer 10 GbE for the servers. The chassis can connect to the datacenter network infrastructure on 40 GbE links; the customer would prefer to consolidate ports as much as possible. The customer network administrators tell you to ensure that uplinks have no more than 8:1 oversubscription. The company plans to use hosted applications that are installed on the VDA servers locally. Each server will need at least 90 GB of local storage for the server image, for installed applications, and for temporary files while hosting sessions. The servers will also need to connect to a NAS, which the company already has set up. Plan the solution 1. Which applications will benefit from GPU acceleration? 2. Which type of HPE Moonshot cartridge should you recommend for the XenApp Virtual Desktop Agent (VDA), which supports the user sessions? 3. What is your initial estimate for the number of cartridges required for the XenApp VDA servers? 4. Remember also that the company is growing. What percentage does HPE recommend adding to the solution to account for this growth? How many cartridges total should you recommend for VDA servers? 5. The company also requires three servers to host VMs for the XenApp controller and Netscaler. Which cartridge provides the best choice for hosting these VMs? a. m300 b. m700 c. m800

6. With the three additional cartridges, how many total cartridges have you planned? How many HPE Moonshot 1500 chassis are required? Plan the networking Based on the requirements that you gathered, plan the following: • Switch modules – Type of switch module – Number of modules per chassis • Uplink modules – Type of uplink module – Number of modules per chassis • IRF or stacking – Do you plan to use this feature? – If you do, which modules will you combine and how will you link the modules? • Number of uplinks to connect Plan provisioning 1. Server administrators explain that they plan to use WDS to provision cartridges with their Windows Server 2012 R2 images. They will then use Citrix Machine Creation Services (MCS) to deliver an image with proper applications to the VDA servers. MCS creates a thin-provisioned clone of a master image, which the hypervisor on the VDA server hardware uses to create a VM for the VDA server. The alternative solution is Citrix Provisioning Server (PVS), which uses a dedicated server to deploy images to physical servers or VMs. Why might you recommend that the customer use PVS instead of MCS? Run tests 1. You have set up a POC with your proposed cartridges and deployed the proper OS and customer applications. What guidelines will you follow as you conduct tests?

Web infrastructure HPE Moonshot can deliver an all-in-one, scale-out Web infrastructure solution. HPE Apollo 2000 solutions (another family of HPE density-optimized servers) can also meet the requirements for such a solution.

Web infrastructure hosting demands Web services might not impose the same processing and memory requirements as the other workloads that you have examined in this chapter. And you and your customers are probably very familiar with designing solutions for them. However, the changing world has imposed new demands.

More and more users are living online—always connected through a smartphone. For many companies, their web presence is becoming an increasingly important revenue generator, whether in generating sales, subscriptions, advertisement revenue, social media presence, customer loyalty, brand awareness, or a mix of many of purposes. To continue to enjoy the benefits, customers need to be able to scale out their web services as quickly as demand grows, and they must achieve the scale out quickly, simply, and cost effectively. Many customers are considering a public cloud Web hosting solution to provide the on-demand scalability that they require. However, they hesitate to lose the control and security that comes with having their own dedicated hardware.

HPE density-optimized solutions

Figure 7-23 HPE density-optimized solutions HPE density-optimized solutions allow customers to balance these demands, as you see in Figure 723. They retain control of the infrastructure, but they can scale out efficiently, compressing the infrastructure that used to require racks to one or a few chassis. If the customer wants the true cloud experience, you can even propose an HPE Helion CloudSystem solution to control the Moonshot solution. These solutions are discussed in Chapter 9,“Monitoring and Managing HPE Solutions.”

Selecting the correct HPE Moonshot cartridges for web infrastructure

Figure 7-24 Selecting the correct HPE Moonshot cartridges for web infrastructure Use HPE ProLiant m300 or m350 cartridges for the web infrastructure (see Figure 7-24). The m300 supplies greater compute power and memory. However, the m350 also provides good performance for this type of workload, and it delivers higher density with four nodes per cartridge. Review the information in Table 7-10 for more details about these cartridges. Table 7-10 HPE Moonshot cartridges for web infrastructure Cartridge

m300

m350

Workload

Web infrastructure

Higher density web infrastructure

Number of processors

1

4

Processor type

Intel® Atom™ Processor C2750

Intel Atom Processor C2730

Frequency per core

2.4 GHz

1.7 GHz

Cores per processor

8

8

GPU





DIMM type

1600 MHz, DDR3 UDIMM SO-DIMM ECC

1600 MHz, DDR3 SO-DIMM, ECC4 DIMMs, 4 Embedded DRAM

Capacity

32 GB (4x8 GB)

64 GB (8x8 GB) 16 GB per processor

Integrated NIC

2x 1 GbE

8x 1 GbE2x 1 GbE per processor

Intra-cartridge





Cartridge-to-cartridge





One of: • 500 GB SATA HDD • 1 TB SATA HDD • 240GB SATA SSD

Either: • 4x 32 GB M.2 SATA 2230 SSD • 4x64 GB M.2 2230 SATA SSD

CPU

Memory

Network

Storag e Local

32 or 64GB 2242 M.2 SSD External capabilities

iSCSI software initiator

iSCSI software initiator

Power

TCO/ROI Solutions. 3. Click Go (see Figure 10-4).

Figure 10-4 TCO/ROI Solutions page 4. Under Create New Analysis, expand HPE Servers and select HPE Hyperscale Business Value Calculator v2.3. 5. Enter names for the company and the analysis. 6. Click Create a New Analysis (see Figure 10-5).

Figure 10-5 Create a New Analysis 7. Read through the tutorial if you like. 8. Click the Analysis Selection tab. 9. Select Pre-Sales – Prove Value (see Figure 10-6).

Figure 10-6 Analysis Selection tab 10. Click Proceed to Analysis. 11. Fill in the fields based on the scenario and your plan; refer back to Chapter 4—Activity 1. (Assume that testing indicated that your plan is adequate.) Use Dedicated Power and Cooling and a max kW per rack of 7 kW. Run the analysis for three years (refer to Figure 10-7).

When you specify the number of servers, remember that each ProLiant XL220a Gen8 tray has two servers. Specify the number of servers, not the number of trays.

Figure 10-7 HPE Hyperscale Business Value Calculator 12. Compare with SuperMicro, which is the competitor that the customer is considering. The SuperMicro SD-5038ML-H8TRF is also one-processor server, so you should specify the same number as HPE servers. 13. Scroll down to the results (see Figure 10-8). Note the high-level comparison, including the number of racks required, the number of cores provided, the number of Watts consumed, and the TCO. Begin planning how you will use this information in your presentation to the CFO.

Figure 10-8 Results—Three Year Analysis 14. You can click the Configuration button under the ProLiant XL servers to adjust the plan if you like (see Figure 10-9). For example, you could change the support pack or the TOR switches,

or you could add HPE Insight CMU to the plan.

Figure 10-9 Configuration tab 15. Click the Assumptions tab to see a breakdown of the assumed costs (see Figure 10-10). You can adjust these to reflect the customer ’s situation more closely. For example, you can adjust the cost of power.

Figure 10-10 Assumptions tab

16. Click the Financial Results tab to review details for the TCO comparison. 17. For which types of costs does the HPE solution provide a higher cost? For which types of costs does it provide a lower cost? Begin to plan how you will discuss the comparison with the CFO. 18. You can clear a check box for any type of cost to remove it from the comparison. Explore clearing various check boxes. Then select them all again. 19. Click the graphic comparing the TCO for the solutions to enlarge it (see Figure 10-11).

Figure 10-11 TCO graph 20. Use this graphical representation to draft an explanation for the CFO about the difference in TCO of an HPE Apollo 6000 solution after just three years. 21. The automotive company might be interested in an HPE Financial Services option. Click the graphic to enlarge it (see Figure 10-12).

Figure 10-12 HPE Financial Services option 22. Use this graph to help you draft an explanation of how this service can help to decrease the impact on the customer ’s cash flow for a single year. 23. Click Create a Report 24. Use the Word template to create the report (see Figure 10-13).

Figure 10-13 Create a Report 25. You can open the report and edit it. In the real world, be sure to edit the report to customize the results for your customer.

Talk with the CFO Use the Alinean tool report and the notes from the previous task to answer the following questions. 1. How does the HPE Apollo 6000 solution help to support the company’s environmental initiatives? 2. Based on what you learned earlier about the benefits of HPE Apollo 6000 and ProLiant XL220a systems, how will the HPE solution deliver a favorable ROI? 3. Will you recommend the solution to the CFO? 4. Prepare a pitch to convince the CFO of the benefits of this solution.

Summary Financial metrics tell a story in numbers about a company’s cash flow or financial health. Each financial metric provides different information about the company and reveals a characteristic of the bigger picture that might not be apparent from reviewing individual financial figures. It is also invaluable to evaluate these metrics over time, including economic downturns, and also relative to competitors.

Certain terms are regularly used when working with financial statements. Knowing the definition of these terms is key to extracting the information most helpful to you when working with a customer. Financial statements show the profitability of the business and its financial position at a specified date. This information is helpful for you to understand how to position an IT solution relative to the customer's budget. HPE provides tools that help you analyze a company’s financial position, including the following: • HPE Converged Infrastructure Business Value Calculator • HPE Hyperscale Business Value Calculator • HPE Client Virtualization ROI Calculator

Learning check Review what you have learned by answering these questions. Then check your answers in Appendix A: Answers to Learning Checks. 1. Which financial term describes the periodic rental payment, expressed as a percentage (or decimal equivalent) of equipment cost? It is used to calculate payments, given the cost of equipment (for example, 0.0240 on equipment cost of $10,000 requires a monthly payment of 0.0240 × $10,000 = $240). a. Internal rate of return (IRR) b. Net present value (NPV) c. Lease rate factor (LRF) d. Cumulative average growth rate (CAGR)

2. Which financial term describes the type of business outlay that is reflected on the company’s balance sheet as assets? It creates a depreciation expense on the income statement for each year of the asset’s depreciable life. This depreciation expense lowers reported income (profit), thereby creating a tax savings for each of these years. a. Cumulative average growth rate (CAGR) b. Operating expenditure (OPEX) c. Net present value (NPV) d. Capital expenditure (CAPEX)

3. Which type of business outlay addresses spending on predictable, repeatable costs for items or services that are not registered as assets and that are not depreciated? It impacts reported profit and taxes on earnings only in the single reporting period it is incurred. a. Internal rate of return (IRR) b. Total cost of acquisition (TCA) c. Operating expenditure (OPEX) d. Gross Profit (GP)

4. What is the monetary amount by which an asset is valued in business records, a figure not necessarily identical to the amount the asset could bring on the open market? It could also be used to designate the sum of the assets on a portfolio or within a company.

a. Net investment value (NIV) b. Net book value (NBV) c. Fair market value (FMV) d. Total cost of acquisition (TCA)

For answers, See Chapter 10 in Appendix A.

Chapter 11 Practice Exam Introduction This practice exam is designed to test your readiness for the HPE0-S22 exam. The HPE0-S22 exam tests candidates’ knowledge and skills on advanced architecting HPE server products and solutions. Topics covered in this exam include advanced server architectures and associated technologies, as well as their functions, features, and benefits. Additional topics include analyzing the server market, positioning HPE server solutions to customers’ solutions, demonstrating server-related business acumen, and explaining how the HPE Transformation Areas relate to HPE server products and solutions.

Exam details The following are details about the exam: • Exam ID:HPE0-S22 • Number of items:60 • Item types:Multiple choice (single response); multiple choice (multiple responses); matching • Exam time:90 minutes • Passing score:70%

HPE0-S22 testing objectives The exam is designed to validate that candidates can successfully meet the following objectives. The percentage next to each of the main objectives indicates how the objective is weighted in the exam.

15% Foundational server architectures and technologies • Determine optimal processors for specific use cases and operational workloads. • Determine interconnect (networking, storage) technologies based on customer/solution requirements. • Explain the benefits of APIs.

25% Functions, features, and benefits of HPE server products and solutions • Differentiate and position the HPE server product offerings, architectures, and options. • Explain the functions and benefits of HPE health and fault technologies. • Compare and contrast management tools. • Given a customer environment scenario, recommend and substantiate which HPE management tools optimize administrative operations.

20% Analyzing the server market and positioning HPE server solutions to customers • Determine an approach to address customers’ business requirements (TCO, ROI, IRR, NPV, TCA, CapEx, OpEx, HPE financial services, and so forth). • Explain how the four HPE Transformation Areas relate to given server solutions.

40% Planning and designing HPE server solutions • Given a scenario with changed customer requirements, recommend modifications to the implementation plan. • Given a customer ’s storage infrastructure (for example, iSCSI, Fibre, NAS, DAS), determine an appropriate configuration for server deployment. • Given a customer ’s networking infrastructure, determine an appropriate configuration for server deployment. • Determine customer ’s internal/external storage capacity and performance requirements. • Given a scenario, determine the customer ’s IT maturity and recommend next steps. • Given an anticipated performance bottleneck, determine an appropriate design solution.

Practice exam questions As you take this practice exam, remember to read all the choices carefully because there might be more than one correct answer. Answers and explanations are provided at the end of this chapter. 1. A server architect is planning an HPE Apollo 6000 solution for a customer ’s weather modeling high performance computing (HPC) application. For the initial design, the architect needs to select the proper compute tray for the workload. Which question helps the architect determine whether the ProLiant XL250a could be a better fit than the XL230a? a. Does the customer’s application support GPU acceleration? b. Does the application have high memory requirements? c. Which GPU vendor does the customer prefer? d. Does the customer have a preference for InfiniBand or Ethernet fabrics?

2. An architect is proposing an HPEBladeSystem solution with Virtual Connect FlexFabric-20/40 F8 Modules. The customer needs an external Fibre Channel (FC) storage solution for the blade servers. The solution must be simple to deploy and easy to manage, and the customer also wants to reduce equipment. The architect is proposing an HPE 3PAR StoreServ System. What should the architect propose for connecting the BladeSystem to the storage? a. Connecting the Virtual Connect modules to Ethernet switches that support iSCSI and connect to the StoreServ System b. Adding FC SAN switch modules to the BladeSystem and setting up a SAN to connect to the StoreServ System c. Directly connecting the Virtual Connect modules to the StoreServ System d. Adding Virtual Connect 16Gb 24-port Fibre Channel Modules to the BladeSystem and directly connecting them to the StoreServ System

3. An architect is proposing an HPE Moonshot 1500 System that has 40 m710p cartridges and five m300 cartridges. The architect is now planning the switch module solution. The customer requires – The highest possible bandwidth for the m710p cartridges – Advanced data center technologies such as TRILL – LACP NIC bonding on the cartridge ports Which switch solution should the architect propose? a. One Moonshot 45Gc switch and one Moonshot 45XGc switch b. Two Moonshot 45XG switches c. One Moonshot 45G switch and one Moonshot 45XG switch d. Two Moonshot 45XGc switches

4. An architect is proposing several HPE c7000 Blade Enclosures that are managed by HPE OneView. The customer has developed an in-house management solution for inventorying and tracking assets. The customer wonders how the new HPE solution will fit with its existing management solution. What should the architect explain? a. The customer should program the in-house solution to receive information about the new servers using the HPE OneView REST API.

b. The customer should import the SNMP templates used by OneView into its in-house management solution. c. The customer should replace the in-house solution with HPE OneView, which provides all the capabilities that the customer requires. d. The customer should program the in-house solution to use SOAP to communicate directly with blade servers.

5. An architect is planning an HPE Moonshot System for a customer who needs a hosted desktop solution. The hosted desktops are for graphic designers who run a variety of rich media applications. Decision makers have emphasized that they want the solution that provides the best performance and user experience. Which cartridges should the architect propose? a. m400 cartridges b. m700 cartridges c. m710p cartridges d. m800 cartridges

6. An architect is designing an HPE Apollo 2000 solution for a customer and needs to choose the chassis. What is one reason to propose the r2800 chassis versus the r2600 chassis? a. The customer needs a higher density of servers per chassis. b. The customer needs flexibility in allocating local drives to servers. c. The customer needs a higher density of local drives per chassis. d. The customer needs the ability to aggregate server connections.

7. A customer needs a new server solution to host its transactional database and its business intelligence application, both of which are growing in size. The databaseis licensed per processor core. Which solution should the architect propose to help both improve performance and reduce licensing costs? a. HPE Integrity Superdome X with an nPartition scoped to the size of each application b. HPE Integrity Superdome X with a VMware ESXi virtual machine to host each application c. HPE Moonshot with multiple cartridges scaled out to meet the needs of each application d. HPE Moonshot with a dedicated m710p cartridge per application

8. A customer has HPE BladeSystem and HPE 3PAR StoreServ solutions that are managed by HPE OneView 2.x. VMware ESXi hosts are deployed on the blade servers, which use StoreServ for storage volumes. What is one benefit of HPE OneView for vCenter for this customer? a. Integrated management of the StoreServ solutions from vCenter b. A OneView Dashboard integrated into vCenter Operations Manager c. Automated deployment of HPE StoreOnce to StoreServ Systems d. A self-service portal for deploying cloud workloads to the hosts

9. A customer has been using HPE Virtual Connect Enterprise Manager (VCEM) to manage its Virtual Connect solutions. The customer has been adding more HPE servers and storage solutions to its data center and now wants to deploy HPE OneView 2.x to manage all HPE servers, storage, and VC modules centrally. What should the architect explain about how to make this change successfully? a. Administrators should add the VC modules to OneView so IT staff can manage the solutions from both VCEM and OneView, as they choose.

b. Administrators should remove VCEM and then migrate VC module management to OneView. c. Administrators should discover VCEM from OneView, which will manage the VC domains through VCEM. d. Administrators should add a OneView license to VCEM before deploying OneView to ensure that OneView and VCEM can integrate successfully.

10. An architect is proposing an HPE Apollo 6000 solution to a customer. The solution includes multiple racks of Apollo chassis. Although the customer does not require advanced monitoring capabilities at this point, the customer needs to simplify and accelerate the deployment of images to Apollo servers. The customer also wants a tool for simplifying the maintenance of servers centrally. Which HPE solution should the architect propose? a. HPE Cluster Management Utility (CMU) b. HPE OneView 2.x c. HPE Onboard Administrator (OA) d. HPE Smart Update Manager (SUM)

11. An architect is working with a customer who is considering an HPE server solution. The customer needs to assess whether it is worthwhile for the company to invest in the solution. Which value helps the customer compare investment alternatives, taking into consideration the company’s desired rate of return, as well as each investment’s expected cash inflows and outflows? a. Cumulative average growth rate (CAGR) b. Total cost of ownership (TCO c. Net present value (NPV) d. Lease rate factor (LRF)

12. An online retailer collects a great deal of information about its customers and their purchases in the form of structured databases, as well as emails, messaging boards, and social media content. The retailer is looking for ways to become more competitive. Which transformation area should the architect focus on? a. Transform to a hybrid infrastructure b. Protect the digital enterprise c. Empower the data-driven enterprise d. Enable employee productivity

13. Refer to Figure 11-1.

Figure 11-1 Exhibit for item 13 The architect was planning to connect HPE Apollo 6000 Management Module iLO ports to the network as shown. Customer decision makers then indicated that they want a more highly available design for iLO functions. How should the architect change the design? a. Remove the links between the chassis and connect each chassis to the management network switch on one iLO link. b. Remove the links between the chassis and connect each chassis to the management network switch on two iLO links. c. Connect the second port on the bottom chassis to the management network switch. d. Connect the second port on the bottom chassis to a different management network switch and make sure that both management switches are connected on the same VLAN.

14. An architect is proposing several HPE Moonshot System for supporting the Cloudera distribution of Hadoop MapReduce 2 and several HPE Apollo 4200 servers for supporting the Hadoop Distributed File System (HDFS). The Moonshot Systems use: – m300 cartridges – Moonshot-45Gc Switch Modules When testing the application on the proposed solution, the architect discovers high latency for disk IO during the shuffle phase. What should the architect consider to improve performance for the solution? a. Replacing the 45Gc modules with Moonshot-45XGc Switch Modules b. Replacing the HPE Apollo 4200 servers with HPE Integrity Superdome X servers c. Ensuring that the m300 cartridges are using the highest capacity SSDs d. Adding more DDR4 memory to the m300 cartridges

15. Refer to Figures 11-2, 11-3, and 11-4.

Figure 11-2 Exhibit 1 for item 14

Figure 11-3 Exhibit 2 for item 14

Figure 11-4 Exhibit 3 for item 14 An architect is proposing HPE Apollo 6000 Systems with ProLiant XL 220a compute modules for a customer ’s high performance computing (HPC) application. For each compute module, the architect plans two nodes, each with: – One Intel Xeon E3 1200 v3 series processor with four cores at 3.5 GHz – Two 8GB DIMMs (16 GB total) – Two 400GB SSDs (800 GB total) The architect tests the application on the proposed solution and discovers the results shown in the exhibit. What should the architect consider changing to resolve potential performance issues? a. Replace the SSDs with higher capacity HDDs. b. Add another processor to each node. c. Select processors with more cores. d. Add more memory capacity to each node.

16. An architect is proposing an HPE Integrity Superdome X System for a customer business intelligent application. The application needs to have access to block-level storage for data mining. What should the architect plan to fulfill this requirement? a. External Network Attached Storage (NAS) on a server such as HPE Apollo 4200 b. External Fibre Channel (FC) storage such as HPE 3PAR StoreServ c. HDDs local to each blade on the application’s nPartition d. An SSD storage blade within the application’s nPartition

17. An architect is proposing an HPE Moonshot System with HPE Moonshot-45XGc Switch Modules. The Moonshot System requires these connections on each uplink module:

– Four 10GbE connections to HPE Apollo 4200 servers in the same rack – Two 40GbE connections to top of the rack (TOR) switches Which uplink solution should the architect propose? a. Moonshot-16SFP+ Uplink Modules with QSFP+ and SFP+ transceivers b. Moonshot-6SFP+ Uplink Modules with QSFP+ and SFP+ transceivers c. Moonshot-4QSFP+ Uplink Modules with QSFP+ transceivers and DAC splitter cables d. Moonshot-4QSFP+ Uplink Modules with QSFP+ transceivers, QSFP+/SFP+ adapter kits, and SFP+ adapters

18. Match each member of the HPE Apollo 4000 Family with a typical situation for proposing it. a. Apollo 4200 b. Apollo 4510 c. Apollo 4530 ___ The customer requires a solution that provides both compute and storage for a complex data analytics application. ___ The customer needs a server for hosting its Scality object storage solution. ___ The customer is just getting started with big data analytics and needs an entry-level solution.

19. A customer requires a solution for supporting the Cloudera distribution of Spark. The architect is proposing: – Three HPE Moonshot Systems with m710p cartridges for supporting Spark – Five HPE Apollo 4200 servers for supporting the Hadoop Distributed File System (HDFS) Based on the typical requirements for the application, what should the architect consider changing about the proposal? a. Adding more Moonshot Systems b. Adding more HPE Apollo 4200 servers c. Replacing the m710p cartridges with m300 cartridges d. Replacing the Apollo 4200 servers with Apollo 4530 servers

20. A customer needs a solution for a SAP HANA database. Which solution should the architect propose? a. HPE Moonshot System b. HPE Apollo 4510 c. HPE Apollo 4530 d. HPE Integrity Superdome X

Practice exam answers This section provides answers and explanations for the practice exam. If you need to review a topic in more detail, see the provided reference. 1. A is correct. A primary distinguishing feature of the XL250a, as opposed to the XL230a, is its support for GPU or coprocessor accelerators. Therefore, architects should ask about whether the customer ’s application supports GPU acceleration early in the design process.

B is incorrect. Both the XL230a and XL250a modules support the same maximum memory, so this question does not help the architect select the right module for the workload. C is incorrect. This might be an important question later, but the XL250a supports GPUs from several vendors, whereas the XL230a does not support any GPUs at all. This question is less important at this point. D is incorrect. Both of the modules in question support InfiniBand or Ethernet adapters, so this question does not help the architect choose between the modules. To review topics related to this question, refer to “HPE Apollo 2000 and 6000 architecture” in Chapter 4. 2. C is correct. Virtual Connect FlexFabric-20/40 F8 Modules support direct attachto 3PAR StoreServ Systems. On the downlink side, servers use FlexFabric adapters for both traditional data and storage traffic. This option eliminates a great deal of SAN equipment and is simple to set up. A is incorrect. The customer requires FC storage, not iSCSI storage. B is incorrect. This option would require FC adapters on the blade servers and additional SAN equipment. It does not meet the customer ’s requirements for a simple solution. D is incorrect. This option would require FC adapters on the blade servers, so it does not meet the requirements to eliminate as much equipment as possible. This item tests whether you have the required knowledge from prerequisite training, including Architecting HP Server Solutions (ASE-level training). You should review features of BladeSystems and Virtual Connect modules, as well as other topics, to prepare for the exam. 3. D is correct. The customer wants the highest bandwidth possible for the m710p cartridges, so the architect must propose 45XG or 45XGc switches, which support 10GbE. (These switches will also support the m300 cartridges, although the m300 cartridges will only receive 1GbE connectivity.) The 45XGc is the correct choice because it supports the advanced technologies. Finally, the customer requires two of these switches to support both of the cartridges ports, which will use NIC bonding. (The switches can use their IRF technology to support LACP NIC bonding.) A and C are incorrect. Both switch modules must be the same type. B is incorrect. The 45XG switches do not support the advanced technologies that the customer requires. To review topics related to this question, refer to “HPE Moonshot networking” in Chapter 6. 4. A is correct. The HPE REST API accepts calls from applications and helps customers automate server monitoring, management, and maintenance using applications of their choice. Servers’ iLO engines support the REST API, and so does HPE OneView. In this case, the servers are managed by HPE OneView, so the application should use the OneView REST API. HPE offers a number of tools, such as a Python library, for helping customers to script to this API. B is incorrect. Importing SNMP templates won’t help the in-house application integrate inventory information from OneView. C is incorrect. The customer wants to leverage the in-house application, and the architect should

explain to the customer how HPE helps. D is incorrect. The OneView REST API offers the simplest way to integrate inventory and server status information into the in-house application. To review topics related to this question, refer to “HPE REST API” in Chapter 9. 5. C is correct. The m710p cartridge has a powerful GPU and is designed to provide high performance for rich media hosted desktop solutions. A and D are incorrect because those modules do not provide GPUs and are not recommended for hosted desktop solutions. B is incorrect. The m700 cartridges are suitable for some hosted desktop solutions and has the advantage of providing a higher density (four desktops per cartridge). However, this customer has indicated that performance is the most important consideration. Therefore, the m710p is the better choice. To review topics related to this question, refer to “Mobile workspace” in Chapter 7. 6. B is correct. The r2600 allocates a fixed number of local drives to each server,whereasthe r2800 chassis allows flexible mapping of any number of drives to any server. A is incorrect. Both chassis support the same number of servers. C is incorrect. Both chassis support the same total number of drives. D is incorrect. Neither chassis aggregates server connections. To review topics related to this question, refer to “HPE Apollo 2000 and 6000 architecture” in Chapter 4. 7. A is correct. Integrity Superdome X solutions provide the scale-up approach that transactional databases require. The nPartition technology allows customers to hard partition the system. The customer saves money by licensing only for the cores in the nPartition. B is incorrect. Database vendors such as Oracle and Microsoft do not accept soft partitioning, such as VMware ESXi. When the system uses soft partitioning, the database must still be licensed for all cores on the system. This is one benefit of nPartitioning that architects should communicate to the customer. C and D are incorrect. A scale-out solution such as Moonshot does not meet the needs for transactional databases. To review topics related to this question, refer “HPE Integrity Superdome X solution architecture” in Chapter 8 and “Architecture for data-driven organizations” in Chapter 9. 8. B is correct. HPE OneView for vCenter integrates a OneView Dashboard into vCenter Operations Manager, helping to improve resource monitoring and troubleshooting. A and C are incorrect. HPE OneView for vCenter does not integrate storage management or deployment of StoreOnce into vCenter. D is incorrect. For a self-service portal for deploying cloud workloads, the customer requires an HPE Helion CloudSystem Enterprise solution. This item tests whether you have the required knowledge from prerequisite training, including Architecting HP Server Solutions (ASE-level training). You should review features of OneView

thoroughly to prepare for the exam. 9. B is correct. The customer wants to manage the BladeSystems and VC modules from OneView. Simultaneous management from VCEM and OneView is not supported. Therefore, administrators will need to remove VC domains from VCEM before migrating them to OneView. Because VCEM is no longer managing any of the VC domains, the customer can remove it. A is incorrect. Simultaneous management from VCEM and OneView is not supported. (OneView can monitor VC domains that are managed by VCEM.) C and D are incorrect. OneView does not discover VCEM, nor do the solutions integrate. Instead, the customer should choose between OneView and VCEM, and in this case, migrate to OneView. This item tests whether you have the required knowledge from prerequisite training, including Architecting HP Server Solutions (ASE-level training). You should review features of OneView thoroughly to prepare for the exam. 10. A is correct. Although the customer does not need all of CMU’s monitoring capabilities at this point, CMU is the only solution listed that meets the customer ’s requirements. (HPE CMU provides cloning capabilities that will simplify and accelerate the deployment of images to Apollo servers. HPE CMU also provides advanced monitoring capabilities.) B is incorrect. HPE OneView 2.x does not support the Apollo 6000’s XL servers. C is incorrect. Apollo chassis do not use OAs. D is incorrect. SUM helps to simplify and accelerate firmware updates, but not image deployment. To review topics related to this question, refer to “HPE hypervisor server provisioning and management solution” in Chapter 9. 11. C is correct. Net present value (NPV) helps customers to compare investment alternatives. The customer defines its desired rate of return and applies this rate of return to the project’s expected cash inflows and outflows, which returns the NPV. A positive NPV represents a desirable project. A, B, and Dare incorrect; these terms refer to different concepts. To review the definitions of these terms, as well as of NPV, see Chapter 10. 12. C is correct. This customer can become more competitive by analyzing both structured data and the unstructured data in emails, messaging boards, and social media in order to gain actionable insights. The “empower the data-driven organization” transformation area helps customers with such initiatives. A is incorrect. The “transform to a hybrid infrastructure” transformation area also helps customers to become more competitive, but it does so by increasing their agility and the speed at which they can deploy applications, not by helping them extract value from data. B is incorrect. The “protect the digital enterprise” transformation area deals with protecting data from attack, loss, or corruption. D is incorrect. The “enable employee productivity” transformation area increases productivity

with solutions such as hosted desktops. To review topics related to this item, refer to “HPE Transformation Areas for the new idea economy” in Chapter 1. 13. A is correct. The original design is best for customers who want to conserve ports on management network switches. This customer, though, wants higher availability for the iLO connections, and the original design connects several chassis on one link. Connecting each chassis individually means that the failure of one link will only affect one chassis. The Management Module iLO ports do not support spanning tree, so it is crucial not to connect the ports in a loop. Therefore, the links between the chassis must be removed. B, C, and D are incorrect. All of these designs introduce a loop, which could disrupt network connectivity. To review topics related to this item, refer to the “HPE Apollo 2000 and 6000 management” section in Chapter 4. 14. C is correct. Local SSDs can provide lower latency for frequent reads and writes such as those that occur during the shuffle phase. Selecting the highest capacity SSDs for the m300 cartridges helps to ensure that as much as the shuffle data as possible fits on those local drives, rather than remote ones. A is incorrect. The Moonshot-45XGc Switch Modules do support m300 cartridges, but they still only provide 1GbE to them. Using these modules would not increase network performance. B is incorrect. The Apollo 4200 servers, as opposed to the Superdome X Systems, are the more appropriate solution for supporting HDFS. The Apollo 4200 servers provide a great deal of storage while Superdome X Systems are optimized for scale-up processing power and memory, as well as high availability features. D is incorrect. Adding more memory to a cartridge is not possible. To review topics related to this item, refer to “Big data and analytics” in Chapter 7. 15. D is correct. The exhibits show that the compute modules are running out of memory well before other resources are consumed. The architect should plan to add more capacity by adding more DIMMs. A is incorrect. Neither the scenario nor the exhibits indicate that the drives have inadequate capacity. B is incorrect. The exhibits indicate that CPU usage is still acceptable. In addition, the XL220a compute module supports a maximum of one processor per node (two per module). C is incorrect. The exhibits indicate that CPU usage is still acceptable. To review topics related to this item, refer to “HPE Apollo 2000 and 6000 architecture” in Chapter 4. 16. B is correct. The blades for HPE Integrity Superdome X Systems do not support local drives. For block storage, the system must use external FC or iSCSI storage. A is incorrect because NAS provides file, not block, storage. C and D are incorrect; local HDDs and storage blades are not supported on these systems.

To review topics related to this item, refer to “HPE Integrity Superdome X solution architecture” in Chapter 8. 17. C is correct. Moonshot-4QSFP+ Uplink Modules can support both 40GbE and 10GbE connections. Each uplink module needs to provide two 40GbE connections, which uses two of the four ports. To provide the four 10GbE connections on the remaining two QSPF+ ports, the module must use a DAC splitter cable.This cable supports up to four 10GbE connections on one QSFP+ port. DAC splitter cables can only extend up to 5 m, but these connections are within the same rack, so distance does not pose an issue. A is incorrect. Moonshot-16SFP+ Uplink Modules do not support QSFP+ transceivers for 40GbE connections. B is incorrect. Moonshot-6SFP+ Uplink Modules do not support QSFP+ transceivers for 40GbE connections, and these modules are used with different switch modules from the Moonshot-4QSFP+ Uplink Modules. D is incorrect. The QSFP+/SFP+ adapter allows a QSFP+ port to be used as a single SFP+ port. But only two QSFP+ ports remain after fulfilling the 40GbE connectivity requirements, and this customer requires four SFP+ ports. To review topics related to this item, refer to “HPE Moonshot networking” in Chapter 6. 18. Match each member of the HPE Apollo 4000 Family with a typical situation for proposing it. a. Apollo 4200 b. Apollo 4510 c. Apollo 4530

c The customer requires a solution that provides both compute and storage for a complex data analytics application. b The customer needs a server for hosting its Scality object storage solution. a The customer is just getting started with big data analytics and needs an entry-level solution. To review typical uses for HPE Apollo switches, refer to Chapter 5. 19. A is correct. Spark tends to be CPU and memory-bound, which means that it benefits from a relatively higher ratio of compute nodes to storage nodes. Adding Moonshot Systems without adding Apollo 4200 servers raises the compute ratio. B is incorrect. Adding Apollo 4200 servers would raise the storage to compute ratio, but this application probably needs more compute and memory. C is incorrect. Changing to m300 cartridges would not enhance the processing power or the memory. D is incorrect. The Apollo 4530 servers could provide more processing power and memory on the storage nodes, but this design requires that power on the Moonshot compute nodes. To review topics related to this item, refer to “Big data and analytics” in Chapter 7. 20. D is correct. SAP HANA is an in-memory database, which requires a scale-up approach to function well. The HPE Integrity Superdome X solutions provides this approach. A, B, and C are incorrect. These are scale-out systems that are optimized for different

workloads. To review topics related to this item, refer to “Architecture for data-driven organizations” in Chapter 3.

Answers to Learning Checks Appendix A

Chapter 1 learning check 1. What is one way that an SDDC differs from a traditional data center? a. It focuses on functionality. b. It helps IT act as a cost center. c. It focuses on usability and experience. d. It enables project delivery to occur in 9 to 12 months.

2. Which HPE solution is part of the scale-up compute portfolio? a. HPE Moonshot b. HPE Integ rity Superdome X c. HPE Apollo 2000 d. HPE Apollo 6000

Chapter 2 learning check 1. What is most likely to be a concern for an LOB manager? a. That IT solutions follow best practices b. That IT solutions meet security standards c. That IT solutions meet their tactical requirements d. That IT solutions support automated patch management

2. If a server provides 99.999% availability over a year, how much unplanned downtime can it experience? a. 26.3 seconds b. 5.3 minutes c. 44 minutes d. 8.7 hours

Chapter 3 learning check 1. What characterizes OLTP workloads? a. Very large datasets b. Distributed datasets over multiple systems c. The need for scale-up architectures d. Computationally complex queries

2. A customer requires a solution for a mission-critical transactional database. Why do Intel Xeon E7 series processors provide a good fit for this workload? a. These processors have built-in RAS features for workloads that cannot tolerate any data loss or corruption. b. These processors provide the highest clock speed per core but relatively few cores—the best fit for transactional workloads. c. These processors provide high performance for a low TCO, enabling fast scale out for the mission-critical workload. d. These processors are specifically designed for use with scale-out, clustered applications.

3. For which customer need does object storage provide the best solution? a. Need to provide block-level access to remote drives b. Need to store structured databases for transactional processing c. Need to store billions of voice, video, and email files d. Need to provide a remote drive from which VMs can boot

Chapter 4 learning check 1. You are planning to propose an HPE Apollo 6000 System, and you have determined that a customer ’s HPC application will benefit from GPU acceleration. Which compute tray should you propose? a. HPE ProLiant XL190r b. HPE ProLiant XL220a c. HPE ProLiant XL230a d. HPE ProLiant XL250a

2. A customer tells you that its HPC application uses a SAN shared disk solution. What should you make sure to include in your proposal? a. HPE FlexFabric adapters that support FCoE (or FC HBAs) for the compute tray b. HPE ProLiant SL4540 server to act as a NAS c. HPE Apollo 2000 System to connect to the SAN d. PCIe riser and HPE Smart Array Controller P430 or P440 for the compute tray

Chapter 5 learning check 1. Which HPE Apollo server is purpose-built for object storage? HPE Apollo 4510 System 2. Which type of disk drive meets the typical requirements for HDFS? SAS or SATA HDDs

Chapter 6 learning check 1. When is an HPE Moonshot 180G Switch Module required for an HPE Moonshot 1500 Chassis? a. Whenever the customer wants to install an HPE Moonshot 4-QSFP+ Uplink Module b. Whenever the chassis has a mixture of 10GbE and 1GbE cartridges c. Whenever the customer wants to use both ports on a cartridge node d. Whenever the chassis includes any cartridg es with four processors

2. An architect plans to connect an HPE Moonshot chassis to data center switches as shown. How should the architect plan to configure the four 10GbE ports to prevent a loop? a. As a link ag g reg ation that includes all four ports b. As two link aggregations, each of which includes the two ports that connect to one of the data center switches c. As four separate ports with the two ports that connect to one switch assigned to one VLAN and the two ports that connect to the other switch assigned to another VLAN d. As four separate ports, all of which are assigned to the same VLAN

Figure 6-1 Exhibit for learning check 3. How can administrators contact the VSP for an HPE Moonshot cartridge node? a. Through the cartridge’s serial port b. Throug h the iLO CM CLI c. At the cartridge node’s iLO IP address d. At the cartridge node’s IP address on its first port

Chapter 7 learning check 1. What distinguishes the HPE Big Data Reference Architecture from a traditional big data architecture? a. The HPE architecture relies on a scale-up approach to compute resources. b. The HPE architecture features separate compute and storag e nodes, each optimized for their role. c. The HPE architecture brings compute to data by co-locating compute resources on storage nodes. d. The HPE architecture ensures that YARN applications always run on the same server that stores the data to be analyzed.

2. A customer needs a high-density hosted physical desktop solution. Why might the m700 cartridge provide a better solution for this customer than an m710 cartridge? a. This cartridge has a two-port 10GbE adapter, so it can stream more data to users. b. This cartridge has a more powerful GPU than the m710, enabling it to support more media-rich applications. c. This cartridge provides more memory than the m710, enabling it to support more users. d. This cartridg e provides four nodes, so each cartridg e can support four physical desktops.

3. For which type of video transcoding are HPE m710 and m710p specifically designed? a. Video transcoding that relies on the CPU alone b. Video transcoding that uses Intel Quick Sync or FEI GPU acceleration c. Video transcoding that is designed for hardware processing with custom ASICs d. Video transcoding that is optimized to use DSP cores

Chapter 8 learning check 1. For which customer need does HPE Integrity Superdome X provide a good fit? a. Storing HDFS files for a big data analytics solution b. Supporting a CRM application c. Providing live transcoding of high definition (HD) video streams d. Supporting a NoSQL database on top of HDFS files

2. What is one rule that architects must follow when planning network adapters for the BL920s Gen9 blades in an HPE Integrity Superdome X? a. Only one blade in each nPartition should have a FlexibleLOM card. b. The blades in the lowest numbered slot of each nPartition must use the same mezzanine cards. c. Every blade in the nPartition must have the same number of mezzanine cards. d. Every blade requires at least one FlexibleLOM card.

3. What are two management tasks supported by the HPE SD OA? (Select two.) a. Registering a support case with HPE Support b. Viewing 3D graphical displays of CPU, memory, and other resource utilization on BL920s blades c. Viewing actions taken by the Error Analysis Eng ine to automatically mitig ate potential issues d. Auditing the firmware on all blades in an nPartition for consistency and updating them as required e. Setting up storage controllers on the SAN arrays to which nPartitions connect

Chapter 9 learning check 1. Which solution enables customers to monitor the temperature for HPE scale-out servers at the rack level? a. HPE ICsp b. HPE APM c. HPE mRCA d. HPE Moonshot iLO CM

2. What advantage does the HATEAOS model for the HPE REST API provide? a. Scripts remain valid for different types of systems, including future ones. b. Developers have a list of precise URLs for each resource that they need to contact. c. Users can authenticate securely by submitting a hash value for their password. d. The client does not need to trust the certificate on the system hosting the REST API.

3. You have connected an HPE Apollo 6000 chassis to an HPE APM. How should you handle the iLO ports on the Apollo 6000 chassis? a. Connect both iLO ports in the same VLAN to which APM’s Ethernet port is connected. b. Connect one and only one iLO port in the same VLAN to which APM’s Ethernet port is connected. c. Avoid connecting either iLO port to the data center network. d. Connect one iLO port to APM and the other port to another HPE Apollo chassis.

Chapter 10 learning check 1. Which financial term describes the periodic rental payment, expressed as a percentage (or decimal equivalent) of equipment cost? It is used to calculate payments, given the cost of equipment (for example, 0.0240 on equipment cost of $10,000 requires a monthly payment of 0.0240 × $10,000 = $240). a. Internal rate of return (IRR) b. Net present value (NPV) c. Lease rate factor (LRF) d. Cumulative average growth rate (CAGR)

2. Which financial term describes the type of business outlay that is reflected on the company’s balance sheet as assets? It creates a depreciation expense on the income statement for each year of the asset’s depreciable life. This depreciation expense lowers reported income (profit), thereby creating a tax savings for each of these years. a. Cumulative average growth rate (CAGR) b. Operating expenditure (OPEX) c. Net present value (NPV) d. Capital expenditure (CAPEX)

3. Which type of business outlay addresses spending on predictable, repeatable costs for items or services that are not registered as assets and are not depreciated? It impacts reported profit and taxes on earnings only in the single reporting period it is incurred. a. Internal rate of return (IRR) b. Total cost of acquisition (TCA) c. Operating expenditure (OPEX) d. Gross Profit (GP)

4. What is the monetary amount by which an asset is valued in business records, a figure not necessarily identical to the amount the asset could bring on the open market? It could also be used to designate the sum of the assets on a portfolio or within a company. a. Net investment value (NIV) b. Net book value (NBV) c. Fair market value (FMV) d. Total cost of acquisition (TCA)

Answers to Activities Appendix B Chapter 1 This section provides answers for the Chapter 1 activity. Chapter —Activity Transform to a hybrid infrastructure All of the HPE server solutions can help a customer transform to a hybrid infrastructure. The HPE ProLiant and Moonshot servers can help the customer take the next step toward self-service delivery because they work with the HPE Helion CloudSystem solutions. You might have chosen any two of many benefits, including: • HPE server solutions are optimized for the workload, making it easier for customers to align the IT infrastructure with requirements. • HPE server solutions use open standards to prevent vendor lock in and enhance flexibility. • HPE ProLiant Moonshot solutions can be controlled by HPE Helion CloudSystem, letting IT easily deploy the right workload to the right location on the fly. • HPE Moonshot and Apollo solutions are scalable, making the data center more agile. • HPE Apollo 2000 solutions deliver twice the density of traditional rack solutions, making the solution scalable and flexible. • HPE Apollo 6000 solutions are designed for scalability and efficiency at rack-scale, delivering a TCO savings of $3M per 1000 servers over 3 years. • HPE Moonshot solutions provide extreme scale out and efficiency, enabling customers to innovate and to speed their time to market with new services while reducing costs and energy use. Protect the digital enterprise HPE Integrity servers, including Integrity Superdome X servers, are the primary solutions for protecting the digital enterprise. However, all ProLiant servers support this transformation area. You might have chosen any two of many benefits, including: • HPE Integrity Superdome servers boost infrastructure reliability with unique HPE features such as self-diagnosing and self-healing. • HPE Integrity Superdome servers help customers distinguish themselves from the competition with stringent SLAs because they offer 20 times more reliability with 60 percent less downtime than other x86 platforms.

HPE Integrity Superdome X supports hardware partitioning, which enhances reliability over • software-only virtualization. Empower the data-driven organization HPE Apollo, Moonshot, and Integrity Superdome X solutions help to empower data-driven organizations. You might have chosen any two of many benefits, including: • As the solution grows, HPE Moonshot continues to deliver excellent throughput, supporting a growing user base with up to 1.7 times more operations per second than traditional 2U 2P rack servers. • HPE Moonshot helps customers to scale their big data solution affordably, offering a 66 percent lower TCO than traditional servers. • HPE Apollo 4200 servers provide more storage density than any other 2U server: up to 28/54 hotplug LFF/SFF HDDs or SSDs, depending on the model. • HPE Apollo 4000 servers can be configured for storage and for performance. • HPE Apollo 4000 servers support up to 16 memory DIMM slots with up to 1024 GB, delivering the performance required for in-memory data processing for near real-time analytics. • The HPE Apollo 4530 System is purpose built for big data analytics. It can be configured to optimally match technology requirements for economical large-scale Hadoop-based data analytics, or it can be configured for more complex compute-intensive analytics with high performance processors. • HPE Integrity Superdome X provides a way for enterprises with the most critical and demanding business processing, decision support, and database workloads to gain the benefits of an x86 platform. Enable employee productivity HPE Moonshot solutions help to enable employee productivity by supporting application delivery and hosted desktop infrastructure. You might have chosen any two of many benefits, including: • HPE Moonshot solutions support up to 2000 users in a single chassis. • HPE Moonshot’s high-density design reduces space, cooling costs, and the energy footprint for the solution. • HPE Moonshot boosts cost-efficiency by using the right compute for each specific workload, so there are no wasted resources. Chapter 2 This section provides answers for Chapter 2 activities.

Chapter 2—Activity 1 Pain points 1. Which statement best describes the insight you gained about the customer ’s pain points? a. Althoug h MTB wants to improve its g lobal g overnance of IT, it still has a massive distributed behavior. b. Replacing MTB’s aging data center infrastructure with HPE solutions will ease the customer’s difficulties while allowing it to continue functioning with separate business units. c. MTB has reached the cloud-readiness stage, but needs help moving from a CAPEX to an OPEX model. d. MTB’s number one priority is to document the different IT governance and procurement policies in various business units.

HPC, R&D, and Big Data 2. What does this information tell you about HPC and MTB? (Select two.) a. MTB’s new manufacturing process and the lead from your friend tell you that HPC is a hot topic within MTB. b. Only one or two MTB business units and operating companies are looking at HPC. c. If you can bring HPC into the SDDC environment, a flexible pool of HPC resources mig ht benefit MTB in g eneral and also allow for better control throug h centralization. d. MTB might be interested in the HPC solution, but only because you have demonstrated that it could result in cost savings.

3. MTB’s big data environment for clinical trials currently resides on the Teradata platform. This environment is starting to become a bottleneck, so Teradata has recently submitted a proposal for expansion. One of your coaches told you that the proposal on the table is for $26M. What can you do? a. plain that MTB would save money by switching to an all HPE platform. b. Propose a solution that can offload data from the Teradata environment, allowing MTB to extend the life of the current environment without performing a complete mig ration or paying Teradata a larg e amount of money. c. Do nothing. MTB is clearly invested in the Teradata platform and it would cost the company more money to integrate another vendor’s solutions.

Players 4. Which question is most appropriate for each decision maker? (Match the question to the decision maker.) a. Jaggers (CEO) b. va (CIO) c. lker (CFO) d. oi (CTO)

d Could you tell me more about how developers are using HPC? What do they do when they cannot get the compute resources they need to run a job? b What is the biggest stumbling block stopping IT from deploying HPC environments that meet manufacturing’s insatiable demands at the pace they require? a A year from now, what do successful R&D and manufacturing departments look like to you? How will they be using HPC to get products on shelves more quickly?

c I am hearing R&D and manufacturing say that they need more HPC compute power to finish their projects. Would you be interested in giving them that without expanding the data center physical footprint and power costs? Chapter 2—Activity 2 Answers will vary; however, here is a list of considerations. 1. What power requirements must you consider when installing a new solution in the data center? • Consumption requirements – Power per server or enclosure • Power distribution, including: – Power distribution units (PDUs) – Intermediate distribution units – Universal Power Supply (UPS) – Backup generator • Dual power sources • Circuit breaker sizing • Grounding 2. What environmental requirements must you consider when installing a new solution in the data center? • Heating, ventilation and air conditioning (HVAC) systems: – Humidity – Temperature – Airflow – Filtering dust/pollution • Acoustic noise • Fire protection systems 3. What should you ask about the placement and arrangement of the products in the new solution? • Is there a space planned for the new products? • Are the products being added to an existing rack or to a new one? • Is the new solution replacing old products? What is the plan for removing those? • Does the data center have a cable management system? • How are racks and products within them oriented? Does the data center use a hot aisle/cool aisle design? • Does the customer have a standard way of labeling racks and products?

• If you can use an elevator, how big is the elevator? Is there a freight elevator? Will the products fit inside the elevator? If you must move the equipment up or down stairs, how wide are the stairs? How sharp are the turns? How high and wide are doorways? • What time can the equipment be moved? Does it need to be moved after work hours? • How many people are required and what equipment is needed to move the products? 4. What should you ask about the safety and security regulations? • How will we access the data center? Will we require badges or temporary ID cards? Will we require an escort? • What safety policies to you have for people moving heavy equipment? Are a certain number of people required? Do we need to use specific equipment? • What safety policies to you have for people working with electrical equipment? Do we need to wear any protective items? • Where are emergency exits? Chapter 3 This section provides answers for the Chapter 3 activity. Chapter 3—Activity 1. What approach would you recommend that MTB takes for increasing the responsiveness and availability of the MES solution? Also, what would help MTB continue to scale in the future? They should scale up this type of workload by migrating the databases to servers with more processors and memory. Adding memory to the database can, in particular, improve its performance. MTB wants to add BI in the future. MTB could do that by exporting the data to an OLAP data warehouse. (Or MTB might choose to move the data into a big data analytics solution.) Alternatively, the company might want to plan for the future by moving to a SAP HANA database, which can support OLTP and OLAP together. In any case, the company needs a scale-up server infrastructure. To help MTB continue to scale easily in the future, servers should ideally offer the flexibility to add memory and processors. They should be easy to provision, deploy, and manage, perhaps with automation solutions. You’ll learn in later chapters how HPE Integrity Superdome X lets customers flexibly add blades to an enclosure. Administrators can use a GUI to easily add those blades to an existing server (called an nPartition). Because MTB wants to improve availability, the servers should also offer redundant components and RAS features from the hardware and the firmware to the OS. 2. How well does MTB’s current approach to deploying HPC applications fit with its desire to move toward SDDC? Should MTB change its approach and, if so, how? MTB wants to move toward an SDDC. Currently, the HPC solution are not in line with this initiative because the MTB has non-standardized, sprawling, siloed, and rigid HPC solutions. Standardization is one of the first steps on the way toward SDDC, which MTB should implement before it can begin to virtualize infrastructure resources and automate them.

MTB IT decision makers should profile the HPC applications and determine why some of the organically grown clusters are not performing well. They should define profiles for infrastructure that is tailored for the application requirements. For example, they should determine whether applications run better on processors with more cores but a lower clock speed or on processors with fewer cores but a higher clock speed. They might decide that they need a few different types of servers, optimized for different types of jobs. Solutions like HPE Apollo enclosures can provide a standard modular building block with mix and match compute options. HPC scheduling software could help to match jobs to the correct tailored resources. To combat IT sprawl, the HPC cluster infrastructure should be density optimized, packing a great deal of processing power into a small physical footprint. It should also be ready to support automated provisioning and management for when MTB moves closer to SDDC. The HPE REST API, supported by HPE Apollo servers, ensures that this is the case. 3. What type of server infrastructure will meet the needs for manufacturing’s Hadoop solution? What advantages does the HPE Big Data Reference Architecture provide? MTB is still developing its analytics application sand determining its requirements. The traditional architecture would force MTB to choose a compute to storage ratio now. But the HPE architecture allows the company to deploy compute resources separately from the storage resources. Manufacturing can immediately start moving its unstructured data to HDFS to ease the pain of storing the data. Later when the analytics applications have matured, if IT decision makers discover that they need more compute power, they can easily deploy more compute nodes without uprooting any of the existing solution. Chapter 4 This section provides answers for Chapter 4 activities. Chapter 4—Activity 1 1. Does the customer ’s application support GPU or co-processor acceleration? Visit these links to search for the application: • http://www.nvidia.com/object/gpu-applications.html • https://software.intel.com/en-us/xeonphionlinecatalog • https://www.khronos.org/opencl/resources Neither NVIDIA nor Intel list Synopsys as supporting their acceleration. Synopsys has an ASIP Designer that supports OpenCL, but other tools do not. Acceleration probably will not benefit this customer. 2. Which compute tray will you recommend for the solution and why? The HPE XL220a compute tray is probably the best choice. This tray is optimized for singlethreaded HPC such as most of this customer ’s EDA applications. For these applications, a processor with the fastest clock speed per core, as opposed to a processor with many cores, provides the best performance. The XL220a processors operate at a much higher clock speed than the customer ’s current processors. The XL220a could also provide good performance for the lightly-threaded IC Compiler, or you

might suggest mixing in some XL230a servers. 3. How much memory will you recommend for each processor? How much total for your selected compute tray? In the current solution, each processor has 16GB memory, and insufficient memory is already causing jobs to run more slowly. You should plan at least 24GB memory for each processor and preferably 32GB (64GB total). 4. Visit http://h22195.www2.hp.com/DDR4memoryconfig. Use the tool to plan how you will configure the DIMMs. If you are given multiple choices, choose a design that will help optimize performance. Record your configuration and explain the reasoning behind your choices. To optimize performance, you should create a balanced design in which you use both channels for each processor. If you are planning 32GB per processor, the choice is simple. You should install four 8GB UDIMMs for each processor. If you are planning 24GB per processor, you could install 4GB in one slot and 8GB in the second slot for each channel. You should use standard, not low voltage, UDIMMs. 5. You will need to test to come to a final decision about how many compute trays to propose. At this point, about how many compute trays and HPE Apollo 6000 chassis will you plan to propose? The customer wants to speed up computing tasks by about 30 percent. You know that three issues are slowing down jobs at this point: – CPU resources – Memory resources – NAS bottleneck You are planning to resolve each of these issues. The XL220a compute tray should provide at least 1.5 more compute power (3.5GHz versus 2.5GHz, for example), which would reduce computing task time by 30 percent, assuming that compute power is the only factor. Of course, this is not the case, but you are also planning to resolve other issues. You might start with a plan for delivering the same number of processors that the customer currently has. To deliver 240 processors, you must plan 12 HPE Apollo 6000 chassis, each with 10 XL220a compute trays, each with two processors. You would then test to determine how well the solution handles the load and add more compute trays if necessary. You might also propose a few more trays. 6. You have several options for drives that meet the customer ’s capacity requirements. What are some additional questions that you can ask the customer to help you make these choices? You can ask questions such as these: – How frequently will the application read or write to the drive during a computation? – How frequently will the drive be used throughout the day? – Will the application read more data or write more data, or do an even mix of both? – How mission critical is data stored on the drive? 7. For the shared storage solution, which server model will you choose? How many are required to meet the customers’ needs? Justify your choices.

You can choose one of the HPE SL4530 models. SSDs deliver the best speed, but they are more expensive. You might plan for all SSDs or propose a tiered alternative in which some, less frequently accessed data is stored on HDDs. For the all-SSD plan, you could plan four SL4530 1x60 models, each of which provides 50TB on SSDs; or, you could plan five SL4530 2x25 models, each of which provides 40TB on SSDs. If you are planning to store some data on HDDs, you could plan fewer models. Alternatively, you could choose HPE Apollo 2000 Systems (about 10). 8. Will you use InfiniBand, 10 GbE, or 1 GbE? How many ports will you plan for each server? Explain your reasoning. Also list further questions that you might ask the customer if you cannot make a choice. Questions that you would ask to determine the requirements include: – How much data do applications, particularly IC Compiler which can use distributed computing, generate over the wire? – Does the application need high speed or low latency? – Do nodes require link redundancy? The XL220a compute trays do not support InfiniBand. You must choose between 10 GbE and 1 GbE. This solution might require high speeds and low latency between compute nodes, depending on how much IC Compiler is using parallel, or distributed, computing. (Other jobs run largely independently and should not generate much traffic.) You should discuss these needs with the customer. The nodes will also use the Ethernet connection for transferring files to and from shared storage. Most files are small, but transfers are frequent. You should probably suggest 10 GbE adapters. You should probably plan for two ports per-server (four ports per compute tray). However, you might be able to stay with one port per server if the HPC application can tolerate the loss of a node well. Chapter 4—Activity 2 This activity does not have answers. Chapter 4—Activity 3 Your presentation will be unique. For your reference, a section at the end of this appendix gives an example of the report that you output with the HPE tools. Chapter 5 This section provides answers to the activity in Chapter 5. Chapter 5—Activity Select HPE products 1. What are two HPE server solutions that might fit this customer ’s needs?

The HPE Apollo 4200 System or HPE Apollo 4530 System could meet the needs. 2. What should you discuss with the customer to help you determine which of these servers you should propose? You should have listed ideas such as these (some of the answers to these questions were listed in the scenario): – What types of analysis does the customer intend to apply to the data? – Does the customer plan to introduce more complex analytics in the future? – How much data does the customer intend to store? – Does the customer have a preference for a 2U versus a 4U chassis? – What type of scalability does the customer require? Scope the storage requirements 3. What should you discuss with the customer when planning how you can under provision storage based on the fact that data will be compressed? You should explain that HPE recommends that data be compressed both on the compute and storage nodes. You should discuss the type of compression that the customer plans to use. You should also discuss the type of files stored with HDFS and what percentage of the stored data each type composes. All of these factors affect how much data can be compressed. You should agree on a compression factor, attempting to underestimate how much data will be compressed so that you avoid under-provisioning. 4. Assume that you and the customer have agreed that you can plan on a compression factor of 1.5 (in other words a 1.5 MB file will take up 1MB). How much storage capacity should you plan for the data nodes? Remember to take into consideration the current data, the ingest rate, the replication factor, and space for result files. 5.23PB The customer currently has 1400 TB and will ingest 365 TB over the next year. The solution must support 1765 TB. Multiply 1765 by 4 to allow for a replication factor of 3 and space for result files. Divide 7060 by .9 to account for the extra space required due to the fact that only 90 percent of the raw storage is usable by HDFS. Divide 7844 by 1.5 to take into account compression. 5. Assume that you have discussed the factors that you listed in the first part of this activity and that you have decided to propose HPE Apollo 4530 Systems. Which ProLiant XL server do you propose for this system, and how many can you propose per system? You can propose three ProLiant XL450 servers per Apollo 4530 system. 6. How many HPE Apollo 4530 Systems will you propose? More than one answer could be valid, but think about how you would justify your answer and what you would discuss with the customer to help you make your choice. Each XL450 server can provide 120TB capacity when it uses 8TB Midline SATA HDDs. (Each server can provide 60TB capacity if you plan to use 4TB HDDs instead.) Each Apollo 4530 System, then, can support 360TB (or 180TB). To meet the customer needs, you should propose 15 (or 30) systems, which slightly exceeds the requirements. However, you should create balanced systems.

7. You learned about baseline processors and memory for a solution such as this, as well as ones that meet enhanced needs. Do you think that this customer has baseline or enhanced needs? This customer probably has baseline needs. The customer is only planning to run analyses once a week and does not need the analyses to finish quickly. The customer is using MapReduce2 and has not indicated any particularly complex computational requirements. 8. You will propose two processors per XL server. Based on your answer to the previous question, which of these processors provides the best choice? (Refer to the information presented in this chapter to remind yourself of the baseline recommendations. Note that the HPE Apollo 4530 System supports more types of processors; you can find a complete list in the QuickSpecs.) a. Intel Xeon E5-2698v3 with 2.3GHz frequency and 16 cores b. Intel Xeon E5-2690v3 with 2.6GHz frequency and 12 cores c. Intel Xeon E5-2650v3 with 2.3GHz frequency and 10 cores d. Intel Xeon E5-2603v3 with 1.6 GHz frequency and 6 cores

9. Based on your answer to question 4, how much memory capacity will you plan for each XL server in the Apollo System? The server has 16 DIMM slots (four memory channels with two slots each on each processor). It supports 4/8/16/32GB RDIMMs and 8/16/32/64GB LRDIMMs. Which DIMMs will you propose? The baseline recommendation is for 128 GB of memory (64 GB per processor). As you have learned in previous chapters, it is best practice to fill each memory channel. Therefore, you should plan to fill at least eight slots. You could plan eight 16GB RDIMMs or LRDIMMs. Chapter 6 This section provides answers to the activities in Chapter 6. Chapter 6—Activity Select switch modules Switch module choices a. HPE Moonshot 45G Switch Module b. HPE Moonshot 45Gc Switch Module c. HPE Moonshot 45XGc Switch Module d. HPE Moonshot 180G Switch Module

Scenarios 1. The chassis has 45 m710p cartridges. c 2. The chassis has 45 m350 cartridges. d 3. The chassis has 45 m300 cartridges. b

4. The chassis has 30 m400 cartridges and 15 m800 cartridges. d The 180G module is required to support the m800 cartridges. To obtain the highest speeds for the m400, you would actually need to install those cartridges in a different chassis and then use the 45XCc Switch Module (c). 5. The chassis has 3 m300 cartridges and 42 m710 cartridges. c Select uplink modules 6. HPE Moonshot 4-QSPF+ Uplink Module This module is a valid choice for a chassis with HPE Moonshot 45XGc or 180G Switch Modules. Reasons to choose it include: – You want to maximize bandwidth while minimizing the number of ports required to connect the Moonshot chassis to the customer network infrastructure. – You want to give the customer the flexibility to use 10GbE now (through splitter cables or adapters) but upgrade to 40GbE in the future. 7. HPE Moonshot 16-SPF+ Uplink Module This module is a valid choice for a chassis with HPE Moonshot 45XGc or 180G Switch Modules. Reasons to choose it include: – The customer infrastructure is set up for 10GbE at the edge. – The design does not require a lot of uplink bandwidth, and it will be more cost effective for the customer to purchase this module and use just a couple of 10GbE links. – The design calls for the switch modules to be linked by IRF or stacking. You want redundancy for the IRF or stacking links, but you want to keep more than 80Gbps for uplinks—and you want the data center to have sufficient 10GbE ports on TOR or EOR switches. 8. HPE Moonshot 6-SPF+ Uplink Module This module is required when you are proposing an HPE Moonshot 45G or 45Gc switch. Plan redundant topologies The figures below show how you should plan the topologies.

Figure B-1: Topology 1

Figure B-2: Topology 2

Figure B-3: Topology 3

Figure B-4: Topology 4 Chapter 7 This section provides answers to the activities in Chapter 7. Chapter 7—Activity 1 Select HPE products Match an appropriate product to each role in the solution. You can use the same product more than once. You do not have to use every product. Products a. E DL360 b. E Apollo 4200 c. E m700 cartridge

d. E m710p cartridge

Solution role 1. Active and standby head nodes a 2. MR2 worker nodes (compute nodes) d (or c) 3. HDFS data nodes (storage nodes) b 4. Management node a Also answer this question: 5. The customer wants to distribute data ingestion rather than use an edge node. Which products require dual-homed connections? Will you need to set up VLANs on the Moonshot switch modules? The HPE Apollo 4200 Systems, which act as the data nodes, require dual-homed connections. The compute nodes in the HPE Moonshot 1500 chassis do not. Because all cartridge nodes belong to the same network (the Hadoop cluster network), you do not need to set up VLANs on the Moonshot switch modules. The TOR switches can handle the VLANs, isolating the Hadoop cluster network from the ETL and external network. Scope the storage requirements Answer these questions. 6. How many HPE Apollo 4200 Systems will you propose? More than one answer could be valid, but think about how you would justify your answer and what you would discuss with the customer to help you make your choice. Your answer depends on several choices. You can use 8TB SATA HDDs for the storage nodes and achieve 224TB capacity per system. Every storage node should provide the same capacity, so you should fill 24 systems to capacity, providing slightly more than the customer ’s minimum requirements (5.376 PB). Because Hadoop is rack aware, you might even want to suggest 25 systems so that each rack is even, and data can be easily balanced across it. Alternatively, you could use 4TB SATA HDDs. In this case, you would need 47 Apollo 4200 Systems. Again, you might want to suggest a few extra systems to make racks even for 50 total. The customer is going to continue ingesting data, so this would allow the customer to wait a bit longer before scaling out again. Because this solution has twice as many Apollo 4200 Systems but stores the same amount of data, it will support a higher I/O. If you use the balanced rack design, it will also be relatively hotter because it has more racks and HPE Moonshot chassis per TB stored data. You should discuss how the analytics applications will handle data to help you make these choices.

Are the applications I/O bound? If so, you might want to increase the IOPS for the total solution by using the lower capacity disk plan. (You might then decrease the compute node to storage node ratio if the CPU requirements are not large.). Also consider the recovery time and bandwidth required if an entire Apollo 4200 System fails. The more data each system supports, the longer the recovery time and the more bandwidth consumed. 7. When a customer is deploying a new Hadoop solution, you should generally begin with a balanced rack design for the storage nodes and MR2 compute nodes. How many balanced racks are required to meet the storage needs? This answer depends on your previous answer. If you are proposing 25 Apollo 4200 systems, you need five racks. Or, if you are using 4TB drives and 50 Apollo 4200 systems, you need 10 racks. Scope the compute nodes 8. How many Moonshot chassis will you deploy to fill the balanced racks? Your answer depends on your previous answers. It might be 15 (three chassis in five racks) or 30 (or perhaps 29 if you didn’t plan for exactly even racks). 9. Assume that you are proposing 960GB SSDs for the HPE m710p cartridges. How much data can each HPE Moonshot chassis store locally? How much can the compute node cluster store locally? Each HPE Moonshot chassis can store 43.2TB locally. Your answer to the second question will differ depending on whether you are planning to propose 15 or 30 chassis (or 29). In the first case, the cluster can store about 648TB locally. In the 30 chassis design, the cluster could store about 1296TB locally—close to the customer ’s full data.

10. How many map tasks can the cluster run at once? How many reduce tasks can it run at once? Your answer will differ depending on your previous answers. If you are proposing 15 full chassis, the solution can run 9448 map tasks at once or 4725 reduce tasks at once. With 30 full chassis, the solution can run 18,898 map tasks or 9448 reduce tasks.

11. What would be reasons for you to adjust your design to have relatively more or fewer Moonshot chassis? What might you discuss with the customer? If you planned to use 4TB drives, you already have deployed relatively more Moonshot cartridges per TB stored data. You might plan to reduce the compute node ratio accordingly if the customer does not need this extra power. In fact, the Cloudera reference architecture, which suggests using 4TB drives, also recommends a 2:6 Moonshot chassis to Apollo 4200 ratio rather than a 3:5 ratio. You should discuss the number of applications that the customer plans to run at once and how complex the applications are (how many tasks they typically involve). Discuss the type of tasks. Are they CPU bound tasks such as complex data mining and feature extraction? In these cases, the customer might require a higher ratio of Moonshot cartridges. Run tests

12. You have created a POC with your proposed servers, which you have set up with a solution as close to the customer ’s as possible. What guidelines will you follow as you conduct tests? You should discuss performance expectations with the customer. For example, you should

understand the type of analytics that the customer plans to use and how long the customer expects queries will take to complete. You should run typical queries and demanding “worst-case scenario” queries on the full cluster, if possible, to see how the solution operates as a whole. Otherwise, you can show the performance of a single chassis, scaling down the dataset size accordingly. This customer uses Cloudera, so you can use the Cloudera Manager to monitor metrics as you test. You might use the Pending vcore and Pending MB metrics to see whether resources become oversubscribed. If these metrics rise over 0, an application is waiting to schedule resources because the processor cores or memory are all in use. You might want to add a higher ratio of Moonshot chassis to the solution. You can also view the metrics that show how long an application has been running to see whether the runtime fits the customer expectations. You can also test the HPE Apollo 4200 Systems’ performance by running queries on progressively more data and checking the datanode metrics. If the read and write operations stop increasing as compute nodes read more data, you might have detected a bottleneck. You might want to spread out data across more disks and systems. Describe the benefits of the HPE Big Data Reference Architecture

13. Begin to outline a presentation for winning the CIO and other decision makers to your side using the HPE Big Data Reference Architecture. (You will learn more about tools that you can use, such as the HPE Alinean TCO/ROI Calculator, in Chapter 10.) Your presentation should touch on points such as these: • The main reason that companies have traditionally located compute nodes with data nodes was to reduce latency. However, companies can now easily deploy 10GbE networks, which, properly planned, can operate with as low latency as local storage systems. • Using the same systems to store data and analyze the data has a number of disadvantages. – Trying to optimize the system for processing power, storage capacity, and disk I/O might leave you with a system that does not perform any of these tasks as well as it could. – You are forced to scale processing power and storage together, which makes scaling more expensive and rigid. – You are planning to add new types of applications in the future. It is likely that those applications will be analyzing almost exactly the same data as the current applications. However, if you need to create a new cluster for new types of applications, the traditional approach forces you to replicate all of your data for the new cluster, consuming expensive storage. • The HPE Reference Architecture gives you compute nodes that are optimized for processing power, scalability, low power consumption, and relatively low TCO for their power. It gives you storage nodes that are optimized for storage capacity and disk I/O. • HPE has discovered that this architecture can improve read IOPS by 30 percent while also delivering excellent analytics performance. • If you find that you want to run queries faster in the future or introduce more CPU-bound applications, you can easily add compute nodes without having to add storage.

• Similarly, if you want to add more data but do not need more compute power, you can easily scale out the storage nodes. • You have the flexibility to deploy a new cluster that draws on the same data as the existing one, saving you the costs of replicating data and the complexity of managing isolated clusters. Add HBase

14. Which Moonshot cartridges will you plan for the HBase Region Servers? The m710p cartridges are probably the best option with their higher processing power and local storage capacity. The m710 cartridges could provide a less expensive alternative.

15. How many Moonshot cartridges will you plan for the HBase Region Servers? How many chassis? To provide 100TB of SSD local storage (required for excellent performance), you must plan 105 cartridges, each with a 960GB SSD. The 105 cartridges will fit in three HPE Moonshot 1500 chassis. Chapter 7—Activity 2 Select cartridge models 1. Select an HPE Moonshot cartridge model for the transcoding farm. Explain your choice. The m710 and m710p provide GPUs, which can help to speed transcoding for Harmonic Xpress. (Alternatively, the m800 provides DSP cores, which can accelerate video transcoding into the H.264 codec.) You might research the reference architecture, which recommends the m710. 2. The solution also requires a server for the WFS controller, WFS manager, and SQL database roles. The controller must meet these specifications: – Processor: Intel or AMD, 3.0GHz (can include Turbo); quad-core preferred – Memory: 12GB or higher You want to host the controller in the Moonshot chassis to conserve rack space and deliver a complete solution with Moonshot. Select a cartridge that meets these needs. You might have more than one option. Explain your choice. The m710p best fits these specifications. It has an Intel quad-core processor base with a base clock speed of 2.9GHz and Turbo speed of 3.8GHz. It more than meets the memory requirements with 32GB. Run tests 1. What information do you need to discuss with the customer to set up the test? You need to discuss precisely how the customer plans to transcode files, including the number of output formats and the specifications for those formats. You should gather as much information about the solution as possible. Try to imitate the FTP server and the bandwidth between this server and the WFS controller, for example. You should also see if the customer can provide you with a sample of raw video of the kind that will be transcoded. 2. As you discussed needs, the customer told you that the Harmonic services must be installed on

either Microsoft Windows 2008 or Microsoft Windows 2012 R2. Visit http://www8.hp.com/us/en/products/servers/management/operating-environments/os-supportmatrix.html and determine which OS is supported. The Microsoft Windows 2012 R2 OS is supported. 3. You have discussed the necessary information and set up the POC. You discover that the m710 cartridge running ProMedia Carbon and ProMedia Xpress can transcode one 60-minute video into all required output formats in 180 minutes. What is the real-time ratio? .33 4. What real-time ratio does the customer require? 12.5 5. How many cartridges should you provide to transcode files? 38 2.5/.33 = 37.5. However, you must, of course, round up to an integer. 6. How many Moonshot chassis should you provide? Remember to include the cartridge for hosting the WFS controller. You must provide one chassis. Plan storage Create a plan for how you will provide storage for the solution. Include the connections between the HPE Moonshot solution and the storage. You should plan to connect the single chassis directly to the storage solution. You can use HDDs on an SL 4540 3x15 to provide enough capacity. Alternatively, you could use an Apollo 4200. Your plan should resemble this:

Figure B-5: Plan storage for the video transcoding solution Chapter 7—Activity 3

Gather information 1. What should you discuss with the customer to gain a better idea about the workload and the solution requirements? You need to discuss which applications the XenApp solution will support, as well as which applications can benefit from GPU acceleration. You also need to know how many users use the various applications. This information will help you to scope the correct number of cartridges. You might also want to discuss how employees use the applications. Do employees work mostly at the same times? Are there busier times of the day when all employees are working once or bursts of activity when all users log in? You can discuss the local storage requirements for the application servers. After provisioning, will the server need to boot from a local disk? Will applications be installed on the server locally? Will applications save any data locally? Will the VDA server be connected to network attached storage (NAS)? Will the 120GB, 240GB, or 480GB SSD best fit the company’s needs? You also should discuss the customer ’s availability requirements. Can the solution tolerate the loss of a cartridge node if a link to that node fails? Can it tolerate loss of connectivity to an entire chassis? You should discuss the bandwidth requirements. How much bandwidth does the customer expect each user will require? Considering that a cartridge may host 40 to 50 sessions, will a cartridge require 10GbE or 1GbE links? What level of oversubscription can the solution tolerate? How will the solution fit into the data center infrastructure? Does the company support 40GbE or 10GbE to the rack? Should you propose a TOR switch? You should also discuss how the customer IT staff plans to set up and provision the VDA services on the Moonshot cartridges. What type of management access does the staff require? How automated are the IT processes? 2. What should you discuss to set the CIO’s mind at ease about the solution? You should discuss what problems users have been experiencing in the pilot. You should explain that the Moonshot solution uses physical GPUs to prevent performance issues. HPE partners with Citrix, and Citrix has demonstrated excellent performance for XenApp for companies like this one. You can also discuss support options that HPE provides such as Datacenter Care in which HPE becomes the single point of contact. Plan the solution 1. Which applications will benefit from GPU acceleration? Adobe Photoshop, SolidWorks eDrawings Viewer, and Microsoft Skype for Business video conferencing 2. Which type of HPE Moonshot cartridge should you recommend for the XenApp Virtual Desktop Agent (VDA), which support the user sessions? You should choose m710 cartridges. The m710 cartridge is a good choice due the need for supporting 3D design, which could tax the GPU as well as take up more bandwidth. The customer mentioned a preference for 10GbE connections. The m710p is an alternative. 3. What is your initial estimate for the number of cartridges required for the XenApp VDA servers?

99 m710 cartridges Most users (3,750) use media-rich applications such as Photoshop, eDrawings Viewer, and Skype for Business video conferencing. As a rule of thumb, you can estimate that an m710 can support 40 such users, so 94 cartridges are required for these users (3750/40 = 93.75). The remaining 250 employees have lower requirements. You can estimate one cartridge per 50 users, so you need another five cartridges for 99 total. 4. Remember also that the company is growing. What percentage does HPE recommend adding to the solution to account for this growth? How many cartridges total should you recommend for VDA servers? HPE recommends adding 20 to 25 percent. Because this company is growing rapidly, you should probably add 25 percent to 99, bringing the total to 124 cartridges. 5. The company also requires three servers to host VMs for the XenApp controller and Netscaler. Which cartridge provides the best choice for hosting these VMs? a. m300 b. m700 c. m800

6. With the three additional cartridges, how many total cartridges have you planned? How many HPE Moonshot 1500 chassis are required? You should have planned 127 cartridges, which will fit in three chassis. Plan the networking Based on the requirements that you gathered, plan the following: • Switch modules – Type of switch module – Number of modules per chassis You should select HPE Moonshot 45XGc switches to provide the 10GbE connectivity for the m710 cartridges. The customer does not require link redundancy for individual cartridges, so only one module per chassis is required. • Uplink modules – Type of uplink module – Number of modules per chassis – Number of uplinks to connect To meet the oversubscription requirements, the switch module must establish either six 10GbE links to the TOR switches or two 40GbE links (450/8 = 56.25). Because you are proposing the ToR switches, you can propose HPE 5930 switches, which support 40GbE to the Moonshot chassis. The HPE Moonshot 4-QSPF+ Uplink Module might be the better choice for the simplicity of fewer cables and ports. You need just one uplink module per chassis because you are using just one switch module.

• IRF or stacking – Do you plan to use this feature? – If you do, which modules will you combine and how will you link the modules? You need to use two links on each chassis to meet the oversubscription requirements, so you don’t need to use IRF to consolidate. You could use IRF to make it simpler to manage the switch modules. You could combine all four modules using a ring topology and the two extra 40GbE ports not required for uplinks on each uplink module. Plan provisioning 1. Server administrators explain that they plan to use WDS to provision cartridges with their Windows Server 2012 R2 images. They will then use Citrix Machine Creation Services (MCS) to deliver an image with proper applications to the VDA servers. MCS creates a thin-provisioned clone of a master image, which the hypervisor on the VDA server hardware uses to create a VM for the VDA server. The alternative solution is Citrix Provisioning Server (PVS), which uses a dedicated server to deploy images to physical servers or VMs. Why might you recommend that the customer uses PVS instead of MCS? Each HPE Moonshot m710 cartridge is a compact SoC designed to act as one VDA server. The cartridge can run a Microsoft Hyper-V hypervisor and host a VM for the VDA server. However, this design adds a layer of complexity and overhead, so typically it is better for the m710 to act as a physical server. MCS only works with virtualization, while PVS can stream an image to a physical server. Run tests 1. You have set up a POC with your proposed cartridges and deployed the proper OS and customer applications. What guidelines will you follow as you conduct tests? You should discuss performance expectations with the customer. For example, you should agree on a maximum response time. You should test with a tool such as LoginVSI, which simulates multiple user sessions. You should set up the tests such that LoginVSI uses the types of applications that the customer indicates employees will use. You should monitor resource utilization as you add users to the test. You can tell that a resource has become a bottleneck when you see that its performance stops increasing with the addition of more users and instead plateaus. You should also monitor response time. When the response time exceeds the customer requirements or when a resource, such as CPU, reaches a bottleneck, you have determined the true maximum number of users that the cartridge can support. You should also test with multiple cartridges in a fully populated chassis to verify that the solution scales as expected. If the maximum number of users per cartridge differs from your estimate, you should adjust the number of cartridges in the plan accordingly. Chapter 7—Activity 4 Table B7-1: HPE Moonshot transcoding performance HPE Moonshot solution

Equivalent traditional server

solution Solution

1 chassis—43 ProLiant m710cartridges

14 x DL360 Gen9 servers

Performance (real-time transcodingratio)

14.1

14.4

Rack units

4.3

14

Performance/U

3.28

1.03

Moonshot provides 3.2 times more performance/U Power consumption

3195W

5878W

Performance/W

.0044

.0024

Moonshot provides 1.8 times more performance/W Cost

$333,462

$514,643

Performance/$

4.23*e-5

2.80*e-5

Moonshot provides 1.5 times more performance/$

Chapter 8 This section provides answers for activities in Chapter 8. Chapter 8—Activity 1 Identify scenarios for HPE Integrity Superdome X * A healthcare institution has acquired several new hospitals. It is standardizing patient records on a single Electronic Healthcare Record solution and needs to update its infrastructure for this solution. ___An automotive manufacturer needs a better platform for an Electronic Design Automation (EDA) solution. * An automotive manufacturer needs a better platform for its SAP Enterprise Resource Planning (ERP) solution. ___A media company is beginning to live stream content digitally and needs a solution that can meet the high demands of live transcoding. * A retail organization wants to mine structured data in its data warehouse for help in making business decisions. It plans to deploy an Oracle OLAP cube for this purpose. ___A social media site is deploying Cloudera HBase to organize its unstructured data for faster processing and analysis. * A financial institution is rolling out a new Temenos core banking solution so that it can provide more online banking services to its customers. * A government department has developed an application in-house for processing license fees. Recently the servers hosting the application experienced unplanned downtime.

___A software development company wants to boost employee productivity and reduce capital expenditures by creating an open floor space for developers. As part of the initiative, the company wants to move from traditional managed desktops to desktops hosted in the data center. * A call center uses a resource management solution to manage its workforce. However, the solution is reaching its capacity, and the call center needs a refresh. * A retail organization uses sensors to track inventory and plan just-in-time stocking. But the current solution sometimes loses data, causing managers to make incorrect decisions. ___A university is attempting to virtualize its data center services, as well as create a self-service model for deploying applications. Discuss deployment drivers You can list a great many different questions along these lines: • Attempts to discover whether the customer needs the scalability or relatively low TCO of Integrity Superdome X: – How quickly is the company growing? (You might be able to investigate this information on your own.) – Are employees using the solution more often? – Has the company recently acquired another company or merged with another company? – Is this project part of a larger project to standardize hardware? • Attempts to discover whether the customer needs the performance and high quality experience provided by Integrity Superdome X: – How responsive is the current solution? – Do employees ever complain to IT about the ERP solution? – Do employees avoid using the ERP solution because it responds too slowly? – Can the current solution handle the complex ERP functions that bring the most business value? • Attempts to discover whether the customer needs the availability of Integrity Superdome X: – Has the current solution lost any data? How did the data loss affect business processes? Did it lead to wasted employee time? Did it lead to lost revenue? – Has the current solution experienced unplanned downtime? How did the downtime affect business processes? Did it lead to wasted employee time? Did it lead to lost revenue? Chapter 8—Activity 2 Plan nPartitioning and memory You are planning to propose an HPE Integrity Superdome X server to meet the customer ’s needs. Answer these questions about the solution. 1. Your sales partner has told you that the SAP ERP solution must support 30,000 users. What

additional information might you want to discuss with the customer to flesh out this requirement? You should determine whether this number is taking into account future expansion or whether you should further pad the number. You should also verify whether the user count is the ultimate factor by which the solution success will be judged or whether the customer will also use a metric such as SAP Application Performance Standard (SAPS). You might suggest other questions as well. 2. Estimate whether an HPE Integrity Superdome solution with the customer ’s proposed specifications will meet the customers’ needs. a. sit http://global.sap.com/solutions/benchmark/sd2tier.epx. b. arch for Integrity Superdome and find a row that lists the customers’ solution.

c. Examine the specifications and the number of supported users. Do you feel confident that a solution with the customer ’s minimal requirements will meet the needs? Do you want to propose exceeding the requirements in any way? A 16-processor solution with 4TB supports up to 100,000 users, so an 8-processor one should support 30,000 users well and also still provide good performance in case a processor fails.1TB of memory might be adequate, or you might propose an alternative 2TB solution to provide more capacity and redundancy. 3. Which server blade and processor will you propose? Table 8-2 at the end of Chapter 8 gives an overview of options. More than one option might be valid. Explain your choice. The customer wants Xeon E7 v3 processors, so you must select HPE BL920 Gen 9 blades. You have several choices for processor. You should select a processor with many cores for the massively multithreaded application. The Intel Xeon Processor E7-4850 v3, E7-4880, or E7-4890 could be good choices. 4. How many nPartitions will you create? You should create two nPartitions, each with four blades and eight processors. 5. Which license does the HPE Integrity Superdome X solution require? The Advanced Partition License 6. Which slots will you combine in each nPartition? One nPartition uses slots 1, 3, 5, and 7. The other nPartition uses slots 2, 4, 6, and 8. 7. How much memory will you recommend? Create a table similar to Table 8-1. Remember that you must install memory in increments of 16 DIMMs (one DIMM in each of eight channels for two processors). You can use this strategy to plan: – Fill in the total capacity that you require. – Divide that capacity by the number of blades in the nPartition. – In Figure 8-18 in Chapter 8, find the row with the value that matches the capacity required per blade (round up). – Fill in the rounded up capacity per blade. Then fill in other cells using the table. You might recommend 1TB or 2TB.

Table B8-1: Memory plan DIMM capacity

Number of DIMMs per blade

Capacity per blade

Total capacity per nPartition

16GB or 32GB

16

256GB or 512GB

1TB or 2TB

Chapter 8—Activity 3

Figure B-6: Mapping between a blade’s FlexibleLOM or mezzanine ports to the ICM Bays Chapter 8—Activity 4 1. Refer to the customer requirements, the FlexibleLOM requirements in Chapter 8, and Table 8-3 in Chapter 8. Select a FlexibleLOM option and indicate the number that you will install. Select an HPE Ethernet 10Gb 2-port 560FLB Adapter. The customer doesn’t need FlexFabric features such as FCoE because you’re proposing direct FC connections to a 3PAR storage array. You must install one adapter per blade, so you will plan four FlexibleLOMs for each nPartition even though this provides more ports than the customer requires. 2. Which interconnect module bays provide the uplinks for these ports? Bays 1 and 2

3. Which module or modules will you plan for these bays? Refer to Table 8-4 at the end of Chapter 8 for options. The HPE 6125XLG is probably the best choice. It aggregates the links to 40GbE links for fewer cables. (However, you could also use a 10GbE pass-through module if the customer wants to control networking at the ToR switch.) You should plan two modules for redundancy purposes, because redundancy is very important for this customer. 4. How will you set up NIC teaming on each nPartition? What should you discuss with the network administrator for the switch module configuration? All 16 ports can be bonded. You can discuss with the customer what type of NIC teaming they prefer. Active/standby could work since the server has enough bandwidth. If the customer wants load balancing, LACP is probably the best choice. LACP mode requires the switch modules to be aware and implement LACP as well. The network administrator would need to combine the switch modules into an IRF fabric. 5. The system needs high bandwidth connectivity to the 3PAR storage array: a two-port 16Gbps FC adapter on each server blade. What is a valid mezzanine slot for the adapters? This adapter is a Type A adapter that can work in any of the slots. You might choose the first mezzanine. 6. Which interconnect module bays provide the uplinks for ports installed in this mezzanine? If you choose the first mezzanine, bays 3 and 4. 7. Which switch modules will you plan for these bays? You can choose either SAN switch. The Brocade 16Gb/16c Embedded SAN Switch provides more uplink FC bandwidth. 8. What requirement for an nPartition’s image can affect your plan for FC connections? At least one port must be dedicated to FC boot per nPartition. This port must connect through the interconnect SAN switch module to a supported SAN solution for booting the nPartition. Chapter 8—Activity 5 Your presentation will be unique. It should mention RAS features such as: • Firmware first • Self-healing capabilities • Error Analysis Engine monitoring of components and deactivation of components before failure • Xeon E7 features such as error containment • Superdome X firmware being specially designed to work with the Xeon E7 features and resolve detected errors • Memory scrubbing • Enhanced double device data correction (DDDC)+1

• Clock redundancy • Fault-tolerant fabric • Error isolation through hardware partitioning • Health Repository to inform admins of issues • Error Analysis Engine integration with HPE Insight Remote Support Chapter 9 This section provides answers for the Chapter 9 activities. Chapter 9—Activity 1 In this activity, you will review what you have learned about iLO in prerequisite training. Prepare a presentation about iLO as if you were presenting the benefits to a customer. Point out benefits in how iLO aids in: • Setting up and provisioning servers • Monitoring servers • Diagnosing and troubleshooting servers • Connecting to support services Your presentation should touch on benefits such as: • HPE Intelligent Provisioning removes many of the complexities of provisioning bare metal servers, helping integrators to bring systems online faster. • The iLO management engine allows agentless management. Hardware monitoring and alerting are available as soon as the server is powered on and has an Ethernet connection regardless of the OS state. The monitoring can help administrators to optimize the solution while alerting can help administrators to find issues and prevent unplanned downtime. • Active Health automatically monitors the ProLiant servers’ status across 1600 data points, also helping administrators to find issues and prevent unplanned downtime. • HPE iLO integrates with HPE Insight Remote Support. It monitors servers and sends proactive alerts to administrators, helping them to prevent or quickly resolve issues. Chapter 9—Activity 2 Read each scenario below. Then select the solution or solutions that you should recommend to meet the customer needs. You can select more than one letter if the scenario calls for more than one solution. Solution choices a. E Moonshot Provisioning Manager (MPM) b. E mRCA c. E Insight Cluster Management Utility (CMU)

d. E Helion CloudSystem Enterprise e. E Advanced Power Manager (APM) f. E Superdome Onboard Administrator (SD OA) g. E Smart Update Manager

Scenarios 1. You are proposing one HPE Moonshot Chassis to a customer who needs an application virtualization solution. The customer needs a quick way to provision cartridges with Windows Server 2012 R2, the XenApp Desktop Virtual Agent (DVA), and other supporting tools. The customer does not have another solution for this provisioning process and wants you to provide it. a, b The system integrators can use HPE mRCA to provision one cartridge with Windows Server 2012 R2, XenApp DVA, and other supporting tools. Then they can use HPE MPM to clone that image to other servers. HPE Insight CMU (c) is an alternative option to MPM, but the customer is focused on provisioning just one chassis, so MPM is probably a better solution. 2. You are proposing five HPE Apollo 6000 chassis to a customer. The customer is focused on simplifying operations for the small IT team and wants a solution that allows admins to monitor hardware components and set power policies for servers in all five chassis at once. e HPE APM simplifies rack management by enabling customers to connect to a single console to manage a connected Apollo chassis. You could also suggest Insight CMU (c), but that doesn’t meet the specific requirements of the customer. 3. You are proposing an HPE Integrity Superdome X solution to a customer. The customer needs an easy to use management solution that allows admins to create nPartitions and to set up notifications in cases of component errors. f The SD OA offers a built-in, always available platform and partition management system. 4. A customer has a data center with HPE Integrity Superdome X and HPE Apollo 6000 Systems. The customer needs a single solution for patching and updating the firmware on all the systems. g HPE SUM supports deploying updates to these systems. 5. You are proposing three HPE Moonshot Chassis to a customer. The customer wants to set up a selfservice solution for deploying web server VMs to the cartridges. Your solution must also help the customer provision the cartridges with the hypervisor. a (or c),d, (b) HPE MPM will provision the hypervisor OS. HPE Helion CloudSystem Enterprise provides the self-service solution. Again HPE Insight CMU is a possible alternative to MPM. (Note that the scenario doesn’t specifically call for an HPE mRCA, but it is often a good tool to include with any proposal.) 6. You are proposing three HPE Moonshot Chassis to a customer. The customer ’s server

administrators need a way to troubleshoot and debug a cartridge that is not functioning correctly. b HPE mRCA provides a Debugging Tool. 7. You are proposing 12 HPE Moonshot Chassis and 6 HPE Apollo 4200s for a big data and analytics solution. The customer wants to get the solution up and running as quickly as possible. The customer also wants a way to monitor the Moonshot cartridge and Apollo XL server performance and resource utilization. c (and b) HPE Insight CMU provides the desired features. (Note that the scenario doesn’t specifically call for an HPE mRCA, but it is often a good tool to include with any Moonshot proposal.) Chapter 10 Chapter 10—Activity This section only includes the steps at which you took notes for your presentation. It provides indications of the types of notes that you might have made. Of course, the answers will not match yours precisely, and the values indicated might differ from yours if you are using a different currency or measurement system. 1. Scroll down to the results. Note the high level comparison, including the number of racks required, the number of cores provided, the number of Watts consumed, and the total cost of ownership. Begin planning how you will use this information in your presentation to the CFO.

Figure B-7: Results – Three Year Analysis The HPE solution uses fewer racks (2 versus 3) but provides more cores per rack (560 versus 320). Over three years, it consumes fewer Watts per rack (250W fewer), and the lower power usage is even greater when you consider that the HPE solution only uses two racks. Each year the MicroServer solution uses 60,000W (20,000x3), and the HPE solution uses only 38,250W (19,250x2). Therefore, HPE saves the customer 21,750W and 279,454 pounds of carbon a year. In terms of TCO, the HPE solution is $203,919 lower or about 11 percent lower. 2. For which types of costs does the HPE solution provide a higher cost? For which types of costs does it provide a lower cost? Begin to plan how you will discuss the comparison with the CFO.

The HPE solution has slightly higher initial expenses, or capital expenditures (CAPEX), in most areas. However, the ongoing operational expenditures (OPEX) are lower. 3. Use this graphical representation to draft an explanation for the CFO about the difference in TCO of an HPE Apollo 6000 solution after just three years. You should point out the lower TCO and the areas in which the HPE solution provides the most significant savings: power and cooling costs, power infrastructure costs, and networking (if you have accepted the default option).

Figure B-8: Total Cost of Ownership graph 4. Use this graph to help you draft an explanation of how this service can help to decrease the impact on the customer ’s cash flow for a single year. The graph shows how the HPE Financial Services (FS) help to spread the expenditures over three years. Some customers prefer this model so that the solution has less of an impact on a single year ’s budget sheet.

Figure B-9: HPE Financial Services option Talk with the CFO Although this activity does not have fixed answers, this section provides some ideas for answers that you might have provided. 1. How does the HPE Apollo 6000 solution help to support the company’s environmental initiatives? The report shows that the HPE Apollo 6000 solution has significantly lower power usage than the competing solution, which means that the solution will use fewer energy resources and create a lower carbon footprint. Each year the MicroServer solution uses 60,000W (20,000x3), and the HPE solution uses only 38,250W (19,250x2). Therefore, HPE saves the customer 21,750W and 279,454 pounds of carbon a year. 2. Based on what you learned earlier about the benefits of HPE Apollo 6000 and ProLiant XL220a systems, how will the HPE solution deliver a favorable ROI? Other companies with similar EDA applications have found that they could increase performance by 35 percent with the HPE Apollo 6000 and ProLiant XL220a solution. This increase in performance translates to greater employee productivity because designers don’t have to wait as long for their jobs to complete. Greater productivity leads, in turn, to the ability to bring products to market more quickly. The customer might have a metric for calculating the value of each added hour of employee productivity. 3. Will you recommend the solution to the CFO?

Yes, you should recommend the solution. Although the initial capital expenditures (CAPEX) for the HPE solution than the competing solution, the lower operating expenditures (OPEX) for the HPE solution make the HPE solution have a lower TCO within three years. In fact, the customer will save 11 percent with the HPE solution. If the customer keeps the solution longer than three years, the difference in TCO becomes even greater. (The CFO might not mind that the HPE solution has a higher CAPEX, as opposed to OPEX, because these CAPEX can be deprecated and deducted from taxes each year over the solution’s lifetime.) 4. Prepare a pitch to convince the CFO of the benefits of this solution. Your pitch will be unique, but should incorporate the concepts in the questions above, as well as the elements that you planned during the Chapter 4 activities. Example report from Chapter 4—Activity 3

Proposal to for Enterprise Servers, Storage and Networking from My Company Insert benefit/theme statement here – up to 3 lines July 9, 2016 Insert Customer Project Reference Number (if applicable) Important Notice

Important Notice The information (data) contained on all sheets of this document/quotation constitutes an offer from My Company and NOT from Hewlett Packard Enterprise Company (hereinafter “HPE”). It contains confidential information of [your company] and/or of HPE and is provided for evaluation purposes only. In consideration of receipt of this document, the recipient agrees to maintain such information in confidence and to not reproduce or otherwise disclose this information to any person outside the group directly responsible for evaluation of its contents, unless otherwise authorized by [your company] in writing. Neither My Company nor HPE shall have any liability to recipient as a result of the use of the information provided. Only a mutually agreed-upon written definitive agreement, signed by the authorized representatives of the parties, shall be binding on My Company. Additional terms of use and disclosure not specified in this Notice may be added by My Company. If there are any concerns, questions, or issues regarding this Confidentiality Notice, please contact My Company sales representative. Table of Contents Executive Summary The HPE ProLiant Gen9 portfolio delivers the right compute for the right workload at the right economics—every time According to recent IT surveys from the leading industry analysts and consultants, business executives believe that information technologies will play a substantial role in transforming their business over the next five years. At the same time, many do not believe that their IT organizations are able to deliver services rapidly enough for their desired business outcomes. Thus, the gap between business demand for simple, fast, cost effective, value added services and IT supply continues to grow. Bridging the gap between increasing business demands and IT supply will help you deliver compelling business outcomes with faster, value-added services at greater efficiency for your business. It is time to bridge the gap, by thinking about IT in an entirely new way. A new approach is needed, starting with the heart of your infrastructure—compute—the vast pool of processing resources that can be located anywhere, scaled to any workload, and available at all times to fuel business growth. Business transformation begins with compute transformation because compute runs the applications that run your business. HPE ProLiant Gen9: A New Approach to Compute In the compute era, processing is not defined in terms of discrete systems and silos. To move your IT infrastructure in lock-step with your business, the focus is on relentlessly driving the lowest cost, fastest time, and highest value of service delivery. HPE ProLiant Gen9 servers are designed for this

new era. Hewlett Packard Enterprise (HPE) gives you the simplicity and freedom to build IT that is truly the best fit for your business. Breakthrough innovation in cost of service, time to service and value of service

Redefine Compute Economics As data centers grow, power and cooling costs take an ever larger bite out of the IT budget. Automated energy optimization features of HPE ProLiant servers help you lower your use of space, power, and cooling. For example, you can: • Get 2x more compute per watt per dollar using proven and reliable HPE SmartMemory and 12 GB SAS solid-state drives (SSDs), which go through a rigorous qualification process of up to 2.4 million test hours1 • Use 50 percent less space and gain back 60 percent savings in energy costs with HPE StoreVirtual VSA, plus co-locate apps and storage on servers to lower capital expenditures (CAPEX) by 80 percent2 These industry-leading innovations free up time and save money, both of which can be reallocated to other projects to drive both innovation and more efficient operations. Boost Business Performance Only HPE helps deliver services at the speed your organization demands with data center infrastructure technologies that boost workload performance, thus enabling your data center for the needs of today and tomorrow. • Faster memory performance with HPE DDR4 SmartMemory up to 2,133 MHz with 14 percent better memory performance for HPE ProLiant Rack and Tower servers and 33 percent better memory performance for HPE ProLiant Blade servers.3 • One million IOPS supported with 12 GB controllers4

Whether you are addressing technical computing challenges, enabling cloud deployments, delivering intelligent storage, or powering design automation and data analytics, HPE ProLiant servers allow you to enjoy better-thanever performance. Accelerate Service Delivery Speeding up service delivery will help you keep pace with workload demands while lowering costs. Combined with the introduction of HPE ProLiant Gen9 is HPE Server Management, an industryleading infrastructure management innovation. HPE Server Management applies a software-defined approach to converged management, and is best suited for managing HPE BladeSystem and HPE ProLiant rack and tower servers. HPE Server Management offers out-of-the-box integration with HPE, VMware® , Microsoft® , and Red Hat® enterprise management solutions, as well as easy integration with many other management products. Architected to include open, industry-standard RESTful application programming interfaces (APIs) that enable IT staff to quickly and securely customize provisioning of the Gen9 portfolio, HPE Server Management also provides a common language and interface for integrating into cloud-based environments like OpenStack. HPE Server Management is designed to be: • Simple—One platform for converged management leads to a 50 percent reduction in management tools to license, learn, operate, and maintain.5And it is simple enough to interoperate with your onsite IT management standards. • Automated—Ushering in a software-defined approach to infrastructure management leads to a 66 percent increase in IT service delivery speed, enabling real competitive advantage and improved service-level agreement (SLA) performance.6 • Agile—It now takes just minutes (vs. hours) to update firmware across hundreds of servers, allowing for enterprise data center management at scale and speed.7 And it is agile enough to scale down to meet the needs and budget of small- to medium-sized business (SMB) IT operations. What HPE Customers Are Saying “One of our biggest server challenges is getting our systems up and running in the least amount of time. As a global provider of business and entertainment gaming-solutions, we need to make sure our systems are set up for our customers as fast as possible. Internally, we have very tight testing schedules on a wide variety of platforms and time is money, so any reduction in steps and time for provisioning directly affects the bottom line. “We’re excited about the software advantages of these new HPE ProLiant Gen9 servers with UEFI and Intelligent Provisioning. We were able to configure our servers using UEFI in less steps than legacy BIOS and we were able to upgrade firmware automatically before installing the OS with Intelligent Provisioning. As a result, we effectively reduced server deployment time by 30%. In our line of work we deploy hundreds of servers a year; HPE is saving us valuable time and resources that can go towards innovation.” — Mike Owens, Manager, IT Lab Services, Bally Technologies “With HPE compute we can scale up to millions of transactions a day, deploy servers in virtualized environments in seconds and provision applications in minutes, all while anticipating our future needs. This has allowed us to cut down our mobile application development and deployment life cycle by 30% to 90 days (on average) and grow revenue 50% year over year to our technology and software consulting subsidiary, Redstone Consulting Group LLC.” — Harry GunsallusEVP and CIO, Redstone Federal Credit Union

Why HPE? Only HPE delivers a converged platform to provision, deploy, automate, monitor, and troubleshoot all of your IT infrastructure resources. This gives you the ability to meet your operational goals in the most cost-effective and reliable manner, while lowering the cost of your services. HPE ProLiant Gen9 builds on the advancements in Gen8 and continues to lower your operating expenses, increase your performance, improve efficiency, and reduce your support costs. Built on the experience of over 37.1 million ProLiant servers and 4.4 million blade servers shipped,8 the number one server brand for 76 consecutive quarters and 75 years of leadership, HPE ProLiant Gen9 servers give the competitive advantage they needs to manage its business. When you invest in the latest HPE ProLiant technology, you are getting the best return for your budget and the highest quality experience over the lifecycle of your systems. Run your business with confidence when upgrading to the next generation of HPE ProLiant technology. HPE Apollo 6000 System The Apollo 6000 System optimizes performance, efficiency, and total cost of ownership (TCO) across racks for single-threaded workloads like electronic design automation (EDA), risk modeling, and life sciences. By taking a rack-level solution approach, it delivers more performance per core using less energy in less space than traditional servers. Right-sized for single-threaded applications, the Apollo 6000 System packs up to 160 servers per 48U rack, and its external power shelf dynamically allocates power to help maximize performance and energy efficiency to lower costs. Flexibility in networking connects allows to further optimize performance and costs by selecting the connect option that is just right for your workload. Rack scale efficiency and flexibility

Maximize Performance per Core, per Watt and per Square Foot HPE Apollo 6000 System gives the flexibility that leads to savings: • Per core— The ProLiant XL220a Server tray has two 1P servers per tray with Intel® Xeon® E3-1200 v3 series processors with up to four cores, increasing performance per core up to 35 percent for singlethreaded applications over a 2P blade. The ProLiant XL230a Gen9 Server tray has one 2P server per tray with high performance Intel® Xeon® E5-2600 v3 series processors with up to 70 percent more processor performance, with up to 36 percent more efficiency than the previous generation.9 • Per watt—The external power shelf supplies up to six chassis, and the HPE Advanced Power Manager dynamically allocates power to save on energy. • Per square foot—With 10 slots for server, storage and/or accelerator trays per 5U chassis, can fit up to 160 servers in one 48U rack, using 60 percent less space than competing blades. • With flexibility—HPE rack-level innovations provide the flexibility to fit 20 servers in the space of 5 traditional servers (5U), and power up to 120 servers with a single power shelf. The HPE Networking Innovation Zone also allows for NIC, FlexLOM options to fit your workload needs while increasing cost savings. • With savings—Take advantage of compute, storage and accelerator tray options as they become available, in the same modular HPE Apollo a6000 Chassis. HPE Apollo a6000 Chassis The HPE Apollo a6000 Chassis is designed with density optimization to help manage scaling as your business computing demands grow. When your business depends on single-threaded applications such as electronic design automation (EDA) and financial services Monte Carlo simulations, the HPE Apollo a6000 Chassis and the servers it supports give you the power you need. With the ability to grow up to 20 servers (depending on server trays), you have ultimate flexibility. Cooling concerns are reduced by five dual rotor fans sharing cooling zones, and power is managed by an HPE Advanced Power Manager (APM) and the HPE Apollo 6000 Power Shelf. Together they provide the performance you need at the price you can afford. Each HPE Apollo a6000 Chassis is built with the following: • Up to 10 tray bays per chassis • Up to 10 network I/0 module bays per chassis • Up to four (4) 12V DC input cables, providing up to 5760 W of power for the chassis (depending on power supplies) • HPE Thermal Logic technology for lower power consumption and airflow • Five (5) dual rotor fan assemblies as standard for redundancy and improved power consumption and acoustics

HPE Apollo 6000 Power Shelf HPE Apollo 6000 Power Shelf can support up to four HPE Apollo a6000 Chassis10, with up to 15.9 kW of DC power. The HPE Apollo 6000 Power Shelf with its redundant hot-plug power supplies can be configured for single-phase or three-phase input. The HPE Apollo 6000 Power Shelf offers efficient and redundant power for your data center to support high performance computing (HPC). • 1.5U tall • Efficient pooled/shared power infrastructure • Holds up to six power supplies max • Supports N+1 or N+N redundancy HPE ProLiant XL220a Gen8 v2 Server HPE ProLiant XL220a Gen8 v2 Server has two single-socket servers in each front accessible server tray. Each node delivers value for performance with Intel® Xeon® E3-1200 series processors, 4 DDR3 DIMMs, 32 GB maximum memory, with up to two small form factor (SFF) hot plug hard disk drives (HDD) or solid-state drives (SSD). Run up to 20 of these hot-swap 2 x 1P servers in an HPE Apollo 6000 System, 160 in a standard 48U rack with rear cabling for cold aisle serviceability. Built-in Gen8 technologies increase system uptime. • HPE SmartMemory prevents data loss and downtime with enhanced error handling and improves workload performance and power efficiency. • HPE SmartDrive technology improves serviceability and prevents data loss with features such as icon based status display. • A dedicated iLO connection provides faster and more secure data transmission. HPE ProLiant Apollo 6000 Servers at a glance

HPE Advanced Power Manager HPE Advanced Power Manager makes it easy to see and manage server, chassis, and rack-level power from a single console, while reducing spend on rack infrastructure, and flexing to meet workload demands with dynamic power allocation and capping. The HPE Advanced Power Manager (HPE APM) is an optional rack-level solution. HPE APM will automatically discover hardware components and enable server-level power on and off, server metering, aggregate dynamic power capping, configurable power-up dependencies and sequencing, consolidated Ethernet access to all resident iLOs, and asset management capabilities. HPE Integrated Lights-Out (iLO) The HPE iLO management processor is the core foundation for the HPE iLO portfolio. HPE iLO management processors for HPE ProLiant servers simplify server setup, engage in health monitoring of power and thermal control, and promote remote administration for every HPE ProLiant server. HPE iLO is the intelligence that drives and manages all the server components and how they interact with each other. With HPE iLO 4, you can have the same homogeneous server management experience across all HPE ProLiant Gen8 servers. High Performance Computing Services (HPS) HPE has a global team of award-winning HPC services experts available to help design, deploy,

manage, and support your HPC environment and processes, including consulting, integration, outsourcing, and support. • HPE Datacenter Care is ideal for HPC environments, giving large-scale IT environments the flexibility and economies of scale to manage Hewlett Packard Enterprise and non-Hewlett Packard Enterprise hardware and software environments effectively. • HPE Financial Services makes getting IT infrastructure easier than ever, helping to ensure that your organization can grow, within your budget. Acknowledgements “We are seeing up to 35 percent performance increase in our Electronic Design Automation application workloads. We have deployed more than 5,000 of these servers, achieving better rack density and power efficiency, while delivering higher application performance to Intel silicon design engineers.” — Kim Stevenson, Intel CIO

HPE Advanced Power Manager The HPE Advanced Power Manager is a rack-level hardware and software solution that allows you to manage power without compromises. It automatically discovers hardware components and enables server metering, aggregate dynamic power capping, configurable power-up dependencies and sequencing, consolidated Ethernet access to all resident HPE Integrated Lights-Out (iLO) Management Engines, and asset management capabilities. Improve power efficiency for HPE HyperScale solutions

By collecting real-time operating data from power distribution systems and consolidating it with server utilization data, it provides the visibility needed to identify stranded power capacity and enable safe and effective use of shared server infrastructure. And that allows you to significantly reduce total data center energy consumption. The benefits of this approach are many. • Get a single management view—See and manage shared infrastructure, server, chassis, and racklevel power from a single console. • Simplify and save—Avoiding spend on serial concentrators, adaptors, cables, and switches can save you more than 80 percent in purchase costs.11 • Flex to meet workload demands—Dynamic power allocation and capping capabilities keep you agile. • Save valuable data center space—The small 1U unit can be mounted inside, on the top, or on the side of the rack so you can determine the best use of data center space. You can further enhance your power savings and efficiency in hyperscale environments with these

complementary HPE solutions. HPE Power Advisor This easy-to-use tool allows you to accurately estimate data center power requirements for your HPE server and storage configurations at the system, rack, and multi-rack level. HPE Insight Cluster Management Utility An efficient, easy-to-use, and robust utility for the management of HPC and hyperscale clusters, which supports multiple TOP 500 supercomputer sites. HPE iLO Management Performs Active Health and Agentless Management, allowing extensive monitoring without impacting performance. HPE FlexFabric 5930 Switch Series Overview The HPE FlexFabric 5930 Switch Series is a family of high-density, ultra-low-latency, 10GbE and 40GbEtop-ofrack (ToR) data center switches. The switch series is part of the HPE FlexFabric data center solution, which is a cornerstone of the FlexNetwork architecture. The FlexFabric 5930 Switch Series is ideally suited for deployment at the aggregation or server access layer of large enterprise data centers, or at the core layer of medium-sized enterprises. With the increased pace of deploying virtualized applications, adopting software-defined networking, and the serverto-server traffic, many data centers now require spine and ToR switch innovations that will meet their requirements. The HPE FlexFabric 5930 is optimized to meet the increasing requirements for higher-performance server connectivity, convergence of Ethernet and storage traffic, the capability to handle virtual environments, and ultra-low-latency. Data Center innovation in a 40 GbE ToR switch

Features and Benefits Quality of Service (QoS) • Powerful QoS features Flexible queue scheduling—Including Strict Priority (SP), WRR, WDRR, WFQ, SP+WRR, SP+WDRR, SP+WFQ, Configurable Buffer, Time range, Queue Shaping, CAR with 8kbps

granularity. Packet filtering and remarking—Packet filtering at L2 (Layer 2) through L4 (Layer 4); flow classification based on source MAC address, destination MAC address, source IP (IPv4/IPv6) address, destination IP (IPv4/IPv6) address, port, protocol, and VLAN. Data Center Optimized • Flexible high port density—Enables scaling of the server edge with 10GbE and 40GbE spine and ToR deployments to new heights with a high-density 32-port fixed port switch in a 1RU design, a 2 slot modular design with two 40GbE QSFP+ ports and a 4 slot design; support for 10GbE SFP+, 10GBASE-T, converged port 1/10GbE or 4/8Gbps fiber channel, and 40GbE ports. • High-performance switching—Cut-through and nonblocking architecture delivers low latency (~1 microsecond for 10GbE) for very demanding enterprise applications; the switch delivers highperformance switching capacity and wire-speed packet forwarding. • Higher scalability—The HPE Intelligent Resilient Framework (IRF) technology simplifies the architecture of server access networks; up to nine HPE 5930 switches can be combined to deliver unmatched scalability of virtualized access layer switches and flatter two-tier networks using IRF, which reduces cost and complexity. • Advanced modular operating system—Comware v7 software’s modular design and multiple processes bring native high stability, independent process monitoring, and restart; the OS also allows individual software modules to be upgraded for higher availability and supports enhanced serviceability functions such as hitless software upgrades with single-chassis ISSU. • TRILL and EVB/VEPA—TRansparent Interconnection of Lots of Links (TRILL) is supported, including support of TRILL with IRF, TRILL ECMP up to 8 paths; support for Shortest Path Bridging (IEEE 802.1aq) with ECMP up to 8 paths; Edge Virtual Bridging with Virtual Ethernet Port Aggregator (EVB/VEPA) provides connectivity into the virtual environment for a data centerready environment. • Reversible airflow—Enhanced for data center hot-cold aisle deployment with reversible airflow, for either front-to-back or back-to-front airflow. • Redundant fans and power supplies—Internal redundant and hot-pluggable power supplies and dual fan trays enhance reliability and availability. • Lower OPEX and greener data center—Provide reversible airflow and advanced chassis power management. • Data Center Bridging (DCB) protocols—Provide support for IEEE 802.1Qbb Priority Flow Control (PFC), Data Center Bridging Exchange (DCBX), IEEE 802.1Qaz Enhanced Transmission Selection (ETS), Explicit Congestion Notification (ECN) for converged FCoE, iSCSI and RoCE environments. • FCoE support—Supports T11 standards-compliant FC-BB-5 Fibre Channel over Ethernet (FCoE), including FCoE Initialization Protocol (FIP), FCP, Fiber Channel enhanced port types VE, TE and VF, NPV, NPIV, Fabric Name Server, RSCN, Login Services, and name-server zoning, Per-VSAN Fabric Services, FSPF, Standard Zoning and Fiber Channel Ping. • Jumbo frames—With frame sizes of up to 10,000 bytes on Gigabit Ethernet and 10 Gigabit ports, allows high performance remote backup and disaster-recovery services are enabled.

• VXLAN support—VXLAN Layer 2 gateway support for up to 4k tunnels. Manageability • Full-featured console—Provides complete control of the switch with a familiar CLI. • Troubleshooting—Ingress and egress port monitoring enable network problem solving. Traceroute and ping enable testing of network connectivity. • Multiple configuration files—Allow multiple configuration files to be stored to a flash image. • sFlow® (RFC 3176)—Provides wire-speed traffic accounting and monitoring. • SNMP v1, v2c and v3—Facilitate centralized discovery, monitoring, and secure management of networking devices. • Out-of-band interface—Isolates management traffic from user data plane traffic for complete isolation and total reachability, no matter what happens in the data plane. • Remote configuration and management—Delivered through a secure command-line interface (CLI) over Telnet and SSH; Role-Based Access Control (RBAC)provides multiple levels of access; Configuration Rollback and multiple configurations on the flash provide ease of operation; remote visibility is provided with sFlow and SNMP v1/v2/v3and is fully supported in the HPE Intelligent Management Center (IMC). • ISSU and hot patching—Provides hitless software upgrades with single-unit In Services Software Upgrade (ISSU) and hitless patching of the modular operating system. • Autoconfiguration—Provides automatic configuration via DHCP. • NTP, SNTP and PTP Support—Synchronize timekeeping among distributed time servers and clients; Support for Network Time Protocol (NTP), Secure Network Time Protocol (SNTP), and Precision Time Protocol (PTP) IEEE 1588v2 (2008). Resiliency and High Availability • HPE Intelligent Resilient Framework (IRF) technology—Enables an HPE FlexFabric to deliver resilient, scalable, and secured data center networks for physical and virtualized environments; groups up to nine HPE 5930 switches in an IRF configuration, allowing them to be configured and managed as a single switch with a single IP address; simplifies ToR deployment and management, reducing data center deployment and operating expenses. • IEEE 802.1w Rapid Convergence Spanning Tree Protocol—Increases network uptime through faster recovery from failed links. • IEEE 802.1s Multiple Spanning Tree—Provides high link availability in multiple VLAN environments by allowing multiple spanning trees. • Virtual Router Redundancy Protocol (VRRP)—Allows groups of two routers to dynamically back each other up to create highly available routed environments. • Hitless patch upgrades—Allows patches and new service features to be installed without restarting the equipment, increasing network uptime and facilitating maintenance. • Ultrafast protocol convergence (< 50 ms) with standard-based failure detection—Bidirectional Forwarding Detection (BFD)—Enables link connectivity monitoring and reduces network

convergence time for RIP, OSPF,BGP, IS-IS, VRRP, MPLS, and IRF. • Device Link Detection Protocol (DLDP) monitors link connectivity and shuts down ports at both ends if unidirectional traffic is detected, helping prevent loops in STP-based networks. • Graceful restart—Allows routers to indicate to others their capability to maintain a routing table during a temporary shutdown and significantly reduces convergence times upon recovery; and supports OSPF, BGP, and IS-IS. Layer 2 Switching • MAC-based VLAN—Provides granular control and security; uses RADIUS to map a MAC address/user to specific VLANs. • Address Resolution Protocol (ARP)—Supports static, dynamic, and reverse ARP and ARP proxy. • IEEE 802.3x Flow Control—Provides intelligent congestion management via PAUSE frames. • Ethernet Link Aggregation—Provides IEEE 802.3ad Link Aggregation of up to 128 groups of 16 ports; and support for LACP,LACP Local Forwarding First, and LACP Short-time; provides a fast, resilient environment that is ideal for the data center. • Spanning Tree Protocol (STP)—Supports STP (IEEE 802.1D), Rapid STP (RSTP, IEEE 802.1w), and Multiple STP (MSTP, IEEE802.1s). • VLAN support—Provides support for 4,096 VLANs based on the port, MAC address, IPv4 subnet, protocol, and guest VLAN; and supports VLAN mapping. • IGMP support—Provides support for IGMP Snooping, Fast-Leave, and Group-Policy; IPv6 IGMP Snooping provides Layer 2 optimization of multicast traffic. • DHCP support at Layer 2 provides full DHCP Snooping support for DHCP Snooping Option 82, DHCP Relay Option 82,DHCP Snooping Trust, and DHCP Snooping Item Backup. Layer 3 Services • Address Resolution Protocol (ARP)—Determines the MAC address of another IP host in the same subnet; supports static ARPs; gratuitous ARP allows detection of duplicate IP addresses; proxy ARP allows normal ARP operation between subnets or when subnets are separated by a Layer 2 network. • Dynamic Host Configuration Protocol (DHCP)—Simplifies the management of large IP networks and supports client and server; DHCP Relay enables DHCP operation across subnets. • Operations, administration, and maintenance (OAM) support—Provides support for Connectivity Fault Management (IEEE 802.1AG) and Ethernet in the First Mile (IEEE 802.3AH); provides additional monitoring that can be used for fast fault detection and recovery. Layer 3 Routing • Virtual Router Redundancy Protocol (VRRP) and VRRP Extended—Allow quick failover of router ports. • Policy-based routing—Makes routing decisions, based on policies set by the network administrator. • Equal-Cost Multipath (ECMP)—Enables multiple equal-cost links in a routing environment to

increase link redundancy and scale bandwidth. • Layer 3 IPv4 routing—Provides routing of IPv4 at media speed; supports static routes, RIP and RIPv2, OSPF,BGP, and IS-IS. • Open shortest path first (OSPF)—Delivers faster convergence; uses this link-state routing Interior Gateway Protocol (IGP),which supports ECMP, NSSA, and MD5 authentication for increased security and graceful restart for faster failure recovery. • Border Gateway Protocol 4 (BGP-4)—Delivers an implementation of the Exterior Gateway Protocol (EGP), utilizing path vectors; uses TCP for enhanced reliability for the route discovery process; reduces bandwidth consumption by advertising only incremental updates; supports extensive policies for increased flexibility; and scales to very large networks. • Intermediate system to intermediate system (IS-IS)—Uses a path-vector IGP, which is defined by the ISO organization for IS-IS routing and extended by IETF RFC 1195 to operate in both TCP/IP and the OSI reference model (Integrated IS-IS). • Static IPv6 routing—Provides simple manually configured IPv6 routing. • Dual IP stack—Maintains separate stacks for IPv4 and IPv6 to ease the transition from an IPv4only network to an IPv6-only network design. • Routing Information Protocol next generation (RIPng)—Extends RIPv2 to support IPv6 addressing. • OSPFv3—Provides OSPF support for IPv6. • BGP+—Extends BGP-4 to support Multiprotocol BGP (MBGP), including support for IPv6 addressing. • IS-IS for IPv6—Extends IS-IS to support IPv6 addressing. • IPv6 tunneling—Allows IPv6 packets to traverse IPv4-only networks by encapsulating the IPv6 packet into a standard IPv4 packet; supports manually configured, 6-to-4, and Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) tunnels; is an important element for the transition from IPv4 to IPv6. • Policy routing—Allows custom filters for increased performance and security; supports ACLs, IP prefix, AS paths, community lists, and aggregate policies. • Bidirectional Forwarding Detection (BFD)—Enables link connectivity monitoring and reduces network convergence time for RIP, OSPF,BGP, IS-IS, VRRP, MPLS, and IRF. • Multicast Routing PIM Dense and Sparse modes—Provides robust support of multicast protocols. • Layer 3 IPv6 routing—Provides routing of IPv6 at media speed; supports static routing, RIPng, OSPFv3, BGP4+for IPv6, and IS-ISv6. Additional Information • Green IT and power—Improves energy efficiency through the use of the latest advances in silicon development; shuts off unused ports and utilizes variable-speed fans, reducing energy costs. Management

• USB support—File copy allows users to copy switch files to and from a USB flash drive. • Multiple configuration files—Stores easily to the flash image. • SNMPv1, v2c, and v3—Facilitate centralized discovery, monitoring, and secure management of networking devices. • Out-of-band interface—Isolates management traffic from user data plane traffic for complete isolation and total reachability, no matter what happens in the data plane. • Port mirroring—Enables traffic on a port to be simultaneously sent to a network analyzer for monitoring. • Remote configuration and management—Available through a command-line interface (CLI). • IEEE 802.1AB Link Layer Discovery Protocol (LLDP)—Advertises and receives management information from adjacent devices on a network, facilitating easy mapping by network management applications. • sFlow® (RFC 3176)—Provides scalable ASIC-based wirespeed network monitoring and accounting with no impact on network performance; this allows network operators to gather a variety of sophisticated network statistics and information for capacity planning and real-time network monitoring purposes. • Command authorization—Leverages RADIUS to link a custom list of CLI commands to an individual network administrator ’s login; an audit trail documents activity. • Dual flash images—Provides independent primary and secondary operating system files for backup while upgrading. • Command Line Interface (CLI)—Provides a secure, easy-to-use CLI for configuring the module via SSH or a switch console; and provides direct real-time session visibility. • Logging—Provides local and remote logging of events via SNMP (v2c and v3) and syslog; and provides log throttling and log filtering to reduce the number of log events generated. • Management interface control—Provides management access through a modem port and terminal interface, as well as in-band and out-of-band Ethernet ports; provides access through the terminal interface, telnet, or secure shell (SSH). • Industry-standard CLI with a hierarchical structure—Reduces training time and expenses; and increases productivity in multivendor installations. • Management security—Restricts access to critical configuration commands; offers multiple privilege levels with password protection; ACLs provide telnet and SNMP access, while local and remote syslog capabilities allow logging of all access. • Information center—Provides a central repository for system and network information; aggregates all logs, traps, and debugging information generated by the system and maintains them in the order of severity; outputs the network information to multiple channels based on userdefined rules. • Network management—HPE Intelligent Management Center (IMC) centrally configures, updates, monitors, and troubleshoots. • Remote intelligent mirroring—Mirrors ingress/egress ACL-selected traffic from a switch port or

VLAN to a local or remote switch port anywhere on the network. Security • Access control lists (ACLs)—Provide IP Layer 3 filtering based on source/destination IP address/subnet and source/destination TCP/UDP port number. • RADIUS/TACACS+—Eases switch management security administration by using a password authentication server. • Secure shell—Encrypts all transmitted data for secure remote CLI access over IP networks. • IEEE 802.1X and RADIUS network logins—Controls port-based access for authentication and accountability. • Port security—Allows access only to specified MAC addresses, which can be learned or specified by the administrator. Convergence • LLDP-MED (Media Endpoint Discovery)—Defines a standard extension of LLDP that stores values for parameters such as QoS and VLAN to automatically configure network devices such as IP phones. Warranty and Support • 1-year warranty—See hpe.com/networking/warrantysummary for warranty and support information included with your product purchase. Example report from Chapter 10—Activity HPE Hyperscale Business Value Calculator Analysis Report

March 14, 2016 Input Assumptions Name of the Org anization

Lab cars

Industry Classification

Automotive

Datacenter Geog raphic Location

United States

Deal ID

None

Time frame of Analysis

Three Years

Hyperscale Server

HP ProLiant XL220a Gen8

Alternative ProLiant Server

None

Competitive Server

SuperMicro SD-5038ML-H8TRF

Virtualization

No

Overall TCO Summary Please note that your analysis is saved in your folder for further reference when you revisit the tool.

HP ProLiant XL220a Gen8 HP ProLiant XL220a Gen8 Servers per Chassis/Enclosure

20

Total Racks Used

2

Max kW/Rack

21

No of Chassis

12 per Server

per Chassis

per Rack

No of Cables

0

2

14

Cores

4

80

560

Weig ht (lbs.)

15

300

2,100

Operational Power (Watt)

138

2,750

19,250

1,787

35,744

250,210

Carbon Usag e (lbs.)

SuperMicro SD-5038ML-H8TRF SuperMicro SD-5038ML-H8TRF Servers per Chassis/Enclosure

8

Total Racks Used

3

Max kW/Rack

21

No of Chassis

30 per Server

per Chassis

per Rack

No of Cables

2

16

160

Cores

4

32

320

Weig ht (lbs.)

8

62

622

250

2,000

20,000

3,249

25,996

259,958

Operational Power (Watt) Carbon Usag e (lbs.)

TCO Cost Elements Breakup: SuperMicro SD-5038ML-H8TRF vs HP ProLiant XL220a Gen8

This Chart compares the TCO of SuperMicro SD-5038ML-H8TRF and HP ProLiant XL220a Gen8

HPEFS IT Investment Solution vs Purchase

HPE Financial Services helps make it easier and more affordable to move to new technology. Impact of Cashflow

Year 1

Year 2

Year 3

Purchase

$579,799

$0

$0

HPEFS IT Investment Solution

$180,897

$180,897

$180,897

Next Steps

For additional information, you can access the following site or get in touch with your HPE representative. http://h17007.www1.hp.com/us/en/iss/110111.aspx Return to your calculator here. Annexure

Hardware and Networking Summary: HP ProLiant XL220a Gen8

SuperMicro SD-5038ML-H8TRF

Acquisition Cost - Servers: HP ProLiant XL220a Gen8

SuperMicro SD-5038ML-H8TRF

Acquisition Cost - Software: HP ProLiant XL220a Gen8

SuperMicro SD-5038ML-H8TRF

Acquisition Cost - Network: HP ProLiant XL220a Gen8

SuperMicro SD-5038ML-H8TRF

Facilities Cost - Power and Cooling: HP ProLiant XL220a Gen8

SuperMicro SD-5038ML-H8TRF

Facilities Cost—Power Infrastructure: HP ProLiant XL220a Gen8

SuperMicro SD-5038ML-H8TRF

Facilities Cost - Space Infrastructure: HP ProLiant XL220a Gen8

SuperMicro SD-5038ML-H8TRF

Hardware Support Cost: HP ProLiant XL220a Gen8

SuperMicro SD-5038ML-H8TRF

Installation Cost: HP ProLiant XL220a Gen8

SuperMicro SD-5038ML-H8TRF

Data Center Related Costs Comparison Data Center Related Costs Comparison Energ y Use Select Alternative ProLiant Server

HP ProLiant XL220a Gen8

IT Load (kWh)

0

0

Non-IT Electrical Loads (kWh)

0

0

Electrical Losses (kWh)

0

0

Cooling equipment (kWh)

0

0

Total electrical power (kWh)

0

0

0.00

0.00

PUE

Facility Size Select Alternative ProLiant Server

HP ProLiant XL220a Gen8

Data center size (ft2 )

0

0

Infrastructure size (ft2 )

0

0

Total Facility size (ft2 )

0

0 Initial and Operational Expenses

Initial Costs

Select Alternative ProLiant Server

HP ProLiant XL220a Gen8

$0

$0

Annual Costs (Three Years)

$0

$0

Total Costs (Three Years)

$0

$0

1 HPE internal lab testing. 2.4 million hour test quant is derived from a combination of drive qualification test plans, specifically HDDQ spec-supplier responsibility to perform, HDDQ spec-HPE responsibility to perform, RDT-Reliability Demonstration test spec, CSI integration test spec, and Pilot test requirements. Test conducted July 2014. 2 Based on HPE internal comparative analysis of publicly available data from major competitors, June 2013. 3 Up to 14 percent better performance is based on similar capacity DIMM running on HPE server compared to a non-HPE server with DDR4. Up to 33 percent better performance is based on similar capacity DIMM running on HPE server compared to a non-HPE server with DDR4. 4 Internal performance lab testing using Iometer and the HPE Smart Array P840 with RAID 0, 4k random reads, Microsoft Windows® 2012 R2; testing is ongoing with changes in firmware. Number is current as of 21 July 2014. 5 Comparing HPE OneView 1.10 vs. the traditional approach to server and storage management requiring eight tools. HPE OneView replaces Intelligent Provisioning, Array Configuration Utility, iLO 4, Virtual Connect Manager/VCEM, HPE Systems Insight Manager, HPE Smart Update Manager, HPE Onboard Administrator, and HPE 3PAR array management. HPE internal, Houston, Texas, May 2014. 6 66 percent faster problem resolution time for HPE Insight Remote Support—initiated cases for hardware vs. traditional phone support based on HPE internal call center data, Q4 2011. 7 Performing iLO firmware updates of 200 systems in 380 seconds. Comparing it against our previous generations and competitors. Based on HPE Internal estimates, Houston, Texas, U.S.A., July 2014. 8 IDC Worldwide Quarterly Server Tracker for 1Q15, May 2015 9 Intel.com/performance 10 Depending on the power configuration of chassis. 11 Internal HPE estimate

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF