461 - Telecommunication Service Provisioning
Short Description
DOCUMENT CIGRE...
Description
461 Telecommunication Service Provisioning and Delivery in the Electrical Power Utility
Working Group D2. 26
April 2011
Telecommunication Service Provisioning and Delivery in the Electrical Power Utility
Working Group D2.26
April 2011 Members Mehrdad MESBAH, Convenor (France), Robert EVANS (Australia) Dugald BELL (Australia), Jan PIOTROWSKI (Poland) Jorge MENDES (Portugal) Pedro GAMA (Portugal), Ion NEDELCU (Romania), Matjaz BLOKAR (Slovenia) Paul SCHWYTER (Switzerland), Paul RENSHALL (United Kingdom) Claudio TRIGO de LOUREIRO (Brazil), Elton BANDEIRA de MELO (Brazil) Suzana JAVORNIK VONCINA (Croatia), Janine LEIFER (Israel) Masayuki YAMASAKI (Japan) Kazuto IWASAKI (Japan), Eva LASSNER (Hungary) Lhoussain LHASSANI (Netherland) Olav STOKKE (Norway), Jorge FONSECA (Portugal) Pedro MARQUES (Portugal) Danilo LALOVIC (Serbia), Juan Antonio GARCIA LOPEZ (Spain)
Copyright © 2011 “Ownership of a CIGRE publication, whether in paper form or on electronic support only infers right of use for personal purposes. Are prohibited, except if explicitly agreed by CIGRE, total or partial reproduction of the publication for use other than personal and transfer to a third party; hence circulation on any intranet or other company network is forbidden”.
Disclaimer notice “CIGRE gives no warranty or assurance about the contents of this publication, nor does it accept any responsibility, as to the accuracy or exhaustiveness of the information. All implied warranties and conditions are excluded to the maximum extent permitted by law”.
ISBN: 978-2-85873-150-3
1 2
3
4
5
6
INTRODUCTION ................................................................................................................5 COMMUNICATION SERVICE IN THE POWER UTILITY ............................................8 2.1 Introduction...................................................................................................................8 2.2 EPU Communication Services......................................................................................9 OPERATIONAL APPLICATIONS ...................................................................................13 3.1 Protection Communication .........................................................................................13 3.1.1 State Comparison Protection Schemes (Command Schemes)............................15 3.1.2 Teleprotection Signalling Systems .....................................................................16 3.1.3 Analog Comparison Protection Schemes............................................................17 3.1.4 System-wide Protection Schemes .......................................................................17 3.2 Energy Management, SCADA and WAMS Communications ...................................18 3.2.1 SCADA Communications...................................................................................18 3.2.2 Inter-Control Centre Communications ...............................................................19 3.2.3 Remote Control Centre Operator Consoles ........................................................20 3.2.4 Generation Control Signaling .............................................................................20 3.2.5 Wide Area Monitoring System (PMU Communications) ..................................20 3.3 Remote Substation Control and Automation ..............................................................22 3.4 Operational Telephone System ...................................................................................22 3.5 Settlement and Revenue Metering and Customer Communications ..........................24 3.5.1 Energy Metering in the Deregulated Environment.............................................24 3.5.2 Customer Metering, Advanced Metering Infrastructure.....................................24 3.5.3 Advanced distribution applications and Smart Grid...........................................25 OPERATION SUPPORT APPLICATIONS......................................................................26 4.1 Collaborative Multi-media Communications .............................................................26 4.2 On-line Documentation...............................................................................................26 4.3 Substation Automation Platform Management...........................................................27 4.4 Condition and Quality Monitoring Communications .................................................27 4.5 Substation data Retrieval ............................................................................................27 4.6 Mobile Workforce Communications ..........................................................................28 SECURITY, SAFETY AND ENVIRONMENTAL MONITORING................................29 5.1 Security of Sites and Assets........................................................................................29 5.1.1 Video-surveillance of sites..................................................................................29 5.1.2 Site Access Control.............................................................................................29 5.1.3 Environmental Hazards Monitoring (Sites and Assets)......................................30 5.1.4 Intruder Detection ...............................................................................................30 5.2 Human Safety & Operational Security .......................................................................30 5.2.1 Earth Connection Monitoring .............................................................................30 5.2.2 Isolated Worker Safety Communications ...........................................................30 5.2.3 Public Warning Communications .......................................................................31 5.2.4 Hydraulic Structure Operation and Maintenance Applications ..........................32 5.3 Cyber-Security applications communication..............................................................34 OPERATIONAL CONSTRAINTS AND SERVICE LEVEL AGREEMENTS ...............35 6.1 Operational Coverage and Topology ..........................................................................35 6.2 Time Constraints.........................................................................................................36 6.3 Availability Constraints ..............................................................................................39 6.4 Service Survivability and Resilience ..........................................................................41
2
6.5 Service Security Constraints .......................................................................................43 6.6 Service Integrity..........................................................................................................44 6.7 Future Sustainability, Legacy Openness and Vendor Independence..........................46 6.8 Environmental Constraints..........................................................................................46 6.9 Defining Service Level Agreements ...........................................................................47 7 DISASTER RECOVERY AND SERVICE CONTINUITY..............................................54 7.1 Introduction.................................................................................................................54 7.2 Threats and Risk Management and Risk Assessment.................................................54 7.3 Business Continuity Plan and Disaster Recovery Plan...............................................56 7.4 Project Design Criteria................................................................................................57 7.4.1 Back-up Facilities ...............................................................................................57 7.4.2 Power Supply Independence...............................................................................58 7.4.3 Network Redundancy..........................................................................................58 7.4.4 Countermeasures against Natural Disasters........................................................59 7.5 Enhancing the emergency response capacity..............................................................60 7.6 Disaster Information Systems .....................................................................................62 8 TELECOM SERVICE DELIVERY MODELS..................................................................64 8.1 Introduction.................................................................................................................64 8.2 EPU Profiles - Telecom Service Users .......................................................................64 8.2.1 Coordinating or Operating Bodies without Network Assets ..............................68 8.2.2 Transmission System Operator (TSO) or Transmission Utility .........................69 8.2.3 Distribution Utility..............................................................................................70 8.2.4 Generation Utility ...............................................................................................71 8.3 Telecom Asset Ownership Profiles.............................................................................73 8.3.1 Introduction.........................................................................................................73 8.3.2 Physical layer assets............................................................................................74 8.3.3 Transport network assets.....................................................................................77 8.3.4 Application service networks and platforms.......................................................77 8.4 Telecom Service Provider – Relationship to the User ................................................78 8.4.1 Integrated to the Operational User (Type A) ......................................................79 8.4.2 Sister Entity to the Operational User (Type B)...................................................80 8.4.3 Affiliated Service Company (Type C)................................................................80 8.4.4 Independent Service Contractor (Type D)..........................................................81 8.4.5 External Telecom Service Provider (Type E).....................................................81 9 FEDERATING OF SERVICES ON THE PRIVATE INFRASTRUCTURE....................84 9.1 Introduction.................................................................................................................84 9.2 Process and Organization Issues.................................................................................85 9.3 Technical Solutions.....................................................................................................87 9.3.1 Fibre Separation in Optical Cables .....................................................................87 9.3.2 Wavelength Separation through C- or D-WDM.................................................88 9.3.3 Bandwidth Separation through PDH/SDH .........................................................89 9.3.4 Virtual Network Separation (MPLS VPN or Ethernet VLAN) ..........................89 10 MANAGEMENT OF TELECOM SERVICE AND INFRASTRUCTURE ..................91 10.1 Introduction - Need for a Management System..........................................................91 10.2 Present State Assessment and Target Definition ........................................................93 10.2.1 Telecom Business Maturity Modeling................................................................93
3
10.2.2 Management Process Maturity ...........................................................................94 10.3 Management Frameworks & Best Practices ...............................................................96 10.3.1 Introduction.........................................................................................................96 10.3.2 ITIL Framework..................................................................................................96 10.3.3 NGOSS – Frameworx.........................................................................................97 10.3.4 Business Process Framework eTOM ..................................................................98 10.3.5 Relating ITIL to eTOM Framework ...................................................................99 10.4 Towards a Utility Telecom Management Framework ..............................................101 10.4.1 Introduction.......................................................................................................101 10.4.2 Utility Telecom Management Operations Map (uTOM)..................................102 10.5 Upstream Management .............................................................................................105 10.5.1 Introduction.......................................................................................................105 10.5.2 Policy Definition & Business Planning ............................................................107 10.5.3 Strategic Deployment and Tactical Adjustments..............................................109 10.5.4 Business Development, Service Offer and Service Migrations........................112 10.6 Operational Management..........................................................................................113 10.6.1 Customer/User Relation Management ..............................................................114 10.6.2 Communication Service Management Process.................................................115 10.6.3 Network Resource & Infrastructure Management ............................................117 10.6.4 Provider/Contractor Relationship Management Process ..................................120 10.6.5 Enterprise Processes impacting Telecom Service Delivery..............................121 10.7 Management Tools and Information Systems ..........................................................123 10.7.1 Introduction.......................................................................................................123 10.7.2 Element & Network Management Systems ......................................................124 10.7.3 Operation Support Systems (OSS)....................................................................125 10.7.4 Inventory & Configuration Data Base ..............................................................127 11 COST CONSIDERATIONS.........................................................................................131 12 FURTHER ACROSS THE HORIZON ........................................................................135 12.1 Power System Evolution - Smart Grid .....................................................................135 12.2 EPU Organization and Environment ........................................................................137 12.3 Communication Service Provider Environment .......................................................138 12.4 Telecom Technology Evolutions ..............................................................................139 12.5 Information System Evolution - Cloud Computing..................................................141 APPENDICES ..........................................................................................................................144 A1. IP Voice in Utility Telecoms ........................................................................................144 A2. Sharing Mobile Emergency Service (TETRA).............................................................149 A3. Satellite Communications in Power Utilities................................................................151 A4. Disaster Counter-measures – Learning from US 2005 Hurricanes ..............................154 A5. Survey of Electric Power Dimensioning Practice in EPU Data Centres ......................158 A6. Deploying a Management Framework – Western Power .............................................161 A7. ITIL Management Framework......................................................................................165 A8. TM Forum NGOSS - Frameworx.................................................................................173 A9. List of Acronyms ..........................................................................................................182 REFERENCES .........................................................................................................................185
4
1 INTRODUCTION Some 15 years ago, a technical brochure prepared by CIGRE Study Committee 35 started with the following lines, summarizing the situation and the opportunities facing the power utilities in terms of telecommunications as seen in early 1990s: “Over the last few years an increasing number of utilities have seen their telecommunication activities being influenced by technological and operational change. The choices that utilities are faced with today may have a significant impact on their future development. Traditionally, power utilities have used telecommunications networks primarily for control and operation of the power system. “Standardized” telecommunications equipment could not always be used in an operational environment. Leasing telecommunication services, on the other hand, was unacceptable in many cases because of low availability figures and service level from the “monopolized” public operator or simply because the public telecommunications network did not have sufficient geographic coverage to reach all the utility’s substations. The use of telecommunication network for other purposes was generally prohibited by legislation. As a consequence, the planning of future network developments was relatively straightforward. The networks were planned and designed to meet the utilities’ particular operational needs and this decided the extent of investment in new network infrastructure. The type of network infrastructure provided was also influenced by operational needs with specialized equipment such as Power Line Carrier (PLC) being used extensively. Although these criteria still play an important role in many networks today, the focus has now shifted to cover a broader range of issues: •
The requirement for increased capacity, speed and response time of the operational and administrative services
•
The need to improve the efficiency of network maintenance and quality of service
•
The need to reduce dependence on vendor specific equipment
•
The opportunities provided by the liberalization of the telecommunications market
•
The opportunity to outsource all or parts of the telecommunications infrastructure or service to third parties”
(Extracted from TB107 Power System Communications in the High Speed Environment, [1]) In the 15 years that separate us from these lines, if the essential issues have not changed, the situation is radically different. •
The creation of electricity market has created new participants and hence the need for new communication services. A new group of services has been formed that can be called “Business and Market” communications. The requirement to communicate with the electricity customers and with individual producers is driving important metering infrastructure projects which may represent an opportunity to implement other utility or commercial services.
5
•
The great majority of electrical power utilities have implemented extensive optical fibre networks with SDH as the core technology providing the required capacity, speed and response time of their operational requirements.
•
Technological orientations for a packet data communication layer are no longer a subject of discussion. The omni-presence, the ubiquity and the strong industry support for the fully mature Ethernet and IP technologies make them the natural complementary layer for providing new services in the electrical utility. The multi-service capability of the IP technology as discussed extensively in CIGRE Technical Brochure TB249 [2], allows the creation of a single Integrated Service Network (ISN) to cover different operational, operation support and corporate IP requirements, or multiple networks, each dedicated to one of these families of Utility communications. At present, IP connection to the electrical substation is a strong requirement for many new applications and has been the subject of a Technical Brochure TB321 “Operational Service using IP Virtual Private Networks” [3] and of further ongoing work. In order to carry these IP connections and other more timecritical communications, Utilities implement new Ethernet transport over SDH, over fibre, or over an MPLS core. Wide Area Ethernet transport has been the subject of a separate publication in CIGRE [4].
•
Many “new” directions in the mode of service delivery, which were to be explored at the time, are today sufficiently evaluated, to allow a new analysis in the light of more than a decade of experience. Many pioneers of seeking commercial revenue from the operational network infrastructure have evolved into “standard telecom Service Providers”, moving away from their original goal. Those who have maintained their original scope have survived due to particular legal, legislative or practical contexts which are interesting to explore.
•
New problems and operational issues have appeared due to outsourcing or due to the provision of commercial services in the liberalized telecommunication market. A previous CIGRE publication TB108 “Business Opportunities for Utilities in the Telecom Market” [5] published in 1997 needs to be reviewed in the light of these new issues and problems. The effects of commercial service on the operational service provision are to be analyzed.
•
Finally, moving from the “monopolized” Public Telecom Operator to many concurrent Telecom Service Providers have changed radically the cost/performance and quality of service objectives for the procured services. There has been a clear change of orientation from a uniform quality objective towards a competitive, “avoid non-contractual performance to reduce cost” strategy. The relationship between the Service User and the Service Provider and the principle of Service Level Agreement between the two is of great interest and needs to be covered.
At present, Electrical Power Utilities (EPU) are increasingly dependent upon the existence of fast, secure and reliable communications services. These services interconnect the participants, platforms and devices constituting the technical, commercial, and corporate processes of the Utility across the different sites. The communication services are provisioned, managed and maintained in different ways depending upon different quality constraints, cost and regulatory imperatives and company policy considerations. The services can be integrated together into a
6
common network or provided through completely separate networks. The associated telecommunication organization of the Utility varies correspondingly among power utilities. Moreover, adopting a particular telecom service provisioning model is not an irrevocable decision. It is often re-examined and reviewed in the light of new situations, some of which are as follows: •
New company policy and orientation,
•
Regulatory issues and requirements
•
Mergers and dislocation of activities,
•
Availability or loss of adequate telecom services to be procured,
•
Requirement for new services or change of scale in the existing services, incompatible with the present provisioning model
•
Lack of satisfaction from the services obtained through the existing provisioning model,
•
Major capital investments and running costs required for refurbishment and extension of existing facilities,
•
Technological changes in telecommunications and in power system technology
•
Lack of qualified staff and the ageing of the concerned technical work-force
This present document is not about telecommunication technology, which is extensively covered in other CIGRE published literature and on-going work It focuses on the analysis and provides a new look into the delivery of communication services associated with operational applications of the EPU. It covers quality requirements, architectural aspects, as well as the related organizational, and management issues across different types of EPU. Management and maintenance of telecom network infrastructure and services are also covered from a Utility process and organization point of view rather than from a technological view which is again extensively covered in the literature and subject to very fast evolution. This Technical Brochure is mainly addressed to: •
EPU decision makers who assess telecommunication service provisioning strategies in view of the changing nature and scale of requirements.
•
EPU Telecom service providing entities who need to adapt to those same changing requirements
•
Telecom Service Provider offspring of EPUs who have over time “forgotten” the service imperatives of their EPU operational customers
7
2 COMMUNICATION SERVICE IN THE POWER UTILITY 2.1
Introduction
The term “service” is widely used often with a very loose definition and may lead to confusion and misunderstanding. We shall therefore start with some definitions that shall be used throughout the document. These definitions are illustrated in figure 2.1. Service is the perception of a User from a process implemented by a Provider. A “Service Provider” deploys a telecom infrastructure and corresponding management processes in order to offer Communication Services satisfying the requirements of its user community. The user perceives the service as a “network cloud” providing communication connectivity for its user applications and processes. ITU-T E800 defines the service as a set of functions offered to the user by an organization [6]. A communication service is delivered at a Service Access Point with a certain Quality of Service (QoS) as stipulated through a Service Level Agreement (SLA) between the Service Provider and the Service User. A service level agreement can be formally stipulated or implicit between the provider and the user. The process of assuring that the terms of the SLA are met is called Service Management. It relies upon a proper Infrastructure Management and Maintenance performed by the Service Provider. Service Access Point
User
Application Platform
Service Management
Infrastructure Management & Maintenance
Dedicated Telecom Infrastructure
Application Platform
User
Procured Service
Communication Service User Application
Figure 2.1 – Communication Service Model A Public Telecom Operator’s mission is to provide telecom services to its customers. Telecom service is in this case the end product and the final commodity for which the whole organization is working. The Service User to Service Provider relationship is therefore straightforward as shown in figure 2.2. The Telecom Operator designs and commercializes a catalogue of standard telecom services based on a market survey and from thereon has no particular concern regarding the customer applications employing these services other than continually adapting the catalogue to the market’s evolution. 8
An Electrical Power Utility (EPU), on the other side, provisions telecom services essentially for its own requirements. The provision process may be multi-layer, employing dedicated infrastructures or procured services at different levels and presenting multiple User-Provider relationships as illustrated in figure 1. In this case, the catalogue of services must be based on detailed analysis and characterization of EPU applications in terms of communication requirements. Process User
Service Platform & Service Network
Service User
User Provider User Provider
Customers
Telecom Connectivity User
Service Provider
Public Telecom
Transmission Medium
Operator
Fibre Connectivity
Public Telecom User / Provider Relationship
Provider
Electrical Power Utility User / Provider Relationship
Figure 2.2 – User-Provider Relationships
2.2
EPU Communication Services
Communication services in the EPU can be identified according to the applications that they address. In particular, wherever the “Service User” entities and the telecom service providing entity are tightly related, there is a one-to-one correspondence between applications and communication service, resulting in an “application-oriented” definition of communication services (e.g. SCADA or Protection communication services mean communication services respecting the requirements of SCADA or Protection applications). The communication Service Provider is assumed to be sufficiently familiar with the applications to apply the necessary precautions in the delivery of the required service (i.e. implicit SLA). Whenever a new application is introduced or the requirements of an application change, the user and provider must seek a new common understanding of the service requirements. On the other hand, where communication service is provided by an external or formally separate entity, then the service provision contract (explicit SLA) defines the service attributes according to the provider’s “service catalogue” (e.g. Platinum, Gold, Silver, etc.). The Utility user must then decide upon the suitable SLA for his applications. In this report, we have identified communication services by the applications that they serve. Consequently, we define communicating applications related to the operation of the power system in chapters 3, 4 and 5, then the constraints and the required qualities in chapter 6,
9
before relating applications and constraints in section 6.9 (figures 6.6 and 6.7) allowing SLA specification whichever provisioning scheme is adopted (as characterized in chapter 8). The following application-oriented service categories can be identified: 1. Operational Services – These communication services enable the coordination and exchange of information between the staff, devices and platforms directly involved in operational applications and processes used to operate, control and protect the power system and its constituents. The processes are necessary for the proper accomplishment of the Utility’s primary mission and therefore their communication services are referred to as “mission-critical”. 2. Operation Support Services – Closely related to the Power system Operation, there exists increasingly a group of applications related to the maintenance and support of the Power System infrastructure. This includes voice and data applications for the field maintenance staff connecting them with central offices, servers and data sources allowing them to perform their tasks, as well as remote monitoring and surveillance applications. These are collectively referred to as Operation Support Applications and their telecom requirements as Operation Support Communication Services. 3. Security, Safety and Environmental Services – A whole group of new applications related to the security of utility staff, public safety, Utility site security, and environmental security is emerging in many countries due to growing security and environmental concerns and consequent regulatory constraints. These applications which were previously considered as part of the operation support are increasingly considered as a distinct class of applications with extensive communication requirements and constraints. 4. Corporate Communication Services – These communication services are related to the administrative applications of the Power Utility as a Corporate Enterprise, covering the administration and corporate needs of the Utility organization and its employees (including those located in Operational sites). 5. Business and Market Communication Services – The Power Utility needs to exchange information with its external Market partners and its power customers. This includes communications between power generators, distribution companies, national and other country TSO, trading platforms and energy consumers. The required communication services are referred to as Business and Market Communication Services. Smart metering and Demand-side Management communications are part of this class of communication services. 6. Commercial or U-Telco Communication Services – The Power Utility or an affiliated entity may provide commercial communication services as a source of revenue to other Utilities, to institutional customers (e.g. government or community offices), to telecom Service Providers, or to multi-site companies (and in certain cases to individual customers). The service may cover subscriber premises access (DSL), core communications, or both. Providing U-Telco communication services can be assimilated to the service provision model of a public telecom operator. Criticality of communication services in the Power Utility can be assessed through the consequences of service loss and degradation. It is clear that a high degree of criticality can be
10
attributed to the operational services. However, it should be noted that the operational applications are not the only critical processes in the Power Utility. Security and human safety related applications present also a high level of criticality. The same can be said about communications related to Utility Business and Market activities where the financial consequences of a loss of communication can be tremendous. Corporate communications may be tolerant to longer periods of programmed unavailability, in particular for maintenance purposes. The loss of commercial communications has the same degree of criticality as that of other public telecom operators resulting immediately in a loss of revenues, the non-respect of contractual obligations if prolonged, and in the long run in a loss of customers. The performance objectives and the Quality of Service are also different among these different service types. Many operational services, such as Protection Relay applications, have extremely severe time delay and communication integrity constraints, whereas the other communication service types are mainly transactional with less severe time sensitivity. On the other hand, business and market communication services implicate access beyond the perimeter of the power company and may raise more severe security issues. The first three groups of services described above may collectively be called “operationrelated” services (serving “operation-related EPU applications). The present brochure focuses on these services even if other services are often mentioned, in particular when their provision interferes with (or influences) the way in which operation-related services are provisioned.
Electrical Power Utility Telecommunications Operations-Related Communications Enterprise Network IT-oriented
Administrative/Corporate Communication Services
Operational Communication Services
Business / Market Security transactions internet-oriented Communication Services
Operation Support Communication Services Security & Safety
Service-Provider oriented
Commercial / U-Telco Communication Services
Communication Services Industrial Communications
Figure 2.3 – Power Utility Telecommunications Considering the organizational diversity of EPUs and their different sizes, activities, and regulatory constraints, the exact perimeter of each category and the allocation of individual
11
EPU applications to these categories can vary to some extent and may evolve with organizational changes. Some of the factors that influence these allocations are as follows: •
• •
•
Security policy – The definition of separate security domains in the EPU and the consequent allocation of applications to these different security domains can result in the communication service allocation. This means that the applications which are part of a same security domain shall exclusively use a same group of communication services. Organization – The organizational entity in charge of a group of applications may require exclusive usage of a service or a same group of communication services. Company strategy – Grouping of communication services may depend upon the company’s strategy, for example to merge corporate and operation-related IT and telecoms, or to merge corporate and market related applications’ communications provision, etc. Regulatory issues – Regulation authorities may prevent operational applications to share communication services with non-operational, or may impose full separation of the UTelco activities.
The grouping of different applications’ communication services strongly impacts the service integration strategy of the company as described in Chapter 9 on Federating of Services.
12
3 OPERATIONAL APPLICATIONS 3.1 Protection Communication Power system faults disrupt normal power flow by diverting current through a short-circuited connection and collapsing power system voltage. Power system fault clearing requirements are very important design and operational criteria for power systems. •
Faults can cause damage and breakdown to power apparatus such as circuit breakers, transformers and cables. The repair work or full replacement in case of destruction is very costly and may take considerable time.
•
Faults can also cause severe operational disturbances resulting in collapse of power delivery and blackout for regions, and, in severe cases, even for several countries. Heavy reliance of modern society on electric power consuming devices for business activities, safety, lighting, heating, communication and many other conveniences make severe disturbances and blackouts unacceptable.
•
Transients due to faults in the power system can also adversely affect sources of generation and customer loads.
•
Faults can have also legal and financial consequences. o Manufacturer can be responsible for the consequences in case of a faulty device (e.g. a breaker not acting correctly) Fault recorders can be used for this purpose o Customers may have to be paid for the “Customer lost minutes” and company can get a penalty from the Regulation Authority
Consequently, faults must be detected and isolated very quickly. Electric power system generators, transformers, Busbars, and power lines are monitored by Protective Relays designed to detect faults and operate isolating devices designed to interrupt damaging fault current. Protection performance requirements specify the balance between the conflicting goals of dependability and security: •
Dependability goals require maximum sensitivity and fast response time to detect and clear all faults quickly with very low probability of a failure to trip.
•
Security goals require maximum selectivity and slow response time to minimize the probability of spurious operation leading to an unwanted trip on a faultless circuit. Security is an issue during fault conditions as well as during normal, faultless conditions.
Therefore, the implementation of a Protection scheme should result in dependable operation of only those relays protecting the faulted unit, and secure non-operation of the relays during nonfault conditions and when faults occur on adjacent power system units. This balance is met only through proper protection scheme design, proper relay and equipment selection, and proper connection and setting of these relays and equipment to achieve appropriate sensitivity and coordination.
13
When protection schemes detect a fault on the equipment or line they protect, they signal (or “trip”) isolating devices, called circuit breakers, to open, in order to isolate the faulty segment of the system and restore normal voltage and current flow in the power system. When the protection scheme and circuit breakers operate properly, the fault is isolated within the required fault-clearing time. Protection applied on extremely high voltage systems, where fault-clearing times are most critical, typically detect faults and operate in about one to two cycles (or even less than one cycle in certain cases). Circuit breakers operate in one to three cycles. The combination of high-speed protection schemes and fast circuit breakers can interrupt a fault in about two cycles, although more common fault-clearing times range from three to six cycles. Many protection applications require the real time transfer of electrical measurements, signals and commands between electrical substations to enhance or to enable the trip/operate decision. •
Protection systems for substation units (generators, busbars, transformers, etc.) can normally meet the fault clearing requirements without using telecommunication. Telecom services may be needed in this case, only to command a circuit breaker at a remote end if a local circuit breaker has been economized (Direct tripping) or exists but fails to interrupt fault-currents (Breaker Failure).
•
Protection schemes for HV lines generally need to exchange information with the protection device at the far end of the line to meet fault clearing requirements. Communication between the protection devices may be the basis for fault detection as in the case of a Current Differential Protection, or needed to ensure that time response and selectivity requirements are met, as in Permissive Distance Protections.
The Teleprotection function is part of the Protection system that adapts the signals and measurements from the Protection to the telecommunication channel. It may be integrated into the protective device, or the telecommunication access equipment, or it may constitute a standalone device. If telecommunication fails, backup protection schemes still ensure that power system faults will be cleared, but they may not be cleared within specified time requirements. Then the probability of uncontrollable power swings and partial or complete system blackout increases significantly. Protection communications between substations are at present carried through transparent dedicated telecom circuits ranging from analogue (e.g. PLC), to a sub-E1 or E1 circuits multiplexed into an SDH transmission system, a dedicated wavelength or a dedicated fibre. The communication requirements of different protection schemes have been described in detail in [7]. Their evolutions, in particular their interfacing and transport over an Ethernet connection, are currently being assessed in CIGRE JWG D2/B5-30 and shall be the subject of a separate Technical Brochure. Building additional generating stations or transmission lines is generally the other alternative to reduce the probability of fault-induced blackouts but is significantly more costly than reinforced protection schemes with adequate telecommunication services. This is the reason why Protection Relaying applications can, on their own, justify the implementation of
14
dedicated telecommunication infrastructures with particularly severe constraints in terms of the quality of communication service.
3.1.1 State Comparison Protection Schemes (Command Schemes) State comparison protection schemes use communication channels to share logical status information between protective relay schemes located at each end of a transmission line. This shared information permits high speed tripping for faults occurring on 100 percent of the protected line. The logical status information shared between the relay terminals typically relates to the direction of the fault, so the information content is very basic and generally translates into a “command”, requiring very little communication bandwidth. Additional information such as “transfer tripping” of a remote breaker (to isolate a failed breaker) and recloser blocking may also be sent to provide additional control. Even if the communication requirements for state comparison protection schemes are considerably less stringent than for Analog Comparison Protection schemes (described in the next section), the “command transmission time” is of great importance because the purpose for using communication is to improve the tripping speed of the scheme. Also, variations in transmission time are better tolerated in state comparison schemes than in the Analog Comparison protection schemes. Communication channel security is essential to avoid false signals that could cause incorrect tripping, and communication channel dependability is important to ensure that the proper signals are communicated during power system faults, the most critical time during which the protection schemes must perform their tasks flawlessly. Comparing the direction to the fault at one terminal with the direction to the fault at the other terminal permits each relay scheme to determine if the fault is within the protected line section, requiring the scheme to trip, or external to the protected line section, requiring the scheme to block tripping. If it were possible to set relays to see all faults on their protected line section, and to ignore faults outside of their protected line section, then there would be no need for communication schemes to assist the relays. However, protection relays cannot be set to “see” faults only within a precise electrical distance from their line terminal. They are imprecise because of many factors, including voltage and current transformer errors, relay operating tolerance, line impedance measurement errors and calculation tolerance, and source impedance variations. The primary relay elements used to detect line faults are therefore set to see or reach either short of the remote line terminal (this is called under reaching), or to see or reach past the remote line terminal (this is called over reaching). Communication for state comparison protection schemes must therefore be designed to provide safe, reliable, secure, and fast information transfer from one relay scheme to another. The communication scheme must also be able to transmit information in both directions at the same time. The amount of information required to transfer between relay schemes depends on the relay scheme logic.
15
The terminology used to describe these state comparison protection schemes is basically defined according to the impedance zone monitored by the protection relay as presented below. CIGRE Terminology
Alternate Name
Intertripping Underreach Distance Protection
Direct Underreach Transfer Tripping (DUTT)
Permissive Underreach Distance Protection
Permissive Underreach Transfer Tripping (PUTT)
Permissive Overreach Distance Protection
Permissive Overreach Transfer Tripping (POTT)
Accelerated Underreach Distance Protection
Zone Acceleration
Deblocking (or Blocking) Overreach Distance Protection
Directional Comparison Unblocking (or Blocking)
Figure 3.1 - State Comparison Protection Schemes [7]
3.1.2 Teleprotection Signalling Systems Teleprotection signaling is the function of transforming the state information transmitted by the Protection Relay (e.g. a Binary Command) into a signal suitable for transmission over a telecommunication channel and to restitute the information to the remote Protection Relay or remote Circuit Breaker in a secure and prompt manner. Teleprotection signaling is associated to the communication of State Comparison Protection schemes and all direct tripping applications. The telecommunication channel is typically an analog circuit over PLC, or a digital sub-E1 or E1 over a multiplexed digital communication system or a dedicated fibre (or wavelength). The operational performance of a Teleprotection signaling system can be defined through the following parameters: • Security is the ability to prevent communication service anomalies from restituting a Command at the remote end when no command has been issued. Security is expressed as the Probability Pucof “unwanted commands” (command condition set at the receiving end for a time duration longer than a specified limit). Security is related to the communication service integrity (error performance) and the Teleprotection Signaling system’s error detection capability. •
Transmission time is the maximum time (Tac ) for the delivery of the command at the remote end, after which it is considered as having failed to be delivered. This is a constraint to the time performance of the communication service, not only in terms of nominal value but as a guaranteed limit.
•
Dependability is the ability to deliver all issued commands at all times without any statistical considerations. It is expressed as the Probability Pmcof “missing commands” (issued commands not arriving to the remote device, arriving too late or with a time duration shorter than a specified limit). This sets a very severe constraint on the availability and error performance of the communication service, challenging such telecom service concepts as “errored seconds” and “degraded minutes” being counted in the available time of a communication service.
16
3.1.3 Analog Comparison Protection Schemes Analogue Comparison Protection is based on the transmission and comparison of electrical parameters between the ends of a protected line. The “analogue” values that are compared across the line are, in particular, current samples although other schemes (e.g. Phase Comparison) also exist. Current Differential Protection (longitudinal current differential) is applicable to any overhead line or underground cable at all voltage levels and is used in particular for: •
Very short lines and cables where the low impedance makes the adjustment of settings difficult for the use of Distance Relay
•
Multi-terminal lines where the intermediate in-feeds modify the impedance seen by the Distance Relays, implicating that the observed impedance is not only dependent on the distance to the fault, but also on the in-feed from the remote terminals, making impossible an accurate measure of the impedance.
•
Situations where only current transformers are installed at each end of the line (no voltage transformers)
•
EHV transmission lines where series capacitors may create protection issues.
The transfer of “analog” samples between the ends of the protected line can be performed in several ways, the most common, at present, being the use of digital communications. The instantaneous current values at each end of the power line are sampled, converted to digital data and transmitted towards the other terminals with a sample rate ranging from 12 to 60 samples per cycle. Although the communication interface is generally a standard ITU-T (or EIA) interface, it should be noted that the time, integrity and availability constraints for these services are far from the standard telecommunication practice. Direct optical fibre connection between protection terminals or wavelength multiplexing of the optical protection signal can also be used with an enhanced reliability where dedicated fibre or wavelength is available. Current differential Protections are particularly time-sensitive as their operation is based upon the comparison of current samples collected from a remote point with those measured locally at the same instant of time in order to detect a fault. An error in sample timing and the delay compensation mechanism, results in a differential current that increases the risk of unwanted tripping. Modern systems provide a global time stamping of samples through GPS-synchronization. However, the older generation relays, still largely deployed, use the total “round-trip” transfer time to calibrate the time difference between local and remote current samples, assuming that the go and return times are strictly equal. This creates a great sensitivity of the system to any time difference and therefore implicates the same routing for the two senses of communication. This is a strong constraint on the operation of the communication network and the mode of resilience employed for the communication channel.
3.1.4 System-wide Protection Schemes System-wide Protection operates in a wider area than that for power line protections. It consists of measuring units at different locations across the power system, which sample in a 17
synchronized manner different vector measurement of voltage values (Synchronized Phasors or Synchrophasors) transmitting the information to a central equipment which takes protection decisions (Wide Area Protection & Control System, WAP&C). It should be noted, however, that not all system-wide protection systems are based on synchrophasor measurements. System-wide Protection can be used to implement an “Adaptive Protection Scheme: a protection philosophy which permits and seeks to make adjustments automatically in various protection functions in order to make them more attuned to prevailing system conditions”. System-wide Protection can also be used to prevent power system disturbance such as overload, power swing and abnormal frequency or voltage. These schemes are called System Integrity Protection Schemes (SIPS), also known as Remedial Action Schemes (RAS) or Special Protection Systems (SPS). They consist of automated systems that protect the grid against system emergencies, minimizing the potential and extent of wide outages through automatic measures such as load shedding, generator shedding or system separation. The telecommunication requirements for these system-wide protection schemes are similar to those for Current Differential Protections described above, but moving large volumes of information across a whole sub-network rather than between the adjacent nodes at the ends of a transmission line. This implicates time-constrained and fully predictable wide area network services. The required overall operating time is less than a few hundreds of milliseconds where the protection system transmission time should be less than several tens of milliseconds, and the propagation delay across the telecommunication system at most several milliseconds.
3.2 Energy Management, SCADA and WAMS Communications Energy Management covers all functions necessary to monitor and control the operation of the power network. Control Centres need to exchange information with generating stations, substations, other Control Centres, other Utilities, power pools and non-Utility generators. The information to be exchanged comprises real time and historical power system monitoring data including control, scheduling and accounting data. The reliability of the electricity supply depends ultimately on the security and reliability of the Energy Management System and its ability to exchange information. The Control Centre and its communications need therefore to be highly secure and reliable. The power system architecture and its operational organization often include different hierarchical levels of Control Centres as well as geographically distinct Back-up facilities and Remote Operator positions implicating inter-Control Centre and remote workstation communications.
3.2.1 SCADA Communications SCADA communication consists in the periodic exchange of short data messages between a central platform in the Control Centre and the Remote Terminal Units (RTU) in the electrical substations. The messages comprise status indications, measurements, commands, set-points and synchronizing signals that must be transmitted in real-time and requiring high data integrity, accuracy, and short transfer time.
18
Power transmission and distribution networks SCADA generally differ in their requirements, cost objectives and hence suitable communication solutions. The number of outstations and their corresponding size, cost, volume of traffic, and geographical dispersion are very different in the national transmission grid and in regional distribution networks. The time constraints and the required level of availability, fault tolerance and data integrity are also different. As a consequence, transmission grid SCADA communication is often implemented through a broadband private network with point-to-point or multi-point with a small number of RTUs (typically 2 or 3) per circuit, while in distribution networks (in particular for the MV level) lower capacity solutions such as UHF Multiple Address Radio systems (MARS), spreadspectrum licence-free radio systems or procured services (GPRS, VSAT, etc.) prevail. Still today, the widest employed communication mode for the substation RTU remains the Asynchronous Serial link through an RS232 interface, polled by the central control platform. The communication protocol associated to this mode has been standardized as IEC 60870-5101 (IEC101), although many other protocols are still in use in legacy systems. The great advantage of Serial link SCADA is its conceptual simplicity. The major drawback to serial communication for SCADA is indeed its lack of flexibility (e.g. for back-up control centre connection) and cumbersome installation in particular at the Control Centre. Packet switching has been applied to SCADA services since the late 80s, essentially to save leased aggregated bandwidth on Control Centre links and to enhance system flexibility and resilience. SCADA networks have been implemented over X25 packet switching, Frame Relay systems and ATM in certain countries. However, worldwide popularity of SCADA communications over packet networks has been due to IP communications. SCADA RTU communications are migrating to TCP/IP based protocol IEC 60870-5-104, generally called IEC104. The RTU communicates through an Ethernet LAN access interface at 10 or 100Mbps, although the bandwidth allocated to each RTU communication remains often around 10kbps. Legacy RTU may be connected through a Terminal Server encapsulating Serial data. The use of TCP/IP enhances considerably the flexibility of the SCADA communication system, facilitating the relocation of an RTU or the switch-over of RTU communications to a back-up facility. The migration process for large legacy SCADA networks from existing serial communications to TCP/IP is a major concern in many Utilities. This process may be extended over many years, and does not necessarily cover at the same time the replacement of the RTU, its communication interface, the telecommunication network and the Control Centre Front-end facilities. Moreover, new RTUs dispersed across the network may be TCP/IP while the existing may remain serial linked, up to their programmed end-of-life.
3.2.2 Inter-Control Centre Communications Communications between Control Centres is necessary for connection to back-up facilities (e.g. for database synchronization), to other Control Centres (e.g. for dispatch coordination), or to other platforms (e.g. for market management applications). The primary purpose of the Inter-Control Centre Communications is to transfer data between control systems and to initiate control actions.
19
These communications are assured through the Inter-Control Centre Protocol (ICCP) standardized as IEC 60870-6 and Telecontrol Application Service Element (TASE-2) protocol, although earlier protocols may still be in use in certain older systems. ICCP uses an underlying transport-service, normally TCP/IP over Ethernet. The required bandwidth for an ICCP link is generally around 2Mbps (E1) which can be provisioned over an SDH network, although lower capacity links (64-128 kbps or even lower) have been used where no fibre and SDH capacity is available. The time constraint for an ICCP connection is of the order of hundreds of milliseconds which rarely constitute a constraint in an IP/Ethernet network over a digital communication infrastructure. Security is the fundamental issue in implementing ICCP connections. An inadequately protected ICCP connection may form an open door to the control of the nation-wide energy network.
3.2.3 Remote Control Centre Operator Consoles Remote connection of Operator positions to the Control Centre platform also requires high speed communications. An Ethernet connection with a throughput of 2-10Mbps generally allows an adequate quality communication link for connecting these remote workstations.
3.2.4 Generation Control Signaling Automatic Generation Control (AGC) are signals that the Control Centre sends to the different generation plants in order to maintain the frequency and tie line flow and to increase or reduce their power production accordingly. Generation Control signals are either through fast regulation loop (1- 5 seconds) or slower “step-up” and “step-down” signals transmitted from the Control Centre to Generation Plants/Units through dedicated communication channels (or TCP/IP). Their availability and security is therefore essential to the proper operation of the power system.
3.2.5 Wide Area Monitoring System (PMU Communications) Wide Area Measurement and Monitoring provides a GPS-synchronized snap-shot of the power system through the acquisition of complex parameters (amplitude and phase) across the power network. It enables a better visibility of power flow across the system incorporating dispersed generation and multiple Utilities. The collected complex parameters are Bus voltages, line currents, etc. The Phasor Measurement Unit (PMU) is the acquisition device in the HV substation, collecting time-tagged phasors. Measurements are transmitted to a central platform generally through a Phasor Data Concentrator (PDC) for different applications. These different levels of wide area applications have very different requirements in terms of information exchange and consequently telecommunication service [8]. •
Post-incident analysis and static modelling applications are offline systems where collected data is used to analyze the cause of an event or to adjust the behaviour model for a system. Data can be collected continuously, daily or only on request. The communication service can be a TCP/IP file transfer service with no real time constraint.
20
•
Visualization and Situational Awareness applications collect data from sites and display them for human operator observation. These applications which constitute the great majority of present day systems have time requirements which are those of a human operator and must additionally present a level of sample loss unperceivable by the human operator. In terms of communication service a non-acknowledge UDP/IP is an adequate solution in this case whether through a dedicated network or a public provider.
•
Monitoring & Decision Support systems use collected data to produce analytical information helping operators respond to grid events and to position the grid for improved security and resilience. Stability diagrams and corresponding voltage collapse margins, as well as different monitoring applications (Voltage & Frequency stability, Power Oscillations, Line Temperature, etc.) are among these applications. Monitoring and decision support applications have time constraints which are similar to power system SCADA. This is achievable through UDP over a private IP network or a Service Provider VPN through a carefully specified SLA.
•
Closed Loop Applications are those which incorporate collecting of data from the grid, processing, automatic recognition of a pattern, and remedial action upon the grid. The systems are used for emergency situation control and special protection applications as described earlier in section 3.1.4. Closed loop synchrophasor applications are not yet widely implemented and their critical real-time nature necessitates particular attention on time control. Furthermore the decision to act automatically upon the network in real-time means that the data set (from different locations and sample stack from each point) must be complete, that is to say almost lossless. Providing lossless data across a telecom network generally implies error recovery which is constrained by time limitations.
PMU operation is specified by IEEE C37.118 which defines phasor construction using the GPS-satellite timing signal, as well as the phasor’s data format. The exact data volume associated with the transmission of a data packet from a PMU varies depending on the incorporated parameters and the way each of them is coded (i.e. floating point or not, etc.) but can be assumed to be around 80 – 100 octets. This data volume is to be transferred across the network at a rate which is governed by the sampling frequency of the PMU. The sampling frequency is expressed as a number of (or a fraction of) AC cycles. It is often 25 (or 30) samples per second corresponding to one sample every two cycles to 100-120 samples per second corresponding to two samples every cycle (Nyquist Rate). This latter rate allows the processing of the signal corresponding to the AC fundamental wave. The required communication throughput is then somewhere in the range of 16 – 100 kbps for a 50Hz power system although PDC links may require few hundred kbps upto 1Mbps or more.
21
Different types of application
Operation time
Latency
Data Availability
Telecom Service
Wide-area Visibility and Situational Awareness • Display of voltage, phase power swing and line loading • Help operator understand what is happening in RT − in a region or for a grid asset
Human operator Minutes
seconds
Sample loss not perceivable
UDP/IP on dedicated network or public provider
Decision support systems and Security assessment minutes • Analytical data helping operators respond to grid events or below • Repositioning the grid for improved security • Stability diagrams and voltage collapse margins • Monitoring of Voltage & Frequency stability, Power Oscillations, Line Temperature
seconds
Sample loss tolerated
UDP over private IP network or VPN with well specified SLA
System-level and grid asset models (static & dynamic) e.g. Power plant models
Off-line minutes
N/A
Non-critical
TCP/IP File Transfer
Closed Loop Applications • Emergency Situation Control and Protection Functions • Remedial Action (RAS) and Special Protection Systems
seconds or below
10-100ms
Very Critical
Ethernet VLAN with fast recovery
Figure 3.2 – Wide Area Applications communication service requirements [mes]
3.3 Remote Substation Control and Automation The HV substation is evolving into a networked environment around Ethernet (IEC61850) which will rapidly become the main interfacing technology for all data exchange applications in the electrical power substation. Even if the interactions in the automation and control of the substation are at present local, the connection of the substation automation platform to other substations and/or to remote monitoring and control platforms is rapidly becoming a requirement. Many Digital Substation Control (DCS) platforms employ “Scada RTU”-type communication protocols such as IEC104 for data exchange with the EMS/SCADA environment.
3.4 Operational Telephone System Highly reliable and secure voice communications are required for load dispatching and for network switching operations. At control centres, generating stations and switching substations, voice facilities are needed to allow operational staff to communicate quickly and efficiently. At times of disturbance on the system, the need for operational staff to communicate can be urgent. Normally, a private, highly secure operational telephone system is needed to provide the required facilities. The voice facilities for operational use include: • • • • • •
Direct (hotline) telephone lines from the Control Centre to all major operational sites’ control rooms Switched telephone service through PBX and a closed numbering scheme Additional redundancy and operation in situations of site isolation. Interconnection with the public telephone network. Voice and data traffic. Mobile radio voice facilities for access to operational staff who visit facilities. (Mobile workforce communications is treated in a separate section).
22
The Operational telephone service is today evolving into IP telephony and becomes increasingly an Ethernet transported data service with particular time and bandwidth requirements. Some of the specific features of operational voice service are as follows: • •
•
•
•
•
•
•
•
Access restriction – Use of the operational voice service is confined to operational staff and not accessible to unauthorized users. High availability – Voice service access for the operational staff and in particular the access of the Control Centre to the network substations and generating plants is essential and must present a very high availability through adequate route resilience and equipment duplication. Resilience/fault tolerance – The voice service must remain available even in the event of network faults, node failure and route unavailability. In particular, a star-structured network in which the failure of a single node may jeopardize the system is not acceptable. Multiple homing (at least dual homing) of secondary sites and a mesh interconnection of the main nodes is generally required to achieve the required level of fault tolerance. Transfer to Backup Control Centre – In emergency situations leading to the migration of power system control to a Back-up Control Centre, the telephone network must rapidly adapt in order to transfer the telephone calls for the Control staff to the Back-up facility. This transfer must be possible even if the communication equipment in the main control centre is no longer operational (e.g. fire, flood or power breakdown). Very rapid call connection – The call establishment time must be in line with the operational emergency situations in which the voice communication may become necessary. In particular, the structure of the telephone network (number of cascaded transits) and the employed signaling scheme may greatly influence the call connection speed. Priority functions – These functions allow critical communications to be established even when all voice network resources are occupied. This can be performed through Forced Releasing of facilities which are used by less critical communications, or by reserving the usage of certain facilities (e.g. communication channels) for priority calls only. Similarly, critical calls can “Beak-in” into an established communication of a busy “called party”. Priority status can be attributed permanently to a given user line (i.e. Control Operator), or obtained dynamically through a code for a given communication. Caller identification and Call Queuing – Control centre operators need to identify automatically the source of incoming calls and to establish queues of “in progress” communications in order to interact with many sites, in particular at times of power system emergencies. In progress and queuing calls must be accessible and transferable between different Control centre operator positions. Mobile voice – Control centre facilities and large power plants telephone systems must have the capability to connect mobile voice terminals to fixed telephone extensions. Depending on the implemented mobile radio network, these connections may require the existence of PTT (Push to Talk) facilities and associated conversion of Half-Duplex to Full-Duplex voice communications. Ability to pre-select conference calls – The voice system must present the capability to establish pre-configured conference calls, in particular between operational staff in the control centre, in multiple substations and maintenance staff.
23
Call Recording – Control centre voice facilities include voice recorders which constantly record all communications of the operators which will be archived periodically. These call recordings are essential in order to establish the sequence of events and instructions given by the Control Operators in emergency situations. Appendix 1 presents some examples of IP telephony usage in Utility Operational Voice systems.
•
3.5 Settlement and Revenue Metering and Customer Communications Energy metering is the exchange of time integrated Energy Data at a commercial interface or boundary used for energy charging and billing.
3.5.1 Energy Metering in the Deregulated Environment The opening of the electricity market and exchanges between countries together with the consequent introduction of new players and roles in the power delivery system modifies the requirements regarding the metering information. The transmission grid operator performs energy metering at the HV grid access point in order to feed appropriate information with an adequate level of confidentiality to the different market participants and to enable settlement and reconciliation processes as well as invoicing of its transmission services towards the distributors. Metering data may be used for the following purposes at the Transmission Operator side: • Invoice the grid access service, • Calculate and invoice (or pay) imbalances, • Calculate the compensation for losses on the network, • Pay for system services (frequency, voltage), • Check and pay offers on the Balancing Mechanism At the customer side (Distributors, Industrial Customers, Generators): • Sell or buy energy on the market • Check and control the load curves (comparison with the supplier invoice), • Optimize the access contract, • Control the process in real time by direct access or through remote reading • Make offers on the Balancing Market. Customers with several plants connected to the network require completed metering data for each of their plants in order to get the global load profile. The data from distribution substations are used for the calculation of Distribution System Operator (DSO) losses, to set up the national load curve, and to calculate imbalances acting on the DSO network (spatial alignment and temporal reconciliation). Metering is also used by the Balance Responsible Entity in order to maintain balance on its perimeter and to check the imbalances [9].
3.5.2 Customer Metering, Advanced Metering Infrastructure Customer-related communications and revenue metering are not operational applications but represent enormous potential for telecommunication services in the power utility. They can
24
enable, in certain cases, other operational and monitoring applications in particular in the distribution network where the number of communication nodes is extremely high and the cost that can be attributed to the access for each node is very small. Distribution SCADA for the secondary network (e.g. 33/11 kV) and monitoring of MV transformers are typical examples of operational applications profiting from the deployment of advanced metering infrastructures. Advanced Metering Infrastructure (AMI) which covers the overall system composed of consumer data acquisition and collection as well as bidirectional communication with the electricity provider, is a further step from simple remote reading of customer meters. Several AMI projects in different countries are assessing new technologies for dedicated network coverage beyond the EPU sites perimeter (MV/LV PLC, meshed networking of packet radio systems, etc.) and some telecom/internet operators are working towards new service offers to occupy this promising market segment.
3.5.3 Advanced distribution applications and Smart Grid Many new applications allowing better coordination of the end energy consumer and the dispersed generation of electrical power with the overall power delivery system, under the banner of Smart Grid, require bidirectional communications between the centrally located control platform and dispersed consumer/producer premises. These applications which include demand response, selective load curtailment and dynamic and negotiated control of consumer power limitations must be served in terms of communications with a variety of telecom services whose requirements vary considerably according to the envisaged scenarios, with different impacts on the telecom service delivery mode of the EPU. These are further discussed under Chapter 12 “Further across the Horizon”.
25
4 OPERATION SUPPORT APPLICATIONS 4.1 Collaborative Multi-media Communications Site working process in the Power Utility is changing with the new information system and IT practices. For the execution of their site duties, site personnel and the intervening staff require expert support, remote diagnostics and reporting facilities. The following constitute some of the required services: •
Networked office applications (e.g. mail and calendar systems, file transfer, remote database access, intranet),
•
Work-order and ERP solutions (e.g. project control and time registration),
•
Collaborative voice service, often called “switched” or “PBX” voice (increasingly evolving into an IP-data service as a consequence of switch technology change, network change and also in the objective of cost reduction and new features),
•
Video conferencing facilities and electronic white-board in branch offices and control centres and PC-based video-streaming in the dispersed sites.
These applications require the secure extension of the corporate enterprise applications from the branch office to the operational sites, while remaining fully isolated from the operational applications. IT-support may be effectively administered from a corporate central site.
4.2 On-line Documentation Documentation is an essential base for efficient management of utility infrastructure. Previously, the site support staff found all necessary information to carry out their tasks at the substation either in the documentation residing at site (equipment maintenance manuals and schedules, drawings, etc.) or in their briefcase. Increasingly, an extensive amount of support information for the field intervention staff is available in centralized servers that can be accessed on-line when required. Pictures and video add particularly useful information in the dispersed environment of the power delivery system. These applications require a broadband network in order to meet an acceptable time performance. The introduction of inexpensive GPS equipment and commercial mapping applications makes Geographical Information Systems (GIS) an important tool for field based maintenance personnel. Connecting to maintenance applications in the substations and downloading accurate maps, pictures and work orders may effectively economize time. However, the use of GIS and increasingly automation of data acquisition of power line infrastructure (e.g. laser scanning) lead to heavily growing data volumes and need for scalable ICT infrastructure. On-line documentation is a well identified requirement and an existing facility in certain Utilities. The use of networking is to be coordinated through the Security policy of the Power Utility.
26
4.3 Substation Automation Platform Management The monitoring and configuration management of the different components of the substation automation system (protection relays, feeder bay controllers, etc.) is still often performed locally in the substation and through the substation controller. It can be assumed that networkwide communications shall be required to perform platform management tasks remotely either through native TCP/IP or through encapsulation and terminal servers. In this case, a TCP/IP traffic flow resulting from the supervision data relating to the operational performance, health and condition of the SAS, as well as File transfers due to configuration data and parameter settings can be assumed.
4.4 Condition and Quality Monitoring Communications Primary assets of the power system (circuit breaker, power transformer, etc.) generate condition monitoring data through their associated electronic intelligence which needs to be collected for maintenance requirements, and for determining duty cycle, capability and loading ability. Asset condition monitoring enables the safe and efficient use of network components to their real end of life, at full efficiency, without disruption of service due to asset failure, environmental risks, or unnecessary preventive replacement. Secondary assets of the electrical substation related to the measurement, protection and control as well as their respective power supply (including that of the telecom equipment) also need to be monitored through a remote platform. Condition monitoring in the substation generates a large volume of non-real time data to be transferred continuously to one or multiple remote platforms, hence creating the necessity for a “monitoring network” across the telecommunication infrastructure. The associated architecture is indeed utility-dependent but can be implemented conveniently through web-service, with servers residing in the substation or at some other location. Environment monitoring applications in the power system are performed for two different purposes: •
Protect substation assets and premises (temperature monitoring, substation fire detection, etc.)
•
Protect the environment from industrial risks and hazards related to the substation assets, e.g. chemical pollution detection, etc. These are part of the security and safety applications as described in section 5.1.3.
Energy Quality Monitoring applications transfer data related to electrical power parameters at commercial interfaces or boundaries where energy is transferred between different power actors and may be sanctioned through financial penalties according to contractual quality of service criteria.
4.5 Substation data Retrieval Another group of data exchange applications in the power utility operational environment concerns the non-real time transfer of power process data captured in the electrical substation to data analysis platforms and engineering staff for the evaluation of events and confirmation of device configurations. Substation process data retrieval is typically a web-service
27
application, making formatted data available when required. Data to be transferred is as follows: • Event Reports – typically log files and reports generated by an event recorder or historical system which provides information on the change of state of operational equipment • Oscillography Files – typically event triggered fault records generated by a protection device or fault recorder. These may contain digital events and analogue waveforms. • Device Parameters and Settings – data files uploaded to provide information on the actual configuration of a device.
4.6 Mobile Workforce Communications EPUs make extensive use of mobile communications in the management and support of their infrastructure. In addition to traditional voice services for field-based operational workforce, the evolution of working practices is leading increasingly to mobile data networking applications connecting the maintenance staff to their support base, on-line documentation, and workshop applications such as spare parts database. In particular, systems can largely benefit from dedicated mobile data applications [10]. The operational mobile terminal units are evolving towards Smartphones and Tablet Computers for which ruggedized field-proof versions appear on the market. A number of projects for advanced mobile workforce communications have been reported. A particularly interesting one in Japan [11] is dedicated to disaster recovery. Another experimental project has developed a “wearable terminal” that keeps field workers in continuous contact with support personnel and enables the real-time transmission of camera images and audio from the work site to the support base which can then provide precise support through voice interaction using a headset [12]. These permanent contact systems can also provide checklists and templates for the elaboration of on-line realtime reporting and hence improve work safety, accident prevention and work efficiency. The implementation of data-rich mobile communication systems requires indeed the existence of high availability, disaster-resistant, high throughput wireless connectivity. More than other communication services in the Utility, the provision model for mobile workforce communications is often under assessment. There is no doubt that the most economical solution is to use public mobile services which provide a high level of geographical coverage through extensive deployment of Base stations and related infrastructure, as well as continuous roll-out of new data services and applications. The EPU pays for its communications, not for the infrastructure providing coverage. However, an essential application of the mobile system, its usage by maintenance teams during power outages and in disaster recovery situations, is severely constrained due to public base stations’ insufficient power autonomy and severely degraded service accessibility/performance when the network is widely solicited (e.g. during a disaster situation). Deploying a “security grade” private mobile radio system (e.g. TETRA) with a high coverage is costly and the roll-out of new data services and applications cannot be performed in pace with their public counterpart. This may lead utility staff to use public mobiles even if the company is equipped with its own private mobile facilities [10]. Many Electrical Utilities, in particular those in the distribution sector, own or share with other critical users some type of private “security grade” mobile network, although the issue remains largely a subject of internal discussion. Appendix A2 describes one such experience in Portugal where a shared TETRA service is used. 28
5 SECURITY, SAFETY AND ENVIRONMENTAL MONITORING Security and safety applications in the Power Utility are not strictly speaking operational because they do not consist of processes used to operate, control and protect the power system. Still they are increasingly critical and subject to great attention in many Utilities. Many applications are not yet implemented in a generalized manner in most Utilities but risk awareness and concern is growing and this leads to regulatory pressure for implementing risk mitigation applications. Security & Safety applications can be classified as follows: • • • • •
Protect Sites and Power System Assets from unauthorized access Protect the Public from accessing to dangerous sites and apparatus and from the impacts of the power system (e.g. dam water discharge alerts) Protect the environment from the pollution or undesirable impacts of the power components Protect the Utility staff from the operational risks Protect the Utility from Cyber threats on operational (or non-operational) systems
5.1 Security of Sites and Assets 5.1.1 Video-surveillance of sites Growing concerns in recent years and the resulting regulatory obligations over the integrity and security of national critical infrastructures and the danger from sabotage and intentional damage have lead many power utilities to implement substantial systems for remote video surveillance of substations and other operational installations. The substantial increase of unmanned installations and night-time surveillance amplify this requirement. Traditional video surveillance equipment based on proprietary solutions has been, and still is, rather expensive. Introducing rather inexpensive, semi intelligent IP-based cameras opens a new road to better control of the exterior border of substations. Surveillance cameras using Video over IP are widely being used. Ideally High Definition video would be necessary in order to provide the necessary resolution, however, the traffic volume makes these systems very difficult to implement in a generalized manner. Video-surveillance may also be used to prevent the possibility of damage from high voltage installations to the public (e.g. electrical sites and apparatus inadvertently left accessible to public). This is particularly important in street-side power distribution transformer and switch shelters. Remote video monitoring of installations is a tremendous source of data traffic across the power network and can drive major telecom network rehabilitation.
5.1.2 Site Access Control Electronic site access control systems are increasingly used to control, register and monitor the physical access to operational sites. Smart electronic identity cards and biometrical authentication are becoming part of the security and safety policy. Electronic access control
29
allows differentiated accessibility in time and across locations for different classes of staff (operational staff, service contractors, maintenance, etc.). These applications require fast and reliable data communications for authentication and access registration.
5.1.3 Environmental Hazards Monitoring (Sites and Assets) Environmental awareness and concern is leading power utilities to implement remote environmental monitoring systems for producing an early warning on incidents and hence to avoid environmental impact. Typically, these applications are linked to environmental sensors for detecting fire, smoke, floods, gas and chemicals and provide alarm information to remote monitoring platforms.
5.1.4 Intruder Detection Protecting sites like power plants or substations from intruders has always been a main concern for Utilities, not only for the site protection itself but also for human and animal safety. Fences and guards were in the past the only solutions, but with increasingly unmanned installations, intrusion detection systems are being introduced. The classical intrusion detection system is composed of sensors (radio, laser, dry contacts, …) connected to a local collector unit that monitors the sensor states and, in case of a detection, sends online notifications to the Security Operational Centre (SOC) of the Utility and local security forces. More recent developments use video-surveillance cameras and image analyser software that alerts SOC operators in case of image pattern changes.
5.2 Human Safety & Operational Security 5.2.1 Earth Connection Monitoring Regulatory obligations on human safety impose in certain countries that the physical connection to Security Earth must be visible by the staff during intervention on HV apparatus in a substation or on a power line. Video-monitoring may be used in certain cases to fulfill this regulatory obligation. In this case, a video camera can provide a remote “visual” verification of ground connections. This application remains local and its service requirements are to be taken into account with other local area communications.
5.2.2 Isolated Worker Safety Communications Health and safety regulating authorities in many countries provide particular rules to deal with the case of workers whose duties bring them to work alone in a site, in order to protect them from the consequent hazards [13]. Emergency situations may arise due to sudden onset of a medical condition, accidental work-related injury or death, attack by an animal, exposure to elements, or by becoming stranded without transport, food or water. A person is considered as alone at work when he cannot be seen or heard by other persons who can provide assistance if necessary. The Power Utility must ensure that a means of communication is available in the event of an emergency to enable the employee to call for help, and also ensure that a procedure for regular and systematic contact with the employee at pre-determined intervals has been established.
30
1. Fixed Telephone service - The simplest communication service for isolated workers is indeed the accessibility of a telephone at site, provided that the person is able to reach the telephone in an emergency. This service must be available not only to the operational staff but also to external parties contracted for specific tasks in the Utility premises. The telephone system must provide an emergency number accessible to all categories of users. 2. Radio communications – Different categories of two-way radio systems are in use in different Utilities for traveling staff and for employees working in large sites such as power plants. Mobile workforce communications are covered in section 4.6 and may be used also for assuring the safety during the trip to site depending upon coverage constraints. Usage of private mobile radio as a means of assuring worker’s safety needs careful location of base stations and identification of shadow areas as well as an adequate procedure to assure the supply of charged batteries. Public mobile phone is used in some Utilities as a cost-saving alternative, but may present serious drawbacks with service coverage and service continuity particularly when the Power Utility employees are intervening to re-establish electrical power in a region due to the extremely short power autonomy of the public base stations. 3. Satellite Communication systems – Satellite phones overcome the problems of public mobile phone in poorly covered areas and the dependence on local power supply of the base stations. Satellite systems also allow implementing “Location Beacon Systems” determining the location of the employee through GPS and signaling this location to an operational base permanently. Care should be taken however as their operation is affected by damage to aerials, failure of vehicle power supplies, or vehicle damage. 4. Personal security systems – These portable wireless transmitters are permanently in communication with a central receiver and may include a non-movement sensor that will automatically activate an alarm transmission if the transceiver has not moved within a certain time 5. Emergency location beacons – When working in particularly remote areas, emergency location beacon systems which are automatically activated in emergency situations may be used. These systems do not depend upon the vehicle power supply and do not risk damage as satellite communication systems.
5.2.3 Public Warning Communications Hydro Power Generation Utilities have extensive security communications related to the monitoring of Dam installations and their associated equipment. The alert of flooding due to water release to avoid dam overloading or lightning warning during thunderstorms are some of the capabilities of Dam Management Systems. A Dam Management System implemented by Kyushu Electric Power Company in Japan [14] is presented in figure 5.1. The system collects meteorological data (rainfall, etc.) and dam status information (e.g. water level and inflow) from different stations to the dam management centre which transmits discharge gate operation commands to release water in the event of excessive rainfall or high water level in the reservoir. In this case, the dam security system
31
operates the Discharge Alert devices. Sirens and information display boards notify the public that water will be discharged, preventing water-related accidents. Kyushu Electric Power also informs its customers through a specific website about the location and time of lightning strikes in Kyushu region. The lightning strikes forecast is a result of collected field values measured by sensors across the operating region, this information is than sent and processed centrally, and is used as the criteria of judgment for the generation of lightning, thunder, and launching, making possible protection against lightning and safety control.
Figure 5.1 – Kyushu Electric Power Company Dam Management System [14]
5.2.4 Hydraulic Structure Operation and Maintenance Applications Hydro Generators have flood warning and security surveillance applications that are discussed elsewhere in this technical brochure. They also have applications to ensure the integrity of their dams and other hydraulic structures and for the provision of support services to operators, maintenance crews and field inspection staff. Hydro Generators need to monitor parameters relating to the long term stability of dams and so have regular inspection programs for measuring discharge from seepage points and also for measuring dam deformation as recorded by strain gauges spread across the entire dam. Typically this data is manually recorded during inspections and keyed in or uploaded back in the office to the applications utilizing this data. Inspection data can now be uploaded from the many site monitoring points by the inspector inputting the data directly to the companies’ corporate network using a Wi-Fi connection back to a local Wi-Fi “AP” Access Point. An Access Point, with a range of up to several hundred meters, will typically adequately cover the dam and provide very economical “last mile” service delivery. This same network can also be used for operational visual monitoring purposes using IP video cameras so that decisions can be made in advance of when problems escalate to the stage that
32
they are picked up by conventional SCADA alarms. (e.g. amount of debris building up at trash racks or in the vicinity of critical flow monitoring weirs). IP video cameras are also used as confirmation of SCADA alarms in flood control situations and hence provide extra information for difficult operational decisions or where associated events such as fallen trees, rain or snow impedes access to the site. Field staff can also use such a network to access water level and gate position data to enable onsite calibration of transducers without the need to tie up office staff in relaying values back to the field staff. They can also send photographs, etc directly back to specialist engineering staff from remote locations thereby sometimes saving travel time to remote locations for specialist staff. In all of the above, security policies and technologies have to be rigorously applied due to the openness of Wi-Fi networks. Typically technologies such as WPA2, strong authentication, IPSec tunnels, firewalls, etc are mandatory because any Wi-Fi network has to be regarded as completely untrusted – in effect the same as an internet connection. Many Hydro Generators also have UHF/VHF radio networks that support their operations by transporting data regarding stream flows as well as snow and meteorological parameters. This data is used in applications to optimize the Hydro Generator’s operations and also to account for the use of water as part of their water license conditions. Lastly Hydro Generators also have protection systems which are used for the reduction in impact caused by the failure of a hydraulic structure, particularly for the detection of a burst or significantly leaking penstock. These systems are usually based on differential flow measurement (e.g. at either end of the penstock) with automatic control back to guard gates. Similar to electrical protection systems, reliability is critical for this application and hence duplication of systems and communications facilities is inevitably employed. However a much longer transmission delay (e.g. seconds) is acceptable for these protection systems due to the relatively slow speed of operation of hydraulic gates. Hydro Generator hydraulic assets are often located in remote areas outside the coverage of telecommunications providers. However in some cases the main hydraulic structure may be able to be connected to a telecommunications Service Provider and used as a local “hub” to aggregate the communications required to support the above applications.
33
5.3 Cyber-Security applications communication The exchange of information across the different sites of the power utility has made it essential to implement numerous security barriers and intrusion detection/prevention systems across the network. As the number of security related devices and systems increases, it becomes necessary to reinforce them through coordinated administration, supervision and efficient logs processing. Some Power Utilities such as RTE in France have set up, within a national structure, a Security Operational Center (SOC) which prevents and reacts to security related events [15]. The missions of the SOC are as follows: 1. Security device’ Centralized Administration – guarantees the filtering rules homogeneity regarding data exchange protocols and access checking through the translation of security policies. Centralized administration sets Security rules for firewall, updates security software, and manages security oriented VPNs.. 2. Security device’ Centralized Supervision - provides an image of current status of specific security network equipment and applications (network card, memory, CPU, Hard disks, Application processes) and enables preventive or curative measures. Centralized supervision allows quick identification of breakdowns and generates warnings for SOC operators. 3. Security Events’ Log Analysis - The security events’ elaborated analysis of the company network allows to give a correlated image about what happened in real time. Information is collected from all security devices and processed to normalize, aggregate and/or correlate depending on security policies to provide, in real time, diagnosis of intrusion or intrusion attempts. Beyond real time indicators, this supplies security reports which may be used for investigations and understanding of incidents or trends especially through dashboards. The operation of the described process requires a “Security Management Communication Service” with a high level of reliability and security (through encryption), covering all sites of the Utility in which Security devices and systems are implemented.
34
6 OPERATIONAL CONSTRAINTS AND SERVICE LEVEL AGREEMENTS 6.1 Operational Coverage and Topology Telecom coverage and service access to different operational sites is the first condition for the adoption of any particular service provisioning solution. • HV Grid sites are accessed in a very cost effective manner through a dedicated telecommunication infrastructure, based on fibre over HV lines (and PLC). Considering the peripheral location of HV substations and power plants, a Telco is not always in a position to provide access with the required capacity. • Utility offices in urban environment with no proximity to the HV grid, cannot be accessed directly through the dedicated network, necessitating the installation of new underground cables, urban microwave, or other wireless “last mile” connections. The approach may be unfeasible or costly. These sites are often more economically served by public telecom operators sharing infrastructure with other customers. • Hydroelectric power generation plants and off-shore wind farms often have no other service alternative than dedicated communications or VSAT services. On the other hand, dispersed generation facilities on customer premises may be covered by public telecom operators. The topology of the network infrastructure has direct influence on the performance and fault tolerance that can be expected from the communication service [16]. • The number of transit and switching nodes to be crossed determines the time delay and the availability for the connection. Direct links through dedicated fibre over the HV line are most favoured for critical protection applications. Telco services are often discarded in this case due to the impossibility to establish a direct physical link (topology is based on criteria such as customer concentration, availability/cost of leased fibres, and site facilities) • The possibility of establishing two independent routes between two access points of the network, determines the fault tolerance that can be incorporated into the network design. Typically, a Control Centre towards which a great number of communications converge cannot be located on a secondary spur of the communication network. Control Centre
Power Plant
Control Centre
Provider Nodal Switch
substation Service Provider
substation
400 kV Line
400 kV Line
substation Dedicated Network
Fig 6.1 – Topology of Telco and dedicated network and their correlation to the Power system 35
6.2 Time Constraints Deterministic and controlled time behaviour for communications of time-sensitive applications is one the major reasons leading utilities to deploy and maintain dedicated telecommunication networks. In these networks, the time performance (as well as availability and fault tolerance) can be adapted to the requirements of each application through an appropriate choice and blending of technologies and proper topological structuring. These time control methods are further described in this section. On the other hand, when public telecom services are employed, time control is rarely part of the Service Level Agreement (SLA) of the Service Provider. Generally, the Service Provider cannot commit contractually to anything better than 20-30ms communication delay and is therefore excluded as a solution when faster applications such as protection relaying are to be carried. Time behaviour of a communication service can be characterized through a number of different parameters as follows: a. Time Latency (delay) Time latency is the absolute delay introduced by the communication network into an application. Time latency is an important constraint wherever a remote command (or remote information needed for elaborating a command) is to be received in constrained time. Time latency also matters where a bi-directional exchange is to be established with limited waiting time at the receiving end. •
Communication channels used for protection relay applications may need absolute time latency as low as 5 to 10 ms depending upon the protection scheme and the power system’s fault clearance time (around 80 - 100ms depending on the voltage level).
•
SCADA system overall performance can be degraded by a high time latency or even made completely inoperable through RTU communication time-outs.
•
Voice communication can be seriously degraded by high time latency (more than 150ms) through echo.
Absolute time latency problems may be avoided through an appropriate design based on Time Division Multiplexed circuits (e.g. SDH), constrained usage of switching and routing and lack of traffic queuing for critical applications. For delay critical services delivered over IP, pre-established static routes may be employed to ensure guaranteed end-to-end performance. It should be noted that the “real-time” requirements of SCADA RTU communications are generally in the range of seconds, as compared to order of magnitude smaller transmission times across a thoroughly designed SCADA Ethernet/IP infrastructure. The main issue here is therefore the number of intermediate nodes in the routing of SCADA information as well as the time for any encapsulation and concatenation. b. Time Predictability (delay variations) and Timing Jitter Time predictability determines the delay variation of a communication channel. It defines the capability to predict the time delay of the communication network, independently from the traffic load from other services being carried across the network, and whatever being 36
the network’s state. Time predictability assigns a probability distribution around a nominal time delay value and therefore a maximum acceptable delay. Protection applications (in particular current differential schemes) and voice services are particularly sensitive to delay variations. Time predictability is achieved by avoiding traffic queues which generate variable service times and by imposing constrained routing (e.g. maximum number of hops, pre-determined back-up route, or no route resilience). In case of an SDH system, ring protection must be avoided or carefully analyzed. c.
“Go/Return Path” differential delay Time coherence among the remote points of a distributed application is sometimes achieved by initiating a remote loop-back and measuring the go-and-return transfer time. Considering that the go and return times are equal, the transfer time between two sites is in this manner calculated. This type of transfer time estimation is used in older generation differential protection relays and also in absolute time clock distribution systems. This renders the systems very sensitive to Go/Return path delay variations. In the case of current differential protection systems, a maximal differential delay of less than 400 microseconds is often necessary. Differential delay is controlled through avoiding store-and-forward with traffic queuing and variable service time, and bi-directional route switching (i.e. when a fault is detected in one sense of communication, both directions of communication must switch to a same alternate bi-directional route).
d. Restoration time Transporting operational traffic imposes a time limit on service restoration following a network fault condition. Some operational services require Hitless Switching (no loss of service during the switchover from a normal configuration to a backup state). Restoration time depends upon the employed communication technology and the topological complexity of the network: •
End-to-end alternate route switch-over for each critical service is very fast but does not assure a high level of resilience,
•
Ring protection across an SDH network (e.g. SNCP protection) can restore in less than 50ms.
•
The restoration mechanism of Ethernet, the Spanning Tree Protocol (STP) has a convergence time which depends upon the complexity of the Ethernet mesh. The restoration time may be too long for SCADA communications. More elaborate options such as Rapid Spanning Tree (RSTP) reduce this time.
•
Routing in an IP network is based on different algorithms which re-establish the routing table in each node after any network configuration or status change. Typically RIPbased routing requires around one minute for restoring communications while an OSPF-based system can restore in 5-10 seconds.
37
•
Multi-Protocol Label Switching (MPLS) enables the network to restore service using pre-established alternate routes for each Virtual Private Network (VPN) and can in this way reduce considerably the high restoration time.
The time behavior of the communication network is determined by the following aspects: 1. TDM versus Packet network – The migration from conventional TDM (Time Division Multiplexing) networks towards Ethernet and IP increases considerably the network’s bandwidth efficiency (avoid idle bandwidth), flexibility (network interfaces and routing) and resilience. However, it is also a major source of concern for the control of time behavior. Assembling data packets before transmission and store-andforward of the packet at each intermediate node causes additional buffering delay which increases with the packet size and with the number of transit nodes. Furthermore, dynamic data routing needed for network resilience, gives rise to delay variation and lack of time predictability. Moreover, the fully deterministic behaviour of the TDM is being replaced by the statistical behaviour of the packet-switched multi-service network where queuing and traffic profiles determine the overall delay. 2. Multi-service Integration – Bandwidth efficiency in packet networks is achieved through integrating multiple traffic streams into a same packet network. Priority queuing mechanisms are employed to assure better time performance for more critical services. This implicates that one or multiple traffic queues are established at each routing/switching node. A queue-based store and forward communication system provides essentially a “Best Effort” service with statistical time characteristics. This issue is often masked through over-dimensioning of the network and “order of magnitude” smaller time requirements of the applications. 3. Network Resilience versus Fixed Routing – Network resilience ensures the continuity of service in presence of network faults, but at the same time renders indeterminate the routing of communications. Time predictability is generally sacrificed by improved resilience. Traffic streams which are sensitive to delay variations must generally be treated separately without service restoration mechanisms described previously under “restoration time”. 4. L1/L2/L3 Partitioning and Topological Structuring – In order to provide adequate time performance to critical services while maintaining bandwidth efficiency, flexibility, cost and resilience, it is necessary to design the network with an adequate level of “information forwarding” at physical, link and network layers. • • •
Direct or TDM connections for best time performance but at low bandwidth efficiency, flexibility and resilience. Ethernet Switching with Virtual Networking (VLAN) and priority assignment for fast transfer of information packets (frames) IP Routing for maximum resilience and multi-servicing
Different network topologies can in this way be obtained for different services over a same telecom infrastructure leading to different numbers of intermediate nodes at each layer.
38
5. Network Monitoring, SLA Management and Planning – If managed services are to be used for time-sensitive applications, it is important that the Service Provider be contractually committed through SLA to assure the time constraints (absolute time latency delay variations and restoration times) knowing that very often the SLA is not explicit enough to allow many critical applications. Moreover, a contractual commitment on time performance must be continuously (or periodically) monitored and effectively sanctioned: Time performance monitoring functions must be associated to the network’s performance management facilities in order to assure that contractual obligations are met.
6.3 Availability Constraints Availability is a service-related statistical parameter which can be defined as the probability of proper operation of the network for a given information exchange. It is normally quoted as a percentage of “up” time of the service, or the percentage of time that the network can effectively forward traffic (e.g. 99.999%). It can be estimated theoretically and measured practically on a network-wide or per-circuit basis. Service Availability can be expressed as: AService = 1 − (
Mean.Time.To. Re store.Service Mean.Time.Between.ServiceOutages
)
Some currently used values are given below: Availability objective 99.999% 99.99% 99.9%
Service Downtime 5.25 min/year (~5 Hours/57 years) 52.5 min/year (~5 Hours/ 5.7 years) 525 min/year (~5 Hours/ 0.57 years)
Example of Service Protection Communication SCADA, Operational Voice Data Service
However, this statistical parameter, widely used in public telecommunications, in computer systems and networks, and in defining SLAs must be used with precaution when applied to operation-critical services with an extremely low “service-to-idle” time ratio. As an example, consider a managed service with an apparently high contractual availability of 99.999 %. This gives an unavailability figure of 1E-5, and an unavailable time of 311 sec per year. A Protection Relay system sending 50 trip commands in one year can be “unavailable” during 6 seconds each time and still respect 99.999% availability! For low duty cycle operation-critical services, it is more appropriate to use “service dependability” defined as the conditional probability of a service being available when it is
39
solicited. If we consider the unavailability of the service as being independent from (uncorrelated with) the probability of requiring the service, then: Dependability = Availability x Service-to-Idle Time Ratio However, the hypothesis of service unavailability being uncorrelated with service requirement is often far from being evident for operational communications. Typically, a power system fault is a situation that initiates extensive exchange of information, but it also induces impairments in the operation of the communication system. Unavailability in a communication service has different origins as follows: a. Network infrastructure faults b. Channel impairments (noise and interference, synchronization loss, fading, etc.) inducing error detection and data rejection. It should be noted that the borderline between Unavailable Time and Degraded Performance (Available) Time is very dependent upon the communication service and its specific constraints. This point is further discussed under Service Integrity (section 6.6). c. Timing, queuing and priority impairments leading to unsuitable Quality of Service and late delivery of information d. Works and maintenance in the network infrastructure without adequate measures to assure service continuity (excludes restoration works for the service itself). Service availability can be improved in the following manners: •
Reduce the occurrence of network infrastructure faults. This can be achieved through more reliable and fault tolerant network devices and infrastructure resources.
•
Reduce channel and timing impairments in the network. This can be done through suitable transmission link design, synchronization planning, traffic engineering, performance planning, and installation practice.
•
Reduce the impact of network component faults and channel impairments on the delivered service. This can be achieved through a more resilient network design and duplicated access.
•
Reduce the duration of service down-time following a network fault. This can be achieved through faster detection and localization of network faults, faster identification of impacted services and faster restoration of service through network maintenance. Reducing downtime is therefore dependant upon the service and infrastructure management and maintenance processes and organization, well-trained staff, well-dimensioned stocks of spares and adapted monitoring tools discussed in more detail in later sections.
40
6.4 Service Survivability and Resilience Survivability is defined as the ability of the communication service to continue during and after a disturbance (network infrastructure fault). Many operation-critical communication services do not tolerate service interruption and down-time. In addition to the statistical concept of availability, it is essential to assure that, as a minimum, no single fault in the reliability chain shall jeopardize the application. The concept of service survivability perceived at the user end, translates into Resilience in the telecom Service Provider’s infrastructure. Resilience is the ability to provide and maintain an acceptable level of service in presence of faults. It is achieved through a multi-level resilience model [17]: 1. Fault tolerant network elements with duplicated core components and duplication of access equipment at critical sites 2. Survivable topology ( e.g. Ring and Mesh structures allowing alternate routing) 3. Disruption-tolerant end-to-end transport through protection switching and service restoration mechanisms (e.g. SDH MSP and Ring protection, Ethernet Spanning Tree Protocol (RSTP), IP routing mechanisms (e.g. OSPF) 4. Fault Tolerant and adaptive applications and overlays – Main/backup end-to-end SCADA RTU communication circuits and Protection channels 5. Adapted Management strategy and system supervision through appropriate fault management tools dedicated to the operational services The following points must be noted in relation to service survivability and consequently in relation to the design of resilience into the network: •
Service Restoration Time – This point has already been discussed under Time Constraints section. Different levels of network resilience operate at different time scales which may be compatible or not with different applications’ maximum acceptable outage duration. Typically, what is commonly called “Hitless Switching” signifies that an established connection for an application remains unaffected by the disruption time due to the switching of the communication path or resource.
•
Routing Control – Designing resilience into the network generally signifies injecting a degree of uncertainty into the routing of information and the resources which are used for delivering the service. This in turn, impacts the absolute time latency of the network and generates delay variation. In the most time-sensitive applications (e.g. Protection Relay communications), resilience is restricted to the highest level, that is to say to the application itself (main/backup end-to-end routes). The two communication channels for these applications must employ no common path, no common node and no common equipment, so that no single fault disrupts the application. This requirement generally requires a total control of the channels’ routing.
•
Underlying infrastructure – Where the telecom Service Provider employs lower layer connectivity service delivered by another infrastructure provider to build the network, it becomes difficult or impossible to guarantee the independence of main and backup routes.
41
In particular, when a public telecom operator and managed services are involved, tracking the full routing of connections can become extremely difficult. •
Coordination of Resilience - Applying different layers of resilience in the network to the same service without adequate coordination may cause unnecessary cost and complexity. As an example, main/back-up end-to-end application level fault tolerance, Spanning Tree protection on the Ethernet connectivity and ring protection in the underlying SDH network may all operate on the same network fault with different time scales to switch the traffic from the same topological path to another unique topological path.
•
Dormant Faults - An important issue in assuring service survivability is the capability to detect latent faults in the back-up routes. A classical strategy adopted in power system telecom networks, is the cross-over of Main and Backup routes for different applications as presented in figure(x )(e.g. Protection Line 1 / Line 2). This approach uses each of the two paths, and its associated resources, for one Main and one Back-up route, and therefore checks continuously each path to detect any anomalies.
•
Management Facilities – In an increasingly resilient network, permanent network route monitoring becomes an essential part of the Network Management System. It lets the network supervisor track and observe the routing of information at different network infrastructure layers at a given instance of time. This improves considerably the “contextawareness” required to control network resilience and therefore service survivability. In addition, the management facilities determines the impact of network infrastructure faults on the service delivered to higher layer of network and therefore allows the coordination of resilience at different levels.
Indeterminate Routing
Controlled Routing
? N1
N1
N1
B1
B1
B1
N1 B1
B2
B2
B2
N2
N2
N2
B2 N2
?
Fig 6.2 – Double routing and crossed routing for SCADA and Protection communications [16]
42
6.5 Service Security Constraints Information security is a global issue covering the entire information and communication system. It is treated as a full end-to-end process through an appropriate Security Policy identifying security domains, taking measures to mitigate risks and devising periodic audit schemes to assess the effectiveness. The subject is extensively treated in other CIGRE publications [18] and by NERC CIP standards (North American Electric Reliability Corporation, Critical Infrastructure Protection). This section only presents the security requirements for the “network cloud” expected by the Operational Service User so that the telecommunication connectivity shall not compromise the security level that the User’s information infrastructure has achieved. It is in particular valid when there is a separation between the telecom Service Provider and the operational application infrastructure. Security risk mitigation measures for the telecom Service Provider are presented in figure 6.3. a. The physical access point to the communication service must be secured and possibly allow a network access user authentication (e.g. RADIUS). b. The connectivity service across the network must be isolated from other services through dedicated physical bandwidth or through separate virtual private networks (VLAN/VPN). In case of multi-service IP integration, security barriers and intrusion detection must also be incorporated into the communication network. The connection across the network can further be protected through encryption. c. Telecom network management platform can constitute a major vulnerability in the provision of communication services. Access to this platform must be secured through physical protection (access to the facilities), authentication and logging. d. Access to the equipment constituting the nodes of the communication network is also a major source of vulnerability. Restricted physical access to network equipment, secured HMI, and disabling of remote configuration and parameter setting are common measures to mitigate risks. Service Isolation (Physical, VLAN, VPN) Encryption, Firewall, Intrusion Detection Physical Protection User Authentication
User Access
Protected HMI, Physical Protection
Network
User Access
Network Node
Telecom Management System (TMS)
Protected Access, User Authentication Access Logging
Remote Access to TMS & Telecom Node HMI
Figure 6.3 – Security risk mitigation measures for the telecom Service Provider
43
6.6 Service Integrity Service integrity is the aptitude of the communication network to deliver the transmitted information without degradation, without loss and without on-purpose alteration. Integrity of information relating to on-purpose alteration has been covered under Service Security. The present section deals with degradation and loss of information due to channel impairments. In digital communication networks, data communication channel integrity is characterized by error performance. The concept of long term bit error rate and error probability is widely used but does not contain any information regarding the distribution of errors in time. For many services the distribution of errors is more important than the actual number of errors. Three parameters are defined by ITU-T to describe the error performance of a 64kbps connection [19]: 1. Errored Second (ES) is a 1-second period that contains one or more errors. For data services the information is commonly transmitted in blocks containing error detection mechanisms. Blocks received with one or more transmission errors are subject to retransmission. In order to have a high throughput, it is necessary to minimize the number of errored blocks. The 1-second period was historically adopted as a compromise value for data block size. 2. Severely Errored Second (SES) is a 1-second period where the short term bit error rate evaluated over one second exceeds 10-3. An SES can lead to a loss of synchronization and it is considered that the connection is unusable during the time interval. 3. Degraded Minute (DM) is a one minute period where the short term error ratio exceeds 10-6. The degraded minute has been devised principally for digital telephony for which the mentioned error rate is the subjectively perceived boundary of virtually unimpaired transmission. CIGRE used the above mentioned ITU-T definitions to set a power utility objective on the parameters fixed at 15% of international connection objectives. It considered a reference connection composed of 5 hops between a remote substation and a control centre, and 20 hops in inter-control centre connections [20]: End-to-end Error Performance Objective for 64kbps Channel ITU-T G821 CIGRE [20] Errored Second (ES) 8% 1.2 % Severely Errored Second (SES) 0.2 % 0.03 % Degraded Minutes (DM) 10 % 1.5 % ITU-T defined the time limit between unavailability and degraded performance as 10 consecutive seconds. This means that if the bit error ratio exceeds 10-3 (SES) for more than ten consecutive seconds, the connection is considered as unavailable for that time. Otherwise, the time interval is considered as available but with degraded performance as shown in figure 6.4. Unavailable time begins when ten consecutive SES are observed. It ends when no SES is
44
observed during ten consecutive seconds. These latter seconds count as available time. The counting of degraded minutes is carried out only when the connection is available (excludes unavailable periods). Despite important technological changes in telecommunications, the error performance objective and its related definitions are still widely used in power utility networks, in particular for planning and testing 2Mbps connectivity through SDH infrastructure and the primary access multiplexing systems. However, it can be fully inadequate in some situations. For critical applications such as Protection Relay communication, “available time with degraded performance” is not a reasonable definition. A system which presents one SES every ten seconds (or even 1 SES every two seconds!) cannot be considered as “available with degraded performance”. Total Time Unavailable Time
Available Time ES SES DM Data Integrity Performance
Degraded Performance
Availability Performance
Figure 6.4 – Available time definitions [20]
Packet mode Integrity
In packet mode communications, error detection coding and possible retransmission mechanisms prevent the great majority of transmission errors to be seen by the Service User. Impaired information packets are either lost (e.g. in UDP/IP) or corrected through retransmission (e.g. TCP/IP). A marginal amount of transmission errors, called residual errors, are undetected by the error detection mechanism and handed over to the user. Operational applications may have a specified objective for the Residual Error Probability (e.g. 10-12 whatever be the channel bit error rate for Telecontrol Commands) in particular where a residual error cannot be detected at higher layers of the information exchange system or by the application. Furthermore, an information packet that arrives with a delay outside the application time constraints is considered as lost information. In this way, where a retransmission mechanism exists, the application time tolerance can still lead the valid packet to be considered as lost. The most common integrity check in a packet data network is therefore Lost Packets statistics through a simple “echo response” (Ping) across the network. Ping commands measure the round-trip time, record any packet loss, and print a statistical summary of the echo response packets received, together with minimum, mean, and maximum round trip times. The command can be of different lengths (number of bytes of accompanied data) to simulate different typical packet lengths corresponding to an application.
45
6.7 Future Sustainability, Legacy Openness and Vendor Independence Utility operational applications and substation assets have in general a much higher service life-time than telecommunication services. A newly deployed substation application is expected to operate 10-15 years before being replaced. The adopted communication services for such an application are moreover expected to be “stable and field-proven” at the time of deployment. The communication solution, service or technology must therefore be sustainable well beyond the expected life-time of any new generation, mass market, consumer oriented service or technology. If a public Telco service is employed, the service may disappear before the end-of-life of the power system application. If a dedicated telecom service is used, the interfacing units or equipment may no longer be supported by the manufacturer. Furthermore, the upgrade of communication system software release or core components, which is a current operation in most communication networks, may be extremely difficult and may require long term intervention planning if critical applications such as power system protection relaying are to be carried over the network. Similarly, implementing a new communication infrastructure requires the ability to connect many generations of power system applications rendering the issue of legacy interfacing essential. A new communication solution must provide a way to serve existing applications which may coexist with their replacing systems for a very long time. In order to assure future upgrade and legacy openness, communication solutions and the corresponding service delivery scheme must not depend upon the use of any proprietary interfaces or technologies and must be as far as possible technology-independent.
6.8 Environmental Constraints Many access points for operational services are at electrical substations, power plants, and other electrical installations. Communication equipment is therefore subject to the same electromagnetic environment as other electronic instrumentation. Service access interface must be adequately protected. Equipment cabinets must be closed, fitted with accessories (Earth bar, surge protection, EMC filters etc.) and wired according to substation installation codes and practices. Cables and wires running within the site perimeter can be a source of conducted disturbances to the communication system. A detailed specification for these EMC immunity aspects is given in [21]. The respect of these precautions and practices is costly and their necessity cannot always be perceived immediately. The consequence of their non-respect can only be checked the painful way during a power system anomaly generating large transient currents and voltages. The climatic control inside an electrical power site is often minimal and any installed electronic equipment or its cabinet must resist temperature, humidity, dust, etc. at levels which in general do not correspond to telecom and IT environment. Closed equipment cabinets with an adequate degree of protection are generally required. Climatic aspects are also specified in relevant IEC standards.
46
It should be noted that adopting managed services through a public provider does not remove the expenditure and effort associated to these aspects because access equipment must be installed at Utility sites. Lastly, the impact of Earth Potential Rise (EPR) during a station earth fault needs to be taken into account in connecting a new telecommunications service to a HV substation or power station. As an example, an insulation breakdown of a 330kV asset to earth with a typical fault current of 20kA may cause the station earth mat to rise up to 8kV or more above remote earth potential. The exact figure depends on many factors including actual fault current, soil resistivity, earth grid impedance, etc. The way the earth mat, fences and external connections were initially designed and interconnected would have ensured safety of people on site and remote to site. However subsequently adding a new telecommunications physical connection without proper understanding of EPR could cause a very dangerous situation to occur to staff at the station or remote from the station due to the difference in earth potential that exists during the earth fault. This is easily solved by avoiding connecting copper communications cables to HV stations by using optical fibre or radio solutions. If there is no other economical solution other than connection of a HV zone to a telecommunications Service Provider by a copper cable, then it is essential for safety reasons that appropriate isolation devices are used.
6.9 Defining Service Level Agreements Whichever the mode of provisioning of telecom services in the EPU, and the relationship between the Service User and Provider (Formal, Semi-formal or Implicit) it is essential to assure a common understanding of the qualities and attributes of the delivered service. The contractual document that reflects these attributes as well as the obligations and liabilities of the Service Provider towards the Service User is called the Service Level Agreement (SLA). An SLA allows the Service User to express the operational constraints of its application as defined in the previous sections to the telecom Service Provider and to obtain the provider’s assurance that the delivered service shall meet these requirements. An SLA also allows the Service Provider to define the network resources and management processes (as presented in chapter 10 hereafter) that he must use in order to meet his contractual obligations towards the Service User. Furthermore, the Service Provider may use the SLA towards his service customers in order to specify the level of service that he expects from his contractors and providers (e.g. underlying infrastructure or support services). Finally the SLA allows the Service Provider to know what obligations the Service User must meet so that the service can be delivered and maintained by the Service Provider. Examples may include the provision of rack or floor space for the Service Provider’s equipment, the provision of AC or DC power, access during and out of hours, third party insurance coverage, etc. The precision and the exhaustiveness of the SLA become particularly important when the provider is multi-customer and multi-service and the more we move towards a fully procured telecom service. However, very often multi-customer Telecom Service Providers such as public telecom operators provide a catalog of standard SLAs, none of which may meet the requirements of the EPU. Standard “Operator SLAs” are usually not sufficiently precise to guarantee the
47
fulfillment of operational constraints as described previously and the Service Provider may not be prepared to review his entire network’s operation mode and operational process to meet one customer’s SLA requirements. In this case, assessing the most appropriate SLA of the provider against the operational constraints of the EPU applications allows the estimation of the gap and the risk analysis associated to the potential impact of this gap. The following checklist has been prepared to serve utilities for specifying or assessing SLAs in the EPU operational context. Figure 6.5 - SLA checklist for EPU procuring telecom connectivity services SLA Parameter Description / Comments Interface type As required by the application, (e.g. Optical Ethernet, G703, RS232). Choosing a physical interface such as Ethernet that can be scaled remotely results in easier expansion of services as the need grows. Bandwidth and Guaranteed minimum and Peak bandwidth available to the service and throughput degree of flexibility % number of It is important to set a policy at the edge of the EPU network to avoid packets allowed exceeding the allowable limits, otherwise the policy on entry to the per service for Service Provider’s network with either drop the packets or remark them each procured to the least priority service, leading to poor service performance due to Quality of Service oversubscription of the service by the EPU itself. Conversely you want (QOS) level. to see the Service Provider to apply limiting policies on entry to their network in order to protect the EPU service from contention due to oversubscribed services from other Service Provider customers. Time Latency For packet based services, these need to be defined for each class of (end-to-end delay) service. Voice services for example will be processed via separate low latency queues. Delay Variation For packet based services these need to be defined for each QOS level. (Jitter) As is the case with most data service parameters these are usually expressed by the Service Provider as monthly averages. Consider how to manage the situation of high peaks that don’t cause the monthly averages to exceed the Service Provider specifications. (High peak jitters can cause voice degradation or network convergence problems and still not hit the monthly average parameters.) Go-Return delay For certain protection relay communications. Asymmetrical delay will difference cause certain protection schemes to fail. Service Restore The time required for automatic reconfiguration mechanisms to act upon Time on network the network and hence to restore service (e.g. Spanning Tree Protocol, change SDH Ring Protection restore time, etc.) Availability Distribution, frequency, duration, and timing of service failures. Integrity and Specified for each procured class of service. Packet Loss Power Faults Critical services not impacted by power system disturbances. Correlation Precautions for not losing service during disturbance.
48
Figure 6.5 (continued) - SLA Checklist for EPU Procuring Telecom Connectivity Services SLA Parameter Description / Comments Resilience and Control of the provider on the routes taken by services in normal time Routing Control and on anomalies (determines the capability of establishing duplicated communications without common point of failure). The Service Provider and Service User need to agree on the routing protocol between their networks, and to set various metrics that impact on the resilience of the interconnected networks. Power Autonomy The time duration for which the service can be delivered in case of A.C. power outage Maximum Time to Service Provider’s ability to respond to service failures and carry out the Restore Service necessary repairs within the maximum specified time. Different times will be defined for urban, regional, rural and remote locations depending on the location of Service Provider maintenance staff. Dual Route Ability to guarantee that specified connections between two points Independence never use a same equipment, cable segment, power supply, node or cable conduit. Physical Specify the level of redundancy required for example at network level, redundancy check equipment level, or at specific locations. Service Isolation & Isolation between internal and external traffic, as well as between Security different internal services. Measures deployed by the provider to protect against the risks of interfering third parties (confidentiality, denial of service, integrity of information). An EPU will usually have to regard a Service Provider as “untrusted” and employ security techniques such as encryption. Access Most EPUs have special rules for site access for security and safety Arrangements reasons. These need to be communicated to the Service Provider and factored into his support of the service. Qualified/ Ensure that the Service Provider has sufficient depth in its workforce Certified/ Insured with the right number of personnel in the right locations to ensure that Workforce response time guarantees are realistic. Ensure appropriate insurances are in place to cover accidents by the Service Provider workforce when attending an EPU site. Performance Meaningful and comprehensible information to be provided in a timely Reports / Fault fashion. An EPU should consider implementing their own monitoring Notification tools to ensure the performance of the Services is appropriate. This is especially important for packet based services using different QOS levels.
49
Figure 6.5 (continued) - SLA Checklist for EPU Procuring Telecom Connectivity Services SLA Parameter Description / Comments Penalties and While penalties may not compensate for loss of critical services, they do Liability focus a Service Provider’s attention on the need to accurately monitor the SLA guarantees. Usually a Service Provider will exclude responsibility for contingent liabilities and cap their overall liability to a percentage rebate of fees paid. It is worth considering inserting a termination clause in the SLA that allows termination of the service for a sustained poor performance. At least this enables an EPU to engage a new Service Provider and potentially fix the problem using a different service if the current Service Provider continues not to remedy the problem. Other Legal Depending on the structure of the contracts (e.g. if there is a separate Conditions service provision contract or not) there may be other legal conditions that may need to be covered off in the SLA including details for; confidentiality between the parties, intellectual property, compliance with all applicable laws (and governing law where the service is provided cross jurisdiction), acceptance and payment, Force Majeure and Termination of contract provisions to name the most common ones. Figure 6.7 titled “Typical Communication Service Requirements for EPU Applications” provides a cross reference of typical service requirements for the EPU applications discussed in Sections 3, 4 and 5. The reader should use Figure 6.6 to provide the meaning behind the numbers 1, 2, 3 and 4 in Figure 6.7.
50
Figure 6.6 – Constraint Severity Notation Criteria 1 2 Lowest Severity Low Severity Operational Coverage
Control Centres & Corporate Sites
Plants and stations & Control Platform
Along the grid (e.g. workforce)
Time Latency
1 – 5 sec Human operator
0.1 – 1 sec
Few cycles (20 - 100 msec)
4 Highest Severity Beyond the grid, (Energy farms, customer sites, etc.) Fraction of a cycle ( 5 – 20msec)
Time Predictability, Delay Variation
Seconds
0.1 – 1 sec
10 – 100 msec
1 – 10 msec
May be through different telecom media Few Hours
Uncontrolled over the same telecom system Few Minutes
Controlled Routing
Identical path, 200µs
Few Seconds
100 msec or less
99% Service may be lost in the event of anomalies Public Lost data recovered (Acknowledge & Retransmission)
99.9% Survives one module or one link failure Un-trusted Not so sensitive to recurrent data error & loss
99.99%
In Confidence
99.999% Survives major system faults & disasters Protected
Tolerates some data loss
High data integrity is critical
Sustainability, Life-cycle Mgt.
Continuous upgrade (type IT)
Yearly upgrade
Multi-annual upgrade (Planned migration)
Constant over application asset lifetime
Environmental Class
Customer Premises Admin Building Control Centre
Power plant / Substation (Control & Relay Rooms)
Differential Delay (go-return path) Restoration Time Availability Service Survivability & Resilience Security Domain Service Integrity
3 High Severity
51
Survives loss of one node or few links
Grid corridors
Switch-yard Hydraulic Structure
Time Latency
Delay Variation
Differential Delay
Restoration Time
Availability
Survivability
Security Domain
Service Integrity
Life-cycle Mgt.
Environment Class
2
4
4
3-4
4
4
4
4
4
4
2
2
3
3
3
4
3
4
4
4
4
2
System-wide Protection (WAP&C)
2
2-3
3
3
4
4
4
4
4
4
2
Remote substation control
2
2
2
2-3
3
2
3
4
4
3
2
Operational Telephony SCADA RTU Generation Control Signaling
2 2 2
2 2 2
2 2 1
2 2 1
3 3 3
2 2 2
3 2 2
4 4 4
2 3 4
3
2
3 3
2 2
Inter-control centre communication
1
2
1
2
3
2
2
4
1
1
1
Remote Operator Synchrophasor visualization & monitoring (WAMS) Settlement and Reconciliation metering Smart Metering
1
1
1
2
3
2
2
4
1
1
2
2
1
1
1
2
1
2
4
2
3
2
2 4
1 1
1 1
1 1
2 1
1 1
1 1
3 3
3 1
3 3
2 1
Operational Applications
Applications Protection Communications Current Differential Protection Communications State Comparison (command)
Requirements
Coverage
Figure 6.7 - Typical Communication Service Requirements for EPU Applications
52
Time Latency
Delay Variation
Differential Delay
Restoration Time
Availability
Survivability
Security Domain
Service Integrity
Life-cycle Mgt.
Environment class
3
2
2
2
2
2
4
1-2
2
1-2
4
Collaborative Multimedia Comms.
2
2
2
1
1
1
2
2-3
1-2
1
1-2
Automation Device Management Substation Data Retrieval
2 2
1 1
1 1
2 1
2 1
1 1
2 1
4 4
3 1
3-4 1-2
2 2
On-line Documentation
2
1
1
1
1
1
1
3-4
1
1
2
Condition Monitoring Video-surveillance of sites
2 2
1 1
1 1
1 1
1 1
1 1
1 3
3 3-4
1-2 2
3-4 3
2 4
Site Access Control Environment Hazard Monitoring Intruder Detection Isolated Worker Safety Public Warning Applications Hydraulic Stress O&M Cyber-security Applications
2 2 2 3 4 2 2
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 2
1 1 1 1 1 1 1
1 1 1 1 1 1 1
4 4 4 4 4 4 4
3-4 3-4 3-4 3-4 3-4 3-4 4
1 1 1 1 1 1 1
3 3 2-3 3 3 3 1-2
2 2 3-4 3-4 4 4 2
Applications Mobile Workforce Communications
Requirements
Coverage
Security & Safety
Operation Support
Figure 6.7 (continued) - Typical Communication Service Requirements for EPU Applications
53
7 DISASTER RECOVERY AND SERVICE CONTINUITY 7.1 Introduction Natural catastrophe and intentional disasters have considerably increased the awareness and concern about the vulnerabilities of all critical national infrastructures including the power delivery system. Disaster Recovery Planning is therefore being incorporated into the organization of Electrical Power Utilities. The communication system being essential for the re-establishment of the power system after any major disruption, it must be particularly robust, geographically redundant, and tolerant to many anomalies in its constitution. As a “National Critical Infrastructure”, the Utility is subject to state-specified obligations assuring rapid recovery of the electrical power in case of major disasters. The telecommunication facilities and services are required for the restoration of the electrical power service through • •
the power system automation and control system the communications of the Operation & Maintenance staff
The telecom infrastructure and service must therefore be conceived to: 1. Tolerate the loss of infrastructure in a node, in a link or in a region 2. Tolerate the loss of mains power supply for a relatively long duration 3. Redirect all substation communications (SCADA and voice) to a back-up Control Centre when required (e.g. in case of destruction or major damage of the Control Centre) 4. Include fast deployment communication systems (e.g. radio, satellite) for implementing temporary communication links and networks to replace the damaged or non-operating facilities or to constitute temporary relays for the Utility’s restoration staff 5. Provide specific information exchange facilities for disaster warning, staff coordination and recovery team communications. Disaster Recovery is not a concept specific to telecommunications but a general plan covering all aspects of the Electrical Power Utility. As such, if telecom assets, infrastructure and staff are located within the perimeter of Utility’s premises, then they are integrated into the Utility’s Disaster Recovery and Business Continuity Plan (DR/BCP). However, if telecommunication services are provided by a different entity, with staff and assets at other sites, then the coherence of the telecom Service Provider’s DR/BCP with that of the Utility must be assured and periodically audited.
7.2 Threats and Risk Management and Risk Assessment There are many different threats that can negatively impact the normal telecommunications services operation. Before defining recovery requirements in case of threat occurrence, a risk assessment is recommended. A risk assessment is a part of the global risk management process of the EPU, and it aims to ensure that all risks faced by the EPU are appropriately identified, understood and treated.
54
The decision making process for the risks treatment relies on information about the threats and vulnerabilities that contribute to the likelihood of the risk occurring and the impact of its occurrence, compared with the cost of mitigating the risk and the risk appetite of the EPU. A considerable part of threats and vulnerabilities faced by EPUs are disasters, some common include the following: • • • • • • • • • • •
Natural disasters (floods, earthquakes, rainstorms, typhoon, snowstorm, etc); Fire; Power failure; Terrorist attacks; Organized or deliberate disruptions; Theft; Major system and/or equipment failures; Human error; Cyber attacks or computer viruses; Legal issues; Worker strikes.
Making the risk assessment, you will find out that telecommunications assets will have different risks associated with them, and a correlation analysis of different risks is needed. Some risks will impact many of the assets of a company, such as the risk of a massive fire destroying a main building and everything on it, or an earthquake destroying a large amount of power lines, while in other cases, like a flood in a data center, will only affect a group of assets. Based on the generic risk management process model, a EPUs risk management framework was developed as shown in Figure 7.1 [22].
Figure 7.1 – EPU Risk Management and Risk Assessment Model [22] RA: Risk Analysis, RM: Risk Management
55
The four layers in Figure 7.1 illustrate the various hierarchical levels of the operations part of a typical EPU which operates power plants and/or electricity networks. Typically each level will have authority over one or more entities in the level below. Therefore, the single corporate entity will have authority for most business units, and each business unit will have responsibility for one or more power plant and/or electricity network, etc. As each level has authority over the entities in the level below, it will set certain business objectives for them, and monitor the level of achievement of these business objectives. We can use this same concept to elaborate how risk is assessed and managed within the corporate structure of EPU operations. Each level of the organization has a different set of objectives, and is therefore exposed to different sets of risks. However, as there are dependencies between each level in terms of objectives, there are also dependencies between each level in terms of risk. In order to take a holistic approach to risk management, it is important that these dependencies are recognized, and linked unambiguously through the use of a common framework or language for the identification and management of risks. At each level within the organization, Risk Assessment activities should take place, and this should result in risks being quantified and Risk Treatment actions being taken to bring certain identified risks down to acceptable levels (where they are not already at or below acceptable levels).
7.3 Business Continuity Plan and Disaster Recovery Plan It is recognized that having a Business Continuity Plan and a Disaster Recovery Plan is vital for the organizations activities. In simple terms, the BCP/DRP are developed to help the organizations keeping their business running, completely or partially, in case of disaster, defining roles and action plans in the recovering process, so it can be made more rapidly and efficiently. There are some good reasons for having a BCP/DRP [23]: • • • •
Probability of disasters; Business reliance on telecommunications; Growing corporate and social responsibility; Standardization movements.
As a part of the BCP, a good DRP for telecommunications is needed to ensure an effective response to a disaster that affects telecommunications services and minimize the effect on the business. The major goals of the DRP are: • • • • • •
Minimize interruptions to the normal operations. Limit the extent of disruption and damage. Minimize the economic impact of the interruption. Establish alternative means of operation in advance. Train personnel with emergency procedures. Provide for smooth and rapid restoration of service.
56
To achieve these goals, the most important factors that it shall take into account are: •
• •
• •
•
Communication o Personnel - Notify all key personnel for a certain problem and assign them tasks focused toward the recovery plan; o Customers - Notifying clients about problems minimizes panic. Tools: Be sure that the plan includes the identification and access to all the tools needed for the recovery, such as manuals, procedures, applications, devices, privileges, etc. Backups: Backups should be stored in separate locations. If backup resources are taken offsite, these need to be recalled. If you are using remote backup services, a network connection to the remote backup location (or the Internet) will be required; Facilities: Having backup sites (hot or cold) and mobile recovery facilities are also good options; Prepare your employees: during a disaster, employees are required to work longer, more stressful hours, and a support system should be in place to alleviate some of the stress. Prepare them ahead of time to ensure that work runs smoothly. Testing the plan: provisions, directions, frequency for testing the plan should be stipulated
After identifying the potential impacts of disaster and to understand the risks and construct the BCP plan itself, in order to realize business continuity, the BCP must be not only established but also continually updated and maintained in a “Plan”, “Do”, “Check” and “Act” basis, to ensure that it remains appropriate to the needs of the EPU in terms of covering the measures and action plans to meet the Recovery Time Objective.
7.4 Project Design Criteria Taking into account the importance of the telecommunication services in the EPU environment, it is necessary to install flexible, safe and reliable communication systems that can support the transmission of information with stability and non-alteration of the functioning of the communication system along the time. Having such requirement in mind, it is important to consider a set of global project design criteria for EPUs telecommunications services.
7.4.1 Back-up Facilities One of the most important resources in business continuity is having backup facilities, mainly for the operational services continuity, but also for end business units and corporate services in case of Bulk Electric System (BES). Defining the backup facilities requirements implies, knowing what services are essential, in how much time they shall be restored and for how much time they will be needed. The following requirements shall be considered for backup facilities implementations [Based on NERC Standard EOP-008-1 — Loss of Control Center Functionality]: • •
Select a safety location specially in terms of natural disasters risk that can be easily accessed; Be sure that all needed data will be present at the backup facilities in case of disaster, through storage/database synchronous replication for the most critical information or through tape restore for the less critical;
57
•
•
• • • •
Include in the backup facilities all the tools and applications that allow visualization capabilities that ensure that operating personnel have situational awareness of the BES, and all those needed for the minimum business/corporate activities; Assure all data and voice communications needed for the regular service operation, including voice and control communications to the critical substations and power plants, and to communicate outside the organization including Internet access; Include reliable power sources such access to redundant distribution power lines, diesel generators, diesel refill contracts, etc.; Be sure that all the physical and cyber security requirements applied to the main facilities and control centers are also guaranteed; Implement an operating process for keeping the backup functionality consistent with the primary control center. Do not forget to assure food and medical first aid;
As all of the components of the BCP, the backup facility availability and functionality must be tested in a regular basis. However, if cost effective, his usage as a “hot site” is recommended, for example in a distributed processing architecture or under hot/standby or metro-cluster architectures. Using backup facilities as a “hot site” can be very attractive in terms of permanent updating and training, and providing easy redundancy even in non disaster events.
7.4.2 Power Supply Independence An essential attribute of any operational telecommunication service is its continuity in case of AC power interruption. Although traditional telecom Service Providers are equipped with backup battery facilities, the dimensioning of the latter, associated to the significant rise in traffic solicitation and therefore power consumption in disaster situations, may lead to the unavailability of the vital communications of the Power system. The availability engagements of Telecom Service Providers, as specified in SLAs, are often insufficient to cover this essential requirement. A dedicated infrastructure is generally composed of telecommunication assets located exclusively (or essentially) inside the electrical sites (e.g. substations), many times equipped with diesel electric generators. Its power supply and backup facilities are dedicated to the Utility’s equipment (no unexpected load), are dimensioned according to Utility requirements, and in general, common with the critical facilities that need to communicate: if there is power for the RTU and Protection relay, then there is power for the associated communication equipment. Appendix A5 presents the results of a survey performed over some power utilities concerning their dimensioning of power supply for telecom and data center facilities.
7.4.3 Network Redundancy In some cases, having no communications, or having bandwidth constraints, during some minutes or hours is not critical, but for most of the EPU operational services based on telecommunications, like SCADA or voice, a few minutes is too much. It is important to improve reliability and prompt recoverability of network system against damages, for which the equipments like aerial optical cables are difficult to take measures or which is unexpected.
58
Some of the most common redundancy measures adopted and combined building high-reliable and high-available telecommunications networks are: • • • • • • • •
Duplicated equipments for the same purpose, sometimes form different vendors; Equipments with redundant architectures in terms of power supply, CPU, service cards, access ports, etc. Physical independent communication media (e.g. optical fiber and radio); Different communication technologies (e.g. SDH and PLC); Alternative/Mesh network routes; Distributed processing systems; Third party telecommunication services; “Out of band” management systems.
7.4.4 Countermeasures against Natural Disasters As for many other issues, the better solution is prevention, reason why the following countermeasures against natural disasters should be considered to decrease telecommunication network outages [24]: •
Selection of safety location in the planning stages; The countermeasures in the planning stages of building networks are important to reduce the risk of damages. Equipments and facilities for telecommunication should be located in low-risk areas, and appropriate transmission medium should be selected. For example, we check geological data from onsite survey and past record of disaster, then avoid high-risk site of mudslide or liquefaction due to earthquakes. In addition, for important communications we select high reliability medias such as microwave link or Optical Ground Wire (OPGW) which have higher disaster resistant than aerial optical cable.
•
Design and manufacture disaster resistant equipment; Equipments for telecommunication networks are designed to be highly quakeproof and thunder resistant by the tough specifications. Therefore manufactured equipments are impervious to damage. In the case of microwave radio equipment, they have to pass the impulse withstand voltage and sine wave vibration near the natural frequency tests. During the vibration test, any bit error is not acceptable. Therefore the equipments are required to have high stiffness.
•
Disaster resistant installation of equipment Appropriate installations of equipment improve disaster resistance. The examples of installation method, which improve disaster-resisting capacity, are below: o Earthquake-proof Bolt top and bottom of equipment strongly or install equipment into earthquake-proof frame to prevents the equipment’s deformation by vibration; Install flexible waveguide into connection between rectangular waveguides and wireless equipment to absorb vibration. o Lightning-resistant Adopt a meshed earthing system for site to provide a low impedance path to earth; Potential equalization by connect grounding wires Equipment layout to avoid surge attack Adopt Surge Protective Device (SPD) to protect equipments from surges 59
7.5 Enhancing the emergency response capacity Considering the measures for strengthening emergency response management of power communication network, and enhancing the emergency response capacity of power communication network, not only will reduce de service damage in case of disaster, but will enable to provide a more efficient and fast recovery process. Some of the most important measures are listed as follows [25]: a. Improve contingent plan for power communication under various categories and at various levels
The entities at various levels shall formulate contingent plans for different disasters to ensure the avoidance of interruption of power lines or fast recovery of the interrupted power lines and guarantee the safety and stability of power network. b. Prioritize
Knowing which services are more important for the business continuity, for example dispatching telephone or SCADA, and designing the network and recovery plans based on such priorities, will help to get a fast and more successful recovering process. c. Strengthen the integration, storage and sharing of emergency response resources
Emergency response resources include internal resources of the entities, power industry, public communication network, various domestic trade organizations, public security services, and relevant international organizations. Integrate all resources available for use and improve the efficiency of interlinking between public network and other special communication systems. Given public network and special power communication networks are highly complementary in terms of the laying mode, vigorous efforts should be made to cooperate with public network operators and other special network operators to study contingent plans and sign relevant agreements with them to make public network serve as one of ways for power communication emergency response system. Energetically endeavour, in conjunction with public network operators, back upping, protecting each other, supplying power lines each other to establish one independently functioning hardware platform with sufficient competition and effective integration in case of emergency as well as intercommunication and interlinking and set up emergency response joint-action mechanism for resource sharing. Intensify efforts in the work on the establishment and storage of emergency response materials and devices to deal with disasters. List important materials, equipment name, model, storage places and telephone are for contact necessary for emergency response in contingent plan and relevant management rules. d. Emergency response talent team
Optimize maintenance system and set up maintenance talent team for fast response to reduce business interruption time. Foster technically sophisticated maintenance team and launch enhanced training on technology and process among maintenance personnel to mitigate operation accidents and reduce fault location time.
60
e. Adjust planning and improve communication network structure
Enhance OPGW or ADSS cable’s capacity of resisting disasters in terms of design standard, laying mode and coverage density on the basis of rational technical and economic comparisons. Multiple communication modes are available. Besides the existing OPGW, ADSS and carrier wave, underground optical cable, microwave and satellite communication are also added to enable three-dimensional and diversified communication network. Conduct scientific evaluation and rationally select location, improve the quality of construction of communication facilities and tighten its standard and enhance the capacity of resisting natural disasters; arrange equipment room, base station, transmission route in a unified and rational manner to ensure multi-route and multi-mode feature of communication network and avoid damage. Reduce the length of transmission path and the number of transmission nodes and replace back-to-back switching with high-capacity equipment. f.
Adopt network deployment method for improving the usability of network
Enhance network’s protection capacity to mitigate the impact from interruption of optical fiber or node failure, and replace large-loop network deployment by small-loop-plus-small-loop network deployment mode to mitigate the impact from interruption of optical fiber. g. Adopt technologies with less risk of failure
The most unpredictable disasters and sometimes the most devastators are the natural ones. Adopting also technologies that, at least in theory, are less dependent on physical infrastructures damages, can be an effective way to improve virtual communications loss immunity and faster recovery periods. Examples of such technologies are mobile radio and satellite applications. h. Invest on disaster recovery support systems to improve time response
Develop and implement disaster recovery support systems based on the concept that his introduction will enable rapid understanding of damage status, online information sharing through centralized management of the statuses of damage, planning and mobilization, and providing information to customers, relevant government offices and other entities outside the company. Having this kind of recovery support systems will end in the acceleration of recovery plan.
61
7.6 Disaster Information Systems A Disaster Information System allows the Utility to acquire accurate information across the network in order to mitigate damage from disasters. Adopting this kind of systems will provide an efficient way to communicate directives and other types of information between different business locations and check the result of communication in the event of a large-scale disaster. These systems ensure prompt information conveyance between the disaster counter measures headquarters to be established in the headquarters and branch offices. These systems including the system currently under construction are using IP networks. Kyushu Electric Power Company in Japan has implemented such a Disaster Information System in which meteorological information is transmitted to the Telecommunication Centre which feeds different “Counter-measure Departments” undertaking inspection and reporting tasks and consequently to take appropriate restoration measures. The system is based on an IP data network using the fixed infrastructure with a high degree of routing resiliency. Emergency radio and satellite communications and effective PDA mobile terminals also play important roles. The system collects accurate information on the disaster, which is provided to regions and mass media and helps to work to the recovery from disaster damages.
Figure 7.2 – Kyushu Electric Power Company Disaster Information System [14] Having disaster recovery support systems using mobile devices to receive inspection instructions, enables the recovery staff to have access to an online view of damage status and facilitates formulation of recovery plans.
62
In Poland, a system called PERUN is in use at the PSE-Operator (Polish Power GridOperator). It collects and displays the information about lightning which is sent from the Weather Institute (Meteorology and Water Economy Institute – IMiGW). The information is refreshed every 10 minutes using FTP protocol. Communication is through fixed, point-topoint links and is protected against intruders. PERUN operates with two METEO-FTP servers in cluster. The main one is in the Control Centre and the second in the Back-up Control Centre. The software program RAPOK enables to view the lightning data on the country map using different scales. Up to 5 local computer stations can be connected to the system which runs separately from the real time systems and no data from PERUN is transmitted to other systems. To summarize the storm information service is a stand alone supportive service for dispatchers, which provides aggregated 10 min. snapshots about the lightning fronts. The usage of disaster information systems enables the processing of a high volume of fault information, and prompt collection of service-interruption information, making possible to accelerate recovery in areas hit by power outages during emergencies or disasters, and quickly provide service-interruption information to customers, government bodies and other relevant entities outside the Utility in large-scale disasters.
63
8 TELECOM SERVICE DELIVERY MODELS 8.1 Introduction Electrical Power Utilities (EPU) provision their required telecom services through different schemes and models depending upon several factors and business drivers some of which are listed below: a. b. c. d. e.
Commercial availability and cost of adequate telecom services, Number and dispersion of sites to cover and their communication traffic Company policy and regulatory issues concerning CAPEX and OPEX, Disaster Recovery/Business Continuity and Security constraints Company policy and regulatory position on the provision of revenue generating commercial telecom services and the opportunity to recover investments through nonoperational telecom services (e.g. recovering the cost of optical fibre infrastructure through leasing of dark fibres) f. Organizational issues including in-house availability of skilled staff At a first glance, there seems to be as many operating schemes as EPUs. However, further analysis lets us extract some common attributes in order to identify a number of common patterns in the industry, explaining the reasons behind each, their limitations, and their domain of validity. The analysis is based on 3 principal axes: • The primary mission of the telecom Service User (i.e. EPU’s role or mission) • Telecom assets ownership • Telecom Service Provider’s relationship with the Service User
8.2 EPU Profiles - Telecom Service Users Over the last 20-25 years, the EPU has undergone significant organization change, largely influenced and driven by political and legislative policy. In many parts of the world, a main strategic goal has been to move away from the Vertically Integrated Utility (VIU), a usually government owned monopolistic organization to create a competitive electricity market. This requires the unbundling of monopoly activities such as transmission and distribution and placing them into a regulated environment, whilst creating a commercially competitive environment for generation and supply activities. Differing national constraints lead to considerable variations from country to country in the structure of the competitive electricity market.
Figure 8.1 – Electricity Market Evolution 64
In Europe, the necessity to create a single electricity market determined new rules in the organization of electrical business, the most important being by far the implementation of the transmission system operator entity (TSO) in charge with the operation of transmission grids, acting completely separated from generation and supply companies. •
In case where the company owning a transmission system is part of a vertically integrated group, there are two options: ownership unbundling or, only in exceptional cases subject to the status of the company at a fixed deadline (September 3rd 2009) [26, 27], a right to set up the system operator independent from supply and generation interests and strictly monitored by the national regulator. The ownership unbundling model is that the electricity transmission network is operated and owned by one “independent from supply and generation interests” company, which reveals and undertakes the incentives, responsibilities and liabilities for the network.
•
If the transmission network belongs to a vertically integrated company, then transmission network operators have to be effectively separated from supply and generation activities without ownership unbundling. This model would enable companies to retain ownership of transmission networks provided that the networks were operated by a new independent transmission network operator.
An overview on a present landscape of transmission and/or system operation entities is showing a large diversity of the adopted solutions, as follows: •
Independent system operator model, where the system operator does not own the transmission assets but is ownership unbundled from the rest of the system, as e.g. in US.
•
Independent transmission system operator, fully unbundled from the rest of the system which owns and operates transmission assets. This is the case of the majority of European countries
•
Legally unbundled transmission system operator, unbundled from the rest of system which owns and operates transmission assets. This model meets the EU requirements and can involve effective separation of transmission operation from the rest of the sector while transmission assets remain under the same ownership as generation or retail. e.g. France
•
A hybrid model where both the independent system operator and the transmission operation are ownership unbundled from the rest of the system. The independent system operator is asset-light, while the transmission operation has no system operation function. This is the case in electricity market in Chile and Argentina.
•
A system and/or transmission entity embedded in the vertically integrated utility, e.g. traditional utilities in Europe. This is the model that Europe has sought to move away from in successive directives; however it is still in de facto operation in some European electricity markets.
As it can be seen, the most sensitive case in the new market organization, still in discussion in many countries, especially in countries with large and dominant vertically integrated
65
companies, is the setup of this “monopoly manager” that means a TSO with/without assets, but unbundled from generation and supply. This is why all the efforts of EU Community are still concentrated to find optimal compromise to convince even not yet convinced “giants” to accept a kind of legally unbundling (see third bullet above). A detailed description of ownership unbundling issues and the underlying reasons and regulations is beyond the scope of the present document and can be found for example in [28]. Similarly, in the Distribution domain, unbundling is in progress resulting in the separation of operation from supply (in practice this is monitored by national regulators). In the European case, following EU Directives [26], the independent DSO ought to be established by July 2007. Connecting these afore mentioned forms of EPU organizations with Telecom Service Provider profiles described in section 8.4, one can conclude that: •
Type A is common for all forms of EPUs
•
Type B could be common for TSOs (independent and legal unbundled models), as well as for vertically integrated companies
•
Type C for independent and legal unbundled TSOs only
•
Type D for independent TSOs only
•
Type E is commonly used in ISOs and Distribution Companies
Consequently, the term Electrical Power Utility covers at present a wide range of organizations whose telecom service requirements and dispersion of sites lead to different service provisioning models: Transmission System Operators (TSO), Regional or National System Operators, Energy Market Operators, Transmission Companies, Distribution Companies, Generation Companies, Regulators and Service Contracting Companies, etc. Further consolidation within the industry has seen acquisitions and mergers take place, creating large national or multi-national utilities which operate various business activities within a complex regulated and un-regulated environment and in this respect assuming several utility roles. The communication requirements and attributes of the company shall therefore be the sum of those for each utility mission. Something that the resulting EPUs have in common is that they have to operate as an enterprise organization, and as such are accountable to a variety of stakeholders including: • Parent Company and Investors • Customers • Regulators • Partners and providers. This accountability aspect is of great importance when analyzing telecommunication management issues, and in particular the “upstream” tasks of policy definition and business planning (refer to section 10.4.2). Different Utility roles in terms of their respective attributes and specificities as telecom Service Users are presented in this section and summarized in Figure 8.2.
66
Figure 8.2 – Utility Roles as perceived from the communications point of view Role of Utility Sites Communication Services & Applications National or Connect to Control EMS/SCADA, WAMS and Voice Regional Facilities (tens) facilities around the National Load Coordinating or dispersed across a Dispatch Centre and connection to Operating Body country or a region • Control Centres and Back-up with little or no facilities for Transmission Grid power network companies, large generators assets (e.g. USand distribution companies, type ISO) • Administrative facilities • Market participants and Energy Trading platforms Transmission HV substations and Protection relaying Grids, TSO and transmission lines Security monitoring of other entities 10s to few 100s of installations/assets operating sites dispersed Condition monitoring of assets transmission across the country HV metering for Settlement network assets or a large region EMS/SCADA, WAMS and Voice facilities around the Control Centre and connection to • Grid substations, • Other CCs, Back-up facilities, • Power Plants, • Operation Support sites • Administrative facilities • Market participants and energy trading platform (TSO) Distribution Distribution Distribution Automation Grid / Energy Automation: Demand Side Management Retail Supply / 1000s of sites Market communications Dispersed HV metering for Settlement Generation Demand Side Asset monitoring & supervision Management / (transformers, overhead and cabinSmart Metering : based switches, MV lines, etc.) 100s of thousands Demand Side Management / Smart Metering Connection to Administrative facilities Security monitoring of installations and assets Generation Small number of Intra-plant & Inter-plant Utility sites Between the plant and the Control (HPP, TPP, Centre Energy Farms, Communications of the associated etc.) HV substation Market communications
67
Specificities
Based exclusively on third party telecoms
Extensively based on dedicated telecom networks using mainly optical fibres
Third party, Radio, and some dedicated network based on pole mounted and u/g fiber, microwave, etc. Very large number of sites
HPP and Windfarm often in isolated areas may be a driver for dedicated facilities TPP may be in more telecom accessible areas with both public and transmission grid telecom network
8.2.1 Coordinating or Operating Bodies without Network Assets These Utilities play a role of global coordination across multiple power systems assuring the global reliability of the system and the security of the power supply process but have no proper power assets (e.g. US-type ISOs). They may cover in their “system model” thousands of generating units, and hundreds of thousands of grid telemetry data points. However, from a communication point of view, they only communicate to some tens of EPU Control Centres and to some major power plants. System Operators with no transmission assets may use the telecommunication infrastructure of coordinated power utilities whose power systems they dispatch, or third party telecom services. The communication is mainly performed using Inter-Control Centre Protocol (ICCP) through an external Telecom Service Provider using IP VPNs or other procured telecom service. Two independent Service Providers may be employed in order to enhance service reliability. It should be noted that these bodies having no proper power system assets, they do not have Protection applications and often do not need any SCADA. A communication interruption with one or some nodes for a few minutes has therefore less impact on the operation of the system other than possibly lower precision or visibility, as the node’s data can be estimated (possibly with less precision) through calculation. It is very important to assure, on the other hand, very high service survivability and fault tolerance for the Data/Control Centre and its stored information. This is attained through a geographically distinct Back-up Data/Control Centre and reliable Disaster Recovery/Business Continuity solutions. Large capacity, reliable interconnection links are required for database synchronization and hot-standby between the main and back-up data/control centres. This is often implemented through a duplicated Ethernet link through one or two distinct Service Providers. Voice, computer networking and video facilities are also often implemented between the different facilities and may be interconnected to other EPU control centres as well as to other coordinating, regulating and operating centres. The ISO generally has little or no telecom infrastructure but may have extensive data network assets (router, switch, cyber-security barriers and intrusion detection, communication servers, etc.) which is considered as part of its IT platform and managed through its IT management facilities. In certain countries, the National or Regional System Operator (or Coordination Centre) may have a role of operation and control over some power assets which are used by multiple “State Utilities” (e.g. interconnection links or very large generation facilities). In this case, the corresponding communication requirements and attributes are those of other asset operating utilities described in the following sections. The employed telecommunication infrastructure may be that of the connected Utilities up to a varying level (fibre, transmission, or data network level). Similarly, for the European-type system operators, where the operating entity owns, has in long term concession, or operates transmission grid assets, the supplementary communication requirements are those of the following section on transmission grids.
68
8.2.2 Transmission System Operator (TSO) or Transmission Utility A TSO is a system operator that owns, or has in concession, the totality or part of the HV power transmission network assets. Transmission grid companies and TSOs have control of the HV transmission lines interconnecting the great majority of their sites, impacting considerably their telecom provision model. They generally own the most extensive dedicated telecom infrastructures among EPUs. This fact can be explained through a number of factors as listed below: • A large number of power system sites (often few hundreds) dispersed across a large geographical area (often a whole country or a large region) interconnected through HV transmission lines, • A network-wide coverage, meshed right-of-way through their HV overhead transmission lines that can support optical fibers at relatively low extra cost, and arriving directly into the power sites where communication is required (without any “last mile” issue), in addition to the possibility of transmitting smaller amounts of information on the overhead HV line conductors (PLC) or on pilot wires accompanying underground power cables. • Some highly critical communication requirements which cannot be easily met by other telecom Service Providers (e.g. Protection Relay communications), • A great diversity of operation-related critical communication requirements as described in previous sections, with a rapidly growing required bandwidth due to the distributed intelligence of the network, monitoring and maintenance of the power assets, and geographically dispersed workforce. In the great majority of Transmission Utilities (or their vertically integrated predecessor), operational telecommunication has been historically deployed through dedicated networks owned and controlled by the utility mainly due to the inability of the telecom operators to provide the required services. The infrastructure was originally based on PLC and copper pilot wires with possibly some microwave radio on larger traffic segments. Administrative communications, on the other side, were often provided through the public operator. The transmission lines’ right-of-way and the installation of high capacity optical fiber have greatly impacted the transmission EPU’s telecom organization: •
•
The relatively large capacity of the installed fiber has lead to the integration of EPU Corporate communication services. This integration has often transformed the operational telecom entity into a separate Telecom Service Provider entity, serving both operationrelated and corporate services. The separation of the telecom entity from the power system operations has often resulted in some boundary uncertainties, migration (and/or loss) of technical skills and misunderstandings on requirements (e.g. in the domain of protection communications). The previously informal relationship between telecom and electrical engineering in the substation environment generally evolves into in a more formal “user to provider” relationship based on explicit specifications and SLAs. On the other hand, the ICT staff moving from a department inside the EPU in charge of a “support activity” to a separate telecom entity (i.e. a “core activity”), get higher responsibility, more value and more incentives for their quality of accomplished work. The extensive Right-of-way and/or extra fibre capacity in the cables, in conjunction with telecom deregulation, lead many Transmission Utilities into the provision of commercial telecommunication services to varying extents. This evolution has had further important impacts on the provision of operation-related services as discussed in a separate section.
69
8.2.3 Distribution Utility Distribution is probably the utility domain with the greatest diversity in EPU perimeter, size and organization. It covers the technical operation of the MV and LV infrastructures but also customer related activities such as the metering and billing of a large number of domestic and industrial consumers, customer call centres and commercial agencies. It increasingly covers technical and commercial issues related to the connection of small independent generators and energy producing consumer (individual wind and solar generators). In many cases, the distribution utility also covers the low level HV network in which case it has some transmission grid type communication requirements. On the other hand, in some cases, customer relation activities such as metering, billing and contract management are performed by separate “Energy Retailers”. In some countries, “Metering Utilities” are being formed to carry out customer premises communication access for multiple utilities. In terms of telecom service provision model, the distribution utility communications can be divided into the following segments: a) HV Feeder Substations (HV/MV) – These sites, when controlled by the distribution utility have similar attributes and requirements as the transmission grid as described above (e.g. teleprotection, SCADA, voice, etc.). Their communications are generally performed through dedicated optical fibre networks, microwave or procured services (e.g. E1/T1, IP VPN, etc.) b) MV Asset sites (MV breakers, MV/LV transformers, etc.) – Remote operation of MV switches is the basis for the distribution network automation and network fault isolation. It is therefore important that the communication service remains operational during a power outage and capable of facing information avalanche situations when major disturbances occur in one region. Monitoring services (e.g. MV/LV transformers and site monitoring, site access control) can be performed through dedicated or procured services. Due to the very large number of sites, communications are often provided with an opportunistic approach, using whatever communications are available for other applications in any particular environment (e.g. dense urban area, residential suburban, industrial zone, rural, etc.). Dedicated radio networks, MV PLC, VSAT, public cellular voice and data service, PSTN, or public internet are some of the manners the required telecom services are provisioned. The deployment of Smart grid solutions is changing the communication requirements at MV/LV level leading to specific infrastructures at a very large scale moving telecommunication services nearer to the producer/consumer (Prosumer) c) Customer Premises – Communicating to the customer premises requires extremely low cost of implementation per customer. Procured service versus dedicated radio or PLC is a subject of discussion, experimentation and pilot projects in many distribution utilities at the time this report is being prepared. The concept of smart metering and bi-directional communications between the network and the energy consumer is one of the cornerstones of the future power system and a major driver for future communication systems in the distribution utility. This may leverage the other operation-related communication requirements (e.g. MV monitoring and control).
70
d) Commercial Service Agencies – Communications between commercial offices, the billing centre and the distribution control centre are beyond the scope of the present document. However, implementing a dedicated backhaul communication infrastructure for distribution SCADA and metering, often provides the opportunity to fulfil these high traffic requirements. e) Mobile Workforce – Communication from control centres to mobile staff is particularly important in distribution utilities because of the far greater proportion of unmanned facilities. The service provisioning model is both dedicated or shared trunk mobile networks (e.g. TETRA) and public cellular mobile services. The first case provides the required reliability and availability in emergency situations where the service is most required (e.g. extensive power outage) but at a high cost of implementation. The second, on the other hand, provides a far less costly solution with more advanced functionalities (data services, e-mail, on-line applications, etc.) with frequent updates and service enhancements, but not necessarily available in major emergencies due to disaster situations. Ideally, the distribution staff would need public mobile service in day-to-day operation and a dedicated radio system for disaster recovery [11].
8.2.4 Generation Utility In terms of communication attributes, Power Generation Utilities are the exact opposite to Distribution Utilities. Their communications are concentrated on very few sites, which may be either in a very accessible or a very inaccessible site. Typically, thermal power plants (TPP) are often implemented in industrial zones in the proximity of urban centres (near to the consumption load) where communication services can easily be procured. On the other hand, hydroelectric power plants and windfarms are located in isolated areas (near to the energy source) where it is often impossible to purchase communication services from telecommunication carriers. Another major difference between Generation Utilities and Transmission/Distribution Utilities is that each generation plant is by itself “the complete process”: the great majority of operationrelated applications have all their constituents and their corresponding staff in the same site resulting in extensive intra-plant communications. This means mainly local networking in thermal plants and relatively short range communications (few km) over a hydroelectric complex. Renewable energy production covers large Energy Farms (e.g. wind farms) on one side and small “green power” producers (individual wind turbines, micro hydro generators spread along river beds, solar cells on the roof, bio-based generation, etc.). The required communications and the impact on the telecom service delivery are very different. •
Large Wind Farms (e.g. greater than 30 MW in Australia), like other large generation plants, must provide SCADA data to the System Operator who may also require to perform set-point Generation Control, that is to say, to reduce the generation when required. As mentioned previously large wind farms, either land-based or off-shore, are often located outside the coverage of public telecom Service Providers. The transmission utility’s telecom network must therefore extend for an access to these generation plants. The cost of communication access is part of the connection cost of the transmission lines
71
•
to the Wind Farm. Communications from the remotely located control and surveillance platform is performed via an often unmanned communications hut connected locally to each wind generator unit. Wind farm communications include local automation, SCADA, metering, different monitoring and surveillance applications, voice access in off-shore wind turbines and in collecting and transforming substations as well as wireless facilities for operation & maintenance staff. Optical fibres are often used in off-shore systems with microwave radio as a backup between the off-shore and on-shore facilities. Small “green power” generators and “producing consumers” on the other hand, generally inject power into the rural MV distribution network. Remote communication to these installations is generally covered by the Smart Metering system which in particular includes remote disconnection functionality (e.g. for distribution network maintenance), status monitoring and frequently refreshed metering data. Many different service delivery modes have been used across distribution utilities ranging from procured services (e.g. cellular data services, wired internet or switched telephone network) to light dedicated communication links (e.g. UHF radio, VSAT and broadband wireless data). Further communications may be required in the future perspective of Microgrids. The impact of these generators on the telecommunication delivery pattern is to be considered as a further attribute for Distribution Utilities.
Typical telecom service provision model for generation plants can therefore be as follows: •
Dedicated telecommunication facilities inside the plant covering automation applications, SCADA, monitoring applications, voice and computer networking services. Depending on the type of power plant, this may range from a LAN environment to a few km “Campus Network”.
•
External communications services through the transmission grid network (using the power plant’s grid substation), or an access link to the transmission grid network, either dedicated or procured from telecom operator. Power Generation Utilities constitute natural customers for U-Telco communication services.
72
8.3 Telecom Asset Ownership Profiles 8.3.1 Introduction Ownership of major telecom assets is a determining factor in the EPU’s adoption of a telecom service delivery model, and on the EPU’s degree of control over its operation-related communications. Telecom assets in the EPU can be broadly classified into the following types: •
Physical layer assets – Optical fibre, OHL right-of-ways, cable trays, RF towers, frequencies
•
Transport network assets – All electronic equipment used for the core transport of information. We purposely separate the assets for the bulk transport of information from those used for multiplexing and interfacing of individual applications which constitutes an edge or a distribution layer, even if in technological reality these two layers can at times be merged together and therefore render difficult the separation of their assets’ ownership.
•
Application service network and platform assets – all specific systems delivering particular communication services through the core transport capacity (e.g. low capacity access multiplexing, Voice network, Teleprotection signalling, SCADA communication network, etc.)
We classify ownership patterns into three broad categories: • • •
Assets owned by the Service User (the operational entity in the EPU) Assets owned by the Service Provider (whatever be its relationship with the user) Assets owned by another party (e.g. state-owned fibre, fibre leased from another provider, fibre, equipment or bandwidth belonging to another utility, etc.)
This section analyses some specificities in each case. Ownership criteria and issues for some common telecom assets as described in the section are summarized below: Asset Layer Physical Layer Assets
Asset Types Optical Fibre Conduits, Rights-of-way RF Towers, Repeater Housing Radio Spectrum and Licenses Long Term Contracts Bulk Data Transfer Connections Core Network Infrastructure Narrowband Telecom Links (PLC, Radio, etc.)
Transport Layer Assets
Application Layer Assets
Service Multiplexing Teleprotection Signalling Voice and Data Servers LAN/WAN Assets Surveillance Systems, Platforms
Ownership Criteria & Issues HV Transmission Lines Civil Works & Access Rights Suitable Premises Regulatory Constraints Legal Constraints Cost of Ownership Bandwidth Requirement Lifecycle Issues & Upgrades Required Level of Control Availability of Expertise & Skills Critical Applications Coupling Cost of Ownership IT Lifecycle Issues & Upgrades Availability of Expertise & Skills
Figure 8.3 – Utilities Telecom Asset Ownership
73
8.3.2 Physical layer assets By physical layer assets, we understand those telecommunication assets that allow setting up physical connectivity between the communication sites. Fibre, Cable, Right-of-way
The most determining physical asset is the optical fibre. It can by its own lead an EPU to a particular mode of service provision or prevent an EPU from adopting a particular mode. Installing optical fibre cables between communication sites of the EPU necessitates “Right-ofway”, that is to say underground or overhead corridors where the cable can be laid. This is a very precious asset that Transmission Utilities own due to their HV transmission lines. Optical fibre infrastructure can be provisioned by the EPU through one of the following manners: 1. Procure and install fibre cables through the EPU’s right-of-way corridors (overhead power lines, underground power cables, etc.). This is by far the most used scheme in Transmission Utilities, and in Distribution Utilities owning HV lines. Spare capacity can be used for corporate and other communication services and spare fibres can be leased to external users for covering costs or for extra revenue. Possession of extra fibres may lead the EPU into building a U-Telco activity. 2. Jointly financed procure and install – This scheme is typically employed at the interconnection between two EPUs, e.g. transmission line interconnecting two transmission utilities 3. Fibre (or service) in exchange of right-of-way – The EPU fibre requirement being far lower than the capacity of an OPGW, it can grant a telecom carrier the right to draw multi-fibre OPGW cables in exchange of its required fibres in those cables. However, this scheme presents many issues concerning the maintenance of the OPGW which is intimately related to the maintenance of the transmission line. Even if often envisaged (e.g. for immediate availability of financing when a sizable fibre infrastructure is needed), it often evolves into case 1 with leasing of extra capacity. However, where the telecom entity of the EPU moves away into commercial service and becomes a distinct company, it may inherit the fibres and consequently the EPU’s right-of-way through a long term leasing contract, in exchange of fibres or services left to the EPU. 4. Swap with other fibre asset owner – This is typically used for providing route redundancy where the network’s topology does not provide the required resilience. The other asset owner can be another utility, a telecom carrier, etc. Access from the fibre asset owner to the EPU site may be an important issue. It should be noted that these swapping schemes may raise regulatory issues regarding the non-payment of taxes and duties. 5. Lease fibres in another EPU’s cables – This scheme is often employed at sites where a smaller footprint EPU connects to a more extended EPU. Some typical examples are: o generation plant using transmission utility fibres at the transmission grid substation, o distribution utility access to a national facility using transmission utility fibres
74
It can also be used to close a partially open telecommunication ring using assets belonging to a regional footprint utility. Dark fibres (rather than transmission links) are leased when the distances are sufficiently short to avoid intermediate regeneration and when high capacity is required (e.g. Giga Ethernet). Fibre leasing from another EPU is generally performed at co-located sites and therefore avoids the “last mile” issue encountered in other leasing schemes. 6. Lease from a fibre asset owner – This is the typical situation for EPUs that require a high degree of control over their telecommunication network but do not have the necessary right-of-way (or the initial capital investment) for installing their own fibres. Optical cables over transmission lines may also be in the state public property as part of the line conductor in case of OPGW, but conceded to the TSO or to the Telecom Service Provider entity of the power utility for its internal usage. This type of long term concession in general does not authorize the entity to which the cable is conceded to lease dark fibres. Using leased fibre from an asset owner other than another EPU, raises several important issues that need to be considered: a) The topology of the resulting physical network depends upon the fibre owners’ network structures leading to far longer than necessary links and often far from optimal overall network. b) “Last Mile” issue – the distance from the fibre asset owner access point to the EPU premises, even if relatively short, needs right-of-way and non-negligible civil works inducing important cost and delay consequences. c) Physical routing issue – The design of a fault tolerant transmission network is based on the knowledge of physical medium routing which is controlled by the fibre asset owner. In particular, where fault tolerance is concerned, the two routes must not pass into any common nodes, cables or conduits. In the event of “incorrect” routing information or cable route changes, the fault tolerance of the whole system may be unacceptably compromised. The EPU has no other way than the “provider’s word” to keep track of changes or of the correctness of the routing information. Moreover, it is particularly hard to obtain two independent cable routes from the EPU premises to the cable provider’s “meshed network” (and not the provider’s access point). d) Maintenance works – EPU need to have control of maintenance schedules which is not the same thing as being informed of the date of maintenance works. A multi-customer fiber provider cannot program his works according to EPU requirements. In case of interruption, of its fibres, the EPU requires immediate repair which may lead to unscheduled interruption of other fibre users without prior notice. However, in case of other users’ fibre interruption, the EPU cannot accept non-anticipated maintenance works. A very non-symmetrical contract in general unacceptable to the provider is needed. e) Cable reliability – The majority of fibre providers have underground cable infrastructures, particularly subject to cable cuts due to civil works, while overhead OPGW normally used by the EPU is almost invulnerable. The extremely high levels of service availability required by EPU operation-related applications are very difficult to meet with the probability of cable cut that can be obtained from cable providers. (when using OPGW,
75
cable availability can be neglected in comparison to equipment availability, with leased underground cable we have the reverse situation). f) Multiplying the cable providers in order to meet the necessary coverage of the EPU shall multiply the formerly mentioned issues and creates additionally an important issue of governance and contract management with several contractors/sub-contractors in some cases along one same connection. RF Physical Layer Assets
Another important category of physical layer assets in the EPU are those related to implementing radio transmission networks and links. We add here repeater housing even if this can also apply to optical regenerator housing: RF Towers in HV substation premises and power plant sites, including tower lighting and its associated power supply, are generally the property of the EPU. These towers as well as other EPU structures which can serve as antenna support (e.g. electrical poles, power plant tower structures, etc.) may also be used by other radio network infrastructure owners such as cellular radio operators as a source of extra revenue for the EPU. Antenna support outside EPU premises (e.g. microwave repeater or radio base stations) can be through co-location on towers belonging to other radio networks. In particular when a wide zone coverage is required (e.g. UHF data systems, mobile trunk systems, etc.), the optimal location of radio relays for covering a given zone is often the same for all radio infrastructures, facilitating co-location. Repeater housing, including air-conditioning facilities and repeater power supply can be the property of the EPU, its telecom Service Provider, or leased from an external party. Microwave link repeaters are often located on EPU premises in which case, they are generally EPU assets. UHF and other zone coverage base stations on the other hand, are often on high sites, and may be in shared housing leased from another asset owner or telecom Service Provider. The maintenance of the facilities, in this case, is generally provided by the owner as an external service to the EPU. When using externally provided power supply for radio relays, the autonomy of the power supply and the dimensioning of batteries is an important issue for operation during power outages. Frequency licences with narrow directivity (e.g. microwave radio links) and narrow bandwidth (UHF radio for SCADA or few channel mobile systems) are generally applied for under the name of the EPU user and are therefore part of its assets. Licensed broadband spectrum with wide coverage, on the other hand, cannot be allocated in many countries for the exclusive usage of the EPU internal communications. It is, in this case, common to obtain shared usage with other Utilities (e.g. gas, water, other EPU) or other critical services. This is normally performed through procurement of services from a specialized operator, or setting up a service that can be procured by other users. This latter case generally results in the separation of the Service Provider entity from the EPU operational entity. Power Line Carrier – Narrow-band PLC whether HV, MV or LV is generally a dedicated asset for a particular Utility application (Protection, voice and SCADA in HV, device monitoring and metering for MV/LV). The physical coupling assets consisting of line traps, coupling capacitors, Line Matching Units and the frequency spectrum assets, are indeed the property of the EPU operational user. Particular attention has to be paid to the maintenance of these assets, which usually fall under the responsibility of HV line teams, not belonging to the EPU’s telecom department (or a fully separate telecom entity). Broadband PLC, when used for
76
multiple purposes and in particular when commercial services such as customer internet access are involved, the situation may be more complicated. A separate telecom service providing entity shall be necessary to deal with this situation.
8.3.3 Transport network assets Transport layer assets are those related to the bulk transfer of information. Optical and microwave radio communication equipment and core data network infrastructure constitute the basic elements of this layer. If the ownership of physical assets is often a determining factor on the telecom delivery scheme, transport layer assets can much more easily be procured by the EPU if it intends to own its assets. Asset ownership model at this layer is often based on the following factors: •
•
•
•
•
Ownership of underlying physical layer assets – When the physical layer assets are not under EPU’s control, it is easier to admit lack of control over the transport layer (e.g. leasing STM1, E1/T1 or Giga Ethernet connection rather than leasing dark fiber and repeater housing, power and maintenance). This may lead to more straightforward contract and SLA management and less interactions. Required communication bandwidth – Narrowband information transport on owned physical assets, e.g. HV PLC access to substations or wireless SCADA systems, is always performed with EPU-owned transport network assets. On the other hand, when the communication requirement are small compared to the capabilities of the available or suitable communication technology, bandwidth sharing with other users is necessary, either to justify cost, or to overcome regulatory constraints. (e.g. Broadband wireless data services, Satellite Communication Hub). EPU’s required level of control over the service – The more a communication service is critical in the EPU’s process, the more it is inclined to keep full control of the associated transport layer assets (e.g. communication services for Protection Relaying applications). Total Cost of Ownership, Asset Life Cycle and Return on Investment (ROI) – The cost of implementing and maintaining a particular type of transport asset may lead the EPU to renounce to its ownership. This indeed is to be traded off with the requirement to keep full control. It should be added that unlike physical assets, transport network layer assets have much more limited life cycle, meaning that the ROI must be possible in shorter time. Required skills for managing and maintaining the transport assets – The EPU may simply not have the necessary skills, tools and organization to run a particular type of transport network, or the organizational capability to keep it up-to-date. Large core data networks and Network Operation Centre facilities to run bulk information transport are typical examples of “hard-to-maintain” assets.
8.3.4 Application service networks and platforms Application service platforms as those necessary for Utility switched voice and data services generally belong to the EPU or its telecom Service Provider. Those which are more intimately related to the operational process or to the operational sites such as teleprotection signalling or SCADA communications are the property of the EPU operations, while those that are shared between operational and non-operational activities such as computer networking, voice, e-mail and intranet servers are often procured and renewed by the Service Provider entity. 77
Here again, the short lifecycle of the assets (e.g. IT platforms) and the total cost of ownership being mainly driven by the cost of upgrading and maintaining, the EPU is highly inclined to procure services rather than assets. This is therefore a typical area where the Service Provider is in a better position to invest and to obtain ROI.
8.4 Telecom Service Provider – Relationship to the User The nature of the relationship between the EPU operation-related telecom Service User and the corresponding telecom Service Provider is multiple and changing over time. Figure 8.4 presents schematically the main patterns encountered in the power utilities. It should be noted that in a same EPU we can find different schemes for different groups of services, different layers of telecom service, or different geographical areas. The pattern may change due to EPU change of policy, regulatory changes, or the evolution of technologies. This section provides some in-sight into the reasons for adopting each and the corresponding issues that may arise. A
Power Corporation EPU Telecom Service Operations Corporate Provider Activities
EPU
C
Operations Corporate Activities
Telecom
B Operations
EPU
D
Telecom Services
EPU
Corporate Activities
Corporate Activities
Operations
Telecom Service Contractor
Telecom Assets
E
EPU
Operations
Corporate Activities
Telecom Service Provider
A: Telecom is part of the operational activity. Corporate entity provisions telecom services separately. B: Common Telecom (& IT) Services for both Corporate and Operational Applications. C: TSP is a sister company to the EPU, providing services exclusively (or in priority) for the Power System D: EPU procures its telecom assets but operates them using an external Service Contractor E: Telecom services are procured under SLA by a TSP providing services to many customers.
Figure 8.4 – Telecom Service Provision Models in the EPU
78
8.4.1 Integrated to the Operational User (Type A) This scheme is the most basic and historically the most employed form of telecom service provision in the EPU. It relies upon the total ownership of all telecom assets as described in the previous section, and in-house provision of skills for running the network, which can be designed, deployed and periodically refurbished through turn-key contracts, or gradually created through substation, transmission line and SCADA procurements. Providing telecom services as an integrated activity of the EPU operations has major advantages which are particularly important where “market atypical” operations-critical requirements such as those of Protection communications are concerned: • Full commitment – The network specifications in terms of performance, topology and capability perfectly reflect the user requirements. The telecom staff’s priority of the day is the operation staff’s current problems. • Informal relationship – Telecom staff are direct colleagues of protection, substation automation and SCADA engineers. Performance issues and interface requirements, intervention scheduling and problem solving do not risk to be compromised due to misunderstanding. Interaction with telecom network management is through internal meetings without any need for SLA and contract management. • Maximal responsiveness – The intervention time of maintenance staff in case of service interruption is not prolonged due to site access issues and when multiple interventions at application system and telecom level are required, this can be arranged in minimal time with only internal field staff likely to be based at the same field maintenance centre. • Synchronized deployment – Addition or upgrade of telecom services when a new application is deployed or when the power system is extended need not be anticipated long time in advance for provisioning of necessary telecom assets and scheduling of works. Application and communication service can be provisioned together or at least in a synchronized manner. • Information Security – The telecom system and the corresponding organization and processes being an integral part of the EPU operations, they are covered by the same Security Policy. No coordination action or additional auditing is required to assure that the Security Policy of the Service Provider does not compromise that of the EPU. • Disaster Recovery/Business Continuity Planning (DR/BCP) – As for information security, the telecom organization and processes are an integral part of the EPU operations. No coordination and additional auditing is necessary to assure that DR/BCP of the provider is not compromising that of the EPU. The main drawback from this service delivery scheme is indeed the limited possibilities of a constrained telecom team operating inside the EPU operational entity. The team shall be dealing only with the operation-related telecom service requirements of the EPU and shall therefore be unable to implement more complex, more costly and more demanding technologies, management tools, or at a very high cost due to the small scale of the requirements. Another particular concern for this model is its lack of performance and efficiency measurement through SLA and cost prospective. The quality and cost of the delivered service is not truly assessed against any particular reference.
79
An integrated telecom service provision scheme can scale up to cover corporate or other communications inside and outside the EPU, but in this case, the evolution to a type B situation is almost automatic in order to cover assets and running costs for the corporate communications.
8.4.2 Sister Entity to the Operational User (Type B) The normal position for an “internal” telecom Service Provider who delivers services to both operation-related and corporate enterprise applications is an entity independent from both. This position allows the delivery of services in a “semi-formal” relationship with a larger traffic volume and Service User base. The provisioning scheme allows to deploy a core network common to operation-related and corporate services, and to employ data networking and IT specialist skills (necessary for the corporate communications), in order to implement new generation operation-support services. This scheme is often the “minimum critical mass” necessary for the implementation of “enhanced” network and service management tools. The internal nature of the telecom Service Provider still allows a fair level of commitment although not as informal as the type-A scheme.
8.4.3 Affiliated Service Company (Type C) Provision of external services (U-Telco) or simply the intention of creating a separate profit making company can lead to the extraction of the telecom Service Provider from the Utility organization. The difference between type B and type-C schemes lies in the freedom of the company in investment and its consequent overall accountability. The company can in particular: • Procure its own new assets or extend their capacity, • Design new services, • Extend its customer base to competitive telecom market, • Employ its needed skills and pay competitive salaries to maintain its staff. • Sub-contract tasks and services to specialized contractors The relationship with operation-related organization is more commercial and based on annual negotiations based on SLA or service contract. Service management is formal but in most cases, the history of the telecom Service Provider converging in the recent past with that of the operations entity, informal relations and knowledge of the operational applications and people masks any shortcomings in the formal process. In time, more formal specifications and information exchange processes must replace the “ex-colleague corrective patches”. Service commitment for operation-related services (whether based on SLA or not) remains the high priority and fundamentally different from SLA commitments towards U-Telco customers. In the former case, failing to deliver service may lead to enormous damage at the mother company EPU and in the latter case, only to limited financial sanctions for not meeting an SLA.
80
The liberty of the company in terms of development strategy, assets and human resources and extra income from sharing the infrastructure with other users (or providing services to external customers) normally results in a more cost-effective telecom service provision and should lead to lower service costs for the EPU. On the other hand, the telecom Service Provider must assume the responsibility for network planning, development and refurbishment of communication network and service platforms in order to maintain the quality of the delivered service (e.g. mitigate asset aging) and to ensure that the infrastructure is capable of responding to new requirements (new services, increased bandwidth requirement, and service migration) provided that the EPU ensure the financing. This requires periodic assessment of EPU migration plans at the time of revision of the service catalogue and pricing. However, delivering U-Telco services can also lead to telecom regulatory issues and in particular fair trade regulations loosening the preferential links with the EPU. Depending on the proportions that external service provision may take in comparison to the EPU service, the danger is that in time, the affiliated telecom company may become simply a normal commercial service supplier resisting the specificities of the EPU’s operational services as further discussed in chapter 10.
8.4.4 Independent Service Contractor (Type D) An EPU requiring specific telecom services but not intending to maintain the necessary skills and organization, may deploy a dedicated telecom infrastructure and maintain the network by an external contractor. The perimeter of the service contract may vary according to EPU in-house capabilities: • • •
Service Management Telecom Infrastructure Management Field maintenance
The contractor provides organization, process and skills, even the absorption of EPU’s telecom staff and can often better maintain the skilled workforce through more competitive salary policy than the EPU itself. On the other hand, the EPU shall lose technical knowhow in medium/long term and consequently the control of its network and of its contractor. The contractor is engaged with a Service Level Agreement governing its interventions and services but is not responsible for the failure of ageing assets or their lack of performance whose renewal policy remains with the EPU employer even if the contractor conserves an advisory role in this respect. Typically, the service contractor must prepare a yearly development and refurbishment plan of communication network and service platforms based upon the EPU plan for application changes and the contractor’s survey of aging assets. The contractor can only assume the responsibility of maintaining the quality of the delivered service if the EPU accepts the refurbishment and new developments ensuring that the infrastructure is capable of delivering the service.
8.4.5 External Telecom Service Provider (Type E) The least degree of EPU involvement in the delivery of necessary telecom services is to procure it according to an SLA from a multi-customer Telecom Service Provider such as the Public Telecom Operator.
81
Procuring telecom services liberates the EPU from procuring assets, deploying infrastructures, employing skilled workforce, building processes and deploying tools for its management and maintenance. However, as it will be seen in chapter 10, the EPU shall still need to manage the external Service Provider with adequate processes (and tools) and adapt the procured communication resources to the requirements of its internal users. The infrastructure is extended, diversified, upgraded and renewed without any involvement from the EPU. However, extensions, new services and service migrations need to be planned long in advance to ensure that the provider shall have the capability of delivering the new services (e.g. covering new sites, increasing capacity in remote areas, etc.). This will be included in the yearly renewal or revision of service contracts. However, this mode of service provisioning presents many drawbacks which are symmetrically opposite to the advantages given in section 8.4.1 above. The EPU will have, in particular, to provide considerable effort in the following domains: 1. Formally and precisely specify service requirements and constraints. It should be noted that the terms and vocabulary do not have the same significance in public telecom and in the operational EPU context (e.g. availability) and may lead to misunderstandings with great consequences. Time behavior and predictability of the connections may be an important point to consider. 2. Establish Service Level Agreements (SLA) and Sanctions for not respecting them – It should be noted that non-respect of SLA in the world of telecom is sanctioned by financial compensation with no proportionality to the EPU risks due to lack of service. 3. Carry out Performance Measurement and SLA Monitoring with appropriate tools 4. Provide considerable effort in contract and conflict management, 5. Implement application interfacing and service multiplexing in operational sites where the service operator cannot access, 6. Coordinate Security Policy and Disaster Recovery/Business Continuity Plan of the Service Provider with those of the EPU. Perform audits to assure that they are not compromised. In particular, power autonomy, or the capability of the telecom service to be delivered in the event of a power outage through adequately dimensioned batteries is of great importance for Disaster Recovery. 7. Schedule long in advance any extensions, changes and upgrades and negotiate in good time with the provider. 8. Avoid monopolies and dominant positions for any single telecom provider which may increase its prices and decrease the quality of service. 9. Service life expectancy has to be carefully analyzed before using extensively a standard service delivered by a provider. Many cases can be enumerated where a standard telecom service used by an EPU is abandoned or replaced by another service not equivalent for EPU usage (e.g. leased digital circuits used for protection relay communications). 10. “Safety certified” field maintenance workforce or “safe location” for provider’s assets.
82
As it was stated at the beginning of this chapter, different telecom service provisioning modes often co-exist in the same EPU depending on the nature of services. •
When operation-related telecom services are provisioned through an integrated entity (type A), then corporate communications are generally through procured service (type E).
•
When operational and corporate services are integrated into the same provisioning model and organization (type B, C or D), then Protection communications are often separated from this integrated approach and performed directly through separate fibres (or wavelengths).
Figure 8.5 summarizes main service delivery modes in different types of EPU. Telecom Provider Profile Integrated with the Operational User
Sister Entity to Operational User
Affiliated Service Company (to Service User or to Holding)
EPU Profile (Role)
Independent System Operator
TSO /
Generation Company
Distribution Network Operator
Public Telecom Operator (Procured Service) Communicates essentially to other EPUs. Has no telecom infrastructure
National / Regional System Operator
Transmission Grid Company
Independent Service Contractor
Operationrelated service with limited resources
Cost efficiency User or holding Generally result criteria Provide service owns assets of growth and separation of to IT/Corporate Provide services but managed user affiliated Uto independent and maintained by contractor Telco profit centres
PTO cannot provide coverage for HPP or off-shore wind farm
Wireless SCADA Mobile workforce
Small number of high traffic urban sites for TPP Use of backbone for Metering, Commercial offices & IT
Generally using customer access for providing other U-Telco services
Large number of sites with little traffic, Metering, Mobile comms
Figure 8.5 – Examples of use for different telecom service provision schemes in the EPU
83
9 FEDERATING SERVICES ON THE PRIVATE INFRASTRUCTURE 9.1 Introduction The high cost of implementing private broadband telecom infrastructures with their high potential bandwidth, which is well beyond the operational applications requirements, encourages Power Utilities to seek Return on Investment through carrying non-operational traffic on the same infrastructure. This is further encouraged by Utilities’ ever increasing expenditure for non-operational communications as well as the potential to generate extra revenue through provision of commercial telecom services in the deregulated energy and telecommunication environment. However, carrying operational and corporate/commercial traffic on the same network impacts the organization, processes and investment plans. Corporate enterprise and commercial services can be delivered through the same dedicated network which provides for EPU operational telecom service requirements and planned, managed and operated by a common “service providing” entity. In this case, the power system operation entities become privileged clients of the Service Provider. It should be noted that the primary purpose of the dedicated network is to provide adequate service to these critical clients. Failing to meet the commercial service level results in financial penalty, but not providing operational service may mean major power system incident! Corporate enterprise and commercial services can also be delivered through a fully separate network using dedicated fibres in the Utility’s optical cables. In this case, service and infrastructure planning, network architecture, as well as infrastructure and service management can be performed independently from those associated with the EPU’s operational services. The commercial revenue generating telecom service in the former case generally concerns carrier services (surplus transmission capacity) or increasingly, higher level services such as Ethernet or IP connectivity. In the case of fully separate services, the revenue generating activity can also concern leasing of dark fibers which represents lower investment, organizational adaptation and risk, but also lower profit objectives. The vital issue to consider here is the way to assure that the core business operational and operation support services’ quality and security levels are not degraded by these coexisting non-operational services. The present section aims to enumerate different solutions used by Power Utilities to assure the coexistence and separation of services on the same infrastructure together with their impact on organizational issues.
84
9.2 Process and Organization Issues The adopted separation of operational, corporate enterprise and commercial (revenue making) telecom services depends upon process and organization issues some of which are as follows. Issue
Impact on Service integration
Regulatory, License and Asset Ownership issues – Do regulations and/or concession contracts allow the surplus capacity of assets and infrastructures to be used for other purpose other than the operation?
Regulations may limit the use of assets to operational applications if declared as expenditure for secure delivery of power. • Optical cable may be state-owned and conceded only for operational use. (Section 8.3 on asset ownership). • Licenses may exclude usage beyond operational use. Telecom through power system operation entity • Operation-oriented processes (focusing on the secure operation of the overall power system). • Integrates the operation of platforms and applications (Protection, Scada, etc.) • Inadequate size and organization for managing corporate and/or commercial type services. Telecom “Service Provider” entity • Manage all telecom services (op & non-op), • Service-focused process (provider-client organization), distancing from power system operational teams, platforms and applications. • Can include the provision of U-Telco services on the private infra-structure. Telecom through power system operation entity • Staff expertise is mainly power system oriented with a good knowledge of SCADA and Protection • Multi-disciplinary field intervention staff, integrated into regional power system maintenance organization possibly with multiple skills (RTU or protection) Telecom “Service Provider” entity • Staff expertise is mainly IT-oriented and focused on network services (not on power system applications), often with intensive outsourcing • Regionally located field staff to provide full support and to assure the SLA
Service Management Organization & Formal Processes Management of operational telecom services is performed by Power System Operation entity (Control Centre and Operational Areas) or by a separate Service Provider managing operational and nonoperational communications?
Management of Expertise and Skilled Resources – How much and which type telecom expertise is (or planned to be) existing in the organization?
•
85
Provider-client Service Management Platform
Network Scalability – Does the network architecture and structure allow its growth to accommodate new services and users on the corporate/commercial side? Security Management
Providing large scale commercial and corporate communications over the dedicated telecom infrastructure implicates heavy investments for “provider-client” service management platforms for maintaining, managing, configuring, provisioning and billing processes, in line with the business requirements (refer to section 6): • NOC (24x7x365 Network Operations Centre) • BSS (Business Support Systems) • OSS (Operations Support Systems) Providing non-operational services on the dedicated infrastructure must take into account the future growth of these services. Network design and planning can no longer be based on technologies, structures and dimensioning which would be used for operational purpose only. Integrating corporate and commercial services on the dedicated infrastructure implies that appropriate security processes and technologies assuring an adequate level of service isolation must be implemented.
86
9.3 Technical Solutions Different technologies can be used to support Operational and Non-Operational services on a same telecom infrastructure with different levels of separation: • Dedicated fibres • Wavelength allocation (WDM), • Dedicated bandwidth allocation (e.g. different bandwidth over SDH) • Ethernet VLAN or MPLS VPN separation Op. Services separated from
Using
Separate Fibres
Commercial Telecom Service
Most commonly used
Corporate Data Network
Corporate ICT managed by fully distinct entity
High Speed Data Services (IP data)
Core data network with Gbps capacity
Protection Applications
Used for Protection comms between substations terminating a dedicated OPGW
Separate Wavelength
Used to economize fibre or optical device for some (long) links
Same as Separate Fibre but in a fibre-constrained context (requires common optical network management) Used to separate Protection from a multi-service IP /MPLS network in a fibreconstrained context
Separate Phys. Bandwidth Not used except for some special access services and when one EPU sells bandwidth to another EPU or another Utility Only specific corporate links implemented through dedicated telecom
Separate Virtual Network
Multi-service IP/MPLS network integrating operational, corporate and/or commercial services with VPN separation
IP access to HV substations Used when Protection is integrated in a SDH/PDH dedicated network
None at present
Figure 9.1 – Commonly used Service Separation schemes employed in Electrical Power Utilities
9.3.1 Fibre Separation in Optical Cables Using separate fibres in Utility optical cable (e.g. OPGW) is the simplest solution for separating services and the most commonly used when sufficient fibre is available. Cables generally provide a fibre count in excess of 12 (or more often 24) which allows the usage of separate fibres as follows:
87
•
• •
For commercial and/or corporate services – mainly used when corporate and commercial service is provided by separate entity or when an EPU has to account for regulated and unregulated (commercial) activities using segregated accounting practices. For Protection – mainly used when telecom service providing entity does not provide protection communication services For Core IP/MPLS data network – mainly used when the telecom entity operates an SDH network and an IP/MPLS core data network.
9.3.2 Wavelength Separation through C- or D-WDM Wavelength division multiplexing is increasingly in use in power utility communication networks. In this case, the objective is not generally to increase the transmission capacity per fibre (as for telecom operators) but rather to provide separation between services: •
Assure separation of operational, corporate and U-Telco services,
•
Assure the coexistence of SDH multiplexing networks with packet-oriented IP/MPLS core data networks for which traditional SDH cannot deliver the required scalability. CWDM in this case is used as an overlay onto existing legacy PDH/SDH optical systems, thereby enabling additional services to be delivered while preserving the assets for current legacy systems.
•
Delivering “pseudo-fibre” services to some operational users and applications such as protection relay communications in certain cases.
The use of wavelength separation rather than fibre separation is also a particularly appropriate solution where extra fibre is not available, cannot be provisioned economically or existing extra fibres can be used in a more attractive manner (e.g. profitably leased). A typical example is to break into a prior existing trunk cable to enable delivery of more localized services in a region. The significant cost reduction and reliability improvement in recent years has rendered the WDM technologies an accessible and economical solution in existing private optical networks for providing secure transparent connectivity services. CWDM is the low cost technology for multiplexing up to 16 transmission channels over a wavelength domain from 1300 nm to 1610 nm with channel spacing of 20 nm (ITU-T G.694.2 and G.695). Its main market is cost effective transportation in metro networks and can be used in different types of network configurations. The transmission range is often limited (around 40 km when using all 16 channels, and around 100 km when utilizing only 8 channels). DWDM on the other hand, is a technology used for long distance and/or very high capacity links. It can typically carry from 32 to several hundred channels over a wavelength domain from 1530 nm to 1624 nm (ITU-T G.694.1) with a channel spacing less than 1 nm. This results in the requirement for precise wavelength control of the laser light source and temperature stabilization in the multiplexing units. The transmission range is around 80-100 km. The WDM scheme adopted by utilities depends on the planned expansion of activities. In most cases, where the network is essentially dedicated to internal requirements (operational and nonoperational) with limited external Service Provider capability, CWDM is the cost-effective
88
solution, DWDM being reserved for larger capacity requirements and long segments requiring amplification. High power WDM systems have been reported allowing long spans without costly wavelength control [29].
9.3.3 Bandwidth Separation through PDH/SDH The SDH and PDH technologies constitute the existing (or legacy) core of most EPU telecom networks deployed in the 90’s and still considered by many as providing in a well-proven manner the low latency, the time predictability and the fast protection mechanisms required for many utility operational services. Their Time Division Multiplexing scheme is used to provide separation between different services inside the SDH (or PDH) bandwidth. Packet adaptations and specific mechanisms (e.g. Ethernet over SDH) provide for efficiently sharing packet oriented traffic on part of the TDM bandwidth.
9.3.4 Virtual Network Separation (MPLS VPN or Ethernet VLAN) Virtual Networking allows different user services to share a common packet network while remaining logically isolated through different labeling recognized in switching or routing nodes. It can be set up at different logical layers of the data network, but virtual separation performed at lower layers allows tighter control of performance such as time delay, predictability and service isolation. Ethernet VLANs are logically separate Ethernet networks each with its own broadcast domain, shared bandwidth (e.g. SDH payload capacity), cabling and infrastructure, meaning that Ethernet frames from one VLAN will not be transmitted onto another VLAN. VLAN separation is widely used for EPU operational applications in particular through Ethernet over SDH, e.g. TCP/IP SCADA and IP voice connections. The restricted broadcast domain provides a powerful security mechanism, limiting inter-VLAN exchange to central router sites with adequate traffic filtering and security barriers. Implementing VLANs over an Ethernet is an appropriate solution when the number of virtual connections across the network remains moderate. When the network and its provided services grow considerably in size (e.g. providing commercial services), VLANs can be scaled up through an MPLS core network. Multi-Protocol Label Switching (MPLS) implements different logical customer networks over the same physical network in a scalable manner. This provides a solution which is economically more attractive than building physical connectivity per service, although this latter may well be necessary for some specific or very critical services in order to guarantee QoS and service segregation. MPLS makes use of label switching – customer traffic is tagged at the ingress port, forwarded across the network using the label (not the destination address) and delivered at the egress port after removing the label. This provides a simple way of building VPNs (Virtual Private Networks) on a multiservice network. These are conceptually equivalent to leased multi-point circuits, but with advanced features such as connection-oriented operation, traffic engineering, QoS, and facilitated addition and removal of sites/nodes. The VPN of each customer or service using a uniquely identified label, the segregation of traffic from different customers/services is guaranteed, hence providing service isolation and security.
89
Providing IP connectivity for operational and corporate (and perhaps U-Telco) services, over the same physical network bandwidth remains controversial and subject to discussion in many EPUs. While “dedicated channel”-based technologies such as SDH or WDM, rarely raise the question of a service putting another at risk, there is a natural distrust for service provisioning models based upon shared resource environments such as MPLS, and in fact defined security policies need to be enacted to ensure that the separate VPNs are in fact securely delivered [30]. There are some basic rules that must be addressed in order to permit coexistence of internal and external customer services in the same MPLS network: MPLS being is a “carrier oriented” backbone technology, it is highly recommended not to “mix” the carrier network with customer networks. Adopt the MPLS recommended topology consisting of “Provider”, “Provider Edge” and “Customer Edge” routers. Avoid connecting layer 2 customer equipments directly to the MPLS core network; • Before configuring an MPLS service one must ensure that all customer service requirements (availability, bandwidth, delay, maximum re-convergence time, etc.) are known and can be guaranteed. For some specific services MPLS may not be the best solution; • Each customer service must have its own separate MPLS VPN service; • Customer service level requirements must be guaranteed through implementing Quality of Service for service prioritization and data flow performance; • MPLS Traffic Engineering must be implemented to have a more efficient network usage, to deal with network congestions and strong traffic pattern changes; • The network must be protected from the security threats that may originate from customers. Configure and operate the network with a secure network management system, secure protocols and secure procedures; • The network must be monitored in terms of equipment interconnection load, packet loss, resilient connections availability and security events; • It may be envisaged to implement independent data networks for internal services (MPLS or simply IP) and external commercial services (MPLS). Such architectures get the best of both worlds at somehow higher cost, providing complete segregation between internal and commercial services while taking advantage of multi-service and multi-customer capabilities of MPLS. To end this section, it should be noted that MPLS is now “an aging technology” first developed a decade ago. It remains complex and does not fully respond to all requirements in particular for delay- and jitter-intolerant services. A number of other Scalable Ethernet transport (or Carrier Ethernet) technologies have been specified and developed, fulfilling exclusively the transport objectives of MPLS: •
•
PBB-TE (Provider Backbone Bridge – Traffic Eng.) specified by IEEE (802.1Qay2009) delivering WAN Ethernet services without the protocol complexity of MPLS. An example of such implementation at experimental stage has been demonstrated in Argentina [31], elaborating a two layer architecture associating CWDM and PBB-TE to deliver new and legacy operational services.
•
MPLS-TP (Transmission Profile) under preparation by ITU-T/IETF (in 2010) is a MPLS-compatible network standard with SDH-like restoration and delay.
This subject is further discussed in section 12.4 Telecom Technology Evolutions. 90
10 MANAGEMENT OF TELECOM SERVICE AND INFRASTRUCTURE 10.1 Introduction - Need for a Management System Delivering or provisioning telecom services in the modern EPU requires an extensive effort in terms of management both at the high-level (policy, planning, partnerships, etc.) and at the day-to-day operational level (organization, customer relation, service management, asset management, field maintenance, etc.). Management implies the association of people structured into an organization, adequate processes defining people’s roles and their interactions, and appropriate tools allowing the fulfilment and monitoring of the process, in order to provide a service with a quality and cost which satisfies the end user (customer). The success of a management set-up depends upon the adequate dimensioning of the organization, process and tools relative to the scope of activity, and their size and complexity matching. •
An extremely light organization with minimal processes and elementary tools may be sufficient for dealing with a small network dedicated to a very limited scope of services. However, if the scope of delivery grows in scale, then the light scheme will become completely ineffective.
•
A large and complex formal process can overkill an activity if it is imposed onto a small organization with a limited scope of service delivery. Large management tools in small organizations may fully jeopardize the workforce, becoming occupied with the tools rather than with delivering the communication service.
•
A large workforce composed of in-house and external contractor staff delivering services to multiple users at different geographical sites cannot work with informal arrangements as would a team of five technicians delivering services to their next door operational colleagues.
Management Tools
Management Scope Management
Management
Organization
Process
Figure 10.1 – Organization, Process and Tools adequacy to the Scope of delivery 91
In the case of EPU telecommunications, the management scheme has been trivial and implicit in the past due to the small scale, simple and time-invariant nature of requirements, informal user relationships, and non-recovery (sometimes non-assessment) of costs. The typical service delivery model in this case has been the Type A or Integrated model described in section 8.4. However, it is often observed that many EPUs undertake major changes in terms of scale and scope of telecom service delivery focalizing on the infrastructure capacity and technology without paying sufficient attention to the associated management aspects. The organization moves from type A to types B or C (section 8.4) assuming that that same informal management process may remain sufficient. The EPU telecom provisioning moves to external contractors’ services for running and maintaining the telecom network or even to provide managed telecommunication services (types D and E) hoping that the contractor or provider would find himself bound by “implicit” constraints of the proper operation of EPU’s critical applications, resulting in non-satisfaction and conflicts. Indeed, the investment in a standards-based, well-defined management framework is far from trivial endeavor. In particular, managing the network cannot be reduced to purchasing high cost sophisticated network management tools. Some of the reasons justifying a redesign of management system and processes in the EPU Telecom Service Delivery are given below. The triggering condition is generally the intention to change the scope of service delivery (e.g. customer base), the scale of services, the organization, or the evolution of the service and/or infrastructure beyond its initial deployment objectives: •
Complexity of the network and its underlying infrastructure necessitating the cooperation of several complementary skills.
•
Multiplicity of service provisioning possibilities including different levels of external service contractor intervention
•
Reduced workforce and less interchangeability of staff implying that roles and responsibilities must be more clearly defined
•
Geographical and functional separation of actors (executive and strategic planning, deployment, service operation, maintenance, and service usage) necessitating a clearer definition of the expectations of each actor from other parties.
•
Changing requirements driven by government programs for the foreseen electrical ‘network of the future’ and its critical reliance on communications.
•
Need to demonstrate compliance and efficiency to regulators (technical, economic, etc.)
The present chapter aims to produce guidelines for the definition of management processes governing the provision and delivery of telecom services in the EPU. It aims to establish a reduced framework, applicable to the domain of Utility telecommunications, comprising a well-defined set of processes to formalize the roles and the interactions. The framework is based upon existing Best Practices and existing Management Frameworks which, as they are, may appear too complex to be relevant in this context. It should still be noted that the specified framework remains a “pick and choose” base to be used by each Utility according to its specificities and its positioning across the “business maturity” timeline as described in the next section. 92
10.2 Present State Assessment and Target Definition 10.2.1 Telecom Business Maturity Modelling A formal approach to process definition implies neither a “standard” process nor a unique organization model. These depend upon the exact scope and scale of the EPU in telecom service provision as previously discussed in chapter 8. The same process can, in the case of one EPU, be merged into the duties of one person, or in another EPU, partitioned and represent the activities of a whole team. For example, a “type E” service provisioning (figure 8.4) mainly requires extensive supplier relation management while a “type A” service provisioning mainly requires extensive infrastructure fault management. In order to design a management process that is applicable and appropriate for a particular EPU it is necessary to perform a “Business Maturity analysis” establishing a “stepped maturity model” moving from present state situation to a target state in the future and mapping out how to progress between these two. Business Maturity models can be based on standard tools such as those elaborated for Smartgrid evolution (Carnegie Mellon University, Software Engineering Institute, [32]). In a basic model, different evolution stages are defined as follows: 1. Initiating – Exploring, Building a Business Case 2. Investing – Implementing extended scope services, Developing new schemes 3. Integrating – The deployed scheme is still being adapted in order to fit the organizational and business requirements 4. Optimizing – A new scheme being deployed successfully, fine adjustments are being applied to correct and optimize minor aspects 5. Innovating – Having implemented an operating scheme and based upon the acquired experience, the organization is exploring new directions Figure 10.2 presents an example of such a stepped maturity model applied to the EPU telecom service delivery/provisioning. The stepped Business Maturity Model can easily be employed in relation to the Service Delivery models previously defined in figure 8.4 to establish where an EPU stands in terms of telecom activity and its target in terms of evolution considering its intentions and strategies. Naturally, a “type A” telecom activity will be located at the early steps of all scales due to its small interaction with any external parties, both customer and provider/contractor. On the other hand, a telecom scheme in which service consumer and provider are fully separate (types C, D and E) are located at the far end of the scales. Designing a management process also requires reference to existing needs in terms of services and constraints (SLAs) as well as the envisaged and planned evolutions (refer to section on SLA). An important issue here is indeed the manner in which a redesigned management process can be applied and the migration path towards the new management system.
93
Providing Service to External Customers & U-Telco Business Involvement Internal Multi-User Services
Operational Services only
External Leasing of Facilities
External Managed Services
External Non-IP Wholesale
External IP Service Wholesale
External Retail Services
External Contractor & Service Provider Involvement Support Contracts for sub-systems
Supplier Helpdesks & Warranties only
Field Maintenance Contracts
Full Service Delivery Contracts
Service costs estimated but not recovered
Service costs recovered at no profit
Service price established as Service Catalog
Relationship can get formal if problems arise
Formal SLA No systematic measurement
Liable to SLA Must constantly produce formal proof
Procure Telecom Services
Cost Recovery from Service Users Overall telecom budget without repartition
Resource repartition per contributing entity
Telecom Liability to Service Users Implicit, User application serves to define the QoS
Required QoS is defined & agreed upon
Figure 10.2 - Example of Business Maturity Analysis stepped model
10.2.2
Management Process Maturity
The Business Maturity model discussed in the previous section locates the EPU telecom activity and involvement at present and in the future, but does not situate it in terms of present and future management practice. Moreover, transforming the work process of an organization cannot be performed overnight, moving from chaotic, informal ad hoc relationships of a very small delivery unit to a fully mature value-oriented scalable organization with tools, processes and work practice that continuously align to business requirements and opportunities. This has lead to the definition of Capability Maturity Models composed of different maturity levels. This “stepped model” can be used to assess the capabilities of the management processes in the EPU Telecom scheme as illustrated in figure 10.3. A maturity level is a defined evolutionary plateau for organizational process improvement. Each maturity level covers an important subset of the organization’s processes, preparing it to move to the next maturity level. The maturity levels are measured by the achievement of the specific and generic goals associated with each predefined set of process areas. A capability level for a process area is on the other hand, achieved when all of the generic goals are satisfied up to that level [33]. It should however be noted that the “target state” is not necessarily the highest maturity/capability levels for all management processes. Many processes may be fully acceptable at a reactive stage (repeatable) considering the required cost and effort for moving to a proactive stage. Moreover, no matter how proactive you are, there will always be unexpected problems that call for an immediate reaction. It is much more important to check if
94
the documented processes are effectively being followed and not constantly “worked around” due to being “not adapted”, “too costly” or requiring more effort that the organization can afford. Capability Maturity Levels
EPU Telecom Service Management Chaotic management of telecom services Processes are ad hoc and 1 Initial generally responding to problems as they occur disorganized and signaled by users Reactive management of service through Processes follow a 2 Repeatable monitoring of the telecom infrastructure alarms regular pattern by one or multiple coordinated teams Proactive service management following Processes are documented processes including predictive and 3 Defined documented and preventive actions on network performance and communicated availability Processes are monitored Quantitative measurement of process 4 Managed and measured effectiveness through KPI monitoring Business value-oriented management capable of Best Practices are 5 Optimized optimizing services in terms of efficiency followed and automated through continuous improvement Figure 10.3 - Capability Maturity model based on COBIT and Carnegie Mellon CMM Finally, setting up a management process framework must always be accompanied by some means of evaluating its effectiveness and the extent of its application. This allows continuous improvement and necessary adjustments to the processes and to the organization in order to fulfill the overall goal of efficiently providing service of adequate quality to the customers. This brief description has been extracted and compiled from a number of sources [33], [34], [35] and is only given as a starting point for further investigation. Auditing of management processes and maturity scales is part of the wider domain of IT governance which is indeed beyond the scope of the present document on telecom services. Interested readers may refer to specialized literature and in particular to those relating to COBIT and CMMI.
95
10.3 Management Frameworks & Best Practices 10.3.1 Introduction Telecom management is the subject of numerous standards and is partitioned according to different cleavage lines depending on the purpose of the partitioning. In particular, the FCAPS model and the TMN Logical Layered Architecture (LLA) must be mentioned: •
ITU-T M.3400 FCAPS Management functions (Fault, Configuration, Accounting, Performance and Security Management) have been used for grouping of functions in the development of Network Management Systems associated to Telecom equipment [36].
•
TMN Logical Layered Architecture (LLA) defined by ITU-T M3010 (Elements, Network, Service, and Business Management) has, on the other hand, been devised for extending the visibility of the management system through different layers of abstraction. It extends the NMS vision beyond the individual network elements into the vision of the network composed of many network elements, and then covering multiple networks to manage a whole service, and different telecom services constituting a telecom business [37].
For analyzing the overall management process for telecom services in the EPU however, one needs at the same time a functional view (Fault, Performance, etc.), a lifecycle view (long-term decision making, deployment and upgrades, day-to-day operation) and a logical layer view (Element, Network, Service and Business). FCAPS and LLA have been widely treated in previous CIGRE Brochures (e.g. [38]). In the present document we focus on Management Frameworks defining the business processes that must be deployed and their interactions and in further sections on applications, tools and information systems.
10.3.2 ITIL Framework IT services have pioneered in applying a formal approach to the specification of operation processes. ITIL (IT infrastructure library), initially developed by UK Central Computer and Telecommunication Agency (CCTA) and supported by IT Service Management Forum (itSMF), gives a set of guidelines for a structured, common-sense, process-driven approach to ensure close alignment between IT and business processes without imposing any “universal solution” to the process design and implementation for the management and delivery of IT services. Applied to a telecom Service Provider, ITIL service management principally defines the processes necessary for the delivery of IT services (IT view) necessary for the fulfillment of the Telecom Provider’s Business Processes. These include telecom Network Management System (NMS), trouble ticketing and incident management tools, IP Voice servers, intranet facilities, etc. ITIL V2 (the commonly employed version) divides management business processes into functional groups (logical sets): • •
Service Support Service Delivery 96
• • • • •
ICT Infrastructure Management Security Management Business Perspective Application Management Software Asset Management
The replacement version V3 (since 2009) globally covers the same scope but organized in a “service life-cycle”-oriented manner (i.e. Service Strategy, Service Design, Service Transition, Service Operation, and Continual Service Improvement). Through a scalable and flexible “adopt and adapt” approach, ITIL is applicable to all IT organizations irrespective of their size or the technology in use. Many EPUs are at present using the ITIL framework (or standards based upon it) for the management and governance of their IT infrastructure and services, implying familiarity, practice and trained staff. This covers the corporate and operational information platforms in terms of IT service delivery, IT service support and other associated processes. Applying ITIL may therefore be an obligation through corporate policy, on the IT part of the EPU telecom provider activities and on common processes such as Service Support (e.g. Service Desk) if Telecom and IT organizations are merged. ITIL is further described in the Appendix 7 and in [39].
10.3.3 NGOSS – Frameworx The TeleManagement Forum (TM Forum) developed a much wider framework called NGOSS (New Generation Operation Support Systems) which in 2010 was upgraded to Frameworx. NGOSS - Frameworx is a “Solution Framework” for Telecom Service Providers with various entry points based upon focus/needs of the frameworks’ user. This framework may be used as a reference for discussion on different aspects of management (our usage in this paper) or for defining the scope and boundaries of “management application” development projects. It covers the following components: •
•
• •
Business Process Framework eTOM (Enhanced Telecom Operations Map also published by ITU-T as Recommendations M.3050.x). The Business Process Framework may be used to support organizational analysis, launch process harmonization initiatives, and show points of interoperability between applications Information Framework SID – This framework provides for enterprise-wide information reference model and vocabulary. It may be used by IT departments to tackle data diversity, or as a starting point for information-related projects such as for database design, Application Framework TAM – Common language to describe systems and their functions. It may also used as a procurement checklist for management tools. Integration Framework – Management application integration concepts and principles. The Integration Framework’s APIs (Application Programming Interface) and business services can be used to support application interoperability.
NGOSS-Frameworx and its constituents are described in more detail in Appendix 8 and in [40].
97
10.3.4 Business Process Framework eTOM The Business Process Framework component of NGOSS-Frameworx called the Enhanced Telecom Operation Map (eTOM) describes all the enterprise processes required by a Telecom Service Provider and analyses them to different depths (level 0, level 1, etc.). It can be used to analyze existing business processes, to identify redundancy or gaps in the current strategies, and to re-engineer processes correcting deficiencies and adding automation [41]. Management business processes in eTOM are partitioned into three areas: • Processes related to building a telecom activity (Strategy, infrastructure and product) • Processes related to running a telecom activity (Operations) • Enterprise processes which are not specific to the telecom activity (Enterprise) The eTOM model further defines two main groups of interactions to the outside world: • Interactions with telecom service customers • Interactions with providers and suppliers of services The eTOM model locates individual business processes across four transversals: •
Market, Product and Customer transversal – This includes all upstream and operations processes which are directly in contact with the customer/user. It is the customer view of the Telecom Management. In this context, “product” means a productized service proposed by the provider to his customers (e.g. a communication service with a preestablished level of quality and price – a standard SLA)
•
Service – This covers all processes required for the delivery and support of communication services to its customers using owned resources and/or other suppliers and partners’ procured services.
•
Resources (Applications, Computing and Network) – This covers the management processes relative to network infrastructure and associated tools and applications which are employed for delivering the service. This horizontal corresponds to the network view of the management process.
•
Supplier/partner – This covers all processes concerning relationships to contractors, partners and suppliers whose services enable the organization to provide telecom services or to maintain the network infrastructure used for telecom service delivery.
At first glance, eTOM Business Process framework looks terribly complex and mainly relevant to competitive customer-driven Telecom Service Provider business with millions of customers, very large number of assets, numerous support service suppliers and a large and continually renewed service portfolio. Modeling the relatively modest scope of EPU telecom service provision/delivery through the strength of a Telecom Operator Business Process Modeling tool is not immediately obvious. However, it should be noted that eTOM framework is structured for “pick and choose” of relevant sections according to the scale and size of the activity and may be adapted to requirements. Process elements representing the activity of a whole department in the case of a Public Telecom Operator may be part of the mission of one person or entity in the EPU telecom organization. eTOM identifies in a detailed manner the required interactions and allows the detection of potential areas of automation and IT support. 98
10.3.5 Relating ITIL to eTOM Framework There is no incompatibility between ITIL and eTOM. They simply do not address the same scope: ITIL covers IT service management in any IT context (IT view) and eTOM covers Business Processes of a Telecom Service Provider (Business view). A detailed analysis of ITIL and eTOM in the context of telecom service delivery is performed by ITU-T Rec. M3050 – Supplement 1 [42]. The document “supports the understanding that a common, integrated view can be derived so that either an eTOM-based or an ITIL-based solution can be understood in terms of the other perspective”. Some differences in vocabulary, and process partitioning between ITIL and eTOM are presented in the same document and the mapping of ITIL V2 primary processes on an eTOM map is presented in the figure 10.4.
99
Figure 10.4 - eTOM L2 Operations processes and ITIL processes (TM Forum document GB921-V, Release 6) and [42]
100
10.4 Towards a Utility Telecom Management Framework 10.4.1 Introduction The EPU telecom management processes can broadly be divided into two groups as shown in figure 10.5: •
Upstream Management comprises all the processes necessary for building the telecommunication policy and the provisioning model in conjunction with the power utility actors and the regulating authorities, setting up the business plan for the adopted service delivery organization, and deploying the necessary infrastructure and/or contracts. The outcome is an organization with staff, processes, assets and service contracts which allow through day-to-day operation to deliver the necessary telecom services within the EPU. The deployment process also includes the migration planning from the existing scheme (or network) into the target service provisioning/delivery model and underlying infrastructure. This aspect is discussed in section 10.5.
•
Current Operations consist in running a network and providing services to different users. It comprises different work tasks in relation to external Service Providers, in relation to EPU service customers, and in relation to the network services and infrastructures as well as associated tools.
Upstream Management Organization, Process, Network Assets, Service Contracts, Service Catalog
Policy Definition Business Planning Deployment Upgrade & Migration Business Development Operational Data, Assessments, Audit Reports, Asset Requests
Current Operations
Customer Relations Service Management Infrastructure Management Provider Management
Figure 10.5 – Management process for delivering/provisioning of telecom services
101
10.4.2 Utility Telecom Management Operations Map (uTOM) Based on the considerations previously described in this chapter and keeping eTOM as a highlevel reference, we can define the following attributes for a Utility Telecom Management framework: 1. The telecom entity must manage two interfaces whatever be the delivery model • Interface to Service Customers/Users • Interface to Service Providers/Contractors Depending on the service delivery model (refer to fig 8.6), the services that the telecom entity procures through providers and contractors can be technical support, field maintenance, dark fibre, bulk transmission capacity or leased application-level connectivity. 2. A service Provision layer allows the telecom entity to employ self-owned facilities and/or other provider services to deliver a satisfactory level of communication service to the customers/users Even in the case of directly procuring application-level connectivity, a “thin” telecom service layer is still required: • Ensure that User requirements are correctly mapped to providers’ service obligations and delivered quality • Associate multiple procured services from different providers in order to meet the availability and fault tolerance constraints of critical applications.
Customer / User
Customer / User Relationship Service Provision Process Provider / Contractor Relationship
Provider / Contractor Figure 10.6 – Basic model demonstrating Service Provisioning from (external) Providers and Service Delivery to (internal) Users
102
3. The management model for Utility Telecom services must cover the whole life-time of the service provision cycle: • Building a strategy for the provisioning & delivery of telecom services • Building the capability to materialize the provisioning & delivery scheme • Defining the perimeter of services to be proposed to the customers • Operating the service Upstream Management
Strategy
Operations
Capability Building
Building Build a strategy for the provisioning & delivery of telecom services Define Policy Identify Opportunities Set-up Business Plan Convince stake-holders Set-up Strategic Partnerships Solve Regulatory Issues
Build the capability to materialize the delivery scheme
Business
Operating
Development
Scheme
Define the perimeter of services to be proposed to the customers
Define Service Catalog Deploy Network Resources Adjust Service Pricing Constitute Operation Team Provision Services & Support Specify Service Migrations Deploy tools & Processes
Operate & Maintain the Service and the Infrastructure Initialize Services Deliver over Network Resources Maintain Network Resources Provide Support to Users Invoice & Get Paid
Figure 10.7 – Telecom Management Process Life-cycle The Upstream Management can be further divided along the time cycle into: • • •
Policy Definition & Business Planning (or Strategy Building) Strategic Deployment & Tactical Adjustment (or Capability Building) Business Development
This division corresponds to the eTOM verticals called S (Strategy & Commit), I (Infrastructure), and P (Product) adapted to the Utility Telecom context. 4. In the great majority of Electrical Power Utilities, telecom services are delivered through an owned telecom infrastructure requiring extensive network infrastructure management. The telecom management framework must therefore include a “Network Resource” management process layer. Adopting eTOM structures, this also includes the management of all IT tools and platforms associated to the telecom network and service management. 5. Operations management processes generally follow a same pattern which is illustrated in figure 10.8 as “Initialize, Deliver, Provide Support and Get Paid”. This corresponds to the eTOM verticals called “Fulfilment, Assurance, Support and Accounting (or Billing)”. • A user request (e.g. a new SCADA connectivity) is initialized on the basis of a previously defined “SLA” (type of interface, capacity, QoS, etc.). • The service is delivered to the user from the moment it is registered, provisioned over the network and established. Support is given to the user at different levels for this new connectivity. • The telecom provider entity may measure the usage of the network and invoice the SCADA entity for the communication service.
103
Deploy & Adjust
Business Development Service Portfolio Evolution Service Migration Planning
Customer / User Operations
Upstream Management
Build Service Offer
Build Capability
Build Strategy
Strategy & Planning
Customer / User Relationship Service Management Resource Management Provider / Contractor Relationship
Security, BCP, Safety, Skill Mgt
Provider / Contractor Enterprise Processes Initialize
Deliver
Get Paid
Provide Support
Figure 10.8 – Utility Telecom Operations Map (uTOM) 6. In addition to the Upstream and Operational Management processes defined previously, a telecom Service Provider enterprise owns a number of general processes which are generally performed at the corporate enterprise level. These processes include Human Resource, Skill Management and Training, Risk Management, Financial Management, Quality Assurance, etc. In an Electrical Power Utility, these corporate activities are not specific to the telecom service providing entity and as such, are not necessary in the present analysis. However, a number of these “Enterprise” processes which can be particularly impacted through the telecom activity and the model of telecom service provisioning/delivery shall be enumerated and discussed.
104
10.5 Upstream Management 10.5.1 Introduction Over the last 20-25 years, the EPU has undergone significant organization change, largely influenced and driven by political and legislative policy, generally moving away from the vertically integrated, government owned monopolistic organization responsible for the generation, transmission, distribution and supply of electricity towards a competitive electricity market. This tendency described in section 8.2 has significant impact on the strategic decisions of the EPU as to the way to provision telecom and IT services: • •
The EPU is a commercial organization whose goal may not only be delivering reliable and secure electricity but generating profit through “diversified” activities and pay dividends. The stake holders are no longer the state representatives but also the Parent Company, the Investors, the Customers and the Regulation authorities.
In such a context, the tasks of Upstream Management for Telecom Services widen up considerably. These cover all the processes resulting in the adoption of a particular telecom service delivery mode and the preparation of financing, facilities, organization, resources and partnerships for the activity to be sustainable, economically viable and approved by the different stakeholders. Figure 10.9 presents the main constituents of Upstream Management in the Utility Telecom context and figure 10.10 a typical process flowchart and organization for these tasks.
Deploy & Adjust
Business Development Service Portfolio Evolution Service Migration Planning
Build Service Offer
Build Capability
Upstream Management
Build Strategy
Strategy & Planning
Operations
Enterprise Processes
Figure 10.9 Upstream Management in the uTOM
105
Figure 10.10 Upstream Management Process Flowchart and Organization 106
10.5.2 Policy Definition & Business Planning Telecom policy definition and business planning covers executive level decision making processes that lead to the long term plan (5-10 years) for the evolution of telecom service provisioning in the EPU. Its primary aim is to outline a clear direction and framework for how a competitive telecom solution can be provided within the operating environment, thereby ensuring: •
Reliable and secure accomplishment of the EPU primary mission (e.g. power generation, coordination, and delivery) thus reducing the risks of human accidents, losing customers, sanctions and penalties, and costly damage to power system assets,
•
Shareholders’ satisfaction (usually long term growth) and good mid term returns (dividends) at the EPU level,
•
Regulator’s approval (essentially delivering “value for money” to the end customer).
The strategic policy on telecom services must allows the EPU to define the focus and strategic direction of the business over the next five year period: •
Determine how the organization will deliver the services based on its existing resources (both capital and people),
•
Define any requirement for the cessation or diversification of existing services, identify opportunities or requirements for the creation of new services,
•
Quantify any subsequent investment necessary to fulfil such objectives.
The strategic policy for telecom services depends upon a number of factors and information from a variety of sources. Some important factors and issues are enumerated hereafter: Parent Company Policy (and Business Plan) – This will define the general framework and the orientations to be taken towards telecom services (e.g. diversify activities to other profitable businesses) Regulatory Context – Telecom activities of the EPU must respond to the requirements of any Regulatory and Legislative frameworks to which it is bound. In the case of U-Telco business operations this includes the Telecommunications Regulator as well as the Energy Regulator. Regulatory constraints may have significant impact on the availability of funds to develop the telecom facilities, may create new security or safety obligations leading to extended telecom services or refurbishments and renewals, may favour certain delivery modes to others through CAPEX/OPEX distinctions, and may present an obstacle to the integration of corporate enterprise services with the operational services. Organizational Opportunities – ICT Convergence, the technological trend removing the boundaries between information processing/storage and exchange and the similarity in the required skills is driving utilities to consider the merger of these organizations bearing in mind that the “cost and effort saving” is to be balanced against potential issues regarding certain processes and work constraints. Such a merger between IT and telecom activities can greatly impact the strategic orientations for telecom service delivery.
107
Similarly, the cost and effort for maintaining two independent telecom organizations within the utility perimeter to fulfil operational and corporate enterprise communications is driving many utilities to consider their merger. Market Opportunities – Revenue Generating Commercial telecom involvement through UTelco business are often envisaged by EPUs every time that major enhancements are planned in the telecom infrastructure. Setting up a U-Telco business however necessitates the preparation of a distinct business case based on detailed potential market surveys and opportunity identification which is beyond the scope of the present document. Combining the essentially “non-profit mission” of operational telecom service provision with the “profitoriented” commercial service in a same organization may rise a number of organizational and regulatory issues that need to be carefully analyzed (e.g. business development in a competitive environment without compromising the critical service constraints of the power system, or access to financing for operational service development while maintaining competitive commercial activity). Strategic Partnership Opportunities – Many utility services require coverage, workforce or investment which is economically unfeasible for the EPU on its own. Resource sharing (cable infrastructure, radio coverage infrastructure, emergency mobile network, or maintenance team) between multiple utilities or multiple critical Service Users can be a way around this economical obstacle. Investment Context – Financing the deployment of a large telecom infrastructure can be performed through the company’s own funds, or through international loans. This latter type of financing cannot be applied to an essentially OPEX mode of telecom service delivery (e.g. procuring telecom services). Similarly, obtaining financial support for developing the network for competitive commercial services requires demonstrated “Return on Investment”, which may not be required for financing operational service development. Combining the two may cause difficulties for obtaining either type of financing, unless clearly separated. Resource Context – As previously stated, asset ownership, in particular optical fibres across transmission lines, is a determining factor on the selection of service provisioning modes and on cost assessment. It is also a determining factor for the feasibility and interest of U-Telco involvement and hence of the envisaged strategic orientations. Another highly decisive resource in the EPU telecom environment is the RF Spectrum. Assuring that adequate RF spectrum is allocated (or maintained) for the operational usage of Utilities is a major field of national and international action in many parts of the world in particular in the US and Australia. Gaining (or keeping) access to specific parts of the spectrum is a political issue necessitating power utilities to voice their concerns collectively to legislators and regulators (e.g. UTC). Access (or lack of access) to adequate RF spectrum highly impacts the mode of delivery for mobile services and power distribution network communications. Skills and Human Resource Context – Unavailability of sufficient workforce and/or skills for scaling up the telecom activity may lead to the merger of IT and corporate enterprise services with the operational telecommunications, or to outsourcing/ contracting.
108
10.5.3 Strategic Deployment and Tactical Adjustments These tasks which have been collectively named “Building Capability” transform strategic decisions on the provisioning of EPU telecommunication services into deployed solutions. They comprise a great amount of project and contract management. Strategic deployment may cover all or part of the following processes depending upon the Utility organization, regulatory context and the scope, scale and type of service delivery which is envisaged: Building a Business Case – Building a business case is at the frontier between strategic decision making and deployment. It transforms the business plan into a detailed project usable for investment appraisal and obtaining required approvals. The business case in general includes the benefits of the proposed project for the EPU, the estimated cost, the analysis of “Return on Investment” and the risks related to the project. In Utility telecom projects, it may also include the risks associated to “not deploying” the proposed project as well as the assessment of alternatives. Increasingly, the high level of necessary investments leads to envisage ways to optimize the usage of assets, when regulatory issues can be overcome, through the integration of other services (e.g. corporate enterprise) or revenue generating U-Telco type services. Building Organizational Capability – Telecom workforce exist to some extent in all EPUs. It may form a specific organization or incorporated into other operational entities. Organizational changes related to strategic deployments are often the result of major changes in the scope and scale of the delivered services. This may comprise technological change for which the workforce is not skilled and cultural changes needing time to be assimilated (e.g. formal relationships). Change in organization necessitates prior preparation of formal operational management processes as described in a further section. The organizational change may be the change of service provisioning mode from procured telecom services to an EPU-operated dedicated network or vice versa. It can also be the change from an operations-incorporated activity into a multi-user internal Service Provider) or affiliated service company. Building organizational capability may require new skills and more staff than existing in the present organization. This issue can be solved through: • Employing workforce with required skills • Extensive technical training programs • Outsourcing skilled workforce • Contracting certain support activities (e.g. maintenance) Deploying Network Infrastructures – Rehabilitation projects leading to major change of scale and scope of the EPU dedicated telecom network are often contracted in a turn-key manner. Contracting of these projects often comprising design optimization, procurement, installation and commissioning of all necessary equipment, provision of power supplies, survey of existing facilities and integration with the existing, necessitates precise specification covering responsibilities and liabilities of each party. Telecom network deployment projects are rarely “greenfield installations”, rehabilitation projects and their network-wide surveys are often the opportunity to review the lifecycle issues
109
of existing equipment: compare the costs of replacement of old equipment with the cost of upgrade and maintain. Timely replacement of old telecom assets may allow substantial economy on operating expenditure (spare parts, obsolescence management, site interventions and repairs, etc.) in addition to enhanced functionalities and performance. A major issue in large-scale telecom network rehabilitation projects is the migration of operational services from the existing network to the target network with minimal and scheduled service interruptions. Building Service Contracts – Contracting of services for the deployment of network infrastructure and management tools, and/or operation, maintenance and support necessitates precise specification. If turn-key implementation projects are often well specified, precise and covering responsibilities and liabilities of each party, service contracts are often poorly defined. Service contracts are often based upon Service Level Agreements (SLA) describing the engagements of the contractor to intervene upon encountered anomalies in constrained time: this contractual time defines the grade of the SLA (e.g. gold, premium, diamond, etc.). The contractor must provide people, organization, knowhow, tools and process. However, the responsibility of the contractor, who has not designed the system, has not chosen and procured the equipment, and has not decided to maintain rather than to replace old assets, cannot be extended to the proper operation of the system with an acceptably low down-time. He can just guarantee to intervene in contracted time limit with skilled staff. Coordinating the contracts for equipment procurement, network implementation and operation & maintenance can prove to be extremely hard, leaving gaps in the overall responsibility. The liability of the contractor is another important issue. In general, the level of sanction cannot cover the potential loss: risk sharing cannot be back-to-back. Building organizational capability may also result in the need to outplace existing staff previously involved with that particular service into the contractor’s workforce. This delicate human resource issue needs to be taken into account and treated at the upstream management level prior to the change. Moreover, it should be noted that the use of external resources in whichever scheme is indeed a way to acquire rapidly and flexibly the required skills for introducing new technologies, but the obtained skill and experience is not sustained in the company. Outplacement of skills and experience into external contractor companies is often an irreversible process and may represent a loss of control and capability in telecommunications, rendering the EPU dependent of its contractor. Deploying Management Tools & Processes – Building organizational capability requires also the definition and application of new operational processes which will be appropriate for the new scope and scale of activity. These are described in a further section on operational management. Deploying management tools, beyond the vendor-specific equipment configuration and management platforms, represents an investment that many utilities are at present finding necessary. These tools consist of on-line and off-line information systems allowing interactions with the network infrastructure, with Service Users, with Service Providers and across the management organization. They include service desks, network and service configuration management data base, alarm and event management platforms, performance monitoring systems, incident management and work order dispatch systems, dynamic route tracking
110
systems, security management and intrusion detection systems, etc. These IT platforms and applications are described in section 10.7 further in the document. Management tools employed in small and simple telecom networks are often trivial, “homemade” and may have marginal cost of maintenance. However, scaling up the network and formalizing the processes necessitates more elaborate, complex tools which represent cost and effort to deploy and to maintain. Deployment Phasing and Migration plan – An important task in transforming business plans and policies into deployed solutions is the elaboration of a phased deployment plan describing the step-by-step migration from the existing situation to the target solution. The deployment plan may in this way be extended over some years depending upon a number of parameters: • Business plan requirement • User requirements anticipated in the Service Catalogue/ Roadmap (see next section), • Lifecycle of existing telecom assets on the network, • Investment plan and availability of funds, • Deployment capability and skills • Power delivery constraints and minimal disturbance planning Validation and Feedback – The process of deploying a telecom solution allows the practical validation of the long term policy and its related strategic decisions. Parameters and factors which have not been taken into account and hypotheses which turn out to be invalid are in this manner identified. The feedback can be used to adjust the business plan and the strategic orientations. The deployment process must use previously defined metrics (KPI) and devise measurement capability to validate strategic decisions. On the other hand, operational management teams can provide, through their processes described further, valuable feedback used for identifying: • • • • •
Asset usage and potential optimization Asset “replace or maintain” requirements based on operating costs Existing service delivery costs Service contractors and supplier performance, Security reinforcement requirements, etc.
Tactical Adjustment – EPU communication service requirements are not static and evolve in time. Anticipating new service requirements is described in the next section on Service Offer, however adjusting the capabilities of the network (or provisioning contract) in terms of coverage, bandwidth and network in order to meet encountered requirements is to be treated with a higher reactivity. System and process upgrades and optimizations are performed within the perimeter of the approved yearly budget. This is generally carried out through setting projects and following their deployment using the operational management or contractor workforce. The scope may cover optimization, corrective action, upgrade/renewal of equipment and firmware upgrade.
111
10.5.4 Business Development, Service Offer and Service Migrations If policy definition and business planning were assimilated to the decision to set up a restaurant at a certain location, assessing the catering mode (e.g. served at table, self-service, fast-food, etc.) and looking after legal issues, then strategic deployment would be employing staff, purchasing equipment and preparing the premises accordingly. In this case, this section would be about the content of the menu, attributing prices to different items and adjusting the menu in time to correspond to customer tastes and preferences. In the Telecom Operator world, this is called Telecom product marketing and lifecycle management and is mainly related to market survey. In the EPU’s operational telecom context, building the Service Offer corresponds to the analysis of user application requirements, grouping of constraints and attributes into different categories of Service Level Agreements (SLAs) and hence to build a “Service Catalogue”. Building the telecom service catalogue also comprises the determination of service prices based upon the initial network investments (cost of building the capability) and operating costs (operational management cost), this latter also including all external costs due to suppliers and contractors. The manner in which these costs are converted into service prices are governed by the profit strategy set up by the Business Plan (Strategy Building). Pricing of services may be used for direct invoicing of services to the users or often for the repartition of expenses: • • •
Between internal and external users Between operational and corporate users or between different operational entities
As for the restaurant menu, the communication service catalogue needs to be updated according to encountered and anticipated changes in the EPU’s applications. Typically, SCADA applications are in most EPUs migrating from serial point-to-point communication channels to Ethernet-based TCP/IP networking. This change, together with other applications’ migration to Ethernet, is reducing considerably the requirement for serial point-to-point communications, but increasing sharply the requirement for time-controlled wide area Ethernet. The service catalogue must therefore be adjusted correspondingly, leading to the deployment of adequate capabilities in terms of network, management tools and skills. Service migration planning is often to be phased according to a number of different factors: • • • •
Application asset lifecycle and migration plan Extent of required infrastructure change and corresponding investment plan Deployment capability for the migration and readiness of the organization Other planned changes and extensions, allowing a grouped migration project, reducing costs and service interruptions
112
10.6 Operational Management Operational Management consists of operating the telecom services, the network and its related resources in the day-to-day manner. It covers the current relationships of the service providing entity with its customers and provider/contractors as well as the processes necessary for delivering telecom services as represented in the figure hereafter.
Upstream Management
Customer / User
Customer / User Relationship Service Management Resource Management Provider / Contractor Relationship
Enterprise Processes Provider / Contractor
Initialize
Deliver
Get Paid
Provide Support
Figure 10.11 – Utility Telecom Operational Management The Operational management processes can be further related to an operational lifecycle track categorizing them into four types: •
Fulfillment (Fu) – Setting up and initialization of different schemes (users, services, channels, tools, etc) based upon pre-established agreements and rules.
•
Assurance (As) – Running the schemes set up previously in order to deliver the communication services.
•
Support (Su) – Providing a response to different enquiries and requests for service including the maintenance services
•
Accounting (Ac) – Determination and recovery of costs related to the delivery of telecom services to different Utility-internal or external Service Users according to rules which are set by the Upstream Management processes, monitoring the usage of communication resources according to contracted conditions and recovering of network revenues to settle the operational expenditure.
113
10.6.1 Customer/User Relation Management This process involves all the different interactions that the Utility Telecom Service Provider must fulfil in order to obtain the satisfaction of its internal or external customers. Some constituents of this process are listed below: 10.6.1.1 Service Enquiry Desk (Su) Service Desk is the interface point for all User relations. In many cases, the service desk in the Utility Telecom organization takes the form of telephone and/or e-mail, although web-based automated service desk is also employed. It is current practice, however, that in many utilities, the Service User bypasses the service desk, calling directly the technical staff, hence making it very difficult for the support organizations to keep track of their service calls and to justify their required resources through their load statistics. The bypassing of service desk in Utility Telecom services is often cultural and due to historical relationship of users and the provider staff in previous operation schemes (telecoms embedded in the operational activity) and educational effort is required to establish the use of the service desk. The Service Desk tracking of User issues is also a valuable tool for the survey and monitoring of User satisfaction, providing statistical data for the analysis of services, resources and the evolutions of User expectations. 10.6.1.2 User Order Handling (Fu) User order handling is the process of initializing a new instance of service contract with a User/Customer based upon a pre-defined SLA. The user entity, in general, transmits a written (e.g. through e-mail) Customer Order (CO) to request the creation of a new connectivity based upon an existing service type which is known and pre-characterized by the Telecom Service Provider. For example, a request may concern the creation of SCADA IEC104 connectivity between a given substation RTU and the Control Centre. SCADA IEC104 communication service being part of the “standard service catalogue”, the Service Provider knows the interface type and the characteristics of the circuit as well as the exact terms of the Service Level Agreement (SLA) concerning this service. If the requested service does not correspond to a “standard service catalogue” SLA, then it is necessary to define previously between the User entity and the Service Provider entity an appropriate SLA defining interface type, throughput (or capacity), time constraints, availability, security and integrity constraints, fault tolerance, service restore time, etc. User order handling comprises the following tasks: • Comprehend user requirements, determine feasibility and appropriate standard SLA • Register the new user service for inventory, support, SLA monitoring and accounting purposes • Transmit a Service Initialization Request (SIR) including SLA attributes to the Service Management layer (Service Configuration & Activation) • Notify the User (Acknowledge and date of availability) 10.6.1.3 User Change Management (Fu) User Change Management concerns the telecom provider’s handling of all Customer/User Change Requests. A user change request is in general a written demand concerning an existing
114
connectivity to modify its throughput, quality, interface type or routing. The change may be feasible from the telecom provider’s Operation Support System or may require field intervention. In both cases, user change management process must include time coordination for performing the requested modifications, and in the latter case, also site access coordination for the field intervention staff. 10.6.1.4 User Problem Handling (As) The telecom provider must comprehend the user problem expressed through the service desk generally by telephone, create a trouble ticket and manage it up to the resolution of the problem and closing of the ticket. Proper management of user problems and an adequate method for their tracking and reporting is essential in a formal user/provider relationship. 10.6.1.5 User QoS/SLA Management (As) The telecom provider must monitor the QoS for the communication service and elaborate a User Dashboard which allows quality reporting to the User/Customer. The measured quality is compared to the contracted SLA and violations are managed. User SLA monitoring allows the notification of users and fast resolution of quality degradations. Critical services such as protection relay communications can in this way avoid major application-level incidents and consequent contingencies. 10.6.1.6 Customer/User Entity Invoicing & Settlement (Ac) Customer relationship management includes also the creation of invoices for each user entity (e.g. SCADA, Protection, Surveillance, etc.) based on the usage data collected at the Service Management level and on pre-established service pricing as defined in the Upstream management process. Managing customer entity invoice enquiries is also part of this process.
10.6.2 Communication Service Management Process Communication Service Management covers all the processes necessary for the utility telecom provider to deliver different services to its users employing owned and/or contracted resources. 10.6.2.1 Service Configuration & Activation (Fu) Once a new instance of user service contract is initialized, it is necessary to design, configure and activate the service. In a small integrated delivery scheme, Customer Order Handling and Service Configuration and Activation can be integrated into a same process, i.e. the Customer Order is translated into a Service Order implicitly. In a more general, larger context, a Service Initialization Request (SIR) including SLA attributes is transmitted to the Service Configuration & Activation process. This process is in charge of designing the solution which fulfils the quality requirements. It shall comprise the association of different connections, the allocation of specific parameters, the choice between different modes of communication (multiplexed, switched, different IP network planes, priority assignments and route resilience) based on the prior knowledge of network connection availability and time performances to meet the specified SLA requirements. A “Resource Order” comprising design attributes is transmitted to the internal Network Resource Management layer (Bandwidth & Capacity Provisioning) and/or external Resource Provider (for procured capacity). When the required resources are available, the Service Configuration & Activation implements the end-to-end customer service, performs end-to-end tests, registers the new service in the Service Configuration Data Base (Service Inventory & Configuration Management), and activates the service. 115
10.6.2.2 Incident Management (As) Incident Management in the EPU context is often treated as a merged Service and Network Management process when communication services are delivered exclusively (or almost exclusively) through internal resources. For the sake of generality, here we distinguish Service Incident Management (Service Problem Management) and Network (or Resource) Problem Management. Service Incident Management comprises essentially the creation and closure of Trouble Tickets when a service problem is encountered by the User (User Problem Handling) or notified by the Network Problem Management (through fault detection and service impact analysis). The Service Incident Management shall keep track of Service Problems, shall assess the impact on the User (service lost or only impacted, e.g. loss of resilience), and shall follow up to the resolution, determining the restoration time and the down-time for the service. Incident Management is a “normal time” process which can switch into a Disaster Management mode when the extent of the incident (or incidents) takes extraordinary dimensions. The boundary between Service Incident management and Network Resource Problem Management in this case disappears, giving an overall “crisis management” cell with close links to the field maintenance organization. 10.6.2.3 Service Quality Management (As) Assuring the Quality of Service delivered to the Users comprises different tasks: • Produce Availability and Service Continuity plans • Assess Service Resilience & Survivability • Produce Time Performance, Traffic & Throughput Objectives in relation to SLA obligations • Monitor end-to-end Quality of Service and produce Performance Reports • Detect service degradations through monitoring and through User Problem Handling • Analyze Quality issues and report to Incident Management Service Quality Management sets up appropriate measurement tools and detection mechanisms in order to detect degradations before they impact the User applications. 10.6.2.4 Service Change Management (Su) Service Change Management is the process continuing the User Change Management already described. It translates a User request for change into a series of steps employing standardized methods and procedures ensuring minimal disruption of services. In conjunction with the User and with the Network Change Management, a change planning schedule is defined together with a scenario which may include intermediate steps to avoid service interruptions. In the EPU context, the request for change may also be initialized by a change of the EPU owned infrastructure (e.g. the power network structure modified due to some new substations being added). In fact, the EPU being at the same time the communication service end-user and the telecom provider entity’s transmission infrastructure provider, it is sometimes difficult to fully formalize the provider/user relationship concerning change management. 10.6.2.5 Service Inventory & Configuration Management (Su) In its simplest form, Service Inventory in Utility telecoms is the set of telecom drawings and tables which represent the logical interconnections related to each application (e.g. SCADA RTU communications or operational voice network). In a larger scale network, this is the information base containing all provisioned (or delivered) logical interconnection services together with their attributes, specific parameters and 116
requirements, supporting sub-networks and downtime/restore obligations (Service Configuration Management Data Base, S-CMDB). Service configuration management is process of keeping track of all services in a configuration data base. There may be a close coupling (or identity) between Service Configuration management and the Network Resource allocations across a Utility-owned network, performed in Network Configuration management. 10.6.2.6 Service Policing & Usage Metering (Ac) Services are delivered to the User with certain capacity and throughput. Multiplexed circuits are limited in capacity by the interface and circuit allocation fixing the network bandwidth usage. However, Ethernet-based and IP services interfaced at high rates (e.g. 100Mbps) may require service policing limiting the traffic handling capability of the User interface well below the interface maximal capability. This is essential for avoiding the saturation of the network plane to which the service is allocated and therefore the quality of service contracted through each SLA. Service policing is the process of limiting the throughput associated to each service which can be increased without physical intervention at site on User’s request (User Change Management process). Moreover, the Telecom Service Provider must capture the cost of service according to prior Service Pricing performed in the Upstream Management tasks (establishment of the Service Catalog & Pricing). Even if no internal cost settlement is performed, the estimation of the cost of service per user entity still provides valuable data which can be an important aid to decisionmaking.
10.6.3 Network Resource & Infrastructure Management 10.6.3.1 Bandwidth & Capacity Provisioning (Fu) A Resource Order issued by Service Configuration & Activation is processed by the Network Bandwidth & Capacity Provisioning. This process allocates network bandwidth and/or capacity across multiplexed or switched sub-networks subject to resource availability and keeps track of resource usage and capacity. The allocated resource capacity is registered and managed by the Network Configuration Management described as another process in this layer. Similarly, the Capacity & Bandwidth Provisioning is in charge of recovering liberated bandwidth when a service is discontinued. 10.6.3.2 Network Problem Management (As) Network Problem Management is the process that manages network incidents (A `problem' being an unknown “underlying cause” of one or more incidents). Network incidents can be reported through the Fault Management (equipment alarms), or through the Service Incident Management, or through Performance Management processes. Network Problem Management shall localize and diagnose the root cause of network anomalies using different Management System components (Element Managers, Overall Event Management Systems, Performance Monitoring Systems, etc.) and shall identify a work-around or a permanent resolution for the problem. It shall issue appropriate work orders towards the Field Maintenance teams if corrective action at field is required. The objective of Network Problem Management is to minimize the adverse impact of incidents and problems caused by network anomalies and to prevent the recurrence of incidents. When the dimensions or extents of network problems grow substantially, this process shall merge with the Service Incident Management into a Disaster Management process.
117
10.6.3.3 Network Performance Management (As) Network Performance Management is the assuring that the communication resources delivered to the service layer meet the performance objectives set for the particular class of connection. In particular, the time constraints, availability performance and throughput for each type of network connection must be assessed and monitored. Any performance degradations must be reported as a network incident (or event) for escalation towards the Problem Management. 10.6.3.4 Network Fault Management (Su) Managing network faults and alarms is the most basic task for maintaining the EPU telecommunication network resource in operational state. This is generally performed through dedicated Element Managers associated to each type of equipment and possibly an SNMPbased “umbrella” management system, collecting alarm information from different types of equipment constituting the network infrastructure. Network Fault Management is the principal source of information for the Problem Management process allowing the localization and analysis of network anomalies. A particularly useful complement to equipment fault and alarm collection is the capability to determine Service Impacts of network faults which enable linking this process directly with Service Incident Management. Network faults collected through the fault management process originate from different layers of network infrastructure often with overlay architecture (e.g. multiplexers and switches communicate over transmission equipment which connect over a fiber connectivity composed of cable segments, etc.). Proper visualization of alarms in an overlay mode facilitates the task of root cause analysis described in Network Problem Management. 10.6.3.5 Disaster Management (As) Basically this is the same process as Incident Management and Problem Management but in a situation of multiple critical incidents related to a same cause such as storm, flood, major breakdown, or attack. When a disaster situation is declared, the applicable management process switches from the current incident management to a special process with particular tools, staff duty schemes and process flow, giving a primary role to infrastructure support with priorities based upon the different services continuity and criticality constraints. 10.6.3.6 Network Configuration Management (Su) Network Configuration Management is the “Resource” counterpart of the Service Inventory & Configuration Management described earlier. It is in charge of keeping an updated view of all assets, their configurations and interactions across the network through the Network Configuration Management Data Base (N-CMDB). The N-CMDB must in particular label all assets and links with unique Identifiers and keep track of asset information (versions, ownerships, documentations, factory serial numbers and acceptance tests, etc.). Configuration management is in charge of ensuring that no modifications or additions, removals or replacements are performed without prior notification and approval. In particular, this process is in permanent coordination with Change Management process and Field Works to assure that the records in the CMDB remain correct at all times. Regular reviews with planning and change management processes allow the Configuration Management to establish different projection views of the network infrastructure (e.g. the network configuration as running today, after the current works, by the end of the on-going project, by the end of the planned transformations, etc.).
118
10.6.3.7 Network Change Management (Su) Network Change Management is the companion process of Service Change Management. In conjunction with Network Configuration Management, this process ensures that all network changes (add, modify, remove, replace) are performed using standard methods and procedures and with minimal disruption of services. In particular, the impact of works on existing services, intermediate steps, authorizations and time scheduling for service interruptions must be performed in coordination with Service Change Management and the impacted users. 10.6.3.8 Network Maintenance (Su) Field maintenance of the telecom network infrastructure is the most demanding part of the provision telecom services through internal network resources. Field maintenance requires managing a relatively important workforce and logistics, depending on the number and geographical dispersion of sites. The required intervention time of the maintenance staff determines the geographical radius that can be allocated to a same “maintenance base”. If the number of sites in this radius and consequently the expected workload is too low, then one of the following solutions must be envisaged: • Share the “maintenance base” with other telecom infrastructures through contracting of field works to an external multi-customer service contractor • Share the “maintenance base” with other EPU field activities (e.g. protection, SCADA, substation, etc.). The first solution results in a fair level of telecom skills but generally require costly electrical certifications and trainings for the “non-EPU” external maintenance staff. The second solution results in a relatively lower telecom specialization (multi-skill staff) but with all required electrical certifications and trainings. 10.6.3.9 Asset Lifecycle & Spare Management (Su) Asset Lifecycle management is the process that keeps track of the state of all equipment and infrastructure components constituting the telecommunication facilities of the Utility telecom provider (ordered, received, under test, live, under repair, withdrawn, etc.). This also includes all Information Systems and IT platforms used for the management of the service and of the network. Asset Lifecycle management in coordination with different equipment suppliers, must anticipate on the end-of-life of equipment and the non-availability of spare parts and must constitute stocks of spares as necessary. It must also keep inventory of non-allocated equipment and spares dispersed among different maintenance centres and spare stores. 10.6.3.10 IT Tools Management (Fu+As+Su) Managing telecom service and network increasingly employs information systems and IT platforms as described in section 9.6, some of which have already been mentioned above. The associated IT governance and management is not something with which EPU telecom staff is generally familiar. Integrated IT and Telecom Service Provider entities are clearly advantaged in this respect and already possess the required methods, processes and skills. Managing dedicated IT is extensively treated in ITIL (see appendix) and the issue is a general one across the EPU. We therefore do not focus on this particular point. Some particularly important points are however reminded: • Software Release & License Management • Service Support Contracts for constituent Applications and Firmware • Security & Patch Management • Software Documentation & IT Configuration Management 119
10.6.3.11 Estimate cost of running the network infrastructure (Ac) This operations management of the Utility telecom service provision must keep track of its running costs. This will in turn allow the upstream management to define the service prices for the “service catalogue” or to define the basis for cost repartition between internal user entities. The cost estimation must include the running costs for the associated management IT tools.
10.6.4 Provider/Contractor Relationship Management Process External Providers and Contractors (P/C) are increasingly employed in the telecom provision process of the EPU. Some of the major cases of their use are as follows: • Field Maintenance & Support • Special Communication Services (e.g. satellite & mobile services) • Point-to-point bulk connectivity where the EPU does not have infrastructure • Operation, Administration & Maintenance of the whole network • Individual skilled worker to be integrated in the EPU telecom entity organization 10.6.4.1 P/C Service Contracts (SLA/OLA) Setting and Adjustment (Fu) This process comprises the selection of contractors, determining the way they shall interact with the provision process and consequently setting up the contractual content of their services through SLAs. 10.6.4.2 Providers/ Contractors Performance Management The performance of external P/C must be monitored against the contractual SLA. Any nonconformity and degradation must be tracked and reported for resolution with the P/C. In particular, leased circuits, dark fibre and bulk capacity need to be monitored carefully as they shall directly impact the performance of the communication services delivered to Users and may result in the non-respect of the telecom provider’s own SLA obligations. 10.6.4.3 Provider/ Contractor Problem Reporting (As) Performance monitoring process described above may lead to reporting of a problem, as would also a number of other situations. In general, when external P/C is used in the Operations of the Utility Telecoms then it is necessary to define the process for reporting and resolution of their corresponding anomalies. 10.6.4.4 P/C Support Management (Su) The Utility telecom provider must support its external providers and contractors and respond to their contractual requests. Some typical P/C support management issues in the EPU context concern • EPU Site Access Permits, • Security clearance for contractor staff • Electrical environment safety training & certification 10.6.4.5 P/C Cost Assessment and Invoice Settlement (Ac) This process comprises the reception of Providers/Contractors invoices and to issue settlement orders after assessment and approval. An important task relating to Providers/Contractors relationship is the periodic assessment of their costs and the control of their invoices in order to maintain the overall cost of contracted services and resources at an appropriate level. This comprises the assessment of alternatives and the periodic renegotiation of contracts.
120
10.6.5 Enterprise Processes impacting Telecom Service Delivery 10.6.5.1 Security Management The overall Security Policy of the Electrical Power Utility must be used to set up the necessary specific security rules and to ensure their application. If the telecom provider entity is not directly under the coverage of the Security Rules of the EPU, then it is necessary to define a Security Policy in coordination with the EPU Security Policy. Periodic audits must be applied to ensue that the Security level is maintained suitably. 10.6.5.2 DR/BCP Disaster Recovery and Business Continuity planning have already been discussed in a separate chapter. The telecom provider entity must either be under the coverage of the EPU DR/BCP, in which case the entity must have appropriate processes to ensure their application, or have its own DR/BCP which must be elaborated in coordination with the EPU communication user entities’ plans. Test plans and periodic audits are required to ensure that the capabilities of the telecom provider are maintained in time. (e.g. evacuation and switch-over to back-up facilities on disaster situation). 10.6.5.3 Human Resources and Skill Management Maintaining a suitably trained and skilled workforce which resists the retirement of aging employees and the mobility of the younger skilled staff is an enterprise process with significant impact on the proper operation of the telecom provider entity. Two issues need particular attention in the present day EPU: Not losing knowhow and expertise with retiring staff Acquiring knowhow on new technologies when deployed.
121
Fulfillment (Setting up)
User Order Handling User Change Mgt
User/Customer
Communication Service
Service Configuration & Activation
Assurance (Delivery) Running
User Problem Handling
Support (Enquiry Supervision & Maintenance) Service Enquiry Desk (User Technical Support)
User SLA Management
Incident Management
Service Inventory/Config. Mgt
Service Quality Mgt
Service Change Management
Network Problem Mgt
Network Infrastructure & Resources
Bandwidth & Capacity Provisioning
Network Perf. Management
External Providers/ Contractors
P/C SLA Setting & Adjustment
P/C Performance Mgt (SLA Monitoring) P/C Problem Reporting (Supplier Relationship Mgt.)
Disaster Management
Net. Configuration Mgt Network Change Mgt Fault Management Network Maintenance Asset Lifecycle & Spare Mgt Management Tools Support P/C Support Management (Site Access Permits, Safety & Certification)
Fig 10.12 - “Current Operations” Process for Utility Telecom Service Delivery and Provision
122
Accounting (Metering, Cost Repartition, Invoicing & Settlement) Customer/User Entity Invoicing & Settlement
Service Policing & Usage Metering
Estimate running cost of network infrastructure (+ tools)
P/C Cost Assessment & Invoice Settlements
10.7 Management Tools and Information Systems 10.7.1
Introduction
Building organizational capability to deliver communication services requires also the deployment of appropriate management tools and information systems. These tools are used as appropriate in support of the network’s operation, customer relations, revenue collection and provider relations. The size and complexity of the required management tools depend upon the scale of the telecom activity and hence the complexity of management processes, as discussed earlier in the chapter. Although the great majority of EPUs have deployed dedicated vendor-specific Element and Network Management Systems (E/NMS) allowing the supervision and maintenance of each silo of technology, few have yet implemented wider scope management systems, enabling an overall view of the network as well as its delivered services and assisting the management staff to interact with it. Figure 10.13 illustrates KEPCO management architecture as deployed in 2006 [43]. This architecture covers the integration of SNMP managed part of the network together with alarm management of legacy communication equipment. The architecture also includes a second level composed of an “integrated NMS” as well as data structures and applications for network documentation, fault management, work management and line configuration.
Figure 10.13 – Kyushu Electric Power (KEPCO) Management Architecture [43]
123
10.7.2
Element & Network Management Systems
Although the present chapter's focus is on tools covering the overall infrastructure, the service and the user/provider interactions, and therefore beyond the vendor sub-system management, it should be noted that the basic level of management tools presently employed in the majority of EPUs remains the vendor-specific, dedicated Element and Network Management Systems. These systems cover Fault, Configuration and Performance Management for a single-vendor telecom subsystem in an in-depth and optimized manner. In particular, these systems allow an adequately trained maintenance engineer to perform preventive and corrective maintenance of fibre cores as well as board-level fault diagnostics, read and modify equipment configurations and program end-to-end connections across the uniform sub-system with the required protection mechanisms. Dedicated Element and Network Management Systems are widely documented by telecom equipment suppliers. In this section we rather focus on tools beyond the vendor sub-system management, covering the overall infrastructure, the service and the user/provider interactions. Fibre maintenance systems are specific SCADA systems based on OTDR measurements performed on dedicated pilot fibres or in-traffic fibres in order to detect change of characteristics or to localize damage. These systems allow a decrease of the fibre down-time and hence improve the overall availability. They improve also the efficiency of the maintenance process and work scheduling. A particular category of network management systems which are increasingly employed in the EPU operation-related environment is those which manage IP/MPLS and Ethernet packet networks. These tools may be vendor-specific or more general monitoring platforms and allow the following tasks in addition to managing elements’ fault and configuration: • • • •
Discovery of devices with an IP address, Route tracking for packets belonging to each virtual network Traffic load estimation on devices and links Cyber-security attributes and events (from router firewalls, etc.)
A number of Japanese EPUs have made particular developments for their IP Network Management [44]. The objective of these developments has been the achievement of smoother operation and maintenance through an adapted graphical display of network states (VPNs and Label Switched Paths, LSPs) enabling a visual perception of route information. Additionally, the network management system can be operated with graphical operation windows and the O&M staff can set up routers and L2 switches through wizards without wrong settings and operations. This function requires only simple operations for end-to-end settings, and facilitates the system expansion.
124
10.7.3
Operation Support Systems (OSS)
The Operation Support System (OSS) consists of on-line and off-line information systems and corresponding applications supporting the operational management organization in its interactions and in its visualization of the network infrastructure, current problem solving, incident management and service monitoring tasks. Some particularly important functional requirements for a telecom OSS in the EPU environment are listed below [43], [45]: Management Function Fault Supervision and Problem Management (Root Cause Analysis)
Scope ° ° ° ° ° ° °
Service Monitoring
° ° °
Work Order Management
° ° °
°
°
Trouble Ticketing & Incident Management
° ° ° ° ° °
Collect alarms and events from the whole telecom network Filter, de-duplicate, translate and correlate alarms Enable prompt identification of the root cause of fault from multiple alarms in an overlay of telecom technologies Provide troubleshooting information on identified fault using a common faults database Notify on-duty staff of the occurrence of specific events (or combinations) Enable remote switching or activating of units Mask alarms generated by work, based on work schedules inputted in advance Determine the impact of network faults on delivered services Monitor the quality of delivered services and generate alert on detecting degradation Monitor service availability and down-time on faults and works and generate service dashboards Monitor the performance of SLAs Generate periodical reports on services Provide support for planning and adjustments of work schedules by automatically extracting other work on complementary pair, including work on redundant routes. Determine the necessity of checks on service soundness including absence of failure on redundant routes prior to the commencement of planned work. Support for control offices to regulate work and/or manage the progress of work at each place responsible for maintenance when major failures occur. Create trouble tickets Assign incident resolution to a particular person/department Keep track of the incident with possible escalation Close incident and perform statistics on resolution times Formalized incident reporting Monitor & report on incident resolution
125
Inventory & Configuration Information System
° ° ° ° ° ° ° ° °
End-user (and customer) inventory Service Catalogue and SLA inventory Service inventory and provisioning Inventory data of logical resources Inventory data of passive and active physical resources Inventory of application software and firmware releases and licenses Spare and warehouse inventory management Supplier / Partner inventory Capacity planning, change management and migration planning
Figure 10.14 illustrates comprehensive management platform architecture for delivering these OSS functions as envisaged by Japanese Utilities [43].
HMI & Terminal Functions New HMI
..
Line Configuration Management
Configuration Mgt. Support Equipment Config. Management
Line Design
Line Provisioning Management
Line Planning Task Support
…
New Task Support
.. ..
..
New Functional Component
Fault Statistics
Line Registration
Equipment Registration
Line Planning Registration
Contractor Control
Server Log Collection & Analysis
VPN Route Verification
Equipment Status Monitoring & Analysis
Trouble-shooting Procedure Recording
Scheduled Outage Registration
Information Platform Integrated Fault NMS Determination I/F
Element Management Systems
..
Thin Client
Workflow Engine Service Quality Monitoring
System Down Determination
Task Functional Component Group
Network Management Function
Online Help
Work Management Support Work Control
..
Electronic Notice Board
Work Plan
..
Maintenance Task Support Trouble-shooting Knowledge
Reliability Status Management
Equipment Status Management
Task Support Function Group Supervision Task Support Service Status Management
User Authentication
E-mail
Remote Switching Control
GUI
I/F EMS Contacts
Service Impact Determination I/F I/F EMS Proprietary
Alarm Masking
Performance Monitoring
Network Management Data Base
I/F
I/F
I/F EMS
I/F EMS
SNMP
Syslog
I/F EMS New
I/F
…
Fig 10.14 – Outline of functional configuration of new Network Management System [43]
126
An outline of each section of the figure is detailed below: •
EMS, Integrated Network Management Function, and Network Management DB Monitoring information from the NEs is collected by the EMS (Element Management Systems). The Integrated NMS determines faults and identifies affected lines and services using configuration information from the network management Data Base. Moreover, the configuration management information in this Data Base is synchronized to the configuration information residing in managed equipment and in the different EMS.
•
Task Support Functions, Functional Components, and Workflow Engine Task support functions (such as service status and equipment status management) are used for the supervision task. These functions are realized by the combination of various functional components such as the system down determination function. The configuration of these task support functions (combination of functional components) is defined in the workflow engine based on the task flows.
•
Information Platform Functional components operate based on information from the integrated NMS and network management DB. An information platform plays the role to accommodate this information and functional components as if they are on a common bus, to provide a mechanism for linking functional components through loose clustering, in which functional components are independent of each other and separated for each coarse grain.
•
HMI Functions and Terminal Functions Various HMI functions such as GUI and electronic notice boards are provided for visibility, operability and convenience.
10.7.4
Inventory & Configuration Data Base
User, Service and Network Inventory are basic information system requirements governing the EPU telecom facility and business. In service- and resource-related areas, the inventory information is composed of the configuration data encompassing existing assets, allocation, implementation, installation, configuration, activation and testing of specific resources and services. The information may be employed to meet the requirements from different processes to alleviate specific service or resource capacity shortfalls, availability concerns or failure conditions. The relative importance and necessity of different inventory related activities depend indeed upon the scale and scope of the EPU telecom delivery model. Regardless of these delivery model considerations however, the efficiency of the inventory system (and hence the speed and efficiency of the enterprise process) lies primarily in the accessibility of data across different sources and different data domains and the association of these inventory information domains. Inventory and configuration information systems are examined here in more detail in support of efficient implementation of EPU operational processes [46]. Beside interrelationship, another important aspect of inventory information is the quality of data. Since inventory related activities are a combination of manual and (possibly) automated tasks, the only way to maintain high level of data quality is to incorporate respective attention into the enterprise processes. Data quality becomes extremely important when some process
127
automation is implemented at the operational level (e.g. automated service and resource provisioning, alarm correlation, impact and root cause analysis, etc.). Functionality of an inventory system typically includes the following features: •
•
• •
•
Inventory retrieval – allows for other EPU information systems to retrieve information from the Inventory system either through attribute matching or on a request basis. Inventory update notification – generation of notifications for client information systems based on changes to the inventory data (e.g. object creation, object deletion and attribute value changes, etc.). Inventory update – A connected EPU information system may request that the Inventory be updated according to its own data change Inventory reconciliation – The Inventory system may look for, and discover changes of certain of its data records across other EPU information systems and consequently updates its records, raising an exception in case of discrepancies. Inventory information model – typically a lot of detail needs to be added concerning the entities to be managed. Specific details will depend on the particular service / resource / technology.
In practical implementations, these basic inventory system features are always part of a broader context which needs to be defined before the implementation. Broader context is defined by the business requirements at initial step of an inventory project lifecycle. Business requirements are high level use cases which originate from enterprise processes and their corresponding expected business results. The importance and influence of the inventory system is illustrated below: 1. Automatic generation of User Notifications – The EPU telecom service provider may be required to generate and transmit automatically advanced e-mail notifications to customers (or end users) of a possible service outage in case of planned maintenance actions as part of its SLA. User notification is based on “service impact analysis” which can only be performed if the “service to network mapping” is available in the inventory information system and its precision depends on the degree of resolution available in the inventory system. 2. Circuit provisioning across a multivendor network – The EPU telecom service provider needs to document the creation of new circuits across a multivendor network and to issue the corresponding work orders. A “path finder” application containing decision rules needs access to inventory information in order to perform its task. 3. Root Cause Analysis – Problem management in the EPU requires the identification of the root cause of an avalanche of alarms received from many network assets at different levels of the infrastructure. Performing this task requires the knowledge of configurations and the relationship between the alarm generating assets, which is recorded in the inventory information system. A comprehensive multilevel and multi domain inventory and configuration data base fulfils these requirements. Data models differ for each implementation due to the types of needed entities as well as the level of detail needed for each entity in the data model. 128
There are several sources of inventory information as described below. End-user (Customer) Inventory The user inventory maintains records of all telecom service users, their interactions with the enterprise, any contracts established, and any other customer related- information, required to support CRM and other processes. While customer inventory is important for the U-Telco type of enterprise it seems that EPU can stand here with less detail since actual customers are internal departments inside the EPU. Instead of extended customer inventory EPU shall maintain detailed end user inventory. Service Catalogue and SLA Inventory The product offering (or service catalogue) inventory maintains records of all product offerings, their interactions with the enterprise, and any other product offering related information, required to support CRM and other processes. The product offering inventory is also responsible for maintaining the association between customers and purchased product offering instances, created as a result of the Order Handling processes. EPU type of operation is not market competitive in nature but is fulfilling a demand. For that reason product inventory can be simplified but not wholly missed out. The concept of a product is still used in inventory system implementations as the vital point of associations for the business. For similar reasons, the EPU telecom provider not being generally involved in retail sales of services (even in the great majority of U-Telco contexts), maintaining a Sales Inventory is often not an EPU requirement. Service Inventory and Provisioning The service inventory maintains records of all service infrastructure and service instance configuration, version, and status details. It also records test and performance results and any other service related information, required to support Service Management & Operations. The service inventory is also responsible for maintaining the association between customers purchased product offering instances and service instances, created as a result of the Service Configuration & Activation processes. Service Configuration shall support aggregate customer facing services. It is necessary to see a clear difference between services and networks/resources. Services can be viewed as being comprised of a number of building blocks - e.g. bandwidth, security, maintenance package, SLA, QoS, specific features e.g. voicemail, etc. Service configuration can be derived from order details in addition to inherent business rules from service specifications and the service view in the Service Inventory Management application. Resource Inventory and Configuration The resource inventory maintains records of network infrastructure comprising logical and physical resources as well as passive resources such as cabling, cable ducts, etc. It documents resource instance configuration, version, and status details as well as test and performance results and any other resource related information, required to support Resource Management & Operations. The resource inventory is also responsible for maintaining the association 129
between service instances and resource instances, created as a result of the Resource Configuration and Provisioning Management processes. Configuration and topology information of each resource domain in the Inventory management System shall be kept coherent with that in the resource type’s database (e.g. Element Manager Database). When possible this shall be done by resource discovery applications or uploaded from dedicated Element/Network Management Systems through update or reconciliation. Resource Inventory is the basis for a number of important applications including spare modules and warehouse management, capacity planning and change management and network migration planning. These latter applications are used to discover and manage underutilized or ‘stranded’ resources including cable pairs, wiring and distribution panels, and other passive resources. Resource Inventory requirements also include those related to application software and firmware releases and licenses related to telecom management or related to telecom service delivery platforms which may (or not) be associated to more general IT asset management information systems of the EPU. Supplier/Partner Inventory The supplier/partner inventory maintains records of all commercial arrangements with supplier/partners, and any modifications to them. It also records all details of contacts with suppliers/partners as well as commercial information, including details of supplier/partner products and services, required to support Supplier/Partner Relationship Management. The supplier/partner inventory is also responsible for maintaining the association between product instances, service instances and resource instances. This inventory is particularly important for EPUs utilizing connectivity provided by public telecom service providers where an association between rented facilities to the EPU resource and configuration database is highly important.
130
11 COST CONSIDERATIONS Cost assessment for different modes of telecom service provisioning can only be performed in a specific manner for a particular EPU depending upon its existing resources in terms of infrastructure, skills, service procurement opportunities, regulatory issues and service requirements. Similarly, as it has been stipulated in the previous sections, many hybrid provisioning schemes combining service procurement and in-house provisioning may be adopted to fit optimally in any particular case: • • • •
Only particular telecom services may be procured (e.g. mobile services) Only particular telecom services may be delivered in-house (e.g. protection relay) Access to some particular sites or zones may be through procured services (e.g. urban sites) Only some service or layers of infrastructure may be externally provided (e.g. maintenance)
In the present section, we only enumerate some important cost parameters that need to be taken into account when performing such an assessment considering three principal modes of service provisioning which are: • Build and operate • Build and contract services • Procure service Asset ownership investment issues on the regulatory side
The costs associated with the provision of telecom services can widely be classified into Capital Expenditure (CAPEX) and Operation Expenditure (OPEX). The former comprise the costs associated to the implementation of a telecom network infrastructure, and the latter to the running and maintaining of a telecom network or to the procurement of services. Regulatory constraints in the deregulated environment generally render high CAPEX and low OPEX more attractive because the expenditure on infrastructure may be declared to the regulator as an improvement of the reliability and security of the power system and hence incorporated into the pricing of the service. It should however be borne in mind that the same favourable regulatory environment can end up to be an obstacle to the integration of non-operational services (and/or commercial revenue generation through external customers) across the same infrastructure. Similarly, the rehabilitation of the operational telecommunication facilities can be eligible for financing through international funds which indeed may not be used for procuring externally provided services. Perimeter of Communication Services and Requirements
When comparing costs and solutions, defining the perimeter of services and requirements is often a tricky issue: • When provisioning through procured services is envisaged, utilities tend to limit their requirements to the minimal operational services currently in use. However, when planning to implement a dedicated infrastructure, utilities dimension the system not only for present services but also for estimated new site extensions, new applications and estimated rise in application bandwidth usage during the system lifetime.
131
•
•
•
•
Non-operational services are generally not included in procured service cost estimations although these services result in some cost to the utility, usually for another department such as “Deregulated Business” department, or alternatively these are “absorbed” by a project or maintenance group so that the true costs are not known. These are most often planned into the dimensioning of dedicated networks. There are often auxiliary applications and services that would be implemented if communications were available but rarely deployed when supplementary channels need to be procured. When assessing procured service costs, it is usually the “commercially available service” nearest to the required quality which is taken into consideration. There is often an important gap with respect to the actual requirement meaning that the utility accepts to lower its operational expectations. The high cost of actually meeting the operational constraints for atypical services (if possible at all) is rarely taken into account. In the “build” scenarios, however, it is considered that the operational expectation cannot be lowered as technical solutions for meeting the constraints generally exist. It should also be noted that quality constraints may be related to the proper operation of a system (e.g. time latency in Protection Relays), related to the respect of certain national regulations or international recommendations (e.g. fault tolerance), or purely cultural, related to the behaviour of the previously employed systems or “how we thought they were behaving” (e.g. invulnerable)!
Capital Expenditure
The cost assessment for the telecom infrastructure needs to be complemented with information concerning the expected useful lifetime of the system components. Optical cable can be expected to have a lifetime of 15-25 years, an RF tower even longer, but modern communication equipment may need to be replaced due to unavailability of spare parts or incompatibility with (or performance limitation of) more recent system components. The addition of varying life factors complicates the cost assessment even more when we consider a procured service counterpart: how to estimate the cost of a currently available commercial service in 15-25 years time or the cost of adapting the operational applications to whatever telecom service is available in future? Moreover, it should be noted that the cost of optical cable, RF towers and electronic communication equipment does not cover all the CAPEX of a “Build” project. Some other important expenditure items often under-estimated are as follows: • Installation Wiring, Cable Trays, Distribution Panels, • Licenses and Right of ways • Power Supply, DC Distribution, Batteries, Back-up Generators, Fuel Tanks • Cubicles and Environment Conditioning, EMC Protection, • Management Facilities • Application interfaces and converters • The opportunity cost of outage times where critical assets such as transmission lines need to be taken out of service for the installation of OPGW for example • Spares Many of these items still need to be provided in a Procured Service provisioning scheme: Customer Premise Equipment in many public telecom operators do not fulfil power system interface requirements and is not conditioned for installation in the substation environment,
132
necessitating cost provision for interface converters, multiplexers, routers and switches as well as cubicles, DC distribution (or AC UPS for Customer Premises Equipment) and installation items in an assessment. Moreover, if the provider’s scope ends outside the perimeter of the electrical substation (e.g. for safety and security reasons), one may also need to add shelters and links to the substation interfacing points. Alternatively the Service Provider will need to include the cost of gaining and maintaining site access accreditation where he does need to access electrical substations. Operation Expenditure, Management and Operation Support
Running a dedicated telecommunication network is costly and requires different categories of skilled workforce, organization, processes and tools as already described in the previous chapters of the present document. Network Management information systems and tools, as other IT systems in the Utility, can be particularly costly to acquire and to maintain. For some more complex management tools, their acquisition may require a certain critical size of the telecom facilities to be managed. When the critical size is not attained in the utility, sharing these facilities with other dedicated networks can be an attractive solution, pleading in favour of outsourcing. Similarly, field maintenance of a network dispersed across a wide geographical area, while respecting service restoration time constraints can lead to a great number of local maintenance centres with skilled staff and spare parts, but under-utilized most of the time. The operational cost can rapidly become too high to sustain. Sharing field maintenance through multi-skill utility staff (SCADA, protection, substation, etc.) or through external service contractors can considerably reduce this cost (i.e. build and contract services). However, it should be noted that even a full procurement of “managed telecom services” does not eliminate the cost of management and operation: referring to the uTOM model developed in the previous chapter, the external provider relieves the utility from the process of Telecom Resource Management (i.e. the telecom infrastructure) but still the following costs must be taken into account when provisioning solutions are compared: Cost of Service Provider Relationship Management – assuring that the provider’s contractual obligations in terms of quality of service are met (Service Level Management) Cost of Service Management - adapting provider’s services to utility user’s service requirements, e.g. through coupling of services from different providers to meet fault tolerance and service restoration times that a single provider can assure Cost of Utility Customer/User Relationship Management – assuring that the user is receiving a satisfactory telecom service It is a current mistake to assume that the external telecom Service Provider shall replace the utility in the fulfilment of the above-mentioned tasks. This often leaves an important gap in the service provision chain leading to an unsatisfactory service at the user level and conflict with the external Service Provider. A certain degree of cost sharing on operation support and management can be devised in multinational Utilities, covering EPU activities in several countries. This can typically cover a centralized support scheme, common procurement and other administrative activities, however often constrained by different national regulations and legislative issues, as well as different optimal technology and provisioning schemes.
133
Skill related costs
A cost item which is so often neglected when performing cost assessments is the cost of maintaining in-house skills and expertise. A first-level superficial analysis may consider that contracting of services (management, maintenance, full procurement of bulk capacity or process connectivity) leads to important cost saving in that the utility no longer needs to maintain in-house skills and skilled workforce. However, experience shows that losing skills in the utility generally leads to the build-up of a “provision monopoly” for one contractor (e.g. Telecom Service Provider), rise of service cost and degradation of service, due to the inability of the Utility to change its Service Provider (all the knowledge of the service residing with the external contractor to be replaced). In some other circumstances, outsourcing of services is not for saving on skilled workforce, but a consequence of lack of skilled workforce or the inability to capture or to maintain skilled workforce (e.g. due to non-attractive salaries). Finally, it should also be noted that maintaining in-house telecom skills is costly and may even be more costly if the maintained skills are only to be used for specifying and supervising purposes. An adequate provision of operation cost is to be allocated for training, transfer of knowledge between generations and acquisition of new skills facing technological migration. Risk related costs
Cost assessment of telecom services often lacks consideration of the Risk parameter even if this is precisely the reason to envisage anything else than the normal operator service as described throughout this document. A “Cost versus Risk” assessment must examine the “Liability chain” from the end-user to the end supplier and contractor. The level of liability that any external telecom supplier is willing to accept is completely out-of-proportion with the risk that the utility may assume if the telecom service were not to be operational at critical times. It is often recommended to set the liability of the telecom operator as high as possible (despite the impact on the service price), not to cover the consequences, but to assure responsiveness. The cost of non-performance of the telecom service comprises the cost of possible damage to EPU assets, cost related to sanctions and penalties that the EPU may have to face, as well as the loss of image for lack of performance and its associated costs in the competitive market environment. Cost Comparison and Selection of the solution
In order to properly analyse different solutions which have different CAPEX and OPEX cost considerations, it is useful to use appropriate economic models that enable valid cost comparisons to be made. One such standard economic technique is known as “Net Present Value Analysis”. In principle this technique converts a stream of future payments (OPEX costs such as continuing telecom service fees, maintenance expenses, ongoing management expenses, etc.) into the present (taking into account the time value of money caused by inflation and opportunity costs) so that they can be added to CAPEX costs. This results in the ability to directly compare different solutions with different mixes of CAPEX and OPEX costs. It is also possible to do a sensitivity analysis for different scenarios, such as changes to the inflation rate during the lifetime of the telecom service delivery. A simple “google” will find ample references to this technique which can be easily implemented using a spreadsheet approach, or by using “Discount Cashflow” tables.
134
12 FURTHER ACROSS THE HORIZON Technological “prophecies” in telecommunications and information related domains have often proven to be far from correct. Breakthroughs or unexpected limitations, new market segments or vanishing of previously promising markets, company strategies, economical and political drivers have in the past resulted in the least probable options. This short section can therefore only mention a few current trends which, if not deviated or abandoned, may have significant impact on the evolution trajectory of the nature and delivery mode of telecom services in the EPU. Many of these trends such as Advanced Metering Infrastructures (AMI) and Smart Grid initiatives of all kinds are already well engaged but in multiple tracks without yet any certainty as to the track (or tracks) which will in the end be universally adopted. The following discussion can be considered as the basis for further reflection across the power industry. Five axes of reflection can be identified for discussing the evolution of EPU telecom service: • Power System Applications, Practices and Development –Smart Grid • EPU Organization and Environment • Telecom and Internet Service Provider Environment and Service Offer • Telecom Technology Evolutions • Information Systems Evolution – Cloud Computing Subjects and domains discussed in this section may be candidate domains for further exploration through future Working Groups in the CIGRE Study Committee D2.
12.1 Power System Evolution - Smart Grid Worldwide concern on sustainable development in the recent years has created a number of environmental, economic and social initiatives related to energy generation, delivery and consumption: Reduce the generation of energy through fossil combustibles and the production of CO2 due to energy-driven activities, • Coordinate more closely the generation, delivery and consumption of energy in order to reduce the general need for energy production and to reduce the impact of high “spot” energy prices in deregulated energy markets An important orientation associated to these initiatives, is SmartGrid whose target is to assure secure, reliable and optimal delivery of electrical power from generation to consumption point, in the deregulated power environment, through an enhanced use of information and communication technologies.
•
Some of the main attributes of this “Smart” Grid are given below. Their fulfilment relies heavily upon the existence of secure, reliable and fast communications with wide coverage of the concerned sites and devices: a. Self Healing - The prompt reaction of the power network to changes, through a highly coordinated automation system, and “network-aware” protection schemes for rapid detection of faults and power restoration. These communication-intensive applications and their service requirements have already been described in the section 3.1.
135
b. Enhanced Visibility and Control - Increased security of the power delivery system through better visibility of power flow and the network state across the interconnected, multi-actor, competitive market. These applications covering the visibility of the system from the remote control platforms together with new ways to organize, synthesize, dispatch and display large amounts of collected information are already illustrated in section 3.2. c. Empower Consumers - Empowering the consumer means incorporating consumer equipment and behaviour into the design and operation of the grid. This implicates reinforced “Demand Response” and “Load Control” through information exchange with the energy consumer and potentially the electrical appliances in consumer premises, allowing to regulate consumption and to absorb peaks. The Advanced Metering Infrastructure (AMI) and Smart Metering programs constitute the basis for this reinforced relationship requiring bi-directional “real-time” communication access from the Service Provider to the individual domestic and industrial consumers. d. Tolerant to physical and cyber-security attacks – New utility security monitoring solutions and their communication requirement have been discussed in chapter 5. Security also becomes an important attribute of the communication service for utility applications as described in section 6.6. Moreover, disaster recovery solutions such as geographically distinct back-up control platforms, fast deployment and mobile facilities described in chapter 7, are often part of the mitigation plan of utilities towards physical attacks raising the issue of routing of communications to facilities, sites and staff across the network. e. Accommodate Green Power - The secure integration of the considerably increased dispersed power generation, mainly large wind farms, but also “energy producing consumers” (solar, wind, etc.) implicates incorporation of statistical data and meteorological forecasts into the power dispatch program, but also reliable two-way communications with dispersed generators constituting virtual power plants. In case of a wind or solar farm, the “local communications” of the power plant are extended through a large perimeter, sometimes hardly accessible, covering an array of tens of individual generators. f. Optimize Assets through Monitoring – Asset and environment monitoring for an increased lifecycle of assets without compromising system reliability has been discussed in chapter 4. As it can be observed, the different components enabling the “smartness” of the grid have already been covered in terms of communication services in the present document. What still remains unclear at the time of preparation of this document is the extent of deployment, coverage and associated investment of power utilities for each of these attributes. This may have considerable impact on the overall provisioning scheme of telecom services in the EPU. One such “scale-setting” application is the Advanced Metering Infrastructure for access to consumers. Depending on the adopted applications and their communication requirements, the distribution utility may be driven into a procured telecom service or into building of an extensive dedicated telecom network. The concept of microgrids, autonomous islands comprising generation and consumption, will redefine the Energy Management architecture and its consequent communication service requirements, pushing the concepts of deregulated energy market down to the community level. 136
Finally the management of devices may also be considerably modified with the change of scale in the number of smart metering devices to be monitored, evolving from a power utility type “monitoring system” into concepts and platforms that mainstream telecom operators have long employed for payphones, ADSL and cable modems.
12.2 EPU Organization and Environment Liability and Regulatory issues
In a great number of countries the power sector has already undergone or is undergoing important changes moving from a state-owned vertically integrated utility with a public mission to a competitive market situation with investor-owned companies. A natural trend has therefore been a great focus on Return on Investment with consequences on all activities which do not contribute directly and immediately on the creation of added value. This unique direction of attention is gradually being counter-balanced to some extent by costly incidents, regulatory bodies’ directives, and liabilities: Return on Investment is being complemented by the Risk of non-Investment. Competitive companies will have to invest extensively in such domains as cyber-security, surveillance and physical protection of sites and installations, primary asset condition monitoring etc. in order to meet regulatory requirements (e.g. NERC). The safety of the maintenance workforce is also getting renewed attention through innovative information tools as already described in order to avoid accidents. Continuity of supply is increasingly a requirement that is sanctioned through important financial penalties encouraging the power company to invest in its energy management, grid management and asset management facilities and the efficiency of its maintenance. Reliable telecommunications in this context shall benefit as an enabler infrastructure, requiring important investments, not necessarily to create a Return on Investment but to avoid important consequential losses. On the other hand, regulatory bodies encourage investment in operation-related communications by accepting the expenditure as necessary for the security of power system (and hence reflected into the total cost of power). This approach may impact considerably the telecom strategies of the EPU. Workforce and Expertise issues
Power Utility workforce is changing in distribution, skills and behaviour requiring revised ways of operating. • Ageing Workforce – Many European countries are facing workforce aging due to demographic reasons with waves of retirement creating a loss of “legacy expertise” while “legacy systems and technologies” still remain in operation • Shrinking Workforce – Competitive strategies in companies have led to reduced workforce and to externalization reducing their ability to conserve skills and expertise on a wide scale. • Accelerated Turnover – Younger generation technical staff, increasingly do not remain a long time in one activity, making it difficult to develop a wide knowledge and experience, in particular where multiple disciplines are concerned (e.g. Telecoms and Protection Relaying).
137
•
Information and Communication technology is changing extremely fast, creating a strong dependence on inexperienced or external workforce often unfamiliar with the power system applications and environment. Technical options and decisions can be taken without a full understanding of consequences.
Based on these statements two important directions of focus and investment can be identified for power utilities in the coming years to mitigate workforce and expertise issues: • Training and simulation tools for technical workforce • Embedded intelligence in operation & maintenance tools – This includes enhanced network management and situation awareness systems and data-rich maintenance tools. Remote assistance of field workers through enriched information exchange terminals is described in 4.6 and [12]). Some utilities are investigating into Expanded Reality for their maintenance workforce.
12.3 Communication Service Provider Environment a. Distribution Utilities are the largest EPU organizations (both present and potential) using Public Communication Services, in particular for Smart Metering and Customer Access applications. Fixed Internet Service Providers and Mobile Operators are extremely serious contenders for this large potential market that can itself impact the service catalogue of these operators. In the vision of Public Communication Service Providers, energy management at customer side is only one constituent of a far larger set of services covering Home Automation, information delivery and general accessibility. b. IP Convergence of Services – It is at present established that all fixed services provided by Public Operators can be (and shall be) delivered at lower cost over IP (even if still generally with lower reliability). Fixed communication services in future will be exclusively delivered over IP. c. Short life-cycle services – Added-value services have much shorter lifecycles and under the pressure of market and competition need to be modified continuously. It becomes impossible for a Utility to base any of its “long life-cycle” applications on the usage of these services. Otherwise, the EPU should be prepared to modify its deployed applications accordingly. d. Separation of Service Platforms (voice, Internet, file transfer, video, etc.) – In order to allow ever faster design of new competitive communication services, the service delivery platform (Service Factory) is being separated from the network infrastructure (Network Factory). The former is then purely an IT platform and the latter a large Ethernet transport infrastructure (Carrier Ethernet). It is also likely that the two activities be performed by different companies: e.g. Internet Providers becoming Telecom Service Providers and incumbent Telecom Operators becoming Telecom Connectivity or Network Providers [47], [48]. e. Carrier Ethernet – The telecom “network provider” of the future shall be a provider of Ethernet connections (E-lines and E-LANs). Liberating the Network Providers from added-value service provision probably lets them focus on the quality of service of their Ethernet offer with less structural complexity. The requirement for higher quality of connection service due to the increase of time-sensitive and non-transactional data may lead to special transport networks using Scalable Ethernet solutions (MPLS-TP or PBBTE) delivering specific SLAs.
138
f. Mobile and fixed services of premium quality/reliability may be created to serve specific “blue-light” applications. This can be performed through sharing of costs and infrastructures (frequency band, cable, workforce, etc.) among different users creating specific operators as an alternative to dedicated infrastructures. The case study in Portugal described in Appendix A2 is an example of this trend. g. Satellite communications for specific applications may lead to the creation of specific service contracts and Service Providers as illustrated in the Brazilian case study in Appendix A3.
12.4 Telecom Technology Evolutions Telecom technology evolution is continuously impacting the way EPU telecom networks are implemented. Solutions considered as appropriate for large operator networks are constantly shrinking in size and cost, making them available to dedicated network implementation. In particular, the bandwidth and capacity borderline between operator and dedicated networks, as well as between backbone core and edge is constantly rising. A long lasting technological race between the IT world and the traditional telecom world is still in progress with unpredictable “next round” winners. Moving away in the early 80s from the initial separation of transmission and switching technologies, we were constantly given the idea that the future is in integration. The domain of Switching enriched with voice and data integration gave us the “Integrated Digital Networks” where computer technology gradually swallowed the telephone switch. The IP network revolution generated the idea that the telecommunications network is just a larger size computer network integrating a great amount of services. The network of the future would be application servers connected to end users by a network of large IP routers through optical modules integrated into them. This vision, announcing the imminent death of SDH technology, seemed (or seems) particularly frightening to the EPU community due to the requirements and constraints of some applications such as Protection Relaying. Other music starts to be heard at present, with the mainstream telecom actors talking of the separation of Service Providers and Transport (Network) Providers as mentioned in the previous section. New Optical Transport technologies (OTN) integrating the qualities of SDH while providing much wider packet capability are being developed under the ratification of ITU-T. IP/MPLS, the previously declared champion, is being adapted to the transport network, taking off a lot of the complexity due to its data network functionalities (now in Service Providers’ scope) and gaining a deterministic behavior and time performance. These changes may impact greatly the future of dedicated network technologies and service separation architecture in the EPU. Two important domains related to this issue are Optical Networking and Ethernet Transport: Optical Networking
Optical technology progress in recent years has rendered feasible and cost-effective in the EPU context, a number of solutions which were previously too complex, too costly or not sufficiently reliable. Wavelength Division Multiplexing once reserved for large mainstream networks has made its way into dedicated EPU networks. In particular, Coarse Wavelength Division Multiplexing (CWDM) is now a very accessible technology for federating different
139
services (or groups of services) and communication modes (refer to section 9.3.2). Wavelength multiplexing allows a new dimension in service integration, keeping legacy TDM systems and Protection circuits on separate wavelengths from new core data networks. Double transport planes (packet and circuit) available in many equipment, complement the wavelength multiplex for building a compact hybrid transport for dedicated networks. Optical networking which brought the SDH/Sonet in the late 80s and early 90s is now moving into the Optical Transport Network (ITU-T OTN) standardized by ITU-T (G.709 revised Dec 2009). Operators are deploying OTN equipment at backbone, metro and access levels to enhance packet networks. Inclusion of WDM and enhanced OAM gives the OTN significant advantages over SDH and a combination of OTN and Ethernet (carrying IP) is proving to be a very attractive industry solution for converged networks [49]. ITU-T OTN provides considerable flexibility through implementing an Optical Control Plane that automates the allocation of wavelengths to communications. Reconfigurable Optical Add Drop Multiplexer (ROADM) allocates routes across a number of different wavelengths automatically at the time of route configuration. This constitutes the basis of Optical Transport Networks delivering packets as well as TDM circuits (Constant Bit Rate traffic CBR) in a scalable manner combining cost-effective and flexible bandwidth usage with low deterministic delay and restoration times available through SDH [50]. On the other hand, EPUs need to consider the presently available experience data on the lifecycle issues of different fibre cable technologies which were previously only theoretical or based on little experience. If optical cable lifecycle may prove to be less optimistic than that predicted, then substantial extra investment may need to be made available in the years to come. Moreover, the traffic expectation continuously rising, the replacement cycle of the cables is shortened, requiring supplementary disturbance and down-time for the associated power transmission lines. Carrier Ethernet Technologies
For a number of years, mainstream telecom industry has introduced IP/MPLS as the scalable solution for the backbone core of all IP or Ethernet-based networks. Many Utilities, in particular those carrying a great amount of administrative (corporate) traffic or commercial UTelco services across their network have deployed an MPLS core composed of a few nodes interconnected through high bandwidth Ethernet links. However, it is argued at present that MPLS presents too much complexity for a pure carrier function and does not provide necessary guarantees for time-sensitive traffic. What is required in EPU networks is to deliver Carrier Ethernet services (E-line, E-LAN) in a scalable manner without too much complexity while keeping SDH-like QoS. This will allow simple migration from the present stage Ethernet VLANs over SDH into a more scalable solution if and when necessary. Provider Backbone Bridge (PBB-TE) defined in IEEE 802.1Qay and Transmission Profile MPLS (MPLS-TP) being defined by ITU-T and IETF follow this same objective of providing a simplified and time-constrained solution for scalable carrier Ethernet provision. This is a technological step which will certainly interest EPU telecom networks in future [51].
140
Network Management and Situation Awareness
Another important technology change which will impact EPU networks is the one related to the management of network infrastructures. Management platforms are growing from Element and Network Management to Service and Business Management. Moreover, from a previous situation of monolithic management platforms at high cost and maintenance effort which were often out of reach for EPU networks, we are getting to interconnected multi-vendor modular management platforms, allowing the association of event management, incident management, performance monitoring, network inventory, etc. from different vendors. TM Forum is leading to common information models and standard application interfaces allowing gradual deployment of management infrastructures using Service Oriented Architectures (SOA). This will allow the EPU to tailor and to build up its management system according to its needs over many years.
12.5 Information System Evolution - Cloud Computing Cloud Computing is a method for providing IT resources (hardware, software and data) as “services” through Internet in a scalable and elastic manner. In a certain manner, cloud computing is not a new concept, in fact it comes from the evolution and large scale adoption of services that are already in place like equipment housing, centralized storage sharing and server virtualization. The present requirement to focus on business processes, to reduce investments and management costs, as well as faster and less expensive broadband Internet access drive organizations into looking for these cloud computing services. The scope of Cloud Computing is generally divided in three main types of services • • •
Infrastructure as a Service (IaaS); Platform-as-a-Service (PaaS); Software-as-a-Service (SaaS).
The EPU environment, with its specific security and reliability requirements governed by specific regulations, is unlikely to adopt public cloud services in an extensive manner, at least in the coming five to ten years. However, driven by standardization and integration of IT and industrial processes, not forgetting the “green IT” demand, the adoption of private clouds in EPUs, is already in place and will grow gradually. This will concern both IaaS (e.g. servers and storage) and PaaS (e.g. databases and web-service platforms). Assuming that cloud computing in EPUs will be based on a private cloud principle, the impact in telecom services will be more on the way they are built, managed and used (i.e. migration to the cloud) than the way the services are accessed. In this way, the direct impact that can be identified at this stage would be extra bandwidth and QoS requirements resulting from the transfer of servers and storage from distributed technical rooms to the main datacenters where the cloud will be “housed”. An important potential application and an issue to be considered is disaster recovery. At present, typical disaster recovery implementations require often a lot of manual procedures,
141
activation time, frequent tests and revisions for business continuity plans, and low efficiency of assets usage due to cold or hot standby solutions. The adoption of cloud architectures can change the way we look at disaster recovery solutions, enabling the adoption of semi distributed IT/Telco architectures where the cloud and the processing capabilities are spread along the main and the backup datacenters and used in an active/active basis. This can be achieved through long distance storage interconnection, metro-clusters, real time long distance database synchronization, balanced Internet access, server farms, etc. This possible evolution may render disaster recovery simpler, more reliable, more efficient and (theoretically) cheaper, but will also need communications in line with those requirements which will necessitate investment in (or usage of) more resources (e.g. fibers, wavelengths, MPLS, etc.). At the time of preparation of this document, to our knowledge, EPUs are not using cloud computing but investigating and in some cases only starting to plan to implement for corporate organizational information systems. On the migration into a Public Cloud, there are still many concerns and questions to be answered relating mainly to reliability and security. Many of the questions resemble those already discussed for procured telecom services: •
•
•
•
•
Service availability and capacity – How can it be guaranteed that the Service Providers will meet not only the present needs but also the future ones as they appear? How can the EPU internal capacity management processes be matched with those of the providers? And what about customization, will the providers be as flexible to the EPU as the EPU has to be for its internal customers? Failure remediation and disaster recovery – Are providers ready to offer failure remediation schemes and guarantees in line with those already existing “in house”? Is the EPU current typical SLA model applicable and enough? Are the providers able to offer disaster recovery with the same SLA commitments as those existing at present for internal customers? At what price? Transparency – Once we ask for a certain service in the cloud to a certain provider we are getting that service from the cloud. What is the cloud? Who is behind the cloud? Are there subcontractors? Who really will operate our services? How do we meet our requirements along the chain? Ownership and security: EPU infra-structures/applications/data will go out of its physical and logical perimeter. Who really owns the information, EPU or its Service Provider(s)? Can we guarantee information privacy and confidentiality? Will other providers/partners/customers with whom an EPU has confidential agreements accept such a model? How do we guarantee the same security requirements and commitments that we have in place today? How can we define an effective security policy if we lose/reduce the capability of applying security policies enforcement and auditing? Will we be able to audit our “cloud providers”? Isn’t that a huge and perhaps ineffective task? In case of security breach how do we track it and who is responsible? How will we put in place risk management processes in this new reality? Maturity: Nowadays even “corporate services” like email, portals or file transfer can play a significant role in the critical processes of EPUs. Are providers mature enough to support critical services or will the support be on some kind of “best effort” basis?
142
• •
•
Regulatory issues: How will the cloud model meet the regulatory requirements of EPUs? What are the legal issues if we are under a public concession contract? ROI: The model is quite complex and does not give readily a real idea of costs and benefits. Little information is at present available on the return on investment. EPU stilll have to demonstrate the interest of the financial model for their organizations. Regarding the communication services and network, the first issue will be Internet access that must be similar to the actual datacenter bandwidth access (Gigabit?) and implicitly communications quality of service and security requirements. Are ISPs able to provide service levels as we have “in house” today? The access to the cloud should be given everywhere (fixed or mobile access)? Are ISPs prepared to commit the necessary SLAs?
Further information on cloud computing as well as available platforms, providers and different services may be found in [52] and [53].
143
APPENDICES A1.
IP Voice in Utility Telecoms
Hungary
The Hungarian TSO, MAVIR has a digital Operational Telephone Network since 1992, connecting the telephone exchanges of NDC, reserve NDC and high-voltage substations (the present number 2+26). Although this network is independent of the other used telephone networks- as the nationwide administrative telephone network of the EPU-s and the public telephone network-, these exchanges provide the concentrator task (collecting the different type of lines to the dispatching desk) also. At first the connections among the exchanges operated through PLC and microwave connections, later on optical network. Since 1997 the network used a double tree structure (6 substation exchanges on higher level than the other), the connections were 2 Mbps trunk lines and/or 4 wire E&M). Since 2007 this telephone network operates as one of the districts of the nationwide administrative telephone network of the EPUs. In 2009 began the IP migration of this operational telephone network. In case of NDC, back-up NDC, and 4 selected substations LAN connections were developed instead of the earlier mentioned double tree structure. These LAN connections are operating with double routes (through NgSDH and SDH network on 2 Mbps and 10 Mbps lines. The hardware changes are: routers, Main Control Server/Enhanced Survivor Server at NDC/reserve NDC (between these MPLS connection), at the 4 substations gate-ways Local Survivor Processors and port cards to analogue trunks and subscribers. The IP telephones at the substations operate from the Main Server or, in case of necessity, could work from the Local Server also. This started change will continue at the other substations. Reasons for IP migration and expected results: • • • • • • •
Only IP telephone exchanges are produced. In case both of the connections are cut off the objects in substations. Could work independently and reach everywhere through the public network. Improvement in voice-quality not probable. Simultaneous usage of mobile service and subscriber set are available. Hopefully easier managing (access enough only to one, central element). Central managing of the call-number lists stored within telephone sets.
144
Portugal
The Portuguese TSO, REN has a national wide TDM Operational Telephone Network that provides voice and data services to Electrical and Gas dispatch centres (main and redundant), electric and gas substations and corporative building. Additionally this telephone network is interconnected with other telephone networks from electrical companies and with the public telephone network. In 2006 it was decided to start the process of evolving the telephone network into IP telephony since it was implemented a national wide MPLS/IP network that could provide IP VPNs for interconnection of substations and dispatch centres into the same “WAN/LAN”. This project had the following guidelines and objectives: • Analyze the differences between different VoIP implementations from different vendors. In this case it has tested 3 different vendors. • Compare the implementation of VoIP systems in office and substation sites in terms of usability, redundancy and operation. • Implementation of a call centre in order to support the Telecom and IT helpdesk activities. The following sites were selected: • 2 Office building (400 users). • 6 Substations with 50 user extensions medium each. • IP Call Center implemented in the central site with 20 IP users distributed across 4 sites • 10 remote wind generation sites (8 extensions each, 6 analog modems and 2 IP phones). For installation it was adopted the following requisites and project general options: • Each IP gateway is registered in central redundant system. • Each site has a survival mode that allows internal services to be provided even without the central management VoIP system. • For mobile service inside the substations the choice was DECT technology since WiFi is more expensive and not adapted for industrial environment use in the substations. During the implementation several difficulties were found, in resume the following aspects can synthesize the problems and difficulties detected during the implementation and testing of the VoIP systems: • As several models of modem and fax equipment are used across the network, in some cases problems were detected in communication (“pass trough”/”relay” configuration). • The IT network is composed by several elements, namely firewalls, DHCP servers, DNS servers, routers, switches, etc., when there is interdependency between the VoIP network and all these components, as the complexity increases. • Mobile communication inside the substation with some problems, namely for “hand-over” of calls between cells (1 vendor). • Difficulty and in some cases unavailability to interconnect several services: BRI connections, analog connections (FXS, FXO and E&M). • It was detected that certain configurations implemented in the central management system were not reflected in the network equipment, differences between local and central management system (e.g. gateways registered though MGCP into the VoIP management system).
145
• Difficulties managing the site survival mode. In resume the usability of VoIP services in the substations environment requires different configurations and features in order to improve the overall system. The principal advantages that emerged from the project of implementation of VoIP systems in different locations were: • • • • • •
Use of internal WAN/LAN network, no provisioning of dedicated point-to-point TDM services. Reducing the operational costs (per call basis) More flexible usage (registration of VoIP phones, soft-phone installed in notebook, etc.). Possible use of SIP protocol allowing the use of telephones of different vendors and connection to public operator less expensive. Integration of PC and Telephone (web pages for several configurations, integration with Outlook – new functionalities as “click-to-call” can be used, etc.). Use of corporative directories with addresses in the telephone.
Netherland
The Dutch distribution network company, Stedin, managing gas and electricity networks composed of around 200 communication nodes (10-150kV electricity and gas stations) and a total of 650 substations to cover, is deploying an IP telephony network for operational and operation-support voice communications. The company uses public mobile phone (GSM) services which are however unavailable most of the time in case of emergency. Pagers (better availability than GSM) as well as TETRA mobile system of the Public Safety services (Police, Fire brigade, Ambulance Services etc.) are also used. In case of electricity failure and lack of public telephone services, the technical staff must travel to the nearest station and use the company’s dedicated telephone facilities. The dedicated telephone system which is migrating to IP can operate without connection to the public network, hence allowing communication between HV stations and Control Centres. The network uses redundant communication servers placed in the two geographically distinct control centres. A number of process efficiency features are sought from the new IP telephone network, including: -
A new numbering plan based upon geographical zones, electricity or gas, High Voltage or Medium Voltage, functional departments, etc.
-
Calling party authentication and display/control of caller attributes (identification, formal qualifications, etc.) on the called party screen. This feature is important for working with multiple contractors.
-
Additional functionality such as special conference calls, etc.
An outstanding issue to be solved at present is the connection of legacy voice modems to the IP telephony network.
146
Australia
Snowy Hydro Limited is a leading provider of peak, renewable electricity to the National Electricity Market in Australia. It owns and operates the 3800 megawatt (MW) Snowy Mountains Scheme, an integrated water and hydro-electric power system located in Australia's Southern Alps comprising sixteen major dams, seven major power stations, 145km of interconnected tunnels and 80km of aqueducts, as well as two gas-fired power stations totalling 620MW, both located in the southern Australian state of Victoria. There are also support offices including in Sydney and in Cooma, resulting in the need to provide voice services to locations spread over a distance of 1000km. In 2004 and 2005 a new voice system was implemented which carried both operational and corporate voice services and covered all sites belonging to Snowy Hydro Limited. The system chosen comprised a combination of IP enabled hybrid telephone exchanges interconnected with server based soft switches. This gained the benefits of IP based telephony while still preserving the resilience required for operational and safety telephony. Each site is interconnected by multiple routes comprising both IP (as a separate VRF on a QOS enabled MPLS network) and TDM (where available) or multiple IP routes where TDM is unavailable. The use of the hybrid approach enabled the condensation of two separate telephone networks (operational and corporate) into one system while not compromising resilience, but gaining the advantages of the IP voice services for certain applications. Office workers were provided with IP phones giving them mobility to work at different desk locations (eliminating costly moves and changes) or at different geographic locations using the same contact phone number. The main and backup control centres were set up with multi-appearance IP phones. This has enabled the use of different control centres without the need to notify other parties to use different contact numbers. TDM phones were also provided to cater for a failure of the IP network. Safety phones spread throughout the power stations were kept as analogue extensions but their resilience against link failures was enhanced by the provision of IP routes between locations that were additional to the TDM routes. The provision of IP routes also provided a means for the carriage of voice for free on data networks to achieve substantial cost savings through toll bypass. The main issues (additional to change control issues) that had to be solved during the implementation of the system were: •
•
•
Establishing a consistent QOS enabled underlying data environment across both internal networks and external service provider networks. In the case of Snowy Hydro Limited’s system, Expedited Forwarding was used for real time packets and Assured Forwarding for signalling. Policies were applied on entry to Service Provider networks in order to ensure that Snowy Hydro Limited did not cause voice degradation by exceeding the service levels procured (e.g. exceeding the service allocation of low latency queue packets. Service Providers re-mark packets exceeding the procured service level allocation to a lower priority thereby resulting in a lower voice quality.) There are fundamental limitations in the size of spanning trees that need to be catered for in the design of the LAN environment. This will impact the desktop phones if this design requirement is not catered for. Maintaining tight firewall security (deny all) was a challenge because of the lack of information often available for minimum firewall rule sets as opposed to opening up a 147
•
•
large range of ports on the firewalls. E.g. too tight a policy can lead to “one way” voice calls. Echo issues often related to duplex mismatches in the data network or were associated with the use of specialised phones such as conference phones, cordless phones and headsets. Even room layouts can affect echo issues. Careful attention to duplexing and echo cancellation settings was required. Implementing voice quality monitoring systems was important so that the diagnostic information was available to resolve internal network issues as well as to substantiate issues with Service Providers.
Over the last five years of operation, the system has proved to be extremely resilient. The main issues that have occurred at times have been caused on routes running through Service Provider networks as follows: • •
•
Different performance measurement systems make it difficult to convince a Service Provider that the provided service does not meet SLA requirements. The use of monthly average SLA technical specifications which do not recognise the instantaneous nature of voice. e.g. A Service Provider usually specifies monthly averaged jitter figures for each grade of data service but small irregular instantaneous peaks in jitter may exceed the jitter buffer settings of IP telephones causing a drop in voice quality. This latter issue is caused by a Service Provider that has over-subscribed its infrastructure. Changed ownership of a Service Provider resulting in changed service provision priorities including not supporting the original SLA requirements.
148
A2.
Sharing Mobile Emergency Service (TETRA)
EDP (Portugal) employs private mobile radio systems for voice and MV SCADA applications. These radio systems comprise: • Analog VHF networks used for voice and data • Analog mobile voice used only when public cellular (GSM) is down • Analog data networks with better availability than GPRS (99% compared to 95% for GPRS) However, EDP needs to find alternatives mainly due to the pressure from the regulator to abandon the VHF band and the desire to migrate to standard TCP/IP based SCADA protocols such as IEC 60870-5-104. In this context EDP signed a protocol with the Portuguese Government for setting up a pilot project using SIRESP, the government’s TETRA Network for Emergency and Security Forces. The objectives and goals of EDP for the pilot project were to test the network’s capability to support TCP/IP SCADA (IEC 60870-5-104 light version), coverage and ease of use of voice services and network availability. SIRESP (Sistema Integrado de Redes de Emergência e Segurança de Portugal) is the Portuguese Government TETRA Network for Emergency and Security Forces serving 55 000 users It provides a very dense coverage of the national territory through: • 532 BTS + 2 mobile BTS • 4 switches (+ 2 in the islands + 1 for tests) • 66 dispatch centers Avaliability (%)
Operacional
> 99.9
Coverage(%)
Urban, suburban, highways and main roads
> 95
Rural
> 90
Buldings – Urban and suburban
> 80
Buldings - Rural
> 50
Same cell
< 500
Switching delay
< 100
Rejected calls
< 2.5
Dropped calls
< 0.5
Call setup (ms)
Capacity (%)
SIRESP Service Level Agreement: The pilot project carried out over a period exceeding twelve months comprised: • • • •
30 RTU’s with 104 protocol loaded firmware; Mobile terminals (installed in work vehicles) 9 handheld/portable radios 1 dispatch console 149
A 2 Mbit/s Ethernet line was installed between EDP SCADA SYSTEM and SIRESP to guarantee connectivity between the two networks (for voice and data) The SIRESP network provides a fixed IP address to each unit and the scada system command center.
Pilot Project Results
Voice Service: • Good coverage (even in places where no GSM operator exists); • Fast call setup (virtually instantaneous); • Good audio quality; • Users prefer handheld to vehicular mobile terminals (they prefer to carry communication terminal with them where they go, e.g. up the poles, lines, etc) ; • Good network availability – During a tropical windstorm (up to 220 km/h) in the west region of Portugal all GSM operators went down (in the region). The SIRESP network provided EDP the only means of communication in the region (in addition to EDP VHF network); SCADA (data) Service: • Reduced availability of incumbent 2 Mbit/s circuit between SIRESP switch and EDP SCADA system. We were not authorized to use our fibre optic network; • High latency packet transmission time (from 700 ms to 1400 ms). We had to adjust protocols timers to cope the high latency. Also, SIRESP network changed from 1 fixed timeslot to 4 dynamic timeslots assigned to data, per site; • Long drops without recovery - Radio attempts to re-establish the connection are unsuccessful (SN_ACTIVATE_PDP_CONTEXT_DEMAND ) -> local reset is needed. This is the main reason for the ongoing pilot (only ends when this problem is solved); • Local Site Mode -> Useful for voice but not for data. Radios are locked to a single BTS (there is no alternative if it losses connection with switch). Constraints • SIRESP does not support terminal maintenance; • SIRESP helpdesk is not familiar with data issues; • CAPEX: Tetra terminals are expensive (vs GSM); • OPEX: Monthly fee is 10 times higher than GSM (is the premium service worth it?)
150
A3.
Satellite Communications in Power Utilities
Satellite communications constitute a valuable option for specialist utility applications where other services, either through deployed facilities or procured from telecom operators, may not be accessible, economically feasible or sufficiently disaster-proof. Some typical situations are:
•
Black start-up of power installations – When power supply fails, then telecom facilities are lost after a time duration which depends upon the battery and standby generator autonomy of the telecom infrastructure. Moreover, in disaster situations, telecom facilities can suffer destruction in the same way as the power system facilities.
•
Access to remote facilities – Power generation installations (hydro plants, off-shore wind farms, etc.) may not be accessible through other deployed solutions in a cost-effective manner or through procured communications.
•
Alternate communications – Satellite communications may serve as back-up to key telecom services where no other back-up path can be established.
•
Fast deployment facilities – Satellite communication terminals can be integrated into fast deployment mobile units and transported when necessary to temporary work sites as illustrated in the Brazilian example presented in this section [m].
The usage of satellite communications for SCADA and monitoring applications has so far been limited primarily because VSAT services, while being autonomous and reliable, are expensive in comparison to other technology options, and scaling a solution to support the SCADA platform for large utilities is cost prohibitive. Consumer-based Broadband Service
Recent developments have employed existing consumer-based broadband products tailored to service utility company requirements to provide new services designed specifically for the electricity and water utility market in particular SCADA. The widespread roll-out of such services is still in its infancy, but may provide the benefits of a satellite solution at costs which are Price comparable to established solutions such as terrestrial radio and GPRS. From a cost perspective, it is estimated that these services will be positioned between GPRS and standard VSAT (as shown in the adjacent diagram). For the end user, this means a VSAT service with a service availability of greater than 99%, for marginally more cost than GPRS which provides an availability of 75-95%.
VSAT Consumer based satellite systems
GPRS
Traffic These new satellite IP services may support a wide range of monitoring and command applications where latency issues are not critical. They allow the benefits of satellite autonomy and ease of installation to be extended to a wider range of applications, and enable a much wider monitoring base, something which will be particularly beneficial in supporting the requirements of future smart grid applications. In addition, the satellite platform also offers the 151
potential to support services and applications which require higher data rates such as video applications and CCTV monitoring; services which on a widespread basis may not have been cost effective in the past. Consumer broadband services offered by satellite Service Providers are generally asymmetric, with the downlink being five or six times larger than the uplink although other up/down link combinations should be feasible according to application requirements. An example of the standard hardware for this type of system includes a satellite modem, the iLNB and an 80cm satellite antenna. The equipment is generally compact, easy to interface and, because the consumer equipment is designed for self install, not costly to deploy. EMC hardening may however be necessary for installation in severe EPU environment. Services of this type require a high speed link between the hub station at the satellite operator’s location and the IP backbone. This link may be provisioned through an ISP. Alternatively, a second satellite link back to the EPU control platform can be provided, again using the low cost service. Such configurations depend on user-specific requirements for latency, network independence and cost. Fast Deployment Mobile Units – Brazil
Tower collapse incidents are particularly harsh situations requiring complex logistics and intensive voice and data communications for moving workforce, materials and machines to site, deploying a support base, and performing works to restore the line. The unavailability of transmission services is heavily penalized by the power system operating organization and so the company is strongly motivated to reduce the average line restoration time through improved communications. In many cases, public cellular phone service is not available or proves to be very costly as the usage is predominantly long-distance Furnas Centrais Eletricas, a major Brazilian EPU, has developed container-based Telecom Mobile Units (TMU) for facing transmission line tower collapse scenarios [54]. The TMU provides local wireless voice and data and uses a satellite service to connect these remote facilities to the company’s core infrastructure. The container-based unit is of compact design allowing easy transport and positioning, even when installed on irregular ground without very rigorous levelling. Furnas has negotiated an annual satellite service contract covering 30 days of usage per year for a fixed lump sum. Each time that service is required, the Service Provider must make the link available within few hours of the service request. This delay is compatible with the time required to expedite the TMU to the incident site. Usage exceeding 30 days per year is billed at a contractual daily cost. The cost of the satellite service (including satellite equipment rental) in the present contract for a 256kbps link is reported to be approximately €15.000 per year, with a cost of €350 per extra day of use and the quality of service reported to be satisfactory.
152
Fast Deployment Telecom Mobile Unit used in Brazil [54]
153
A4.
Disaster Counter-measures – Learning from US 2005 Hurricanes
United Telecommunications Council (UTC), the international trade association representing the telecommunications interests of utilities companies from US, decided to undertake a formal survey of electric, gas and water utilities of all sizes to generate data about how the telecommunication networks served the utilities of the US Gulf Coast during disasters caused by hurricane season in 2005. The target of this action was to determine the decision makers of the society to realize the importance of very reliable utilities’ telecommunication networks, especially during and after natural disasters on the one side, and to provide arguments to other utilities for taking adequate counter-measures to prevent future similar or other catastrophic events, on the other side. What follows is an extract from the UTC Report [55] available in full from the UTC website (www.utc.org). 1. Background
During the months of August, September and October of 2005, the Gulf Coast of the United States and both coasts of Florida were belted by a series of three devastating hurricanes: Katrina, Rita and Wilma. Katrina put much of New Orleans under water; trees were flattened as well as homes, public buildings and communications towers by Category 4 winds. Loss of life from this storm was tragic, and unusually high. Rescue and restoration effort were severely hampered by the loss of public fix switched network (PSTN) and wireless carrier communications. Still recovering, Gulf States were again pelted by Hurricane Rita just weeks later and then by Wilma in October. While these storms were not quite as severe as Katrina, the rapid succession tested the limits of overstressed public safety and utility repair personnel. It became clear as response efforts continued that the recovery of nearly all other infrastructures was dependent on electric power restoration. Moreover, all people have a vital need for safe drinking water and public health demands reliable wastewater facilities. Within these industries, as always, coordination of repair crews, rapid restoration and interaction with public safety personnel depends upon working communications channels. Communities of the Gulf Coast and Florida are served by a wide variety of electric, gas and water utilities, including many small regional companies and municipal power/water authorities whose service territories lie within those of larger entities. These include large, investor-owned electric power companies, electric generating companies and natural gas utilities. 2. Survey Questions and Results
The Utilities Telecom Council posed five basic questions to utilities in hard-hit areas to determine how their various communications systems behaved during and after the hurricanes, and analyzed the resulting data. Learnings and adequate counter-measures which could be proposed to EPU’s world are presented below.
154
2.1 Communications Systems Performance
Most utilities, regardless of service territory size or proximity to the centers of the storms, reported that their communications systems behaved well during the hurricanes. This is in stark contrast to the public switched network (PSTN) in the region and wireless carriers, who suffered extensive loss of service and slow recovery time. The comparison points to the fact that communications systems, if built extremely well, can withstand the intense wind and/or flooding associated with these events; however, unlike public networks, EPU systems’ redundancies and robustness can be limited in size and scope, since they are designed and constructed to meet the specialized needs of a single entity or group of companies. Such construction would be cost-prohibitive for a commercial system designed to serve millions of the general public. Thus, in spite of the growth of various commercial communications networks, there will be a continued need for EPUs to maintain their own private communications networks for mission-critical functions, including backbone networks. Most companies that suffered damage to their equipment and networks fixed it very quickly, generally within 24-48 hours following the passing of the hurricane. Such repairs were carried out according to detailed emergency plans, since communications networks are vital to power restoration and infrastructure recovery efforts being undertaken area-wide. Utilities reported that cell and fix networks were down, leased lines and satellite communications suffered damages. One small water district authority depending entirely on cellular communications, suffered serious communications difficulties. The utilities that had the greatest communications difficulties were the smaller cooperatives and municipal authorities. Lack of interoperability among utilities and between utilities and public safety remains a serious problem that hampers restoration efforts. Hurricane rescue and restoration efforts provide an example of the benefit that could be reaped from allocating a small amount of dedicated spectrum for EPUs, with systems to be built using an open architecture and made available to all emergency participants as needed for disaster recovery. 2.2 Private Mobile Radio, Backbone Survivor
The performances of private mobile radio (PMR) systems were very high during and after the storms. Unlike most commercial wireless systems, these networks are built specifically to weather such disasters and to continue to operate in extended power outages, in support of restoration crews carrying out extremely hazardous duties. However, the superior performance of these individual systems offset by the lack of interoperability between systems and the lack of dedicated spectrum to share with other utilities and public safety. PMR is at this time the most critical tool of critical infrastructure communications in emergency situations. It provides for necessary mobility and quality of service as crews travel throughout the damaged service territory. In many cases during events such as Katrina, it provides the only means of wireless communications during the first critical days after storm impact. The overall performance of private utility communications systems during the catastrophic storms, in comparison with consumer networks, reinforces the industry position that private systems must be maintained, and encouraged for emergency response. 155
Due to the emphasis upon reliability for utility operations, it is clear for the foreseeable future that critical infrastructure entities not depend upon commercial systems for core communications. 2.3 Utility fiber networks benefited from pre-planning
The performance of private and commercial fiber networks during the hurricanes demonstrates that utilities build more reliability and redundancy into fiber networks than their commercial communications networks counterparts. The technology itself offers all system operators multiple opportunities to secure communications through the ability to deploy features such as intelligent, self-healing rings. The different performance levels of private and some commercial fiber networks throughout the Gulf Coast and Florida point to the fact that the entities have entirely different objectives in mind: utilities build for reliability because communications is critical to the functioning of the core business (electric, natural gas or water delivery), while commercial communications companies build to provide consumer services for the general population. An important conclusion is that EPU’s private fiber networks survived the storms due to preplanning for worst case scenarios. They behaved very well to the storms and where there were problems, they were overcome by inherent features of the technology. These features were preplanned and built into the networks. By contrast, commercial companies suffered more difficulties with broken cables than private, internal networks and had more difficulty restoring service. As well, smaller utilities relying upon commercial fiber-based networks spent weeks for service to be restored. 2.4 Microwave systems survived to the storms
Like fiber networks, microwave systems survived very well to the storms, even the intense winds accompanied these events. Although some damage to microwave towers and attachments was reported, this damage was not as extensive as it was to other types of communications towers such as generally taller broadcast towers. Additionally, utilities reported that they employed detailed backup plans, including employing mobile towers and safeguarding communications through redundant links. Any needed repairs, such as refocusing dishes to restore links, were accomplished in the days immediately following the storms. Katrina event reinforced the conclusion among public safety, federal, state and local officials, that EPU’s entities, along with other emergency responders, have to be included in disaster planning. 2.5 Utilities and public safety need better coordination
The actions deployed for disaster recovery demonstrated that while utility communications systems behaved well during the hurricanes, there was little or no coordination with state or local public safety organizations aside from some informal sharing of resources. Usually, the utilities played the role of assisting public safety, and not the other way around. Beside the clear needs of dedicated spectrum for EPUs and clear advancements in communications interoperability, EPUs entities responding to disasters should be included in any State or federally developed coordination process. EPUs have to be deeply involved with public safety and homeland security organizations, especially in the areas of critical infrastructure protection and cyber-security and
156
interoperability effort, because of large enterprise ITC networks and critical control systems such as Supervisory Control and Data Acquisition (SCADA). Equipment that is not dependent on frequency assignments is extremely important to both EPUs and public safety, while again, unlike traditional public safety, EPUSs have no dedicated spectrum on which to operate a next- generation communications system. This definitely is a long-term effort and one that will require massive investment by all parties concerned in new infrastructure and new user equipment.
157
A5.
Survey of Electric Power Dimensioning Practice in EPU Data Centres
Introduction
A survey was performed with 9 responding Electrical Utilities from different countries across the world and covering different sectors of activity (Transmission, Generation and Distribution) in order to collect EPUs current practices regarding the dimensioning of electric power infra-structures in datacenters. In this document Data Center should be understood as all kinds of rooms used to house servers, storage, critical workstations or telecom assets, and system signifies all kinds of IT or telecom equipments housed in datacenters. Survey Results
Although it is hard to generalize, based on the answers provided by the questioned companies, it was possible to identify the following current practices and evolution opportunities:
•
All systems housed inside the datacenter are considered critical and therefore supplied through uninterrupted electric power sources;
•
Beyond critical systems, some other equipment like essential lights, access control devices and management/monitoring systems are also fed by the datacenter UPS. In some other cases this equipments have their own internal battery incorporated;
•
UPSs are centralized but and constituted by different power modules in N+1 redundancy scheme to avoid partial hardware failure and to permit partial maintenance. Centralized redundant UPS provide better management and maintenance capabilities than distributed and is a most cost effective approach;
•
The typical battery autonomy is:
o Control centers: From 2 to 10 hours; o Telecom rooms: From 4 to 24 hours; o IT Datacenters: From 1 to 4 hours. •
It is not current practice the usage of predefined selective shutdown mechanisms. In some cases this is done manually shutting down redundant or non essential systems what can provide up to 20% more of battery autonomy time;
•
Every datacenter has standby emergency power generators, mainly diesel engines but in some cases gas is also used. In case of power failure some other essential services on the facility should be fed by the generator like air conditioned and lights. The generator starts automatically in less than one minute and runs without refill (considering full tank capacity) for: o Control centers: From 1 to 7 days; o Telecom rooms: From 3 to 7 days; o IT Datacenters: From 1 to 4 days.
158
•
The generator fuel tank is typically refilled on demand by preselected providers but without SLAs defined;
•
The general redundancy model wide adopted is N+1 but in a few cases 2N model is also used;
•
To prevent failure due to accidents or human errors, and to reduce interferences between power and data cables, some aspects also are considered: o Separated rooms for power systems; o Double power cable routes combined with double distribution switchboards; o Separated cable routes for power and data.
•
Redundancy schemes typically consider: o Double power transformers; o Double UPS / rectifiers; o Multiple battery strings; o Double distribution switchboards; o Double power distribution units and socket strips in the rack fed by different circuits coming from different switchboards; o Double power supplies in equipment; o Automatic transfer switches combined with redundant power circuits for non redundant power supply equipments;
•
“Different datacenters” may coexist in the same room. This leads to different power supply systems rather than an integrated approach. This is partly due to different asset owners operating without an integrated approach.
The typical EPUs datacenter electrical distribution concept is illustrated in the following diagram:
Figure A5.1 – EPUs datacenter electric distribution concept 159
Based on the datacenter availability tier classification defined by the Uptime Institute, the typical electrical distribution topology implemented on EPUs datacenters can be located between Tier II and Tier III levels, combining significant redundant components and partial concurrently maintenance capabilities.
Evolution opportunities:
The EPU’s essential role in the present day society renders the availability of power supply to its IT and telecom infrastructure part of the critical components to guarantee the demanded service level. To achieve such level, enabling maintenance activity without disrupting the power supply and minimizing the impact of hardware failure or human error, a set of evolution opportunities must be considered: • Adopt a 2N redundancy scheme, at least in the more important facilities like control centres; • To avoid high temperature and humidity lack of control in case of main power failure, it is recommended to use UPS to feed a part of the air conditioned equipment; • Deploy all electric systems, including distribution switchboards outside the perimeter of the IT/Telco technical room; • Battery autonomy shall de dimensioned taking into account the probability of failure of the medium voltage distribution network. Having really redundant medium voltage power lines in the facility if a effective way to increase availability; • Introduce automatic selective shutdown mechanisms to bring significant improvement of power autonomy for the most critical systems; • Establish contracts with fuel suppliers with SLA; Adopting this evolution opportunities in some cases may not be cost effective, so EPUs must adapt them according to each datacenter’s importance and service requirements.
Companies participating to the survey:
o o o o o o o o o o
CEZ (Romania), EDP (Portugal), EPS (Serbia), FURNAS Centrais Eletricas (Brazil), IEC (Israel), KEPCO (Japan), REN (Portugal), Snowy Hydro (Australia), Statnett (Norway), Western Power (Australia)
160
A6.
Deploying a Management Framework – Western Power
1. Introduction
A case study is presented of telecommunications management within Western Power and considers management organization, processes and tools both before and after the installation of Western Power's new telecommunications network management system. Western Power is compared against the models and concepts presented in the brochure from both maturity and compliance viewpoints. This case study does not consider the implementation of the new NMS, this is discussed in more depth in [45] 2. Background
Western Power is responsible for the transmission and distribution of electricity in the south west of Western Australia, otherwise known as the South West Interconnected System (SWIS). Western Power operates as an independent business that transmits power from power stations to residential, business and industrial customers. It also ensures equitable access to its electricity network for any new generators or retailers seeking to compete in the electricity market. As Western Australia's leading electricity transporter, Western Power employs over 2,500 staff and 700 contractors, supplies power to more than 930,000 customers and has over $3.5 billion in assets. Western Power owns a network that covers over 322,000 sq km with over 140 transmission substations; 58,000 transformers; 88,000km of power lines and 721,000 power poles. The SWIS can be seen in as the highlighted area below.
161
3. Western Power’s Communications networks
Western Power owns and operates an operational communications network with over AU$300M in assets, over 250 dedicated communications sites, over 100 shared communication sites (within substations) and in excess of 2000 network elements. An overview map of WP’s communications network can be seen in the above figure showing fibre (shown as orange) and microwave (shown as green). The operational telecommunications network primarily exists to support the electricity transmission and distribution business and to meet other technical regulatory requirements placed upon Western Power (for example telecommunications circuits for electricity generators and retailers). In addition to these core requirements, the operational communications network provides a number of services for other corporate users, such as corporate LAN / WAN systems, and non-operational asset monitoring systems. The operational services carried upon Western Power’s telecommunications network require high availability, in order to meet their regulatory requirements, and the majority of the network is designed and built with this high availability in mind. As a result, the operational telecommunications network is mostly duplicated to meet these availability requirements. Western Power’s operational communications systems provide as services: teleprotection signalling; SCADA communications; telephony; voice mobile radio (VHF); Ethernet VLAN services; and distribution automation communications (UHF radio). WP utilises a number of communications bearers – optical fibre, microwave, pilot cable, power line carrier and third party carriers. The predominant technologies on these bearers are PDH, SDH and Ethernet. Development of the communications network is primarily driven by transmission network augmentation; however the communications network is heavily influenced by rapid changes in communications technology and the relatively short asset life of communications equipment. 4. Before the installation of a new tool-set 4.1 Telecommunications Service Delivery model and upstream management
Until 2006, Western Power was a government owned vertically integrated utility comprising generation, transmission, distribution and retail. The company was disaggregated in 2006, and the ‘new’ Western Power became a regulated monopoly transmission and distribution utility, continuing as state government owned. This case study only considers Western Power as it has been structured since 2006. Western Power can be most closely modelled as a telecommunications Service Provider of type A as defined in figure 8.4. The description of this type of Service Provider (as discussed in section 8.4.1) is an accurate fit to Western Power. The telecommunications services are owned, strategically managed, operationally managed, and delivered in line with an overall corporate model that is summarised in the diagram hereafter. Corporate management of the telecommunications asset and systems is undertaken using identical business processes to those utilised for e.g. substation or new transmission line build.
162
4.2 Business Process Models and Maturity
Western Power implemented, over the course of thirty years, an organically grown set of processes that evolved to directly meet the requirements of the (internal, operational) customer. There was never any conscious or planned attempt to align or streamline processes to meet any standard (such as the ITU models). These organically grown processes and management have generally followed, or been forced to follow, the models implemented for the operation of the electrical network, and have utilised similar tool-sets and systems. In the defence of Western Power, the processes are mature and well-documented; however these processes are extremely specialised and completely inflexible. However, these organically grown processes led to organically grown tools to manage the telecommunications network. These management systems were deployed in an ad-hoc and reactive manner, driven by the needs of a new technology or to address capacity requirements. Western Power could be described as (generally) being at maturity level 3 – ‘defined’ as per the COBIT models. 4.3 Management Tools and Information Systems
To monitor, control and document the telecommunications network, Western Power has operated electronic / database systems since the early 1980s and paper / offline electronic systems before then. Western Power operated a number of standalone, un-connected
163
management platforms for operational performance management, fault management and configuration management. 5. The change journey (from a management perspective)
Due to extensive electrical system growth from 2004 to present (itself due to considerable economic growth in Western Australia), WP’s telecommunications network has also grown considerably. During 2007, a forecast of telecommunications network growth identified that WP would not be able to maintain a resilient network (to meet operational regulatory requirements) from 2009 onwards. Western Power embarked on a program to replace the existing tool-set of network management systems. The tripartite management scope as outlined in figure 10.1 was not fully considered as part of the initial scope – the focus was strongly on changing the tools in reaction to the business driver. 6. Following the implementation of a new tool-set 6.1 Telecommunications Service Delivery model and upstream management
The upstream management requirements and the way in which Western Power delivers telecommunications services has experienced any underlying change (no perceived ‘issues’ driver - although it could be argued there may have been a ‘missed opportunity’ driver). During the implementation, and from a comparison to the models presented in this brochure, it is apparent that Western Power delivers a standard suite of telecommunications services, when compared against other electrical utilities. The purpose of the initial implementation of the tool-set was to accommodate growth and to asset replace an ageing management platform. Its cost was justified through OPEX offset (people, maintenance of existing tools, and operational improvement through efficiencies in the toolset). 6.2 Business processes
The work undertaken to replace our current toolset with an integrated NMS has not dramatically changed or matured our business processes, except where there was a clash between the toolset and our ways of working. However, by the very nature of implementing an operational support system with modules and capabilities that can enable implementation of TMForum Frameworx, Western Power has been forced into some models (for example naming conventions). This has ‘opened our eyes’ to the full possibilities of Frameworx, and the associated cost benefits. 7. Future planning
Western Power, in line with every other EPU, is facing the upcoming challenge and opportunity presented by the Smart Grid. Western Power’s smart grid roadmap has clearly identified the need for considerable communications infrastructure to enable two-way communications between devices such as meters, electric vehicles, etc. and Western Power’s systems. The need for management tools to manage the explosive increase in communications devices is evident, and the response required in terms of business processes and structure will be considerable. It is believed that the management tools, and the ongoing alignment of business processes to the eTOM framework, will stand Western Power in good stead for the future.
164
A7.
ITIL Management Framework
The IT infrastructure library (ITIL) is a major initiative in the field of IT management developed in the late 80’s by the UK Central Computer and Telecommunication Agency (CCTA) incorporated at present in the Office of Government Commerce (OGC). It gives a detailed description of best practices, provides checklists, tasks and procedures that may be used by IT organizations according to their needs. Through its scalable and flexible “adopt and adapt” approach, ITIL is applicable to all IT organizations irrespective of their size or the technology in use. ITIL is not a standard, but rather a set of guidelines for a structured, common-sense, processdriven approach to ensure close alignment between IT and business processes. It recognizes that there is no universal solution to the process design and implementation for the management and delivery of IT services. As such, organizations, management systems and tools cannot be “ITIL compliant”, but may implement management processes assessed through ITIL guidance. The present edition of ITIL is V3 (May 2007) consisting of 26 processes and functions grouped under 5 volumes, arranged around the concept of Service Lifecycle structure:
• • • • •
Service Strategy, Service Design, Service Transition, Service Operation Continual Service Improvement
However, ITIL V2 (2001), in particular its first two components (service support and delivery) are the most commonly used. The V2 grouped process guidelines according to different aspects of IT management, applications and services into 8 logical sets:
• • • • • • •
Service Support Service Delivery ICT Infrastructure Management Security Management Business Perspective Application Management Software Asset Management
The logical sets have been complemented by two implementation guidelines:
• •
Planning to Implement Service Management ITIL Small-scale Implementation (for smaller IT units)
The structures of V2 and V3 are given in figures A7.1 and A7.2 hereafter. The international standard ISO/IEC20000 (based on the British Standard BS15000) describes an integrated set of management processes aligned with ITIL (but with reduced scope).
165
Figure A7.1 - ITIL V2 Processes (Warning - explanations are only indicative) Service Desk / Service Request Management (SD)
Incident Management (IM)
Service Support
Problem Management (PM)
Change Management (ChM)
166
• Provide a single point of contact for Service Users and handle incidents, problems and questions • Perform life-cycle management for Service Requests • Keeping the customer informed of progress and advise on workarounds • Handle large volumes of telephone call transactions (Call Centre) • Manage, co-ordinate and resolve incidents as quickly as possible at primary support level (Help Desk): • Provide an interface for other activities (e.g. change requests, maintenance contracts, software licenses, SLM, CM, AM, FM and ITSC) • Restore normal service operation (within SLA) as quickly as possible with the least possible impact on either the business or the user, at a costeffective price. • An 'Incident' is any event which is not part of the standard operation of the service and which causes, or may cause, an interruption or a reduction of the quality of the service. • Resolve the root causes of incidents • Minimize the adverse impact of incidents and problems caused by errors within the IT infrastructure, • Prevent recurrence of incidents related to these errors. • A `problem' is an unknown underlying cause of one or more incidents, and a `known error' is a problem that is successfully diagnosed and for which either a work-around or a permanent resolution has been identified • Ensure that standardized methods and procedures are employed for handling all changes (add, modify or remove). • Ensure minimal disruption of services • A Change Request (CR) is sent to the ChM and reflected into a Forward Schedule of Changes (FSC)
Service Support (Cont’d)
Release Management (RM)
Configuration Management (CM)
• Ensure the availability of licensed, tested, and version-certified software and hardware, functioning as intended when introduced into existing infrastructure • Track all of IT assets through the CM
database (CMDB) containing assets, their configurations, and their interactions • Configuration Planning and regular planning reviews • Identification and labeling of assets. Recording of asset information, (hard/software versions, ownership, documentation and other identifiers with a business defined level of detail • Control – Liaise with ChM to ensure that no Configuration Item is added, modified, replaced or removed without approved Request for Change, etc. • Lifecycle monitoring (ordered, received, under test, live, under repair, withdrawn, etc.) • Verification reviews and audits (physical existence, correct records in the CMDB and parts list). Check Release documentation before changes are made to the live environment.
167
Service Level Management (SLM)
Capacity Management (CaM)
Service Delivery
IT Service Continuity Management (ITSC)
Availability Management (AM)
Financial Management (FM)
168
• Primary interface with the “customer” (as opposed to the “user” serviced by the Service Desk) • Produce and maintain a Service Catalogue (standard IT SLAs) • Monitor the IT service levels specified in the SLAs, ensure that the agreed services are delivered • Establish Metrics and Monitor against benchmark • Liaise with AM, CaM, IM and PM to ensure required service level & quality • Ensure Operational Level Agreements (OLAs) with Support Providers Match IT resources to business demands • Application Sizing • Workload Management • Demand Management • Modeling • Capacity Planning • Resource Management • Performance Management • Perform Risk Assessment & Reduce disaster risks • Ensure service recovery & Evaluate recovery options • Business continuity planning • Prioritize service recovery through Business Impact Analysis (BIA) • Produce Contingency Plan • Regularly test and review the plan • Survey availability requirements • Produce availability plan • Monitor availability obligations • Manage resilience • Calculate & optimize service cost • Recover costs from users
ICT Design and Planning
ICT Infrastructure Management
Security Management Business Perspective Application Management Software Asset Mgt. (SAM)
• Strategies, policies and Plans • Overall & Mgt Architecture • Requirement Specs & Tendering • Business Cases ICT Deployment Mgt Design, build, test & deploy projects ICT Operations • Day-to-day technical supervision • Incidents reported by users or Events generated by the infrastructure • Often work closely with Incident Mgt & Service Desk • Logging of all operational events • Maintenance of operational monitoring & management tools ICT Technical Support Support infrastructure and service mgt with multiple levels of technical expertise Deploy Security Policy in the management organization • • • • • • •
Business Continuity Mgt. Transforming Business Practice, Partnerships and Outsourcing Improve the overall quality of software development & support Gather requirements for meeting business objectives Maintain software license compliance Track inventory and software asset use Maintain policies and procedures on software asset lifecycle
169
Figure A.7.2- ITIL V3 – “Service Life Cycle”-oriented and network-centric Service Portfolio Management
Service Strategy
Demand Management IT Financial Management Service Catalogue Management Service Level Management Risk Management
Service Design
Capacity Management Availability Management IT Service Continuity Mgt Information Security Mgt Compliance Management IT Architecture Management Supplier Management Service Asset & Configuration Management Service Validation & Testing Transition Planning & Support
Service Transition
Evaluation
Release & Deployment Mgt Change Management Knowledge Management Event Management
Service Operation
Incident Management Problem Management Request Fulfillment
Access Management
Continual Service Improvement (CSI)
Service Level Management Service Measurement & Reporting Continual Service Improvement
170
Strategic thinking on how the portfolio should be developed in future Understand and influence customer demands Idem V2 V2 : part of SLM Essentially same as V2 Service review now in CSI Dispersed in several processes. V3 : Coordinated process Idem V2 Idem V2 Idem V2 V3: Improved integration across Service Lifecycle V2 : Addressed within several processes V2 : Covered within ICT Design & Planning V2 : Covered within ICT Infrastructure Management V2: Configuration Mgt. V2 Release Mgt extended • Plan & coordinate resources to ensure that Service Strategy encoded into Service Design are realized in Service Operations • Identify, manage and control risks of failure and disruption across transition activities. • Ensure that the service will be useful to the business • Set metrics & measurement to ensure continued relevance of services V2 Release Mgt extended Essentially same as V2 New process in V3, previously included to some extent in Problem Management Part of Infrastructure Management in V2, has been extended as the trigger for Incident & Problem Mgt Essentially same as V2 Essentially same as V2 New in V3. V2 : Service Requests were treated by Incident Mgt. New in V3 V2: Part of Security Mgt New in V3 V2: Treated partially in SLM
Figure A7.3 – ITIL Service Support diagram (© OGC) [42]
171
Figure A7.4 – ITIL Service Delivery diagram (© OGC) [42]
172
A8.
TM Forum NGOSS - Frameworx
The Tele-Management Forum (TMF), initially the Network Management Forum (NMF) is a leading industry association of telecom operators and related industry actors focused on IT solutions for communications Service Providers. TMF published the first release of the "New Generation Operational Support Systems" (NGOSS) in 2001 and a “Telecom Operations Map" which evolved into eTOM in 2003. Enhanced Telecom Operations Map (eTOM) is at present published by ITU-T as a series of Recommendations M.3050. NGOSS is a Solution Framework for the enhancement of telecom provider’s business operations, support processes and systems. It applies specifically to telecom operators or similar Service Providers (internet, mobile services, etc.) but may be applied to any telecom service provisioning such as those related to EPUs. It delivers a framework for producing new generation OSS/BSS solutions, and a repository of documentation, models, and guidelines to support these developments. The goal is to facilitate the rapid and less costly development of flexible OSS/BSS solutions to meet the business needs of today’s competitive and rapidly evolving telecom environment. TM Forum Frameworx Integrated Business Architecture was introduced in 2010 and provides a master plan for the implementation of a Service Oriented Enterprise. Frameworx is designed using service-oriented principles and supports major software standards such as ITIL (refer to Appendix 7). It builds on the NGOSS Solution Framework releases with two additional major documents: the Frameworx Statement of Direction (TM Forum TR155) and the Frameworx Implementation Methodology (TM Forum GB945-M). Frameworx Architecture
The Solution Frameworks provide communications industry specific process (eTOM), information (SID), and application (TAM) frameworks united by an Integration Framework which includes interfaces that support interoperability within and between distributed value chain participants accompanied by a methodology describing how to use them. These frameworks are described in the following paragraphs. There are various entry points in to the frameworks based upon focus and needs of the frameworks’ user.
173
TM Forum Solution Frameworks entry points Frameworks are associated among themselves. The Integration Framework defines the interaction between processes and entities in more detail by describing the interaction in terms of details that characterize the entities, contained within Business Service (also known as NGOSS Contract) and interface specifications and their implementations. A business service is an element of functionality. The technical specifications within the Integration Framework define how the services are described using common models. The “Solution Framework” consists of four key components, which may be used standalone to solve particular problems or together as an integrated end-to-end framework: •
Business Process Framework (eTOM) provides a map of key processes, a common language and process flows. It can be used to survey and assess existing processes of a Service Provider, provide a framework for defining scope of a software-based solution, or facilitate communications between a Service Provider and its system integrators and suppliers.
•
Information Framework (SID) – The Information Framework provides a “common language” for software providers and integrators to use in describing management information, which in turn allows easier and more effective integration across software applications provided by multiple vendors. The Information Framework provides the principles for a shared information model, as well as its elements and entities to provide a system view of the information.
•
Integration Framework allows the integration of management applications provided by multiple software vendors. It defines architectural principles allowing developers to 174
create components, and Business Services (through “APIs”) for interfacing those elements to each other across a “technology-neutral” architecture (it does not define how to implement the architecture, but the principles that must be applied to be Solution Frameworks compliant). In addition, the Integration Framework includes the TM Forum’s library of Interfaces for integration of applications, and as the basis for the Business Services Repository. •
The Application Framework (TAM) defines the role and the functionality of the various applications that deliver management capability. It can be used by Service Providers to compare vendors’ solutions.
The Enhanced Telecom Operations Map (eTOM) developed by TM Forum has been published by ITU-T as Recommendations M.3050.x .eTOM is a business process model or framework for use by Service Providers and their suppliers and partners within the telecommunication industry. It describes all the enterprise processes required by a Service Provider and analyses them to different levels of detail according to their significance and priority for the business. It is used to analyze existing business processes, to identify redundancy or gaps in the current strategies, and to re-engineer processes correcting deficiencies and adding automation. We shall refer to this component of the NGOSS solution framework in the ongoing analysis. Frameworx Implementation Methodology
The Solution Frameworks Implementation Methodology provides a process and techniques to assist an enterprise in defining evolution phases in the architecture. There is no ‘standard’ platform architecture. Each enterprise will have its own platform architecture based on the business model under which it operates. A platform is a grouping of services, people and roles. The key thing about a platform is that it is a “real” implementable thing. Its definition reflects the focus of an enterprise and its toplevel approach to delivering service within the constraints imposed by a specific business model, the value chains in which the enterprise operates. A platform is a device to manage the complexity of an organization’s processes and IT infrastructure. Platforms are the building blocks of the enterprise architecture. 1. Business Process Framework
TM Forum’s Business Process Framework (commonly known as eTOM) is the industry’s common process architecture for both, business and functional processes. eTOM drives down operational costs by analyzing all facets of an organization’s processes, thereby eliminating duplication, identifying missing process steps, expediting new development, and simplifying procurement. The TM Forum Business Process Framework serves as the master plan for process direction and provides a neutral reference point for internal process reengineering needs, partnerships, alliances, and general working agreements with other providers. The Business Process Framework has being incorporated into ITU-T Recommendation M.3050 and has been approved and published by ITU-T as an international standard. Business Process Framework represents Service Provider's enterprise environment in a hierarchy of process elements that capture process detail at various levels. At the conceptual
175
level, the framework has three major process areas, reflecting major focuses within typical enterprises:
• • •
Strategy, Infrastructure, and Product, covering planning and lifecycle management Operations, covering the core of operational management Enterprise Management, covering corporate or business support management.
The Business Process Framework has multiple groupings for the processes that it contains: • Vertical process groupings: Focus on end-to-end activities (for example, Assurance). Each vertical group links together the customer, supporting services, resources, and supplier/partners. Taken together, these vertical groupings represent a ”lifecycle” view moving left to right across the Framework from the initial strategy for the products and their components, through development and delivery, and on into operations and billing. • Horizontal process groupings: Focus on functionally related areas, like Customer Relationship Management. These groupings can be visualized as a “layered” view of the enterprise’s processes, moving from top to bottom, with the customers and products supported by the underlying services, resources, and (where relevant) interaction with suppliers and partners. • Where a vertical process grouping and a horizontal process grouping intersect across the map, further process detail can be applied in either that horizontal or vertical context, according to the user’s needs. The process structure in the Framework uses hierarchical decomposition, so that the business processes of the enterprise are successively decomposed in a series of levels that expose increasing detail. As an example, below is the process decomposition for Customer Relationship Management. Detailed description of this example is beyond the scope of the present document.
176
Service Providers are increasingly incorporating IT-based services and therefore need to bring their IT and Business aspects closer together. To address this need, TM Forum together with itSMF community (whose members develop ITIL, originally the IT Infrastructure Library) have analyzed and defined integration of the two frameworks that leverages the best of both. As a result, the Business Process Framework has embedded direct support for ITIL processes by integrating these as Best Practices processes within the TM Forum Business Process Framework. One of the ways to use the Framework is to build process flows. While TM Forum does not mandate how to build process flows it does provide recommendations. The figure below illustrates a process flow fragment demonstrating how the Business Process Framework integrates with ITIL Best Practices (in this case for ITIL Change Management applied to a resource-oriented change) and includes low-level Business Process Framework elements mapped against a background of ITIL process steps.
177
Using the Framework for drafting or evaluating process flows is not the only benefit. It can serve as a master plan for process direction and provides a neutral reference point for internal process reengineering, partnerships, alliances, and general working agreements with other enterprises. It can be used as a standard structure, terminology, and classification scheme for analyzing an organization’s existing processes and for developing new processes. TM Forum is developing Conformance Certification program against the Business Process Framework (eTOM). 2. Information Framework
TM Forum's Information Framework (known as SID – Shared Information and Data) provides a common reference model and vocabulary for Enterprise information that is used to describe management information. Information Framework provides concepts and principles needed to define a shared information model, the entities of the model, as well as the business-oriented UML class models (Unified Modeling Language), design-oriented UML class models, and sequence diagrams that enable implementation of TM Forum Frameworx conformant service oriented solutions. The Information Framework scope covers all of the information required to implement business processes in a Service Provider’s operations based on the Business Process Framework (eTOM) processes. It focuses on what are called “business entity” definitions and associated attribute definitions. A business entity is a thing of interest to the business, such as customer, product, service, or network. Its attributes are facts that describe the entity. In short, the Information Framework provides the model that represents business concepts and their characteristics and relationships, described in an implementation independent manner.
178
As shown in figure xx, the framework is designed as a layered model, which partitions the shared information and data into eight domains. At the top layer (Level 1), each of the eight information domains is aligned with the Business Process Framework (eTOM). It enables segmentation of the total business problem into manageable pieces and allows resources to be focused on a particular area of interest. In other words, for a particular business process that is to be automated it is possible to identify the information within the Framework that is needed to support that process. The Information Framework Product and Service domains have been adopted by the ITU and are included in the ITU-T recommendation M.3190. The Information Framework can be used as a standalone framework, or when used in combination with the Business Process Framework (eTOM) it creates a bridge between the business and the Information Technology (IT) groups by providing definitions that are understandable by the business, but are also rigorous enough to be used for software development. TM Forum is developing Conformance Certification program against the Information Framework (SID) as well. 3. Application Framework
TM Forum's Application Framework (known also as TAM – Telecom Application Map) provides a common language between Service Providers and their suppliers to describe systems and their functions, as well as a common way of grouping them. It provides the bridge between Frameworx components, such as the Business Process and Information Frameworks, and real applications by grouping process functions and information data into recognized OSS and BSS applications or services. In areas such as Fulfillment, Assurance, and Billing, the Applications Framework breaks out a growing number of functional areas, including: • Customer Management • Service Management • Resource Management • Supplier/Partner Management • Enterprise Management The Application Framework is not intended to be prescriptive or mandatory, however, it does provide a ‘lens’ to compare current implementations with an idealized approach. Wherever possible, the Application Framework builds on TM Forum Frameworx particularly the Business Process Framework (eTOM) and the Information Framework (SID). The Application Framework uses identical layering concepts as the Business Process and Information Frameworks. It also recognizes managed resources, including network-based resources, content servers, Intelligent Network platforms, and related network control technologies (such as element management systems), as well as the management applications infrastructure fabric (e.g., bus technology or business process management engines). It provides a different perspective from the process view or information view in these other frameworks, and allows a company to advance their insight into the system design and implementation aspects of their management solutions.
179
The high level view of the Application Framework is shown in figure A8.xx. Beside seven horizontal layers (consistent with the Information Framework SID), it is also divided into four vertical columns (consistent with the Business Process Framework eTOM). Each box on the map represents a level 1 real Application such as Customer Order Management or Bill Calculation. The Applications Framework is further decomposed into “lower levels” of functionality which is beyond the scope of the present document. The complete listing of all available decompositions can be found in the Application Framework document (Document Number: GB929). One of the prime benefits of using the framework is the ability to identify and document the key attributes associated with each application, which then enables Service Providers to understand fully the functionality they already have within their organization. With this evaluation of applications, the Service Provider has the information to make an informed choice. The key uses of the Application Framework include: • Procurement: Service Providers can use it across the entire procurement process—from initial request for information, through systems comparison, to guidance for implementation. • Product Positioning: It helps Suppliers position which systems they supply (TM Forum maintains web based Product and Services Directory). • Streamlining IT/Operation Systems: It provides a map to rationalize and combine application stovepipes across multiple technologies and services (e.g., mobile or fixed) .
180
• •
Mergers and Acquisitions: It provides a common vocabulary and structure against which merging organizations can map their systems. Outsourcing: The Framework can be used to define precisely the boundaries between interfacing applications, allowing more effective outsourcing of key functions.
4. Integration Framework
For today’s Communications Service Provider, software rather than the network is the enabling function. This is driving the rate of service innovation to new levels. To cope, Service Providers are turning to software technologies, such as Service Oriented Architecture, and industry standards to gain business agility and flexibility. These are provided by the TM Forum Frameworx Integrated Business Architecture.
For further reference, the interested reader can refer to the following list of TM Forum documents and packs:
° ° ° ° ° ° ° ° ° ° ° ° ° ° ° °
RN303 Release notes for the Frameworx Release 8.1 TR155 Frameworx Statement of Direction Release 8.1 GB945-M Frameworx Implementation Methodology Release 8.1 RN311 Business Process Framework R8-1 Release Notes GB921 Business Process Framework R8-1 Getting Started Pack GB921 Business Process Framework R8-1 Domain Addenda Pack TR143 eTOM and ITIL building bridges RN310 Information Framework (SID) R8-1 Release Notes GB922 Information Framework (SID) R8-1 Getting Started Pack GB922 Information Framework (SID) R8-1 Domain Addenda Pack RN315 Release Notes for Application Map Release 3.2 GB929, Telecom Application (TAM) Map, Release 3.2 RN316 Release Note for Integration Framework Release 2.1 GB942-CP Integration Framework Concepts and Principles GB942-MAP Frameworx Mappings GB942-U Integration Framework User Guidelines
181
A9. Ac ADSS AGC As ATM BCP BSS CAPEX CBR CCTA CIP CMDB CMMI COBIT CRM CWDM DCS DECT DHCP DM DMS DNS DR DRP DSL DWDM EIA EMC EMS EMS EoPDH EOS EoSDH EPR EPU ERP ES eTOM EU FCAPS Fu GIS GPRS GUI HMI IAAS ICCP ICT IEC IP ISP
List of Acronyms Accounting (& Billing) Process (Management) All Dielectric Self Supporting Cable Automatic Generation Control Assurance Process (Management) Asynchronous Transfer Mode Business Continuity Plan Business Support System Capital Expenditure Constant Bit Rate Central Computer & Telecommunication Agency (UK) Critical Infrastructure Protection (NERC) Configuration Management Data Base Capability Maturity Model Integration Control Objectives for Information (and related) Technologies Customer Relation Management Course Wavelength Division Multiplexing Digital (Substation) Control System Digital Enhanced (previously European) Cordless Telephone Dynamic Host Configuration Protocol Degraded Minutes Distribution Management System Domain Name System Disaster Recovery Disaster Recovery Plan Digital Subscriber Loop Dense Wavelength Division Multiplexing Electronic Industries Alliance (US) Electromagnetic Compatibility Energy Management System Element Management System Ethernet over PDH see EoSDH Ethernet over SDH Earth Potential Rise Electrical Power Utility Enterprise Resource Planning Errored Seconds Enhanced Telecom Operations Map European Union Fault, Configuration, Accounting, Performance and Security Management Fulfilment Process (Management) Geographical Information System General Packet Radio Service Graphical User Interface Human Machine Interface Infrastructure as a Service Inter-Control Centre Protocol Information and Communication Technology International Electrotechnical Commission Internetwork Protocol Internet Service Provider
182
ISO ITIL itSMF ITU KPI LAN LLA LSP MARS MPLS MPLS-TP MSP N-CMDB NE NERC NGN NGOSS NMF NMS NOC NSO/RSO OAM OGC OHL OLA OPEX OPGW OSPF OSS OTDR OTN PAAS PBB-TE P/C PDH PLC PMU PSTN PTT QoS RAS RIP ROADM ROI RSTP RTU SAAS SAS SCADA S-CMDB SDH SES SIPS SIR SLA SNCP
Independent System Operator Information Technology Infrastructure Library IT Service Management Forum International Telecommunication Union Key Performance Indicator Local Area Network Logical Layered (Management) Architecture Label Switched Path Multiple Address Radio System Multi-Protocol Label Switching MPLS Transmission Profile Multiplex Section Protection Network Configuration Management Data Base Network Element North American Electric Reliability Corporation Next Generation Networks Next Generation Operation Support System Network Management Forum (now Tele-management Forum) Network Management System Network Operation Centre National (or Regional) System Operator Operation, Administration and Maintenance Office of Government Commerce (UK) Overhead Line Operational Level Agreement Operation Expenditure Optical Ground Wire Cable Open Shortest Path First Operation Support System Optical Time Domain Reflectometer Optical Transport Network Platform as a Service Provider Backbone Bridge – Traffic Engineering Providers / Contractors Plesiochronous Digital Hierarchy Power Line Carrier Phasor Measurement Unit Public Switched Telephone Network Push to Talk Quality of Service Remedial Action Scheme Routing Information Protocol Reconfigurable Optical Add Drop Multiplexer Return on Investment Rapid Spanning Tree Protocol Remote Terminal Unit Software as a Service Substation Automation System Supervisory Control and Data Acquisition Service Configuration Management Data Base Synchronous Digital Hierarchy Severely Errored Seconds System Integrity Protection Scheme Service Initialization Request Service Level Agreement Sub-Network Connection Protection (SDH)
183
SNMP SOA SOC SPD STP Su TASE TCP TDM TETRA TMF TMN TSO UDP UPS UTC U-Telco uTOM VIU VLAN VoIP VPN VSAT WAMS WAP&C WDM
Simple Network Management Protocol Service Oriented Architecture Security Operational Centre Surge Protection Device Spanning Tree Protocol Support Process (Management) Telecontrol Application Service Element Transmission Control Protocol Time Division Multiplex TErrestrial (previously Trans-European) Trunked Radio TeleManagement Forum Telecommunication Management Network Transmission System Operator User (or Universal) Datagram Protocol Uninterruptable Power Supply Utilities Telecom Council (US) Utility Telecommunication Company Utility Telecom Operations Map Vertically Integrated Utility Virtual Local Area Network Voice over IP Virtual Private Network Very Small Aperture (Satellite) Terminal Wide Area Monitoring System Wide Area Protection and Control Wavelength Division Multiplexing
184
REFERENCES 1 2 3 4 5
6 7 8 9 10 11
12
13 14 15 16 17
18 19 20
CIGRE Technical Brochure 107, Power System Telecommunications in High Speed Environment, Working Group 35-07, December 1996 CIGRE Technical Brochure 249, Integrated Service Networks for Utilities, Working Group D2-07, August 2004 CIGRE Technical Brochure 321, Operational Services using IP Virtual Private Networks, Task Force D2-10, June 2007 CIGRE Technical Brochure xxx, Wide Area Ethernet in Power Utilities, Working Group D2-23, xxx 2010 CIGRE Technical Brochure 108, Business Opportunities for Power Utilities in the Telecommunication Market, Working Group 35-08, April 1997 ITU-T E.800 (09/2008) – Definition of terms related to quality of service CIGRE Technical Brochure 192, Protection using Telecommunications, Joint Working Group 34/35.11, August 2001 CIGRE 2010 D2 Technical Session Proceedings, Contribution to Special Report Questions 1.10 and 1.14, M. Mesbah RTE Experience of the Metering and Invoicing Process, X. Gallet, CIGRE Colloquium 2009, Fukuoka, Japan CIGRE Technical Brochure 267, Strategies for utility companies seeking to move to improved mobility, Task Force D2-09, 2005 Development of Disaster Recovery Support System using Mobiles, T. Akagi, S. Yokomizo, Y. Miyamoto, CIGRE Colloquium 2009, Fukuoka, Japan Application of field work remote support tool to work related to electric power equipment, S. Kodama, H. Kihara, CIGRE Colloquium 2009, Fukuoka, Japan Guidance Note on Working Alone, 2009, Commission for Occupational Safety and Health, Government of Western Australia, Dept of Commerce Kyushu Electric Power Co., Telecommunication System, www.kyuden.co.jp Security Operational Centre, F. Lenoir, CIGRE Colloquium 2007, Lucerne, Switzerland Alstom Grid White Paper, Power System Telecommunications - Why do Power Utilities need dedicated telecom infrastructures, M. Mesbah Information & Telecom Technology Centre, Kansas Univ., ResiliNets Strategy for Resilient and Survivable Networking, J. Sterbenz, D. Hutchison, E. Cetinkaya, A. Jabbar, J.P. Rohrer, www.ittc.ku.edu/resilinets, March 2009 CIGRE Technical Brochure 317, Security for information systems and intranets in electric power systems, JWG D2/B3/C2-01, April 2007 ITU-T G821 (12/2002) – Error performance of an international digital connection operating at a bitrate below the primary rate CIGRE Technical Brochure 37, Guide for planning of power utility digital telecommunications networks, Working Group 35.02, 1989
185
Ch 1 Ch 1 Ch 1 Ch 1, 3, 4 and 5 Ch 1 & 2 and 3.2 2.1 3.1 3.2.5 3.5.1 4.6 4.6, 5.5.2 8.2.3 4.6
5.2.2 5.2.3, 7.6 5.3 Ch 6 6.4
6.5 6.6 6.6
21
22
23
24
25
26 27 28
29
30
31 32 33
34 35 36 37 38 39 40
IEC TS61000-6-5 Electromagnetic compatibility (EMC) – Part 6-5: Generic standards, Immunity for power station and substation environments, July2001 Information/ICT security risk assessment of Operational IT systems at Electric Power Utilities, M. Tritschler, G. Dondossola, CIGRE Colloquium 2009, Fukuoka, Japan Case studies of continuity of information systems operations during major disasters for Electric Power Companies, H. Inoue, M. Oohashi, CIGRE Colloquium 2009, Fukuoka, Japan Disaster Prevention measures for information and telecommunication systems in Electric Power Utilities, K. Tsuge, T. Kawaguchi, T. Iwaki, G. Yamashita, CIGRE Colloquium 2009, Fukuoka, Japan Analysis of the Impact of Natural Disasters on Power Communication Facilities, Hui-Bin Cao, Xin Miao, CIGRE Colloquium 2009, Fukuoka, Japan European Electricity Directive 2003/54/EC EU Directive 2009/72/EU, July 13, 2009 The Ownership Unbundling of Electricity Transmission System Operators: the European Union Policy and the Case in Lithuania, S. Milciuviene, A. Tikniute, Commerce of Eng. Decisions, Engineering Economics(2), 2009, Development of Coarse Wavelength Division Multiplexing and its application to telecommunication networks for electric power systems, M. Yamasaki, H. Edatsugi, K. Suzuki, H. Masugi, T. Nakao, T. Nishio, CIGRE Colloquium 2009, Fukuoka, Japan Network security design and considerations for an adoption of IEC 61850: A case study of EGAT, A. PAO-ON, CIGRE Colloquium 2007, Lucerne, Switzerland New architecture for Protection and Control networks, C. Samitier, R. Pellizzoni, J. Darne, CIGRE Colloquium 2009, Fukuoka, Japan SEI Smart Grid Maturity Model, Overview and Definition V1.0, Carnegie Mellon University, Software Engineering Institute, www.sei.cmu.edu CMMI for services V1.2, Improving processes for better services, Technical Report, Software Engineering Institute, Carnegie Mellon University, Feb2009 Directions and applications of IT governance in CFE, M. Velasco, G. Arroyo-Figueroa, I. Parra, CIGRE Session 2008, Paris Introducing a capacity management maturity model, P. Bauer, TeamQuest white paper, www.teamquest.com, 2010 ITU-T M.3010 (02/2000) - Principles for a telecommunications management network ITU-T M.3400 (02/2000) - TMN management functions CIGRE Technical Brochure 341, Integrated Management Information in Utilities, Working Group D2-17, Feb. 2008 itSMF UK Chapter (2007) - An Introductory Overview of ITIL® V3 www.tmforum.org – Tele-Management Forum, NGOSS Solution Framework
186
6.8
7.2
7.3
7.4.4
7.5
8.2 8.2
9.3.2
9.3.4
9.3.4 10.2.1 10.2.2
10.2.2 10.2.2 10.3.1 10.3.1 10.3.1 10.3.2, A7 10.3.3
41 42
43
44
45
46 47
48
49 50 51 52
53 54
55
ITU-T M.3050.x (03/2007) – Enhanced Telecom Operations Map (eTOM) ITU-T M.3050 Supplement 1 (03/2007) – Enhanced Telecom Operations Map (eTOM) - Interim view of an interpreter's guide for eTOM and ITIL practitioners Network management systems for resilient networks at electric power companies in Japan, K. Kishigami, S. Inohana, Y. Matsuda , T. Suzuki, M. Yamasaki, N. Seta, T. Seki, M. Nakamura, CIGRE Colloquium 2007, Lucerne, Switzerland Requirements for IP implementation to achieve resiliency in communication network for power system and its construction and deployment, A. Kitahama, K. Hosokawa, Y. Miyazawa, Y. Tonoshiba, K. Monchizuki, CIGRE Colloquium 2007, Lucerne, Switzerland Business efficiency improvement in telecommunications system management through the adoption and integration of a leading-edge telecom network management system, D. Bell, CIGRE Colloquium 2009, Fukuoka, Japan Inventory OSS in a Simplified Process Environment, M. Blokar, White Paper, Specinova Systems Ltd., 2010, www.specinova.si/publications/ Next generation Telco IT architectures and transformation to support service production and operation in all-IP NGN networks, F. Kocsis, A. Kurukawa, J. Reilly, IEEE Communication Magazine, August 2010 End-to-end flexible transport service provisioning in inter-CSP environments, E. Menachi, R. Giladi, IEEE Communication Magazine, August 2010 ITU-T OTN, Guest Editorial, M.L. Jones, D. Brungard, H. Van Helvoort, IEEE Communications Magazine, September 2010 The Operator’s view of OTN Evolution, M. Carroll, J. Roese, T. Ohara, IEEE Communications Magazine, September 2010 Carrier Ethernet technologies comparison, Brocade white paper, Feb. 2010 Above the Clouds: A Berkeley View of Cloud Computing, M. Armbrust A. Fox, R. Griffith et al., UC Berkeley, Technical Report, http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.html http://groups.google.ca/group/cloud-computing/web/list-of-cloudplatforms-providers-and-enablers A telecommunications mobile unit for transmission lines emergency scenarios, A. Pinhel Soares, R. Medeiros, J.A. Paula Motta, CIGRE Session 2010, Paris UTC Research Report - Hurricanes of 2005: Performance of Gulf Coast Critical Infrastructure Communications Networks, November 2005
187
10.3.4 10.3.5, A7
10.7.1, 10.7.3
10.7.2
10.7, A6
10.7.4 12.3
12.3
12.4 12.4 12.4 12.5
12.5 A3
A4
View more...
Comments