The electrical grid is one of our nation’s most important infrastructure assets. Every aspect of our economy and virtually every aspect of modern living depend on the reliable flow of electricity into our homes and businesses. A system failure due to a cyberattack, especially during severe weather conditions or other event can have devastating impacts at local and regional levels.
As utilities rush to restore service during an outage, they need to have confidence that the system can be restored to a “known good state”—ensuring a system or process starts from and operates in a verifiable and acceptable condition. This confidence depends in large part on a utility’s ability to identify intentional or unintentional changes to operational programs or equipment settings, which could cause additional damage or prolong outages if left undetected. As a result, utilities must maintain a high level of trust in their systems to ensure the return to a known good state.
Effective cybersecurity requires multiple layers of defense to protect the core of an operation from unauthorized intrusions and activities. Common defense-in-depth applications include behavioral policies, firewalls, intrusion detection systems, and patch management processes. Trust-based controls can enhance cybersecurity and improve overall network resilience.
Establishing Trust
Today’s modern electrical grid is comprised of many different assets that work together to control the flow and delivery of power. The utility relies upon each piece of equipment to perform a specific function, and often that equipment is remotely located. Although the operator implicitly trusts the equipment to continuously perform in the intended manner, the possibility exists than an individual, either with evil intent or inadvertently, might modify the equipment settings or operating program, thereby resulting in damaged assets, extended outages, or compromised safety.
This raises several questions fundamental to the establishment of trust:
- Providence—Who built the equipment? Who delivered it? Who installed it?
- Management—Who manages it? Who might have tampered with it or modified it?
- Status—Is the equipment patched? Is there a virus? Is there a rootkit?
These issues concern supply chain management and the operation of equipment installed in the grid. Utilities control their supply chains and only have authorized trained personnel that install and maintain the equipment. Utilities conduct system performance tests to ensure the components and systems are operating correctly after installation and whenever systems are modified. However, these operational tests are often not enough to ensure the system is secure.
Secure operations require knowledge that the equipment is configured and is operating correctly. If both conditions exist, then the utility has confidence in the trust level of the grid and will know that a specific level of security is in place to help defend against intrusions and unexpected events.
Determining Consistent Operation of Equipment
When a utility powers up complex equipment, the operator must have confidence that the asset will consistently perform in a known good state. To achieve this, modern equipment often includes a trusted platform module (TPM) to control the device’s boot sequence. A TPM is an integrated circuit that measures the software resident in the equipment when it starts.
At power-on, a TPM will measure and validate the startup code by taking a “hash” on the file and comparing it to a known good hash prior to allowing program execution. A cryptographic hash function is an algorithm that takes a block of data, such as a file or program, and returns a unique number that is similar to a long serial number. The (cryptographic) hash value establishes the identity of the device, with any subsequent changes to the data or program resulting in a change to the hash value. By comparing the measured hash value to the known value, the TPM can identify code modifications and alert the utility operator, who in turn will determine if the equipment should be permitted to come online. TPM functionalities can ultimately improve a utility’s situational awareness and strengthen the resilience of its networks to a range of threats.
Configuration Control
Controlled equipment configuration consists of two activities: establishing operational parameters and updating embedded software. Equipment operators need to know if and when operational parameters change and fall out of tolerance limits. Utilities also need the capability to remotely and automatically update embedded software as security patches become available.
Currently, most utilities manually configure and update embedded software, either in their center or by sending a technician to the field. This is a slow and costly process that often results in multiple revision levels running across a utility’s equipment base. Historically, utilities have been slow to implement timely updates.
By deploying a modern two-way communications network, a utility can remotely configure a device’s operational parameters and continuously monitor that equipment for anomalies or changes to settings. If such events occur, the control center is automatically notified and an operator is assigned to determine a course of action.
In this scenario, the operational center maintains the configuration of all devices in a central secure database. As unauthorized change alerts are received, the utility staff can take action to determine the appropriate next steps including remotely pushing the correct configuration settings back out to the device. If an authorized technician changes the device in the field, the utility staff in the control center can pull the configuration from the reprogrammed device and update the central repository with the new configuration. As with TPM functionalities, configuration control can improve utility response time to unexpected events and improve system resiliency.
Software Updates
Vendors periodically release updates to the software programs that run utility systems and equipment. As in the case of current configuration control practices, utilities typically dispatch a technician to the field to manually update the devices. This leads to a similar problem of multiple software versions running across a utility’s equipment base.
Using the same communications network described above, utilities can remotely update field devices from the control center. This process can be secured through a combination of vendor-specific private key certificates and embedded public key certificates. Specifically, a software vendor will digitally sign a software update with a unique private key certificate. By using the vendor’s public signing key, the utility can verify that the software update came from that vendor and was not altered in transit. The vendor will also embed the public key certificate of authorized users in the hardware prior to shipping. This methodology of certificates enables a device to verify a user’s signature prior to accepting a software update, thereby introducing an additional trust-based control to the utility’s operations. To avoid the simultaneous operation of inconsistent software versions, the system that updates the embedded software must function in a full transactional mode. This allows an operator to specify a group of devices to be updated with a single software package. At the end of the updating process, all equipment will be running the same revision level; however, if one device fails to update, all of the devices roll back to their previous version to ensure consistent and reliable operations.
Roadmap Recommendations
Utility operators should consider establishing trust in the equipment on their systems in order to improve grid resiliency. While trust-based controls are typically designed to defend against cyber-based threats, these same controls can drastically enhance a utility’s ability to detect and recover from equipment anomalies or system integrity problems, especially during weather-related events.
To establish an appropriate level of trust, utilities should focus their efforts on three activities:
- Ensure that equipment consistently starts in a known good state through the use of TPM and software verification techniques.
- Deploy an automated secure communications network to control and update equipment operational configurations.
- Utilize the secure communications network to conduct transaction-based software updates of field devices.