What is Data Center Commissioning?
Data center commissioning (and commissioning in general) is a process that ensures buildings are delivered according to the Owner’s Project Requirements (OPR). ASHRAE says the OPR “is a written document that details the functional requirements of a project and the expectations of how it will be used and operated.”
Data centers are industrial-scale, complex facilities that may require as much energy as a small town. The functional requirements for these facilities are more extensive than average commercial facilities. Society has come to rely on computing resources and availability of internet service as a matter of fact, where any downtime at all is considered unacceptable, and for certain systems, may have life or safety reliant systems depending on uninterrupted, 24/7 operation. Or at the very least have disastrous financial implications if the computing resources inside data centers were to become unavailable (For example, if Amazon servers go down they lose an estimated $1,104 in sales for every second of downtime).
Data centers are the heart of these systems and the internet itself, so these facilities require design, construction, and commissioning processes that are more rigorous than what’s usually deployed in commercial construction.
This article focuses on the facilities themselves – the power, cooling, support systems, the physical structures that keep the environment for the computing infrastructure inside them running, and what it takes to implement a successful commissioning program for a data center.
Functional Requirements for Data Centers: Considerations
These are functional requirement considerations that are particularly relevant to the scope of data center commissioning.
Power is usually the largest cost in data center operations. The fault tolerance and resiliency needed to maintain uninterrupted power requires a multi-faceted approach.
- Parallel electric utility sources.
- Backup power sources, such as generators that run on a consumable fuel source (diesel, fuel oil, etc.) that can be stored and delivered to the site.
- Uninterruptable power supplies (UPS) through the use of batteries, flywheels, etc. to carry the load of the building during a transition from one power source to another.
- System design as it relates to redundancy, fault tolerance, failover, and maintenance.
The computing resources in a data center generate a lot of heat. Keeping equipment at optimal temperatures ensure it works properly and doesn’t fail. Reliable and redundant cooling systems are another part of maintaining uptime.
- Climate. The equipment and its configuration should be appropriate for the climate and take advantage of “free cooling” when available.
- The operating ranges of the computing resources that are planned to be used.
- The density (or lack thereof) of computing resources in server racks. This can determine whether a traditional row (hot/cold aisle) air distribution design, or a more direct rack distribution system may work best.
- Server rack density as it relates to a need continuous cooling. In high density server racks, some studies show server temperatures rose from 72 deg. F. to 90 deg. F. in 75 seconds without continuous cooling (the cooling equivalent of uninterrupted power supplies for cooling).
- System design as it relates to redundancy, fault tolerance, failover, and maintenance.
Not every region is suitable for data centers. The following considerations may be used when deciding where a data center will be built. Commissioning requirements related to location are less relevant than those for primary power and mechanical systems, but like any system, it will have unique characteristics and performance based on its environment.
- Extreme weather patterns and geological risks.
- Flight paths.
- Climate. Data centers require large amounts of cooling, making some climates better than others.
- Availability, reliability, and cost of regional electric utilities. Many data centers have incoming power from multiple utilities.
- Availability, reliability, and cost of the backhaul network connections. Without reliable network connections to the facility, the computing resources inside the data center would of course be of no value to the systems that rely on them.
- Relative location to other data centers. Many computing systems that data centers will support are critical enough to require geo-redundancy, where a failure in one data center will cause mirrored computing resources in another coordinated data center to take on the workload.
- Proximity to users of the computing resources. Some systems require extremely low latency and the distance network traffic has to travel to end users can be a consideration.
- Regulatory requirements. Some government agencies require data to be stored within certain geographical boundaries.
- Political stability of the region.
Automation and control systems are what bring equipment together and cause them to operate in unison, and respond to events and actions from other equipment and systems, environmental conditions, or operators.
The following items must be carefully coordinated among engineering disciplines and construction trades. Focus on the following key components is a good start.
- Sequences of operation for all systems.
- Control system software design practices and process.
- Power monitoring system hardware and software.
- Cooling systems control hardware and software.
- Control system networking hardware and configuration. This can be a common shortcoming in both control system design and implementation by a control system installer, since networking fundamentals haven’t traditionally been a core competency of either of these types of entities (although this is slowly changing).
- Interconnectedness of control systems and components from different manufacturers. Open and common protocols don’t always guarantee proper interface – it’s still up to each manufacturer to implement protocol specifications correctly, and then up to the engineers and technicians from the installer to program and configure them properly.
- Electrical power for control systems. The power sources for control system and networking hardware must be on uninterruptable power sources.
- Availability and responsiveness of local support for control systems.
- Competency of data center staff for the installed control systems.
- A common interface for all control systems. This common interface doesn’t need to bring ALL data into on place, but it should have dashboards and key, critical operating data and alarms, with links and ways to quickly access each individual system when full reports and detailed data are required.
Access to a data center must be controlled and limited to only qualified, authorized individuals. Depending on the systems running on the servers in a data center, these access restrictions may even be governed by laws and regulations by governments or their agencies.
There are general considerations that will dictate ALL functional requirements.
- Budget (obviously). Each incremental improvement in redundancy and reliability doesn’t come without higher initial investment, or potentially higher operational expenses.
- Tier certification. The Uptime Institute publishes what’s considered the international standard for data center performance.
- The four tiers of certification “match a particular business function and define criteria for maintenance, power, cooling and fault capabilities.”
- Each tier will add reliability to the data center, but will also increase complexity and initial costs.
Commissioning for Data Centers
Commissioning for highly resilient facilities like data centers may vary in the level of rigor applied compared to a commissioning program for any other facility. There are five levels of commissioning usually implemented within data center commissioning programs (these levels are not limited to data center commissioning, but are more commonly used in this context).
Some differences in data center commissioning:
- Higher witness test rates, less sampling.
- Multiple tests and multiple iterations of these tests – every potential scenario or condition must be tested.
- Higher level of test complexity.
- Witness testing of all operation, failure and maintenance scenarios.
- Integrated Systems Test.
The five levels cover parts of a commissioning program that mainly occur during what’s considered the construction phase of commissioning. It should be noted that a complete commissioning program includes design phase activities as well. We have seen the five level approach used with a sixth level, sometimes identified as level zero, that covers design reviews, OPR development, and other early-stage engagement. What it’s called isn’t important, but it is important that the commissioning program spans all design, construction, and post-occupancy phases.
Level One Commissioning
Level one commissioning covers submittal reviews and factory witness testing. Factory witness testing involves mockups of equipment or systems in controlled environments, often at the manufacturer or vendor’s factory – hence the name. Testing in this environment makes it easier and less expensive to find and fix design and implementation issues with equipment and software. It also allows asynchronous testing of many different systems while the data center itself is still under construction, aiding in overall timeline efficiency.
Level Two Commissioning
Level two commissioning is site acceptance inspection. Equipment is inspected to ensure compliance with all design criteria, is not damaged, and that there is a proper storage plan in place.
Level Three Commissioning
Level covers installation inspections, sometimes referred to as pre-functional testing. Installation inspections involve the inspection of the installation of all equipment. Both the contractors and commissioning agent will verify that all equipment is installed properly, and that installation meets design and operational standards. Equipment is started for the first time to check proper, independent operation. Testing is repeated after corrections are made to any equipment that fails testing.
Level Four Commissioning
Functional performance testing happens during level four commissioning. Functional performance testing often applies to either individual components or equipment, or tightly coupled components and equipment (i.e. a chiller, its pumps, and its cooling tower(s)). During this phase, each control loop is checked, actual operation is compared to designed sequences of operation, and performance is observed. Setpoint adjustments may be made as necessary. Operational issues are uncovered during this phase.
This level of commissioning begins to involve more people and firms, disciplines, and entities working together than other phases, at least with respect to testing.
Level Five Commissioning
This is the final phase of testing (aside from any seasonal testing that was not performed earlier), and usually focuses on building-wide response to major events, like suddenly disconnecting utility power.
It’s at this point where the response from all systems – power, cooling, general HVAC, etc. must be proven to work together in unison and prevent any interruptions to the operation of the data center.
In summary, level five commissioning demonstrates the performance of the facility as a whole against all design criteria. Systems are operated at various loads and in various modes to demonstrate proper response to equipment failures and utility problems.
Streamline Data Center Commissioning Projects
BlueRithm provides software for managing data center commissioning. Learn more about this.