Maintaining Yale IT’s commitment to Service Quality in response to fall outages

January 12, 2023

Sunday, January 8, marked the successful culmination of over three months of analysis and planning after outages impacted Yale’s wired and wireless computer network. Yale IT and over 90 partners spent seven and a half hours on that day conducting resilience testing. This was done to reduce the potential for widespread future outages and maintain IT’s commitment to Service Quality. The Schwarzman Center and Klein Tower were tested first due to recent outages, but all remaining network locations will follow in the year ahead.

Yale’s network standard was designed with redundancy and recoverability in mind—if one major distribution location fails, another is intended to back it up. In September, the design did not work as intended, and this exercise confirmed that actions taken since those incidents had resolved all remaining defects. Overall, our heightened standards and increased institutional dependency on IT services require not only design and execution, but regular testing. In the case of our core network, this requires validating the electrical supply, network, and infrastructure components to ensure that the expected results match the actual results. On Sunday, a team ran six exercises in each major distribution room to validate that the design and implementation functioned correctly. Through this process, the team progressed through its dependency map, identifying potential risks, troubleshooting issues in the field, assigning levels of importance, and labeling each variable. All material issues observed were corrected in the field and retested. As a result of this process, IT addressed all known issues and significantly increased our confidence in the reliability of the design during any potential future event.

An example of One IT at its finest, Sunday’s resilience testing required the partnership of many people, including:

  • New Haven partners: the City of New Haven, New Haven Fire Department, and New Haven Police Department
  • Yale partners: Emergency Management, Facilities, Information Security, Professional Schools including The Yale School of Management, Public Safety, and Yale Health
  • Vendors: A/C, electrical, and Uninterruptable Power Supply (UPS) vendors
  • IT and distributed IT partners: NGN, YCRC, and teams including Disaster Recovery and Resiliency, Hosting and Technology Services, Information Security, Managed Services, the Network team, IT SLT members, and more

Following the day-long event, John Barden celebrated their progress, “especially given the coordination across so many groups.” Sunday’s testing demonstrated the power of partnership and One IT, confirmed the benefits of designing with resiliency in mind, and laid the groundwork for enhancements to IT processes, ultimately improving Service Quality.

Service Quality