Pioneering AI projects and promoting data governance

December 14, 2023

With a growing number of publicly available Artificial Intelligence (AI) tools and a rapidly changing technology space, Yale quickly recognized the need to provide its community with guidelines. This led to the Data Governance Executive Council, Provost Scott Strobel, SVP for Operations Jack Callahan, and VP for Information Technology and Campus Services John Barden sharing guidance on generative AI in an email on September 20, 2023.

This communication signaled the beginning of a focus on AI at Yale. In fact, Artificial Intelligence is one of IT’s FY24 Balanced Scorecard (BSC) Initiatives, and the BSC AI team carefully selected its first AI project—a chatbot that uses data from Yale Dining. The chatbot, askYale, launched on November 9 and is the first public AI tool released by Yale.

Associate Director of YaleSites Franz Hartl, part of the development team, emphasized the university’s dedication to responsible AI development. He stated, “While we are deeply committed to fostering open, safe, and responsible AI, we also recognize the importance of developing a rigorous data governance approach. That approach will ensure our AI solutions are as trustworthy as they are innovative.”

Hosted on the new YaleSites platform and designed to help the Yale community navigate dining on campus, the chatbot uses a retrieval augmented generation (RAG) application to generate answers to queries.

For example, “Where can I get pizza today?” generated this response: 

   ESM's Garlic & Ricotta Pie is being served for Brunch/Lunch, Dinner (Pizza) on 11/30/2023 at the dining hall locations Ezra Stiles & Morse. 1   Please note that the availability of pizza options may vary. It's always a good idea to check the daily menu for the most up-to-date information. Enjoy your pizza!

I’m glad you’re interested in pizza! Here are the places where you can get pizza today: 

ESM’s Garlic & Ricotta Pie is being served for Brunch/Lunch, Dinner (Pizza) on 11/30/2023 at the dining hall locations Ezra Stiles & Morse.1
Please note that the availability of pizza options may vary. It’s always a good idea to check the daily menu for the most up-to-date information. Enjoy your pizza!

This chatbot required various groups to work closely to achieve the desired result. Greg Locke designed the architecture, Franz Hartl created the application, and the data governance team built the process to vet and approve it.

What is data governance?

Data governance ensures good stewardship of Yale’s institutional data, coordinating organizational efforts to provide and use high-quality data consistently, meaningfully, and securely.  While readers may be familiar with ISO or the Infrastructure Services team, data governance may be less familiar.

Re-started after a hiatus in 2022, the Administrative Data Governance group handles questions that fall into roughly three categories:

  • Decision support. Data stewards can decide on data requests within their domain or refer complex and challenging questions to Data Governance. 
  • Project enablement. For complex projects involving multiple data domains or those seeking to use existing data in a new way, data stewards and other stakeholders work together to enable projects.
  • New processes and guidelines. Examples include the Guidelines for Student Use of Institutional Data and the AI guidelines.

Data Governance has a five-level structure comprised of sponsors, the Data Governance Executive Council, data stewards, data managers, and data consumers, as follows: 

  • Sponsors include John Barden, Joe Crosby, Arnim Dontes, and Chad Losee.
  • Data Governance Executive Council members, who constitute the primary working group for data governance, meet every other week to provide broad representation across Yale, including:
    • Heather Abbott, Registrar, Yale Law School
    • Kathleen Broderick, Director, Enterprise Data & Analytics
    • Susan Bouregy, Chief Privacy Officer
    • Angela Eisele, Assoc Dir., Financial Reporting & Analytics
    • Kathleen Galo, Data Governance Lead
    • Alyssa Greenwald, Associate General Counsel
    • Shonna Marshall, University Registrar
    • Frank Mathew, Deputy CIO
    • Wies Rafi, Assoc. CIO, Health Sciences
    • Susan Riggs, AVP Human Resources Operations 
    • Lily Svensen, Director, Office of Institutional Research   
  • Data stewards (14 in all) represent unique data domains at Yale, approve and prioritize data requests, and sometimes refer cases to the Council for review and decisions. 
  • Data managers are subject matter experts within a domain appointed by their respective Data Stewards. 
  • Data consumers (members of the Yale community) have not yet been brought into the group’s efforts, but planning is in effect for this. 

One project that data governance has been actively working on is data retention. In the past, Yale relied on a comprehensive list from OGC that compiles legal minimums for retaining over 200 data types. As a result, data was often kept indefinitely. In the future, legal minimums will be enforced as the limit for maintaining active data in systems. By shifting focus on archiving older data and systems, the team will reduce the risk of keeping data longer than necessary. As part of this change, a team did a successful POC archival of a copy of Resumix, a legacy job application system that was retired in 2006. The goal is for additional legacy systems to follow suit eventually.

To submit a data request, please visit yaledata.yale.edu. There, you will find a request form and instructions on the process, which has three steps: 

  • A data governance review
  • SPA (Security Planning Assessment) with the ISO team
  • A technical review by the Infrastructure Services team will also operationalize the project   

Visit the project’s website to learn about this group’s mission, charter, and objectives.