TPC will be providing the following full- and half-day tutorials on Sunday, May 31 and Monday, June 1:
These tutorials have been refined over the past 18 months, including the introductory “AI for Science” tutorial that has been presented to hundreds of people at Supercomputing Asia in February 2024, the TPC European Kickoff Workshop in June 2024, the University of Michigan’s annual Conference on Foundation Models and AI Agents for Science, and numerous local training events.
Sponsors are welcome to provide appropriate half-day tutorials on a subject that would be of interest to TPC members the morning of Monday, June 1. Download the sponsorship prospectus here.
Tutorials are open to all conference attendees, for an additional fee.
This is a lessons-learned tutorial designed to equip researchers with practical insights and conceptual grounding for applying AI systems to scientific challenges. Twelve talks from TPC participants across national laboratories, universities, and industry present concrete experience building AI systems that reason, plan, execute, and refine across simulation, experiment, and literature. The program covers the full arc from autonomous experimentation to production deployment, organized in four sessions:
Session 1: Closed-Loop AI and Self-Driving Laboratories: Deployed agentic systems running real experimental campaigns.
Session 2: Agentic AI for Chemistry and Materials Discovery: Domain research assistants and foundation models for molecules and materials.
Session 3: Agentic Architectures, Frameworks, and Coordination Protocols: Lessons building multi-agent systems, including MCP-based orchestration.
Session 4: Production Systems: Industry, HPC Runtimes, and Safety: Industrial deployment, runtime infrastructure, and safety for agentic AI.
Talks are exemplars showing AI use in applications ranging from accelerator operations, catalysis, battery science, drug discovery, and computational fluid dynamics, to semiconductor engineering. Each illustrates the use of AI frameworks and tools, including LangGraph, AG2, Claude Code, MCP, ChemGraph, Osprey, SpectraQuery, and Dragon, among others.
Participants will develop a practical understanding of building and deploying AI systems for scientific discovery, including:
Closed-Loop AI and Self-Driving Laboratories
Agentic AI for Chemistry and Materials Discovery
Agentic Architectures, Frameworks, and Coordination Protocols
Production Systems: Industry, HPC Runtimes, and Safety
This tutorial aims to provide researchers with an introduction to the latest reproducible Artificial Intelligence (AI) and Machine Learning (ML) workflows and tools available through the NSF-funded Tapis v3 platform, which provides both an Application Programming Interface (API) and User Interface (UI). Using Tapis, researchers can discover AI/ML models and tools and deploy them directly to compute resources within the NAIRR and ACCESS ecosystem, supporting the goal of providing computation, data, software, models, training, and educational materials to advance research, discovery, and innovation.
Through hands-on exercises, participants will gain experience in developing AI/ML workflows and deploying them on a variety of HPC and cloud resources such as Jetstream2, Chameleon, and Vista, and Stampede3. We will emphasize the utilization of various Tapis core APIs, alongside specialized APIs such as Tapis Workflows, Tapis Pods, ML Hub and FlexServ, all seamlessly integrated within the user-friendly TapisUI. Using these production-grade services, we will demonstrate the creation and facilitation of trustworthy, reproducible scientific machine learning workflows. By the end of this tutorial, researchers will be empowered to efficiently develop, deploy, and maintain their own ML workflows.
Participants will learn to securely authenticate with Tapis to access its core and advanced APIs, enabling the creation, execution, and deployment of scientific machine learning AI/ML workflows. By constructing well-defined workflows for real-world use cases, attendees will gain a foundational understanding of how to leverage HPC resources for research and utilize production-quality APIs to build transparent, reproducible ML pipelines.
This tutorial will be conducted by Anagha Jamthe, Wei Zhang, and Christian Garcia of the Texas Advanced Computing Center, University of Texas at Austin.
Attendee Preparation: To participate in the hands-on portions of the tutorial, attendees should create and activate a TACC account in advance by visiting here. We recommend completing this process prior to the tutorial to avoid delays during setup.
Overview of NAIRR Infrastructure and TapisUI
Models, Large Models, and Third-Party Registries
Prompt Engineering: Computer Vision Models with Jupyter
Fine-Tuning and Analytics
This hands-on tutorial will equip computational scientists, engineers, developers, and students with practical skills for using AI models, tools, and agentic systems for maximum productivity, innovation, and discovery. The tutorial will begin by providing a foundation for understanding both predictive and generative AI methods, including how to minimize errors and to increase accuracy and useful results. The tutorial will then cover the powerful capabilities of LLMs and multi-modal models, with demos and hands-on labs. The majority of the workshop will then show attendees how AI technologies can augment phases from discovery and innovation from start to finish: deep literature research, ideation/hypothesis generation, research/development planning, application prototyping and development, code optimization, surrogate creation, and data analysis. The emphasis for every phase will be how AI technologies can assist and empower (not replace), and which tools are most useful for each task (and why).
Tutorial examples and labs will leverage the latest production and research/prototype Google technologies — Gemini, NotebookLM, Gemini CLI, Code Assist, Antigravity, AlphaEvolve, Co-Scientist, and others — that are available at time of this workshop. However, the core principles and strategies are designed to be portable, enabling scientists to effectively use any comparable AI models and tools in their own endeavors (and even in most of the labs). Multiple AI science applications (WeatherNext, AlphaFold 3, AlphaGenome, etc.) developed by Google DeepMind will be used to show the capabilities of AI-powered scientific discovery.
Participants will learn how AI-powered tools can help in every phase of the computational research/application development process:
Deep Literature Research, Novel Hypothesis Generation, and Innovative Research Planning
AI-Supercharged Code Prototyping, Development, Execution, and Optimization
Understanding and Using AI Agents
Developing and Using AI Agents and Surrogates
This is a hands-on tutorial designed to equip researchers with practical skills and conceptual grounding in the application of LLMs to scientific challenges. Large Language Models (LLMs) are becoming capable of solving complex problems while presenting the opportunity to leverage them for scientific applications. However, even the most sophisticated models can struggle with simple reasoning tasks and make mistakes.
This tutorial focuses on best practices for evaluating LLMs for science applications. It guides participants through methods and techniques for testing LLMs at basic and intermediate levels. It starts with the fundamentals of LLM design, development, application, and evaluation while focusing on scientific application. Participants will also learn various complementary methods to rigorously evaluate LLM responses in benchmarks and end-to-end scenario settings. The tutorial features a hands-on session where participants use LLMs to solve provided problems.
This tutorial will be conducted by Franck Cappello, R&D Lead, Senior Computer Scientist, Sandeep Madireddy, Computer Scientist and AI Researcher, Neil Getty, Assistant Computer Scientist and Robert Underwood, Assistant Computer Scientist at Argonne National Laboratory.
Use Cases and Basic Evaluation Techniques
Advanced Evaluation Techniques
Hands-On Work
Agentic systems, in which autonomous agents collaborate to solve complex problems, are emerging as a transformative methodology in AI. However, adapting agentic architectures to scientific cyberinfrastructure — spanning HPC systems, experimental facilities, and federated data repositories — introduces new technical challenges. In this half-day tutorial, we introduce participants to the design, deployment, and management of scalable agentic systems for scientific discovery. We will present Academy, a Python-based middleware platform built to support agentic workflows across heterogeneous research environments.
Participants will learn core agentic system concepts, including asynchronous execution models, stateful agent orchestration, and dynamic resource management. A guided hands-on session will help attendees build and launch their own agentic systems. This tutorial is designed for researchers, developers, and cyberinfrastructure professionals interested in advancing AI-driven science with next-generation autonomous systems.
This tutorial will be conducted by Ian Foster, Data Science and Learning Division Director at Argonne National Laboratory, and Yadu Babuji, University of Chicago.
Introduction to Agentic Systems and Academy
Hands-On Implementation of Agentic Systems
Based on demonstrations and access to AMD Developer Cloud, this hands-on tutorial is designed to equip scientists with the necessary tools to leverage AI in scientific workflows using AMD’s open-source ROCm stack (consisting of frameworks, compilers, libraries and tools).
The program will leverage open data (e.g., Wikipedia), open models (e.g., AMD Instella, GPT-OSS, OpenFold), and recipes inspired by real use cases to demonstrate AI model training from first principles, domain-specific fine-tuning, optimized model inference with distillation, interleaving modeling/simulation codes with AI (at full and mixed precision), and orchestrating agentic frameworks on AMD GPUs.
We will also show use of AMD Primus, the flexible training framework enabling large-scale foundation model training, and Enterprise AI Suite, for model hosting and serving, both applied to scientific domains.
AMD AI Workflows, Ecosystem, Deployment, and Profiling Stack Overview
AI4Science Studio: Agent-Driven Workflows for Scientific AI Models
This hands-on tutorial equips researchers and computational scientists with practical skills for leveraging heterogeneous computing architectures and agentic AI workflows to accelerate scientific discovery on AWS. The tutorial is organized in two sessions.
The first session introduces classical HPC solutions and the quantum computing service, Amazon Braket, covering hybrid quantum-classical resources that support scientific research in academic and private industry settings. Participants will explore AWS HPC services and solutions — AWS Batch, AWS Parallel Computing Service, and AWS ParallelCluster — alongside the integration and deployment of hybrid quantum-classical workloads. The session features a recent implementation of the quantum-classical auxiliary field quantum Monte Carlo workflow and its application to modeling chemical reaction energies.
The second session demonstrates how agentic AI workflows offer a new paradigm for scientific computing: an AI agent receives a research question, reasons about the computational approach, retrieves benchmark datasets from a catalog, configures and launches simulations on cloud HPC, and analyzes the results — synthesizing findings, identifying anomalies, and recommending next steps without manual intervention. Participants will explore how agentic reasoning can accelerate the hypothesis-to-computation cycle, making production-scale scientific computing more accessible and repeatable.
Heterogeneous Quantum and Classical Computing on AWS
Agentic Workflows for AI-Driven Scientific Computing on AWS