Privacy-Preserving Federated Learning for Science

Advancing AI and ML for transformative scientific discovery

(Image Credit: DALL.E)

Project Objectives

Launched in October 2024, AI4S-PPFL is a multi-institutional project with focus on developing next-generation, privacy-preserving federated learning frameworks for training large-scale foundation models. Funded by the U.S. Department of Energy Office of Science, our initiative unites leading experts from academia, national laboratories, and industry to address critical challenges in scientific computing.

Our research focuses on several key thrusts:

  • Efficiency: Optimizing communication, memory, and energy usage in large-scale federated networks.
  • Privacy & Fairness: Implementing robust security protocols to ensure data confidentiality and equitable participation.
  • Synthetic Data Integration: Generating high-quality synthetic datasets that preserve privacy without compromising model utility.
  • High-Performance Data Management: Developing advanced methods for managing and processing massive, distributed datasets to support scalable federated learning.

Additionally, we are proud to introduce APPFL – our open-source Python package that offers a comprehensive framework for building and deploying privacy-preserving federated learning systems.

Research Team

Principal Investigators

Kibaek Kim

Kibaek Kim (Lead PI), Computational Mathematician at Argonne, specializes in federated learning and large-scale optimization algorithms for HPC systems and GPUs, developing advanced numerical methods applied to electric grids and other DOE-critical scientific domains.

Thomas Flynn

Tom Flynn, Associate Computational Scientist at BNL, develops ML methods for secure inference, physics-informed imaging, training, and distributed optimization using statistical change detection and compression for HPC.

Olivera Kotevska

Olivera Kotevska, Research Scientist at ORNL, specializes in privacy algorithms and machine learning for scientific applications, and is a leader with over 50 publications and numerous organizational roles.

Minseok Ryu

Minseok Ryu, Assistant Professor at ASU, focuses on operations research and machine learning, developing decentralized, stochastic optimization algorithms for decision-making under uncertainty in health and energy applications.

Farzad Yousefian

Farzad Yousefian, Assistant Professor at Rutgers University, leads MathOptRG, advancing optimization and game theory models to solve large-scale computational challenges in scientific machine learning and multi-agent systems.

Co-Principal Investigators

Ravi Madduri

Ravi Madduri, Senior Research Scientist at Argonne, develops innovative HPC/AI software for biomedicine, leading PALISADE-X for privacy-preserving federated learning and MVP-CHAMPION for large-scale genetic analysis, earning NIH and DOE awards.

Todd Munson

Todd Munson, Senior Computational Scientist at Argonne and Senior Scientist at UChicago, specializes in scalable numerical optimization algorithms and software development (PETSc/TAO, MINOTAUR) for PDE-constrained, equilibrium, and sparse system problems.

Krishnan Raghavan

Krishnan Raghavan, Assistant Computational Mathematician at Argonne, develops mathematical characterizations of ML models using systems theory, statistics, and optimization to enhance AI applications in nuclear physics, material science, HPC, and climate.

Rob Ross

Robert Ross, Senior Computer Scientist at Argonne and Director of the DOE SciDAC RAPIDS Institute, earned Clemson Ph.D. in 2000, received a 2004 PECASE, and secured two R&D 100 Awards.

Matthieu Dorier

Matthieu Dorier (ANL), Research Engineer at Argonne, specializes in high-performance computing for data management with his expertise in HPC I/O, parallel/distributed storage, distributed algorithms, and networking.

Ai Kagawa

Ai Kagawa, Computational Scientist at BNL, specializes in reinforcement learning, boosting algorithms, and discrete optimization.

Byung-Jun Yoon

Byung-Jun Yoon is a Professor at Texas A&M and Scientist at BNL specializing in scientific AI/ML, optimal experimental design, and uncertainty quantification.

Christian Engelmann

Dr. Christian Engelmann, Senior Computer Scientist and group leader at ORNL, has 24+ years in HPC software R&D, specializing in resilience and system software.

Institutional Partners

Argonne National Laboratory
Brookhaven National Laboratory
Oak Ridge National Laboratory
Arizona State University
Rutgers University