Foundation models like GPT-4 have dramatically altered the modern work landscape for many industries reliant on language tasks, but no equivalent model exists yet for scientific applications. Incorporating foundation models into research workflows could enable unprecedented discoveries. However, mainstream foundation models trained on human-scale datasets will be insufficient for analyzing most scientific phenomena -- a foundation model for science will require special consideration for the requirements of scientific datasets, especially those with wide dynamic ranges.
In this talk , I will introduce the Polymathic AI initiative: our goal is to accelerate the development of versatile foundation models tailored for numerical datasets and scientific machine learning tasks. The challenge we are undertaking is to build AI models which leverage information from heterogeneous datasets and across different scientific fields, which, contrary to domains like natural language processing, do not share a unifying representation (i.e., text). Such models can then be used as strong baselines or be further fine-tuned by scientists for specific applications. I will present our initial papers and projects, including large scientific datasets designed for large scale training "MultiModal Universe" and "The Well".
Biography:
Professor Ho joined the Physics Department as a Research Professor and as an Affiliated Faculty at Center for Data Science at NYU in 2021. Ho joined Simons Foundation in 2018 as leader of the Cosmology X Data Science group at CCA and in 2021, she assumed the role of CCA’s interim director. Her research interests have ranged from fundamental cosmological measurements to exoplanet statistics to using machine learning to estimate how much dark matter is in the universe. Ho has broad expertise in theory, observation and data science. Ho’s recent interest has been on understanding and developing novel tools in statistics and machine learning techniques, and applying them to astrophysical challenges. Her goal is to understand the universe’s beginning, evolution and its ultimate fate. In her bidding to understand our Universe, Ho plans, builds and analyzes data from a number of astronomical surveys such as Actacama Cosmology Telescope, Euclid, the Rubin Observatory, Simons Observatory, Sloan Digital Sky Survey and the Roman Space Telescope. Ho earned her Ph.D. in astrophysical sciences from Princeton University in 2008 and her bachelor’s degrees in computer science and physics from the University of California, Berkeley in 2004. She was a Chamberlain fellow and a Seaborg fellow at Lawrence Berkeley National Laboratory before joining Carnegie Mellon University in 2011 as an assistant professor. She became the Cooper Siegel Career Development Chair Professor and was appointed associate professor with tenure in 2016. She moved to Lawrence Berkeley Lab as a Senior Scientist in 2016. Since 2011, she has been a primary mentor to more than 35 postdoctoral fellows, 10 graduate students and 20 undergraduates in the fields of astrophysics, computer science and statistics. She has received several awards including NASA Group Achievement Award, Macronix Prize and Carnegie Science Award. She is also elected a Fellow by the International Astrostatistics Association. (Source)