Summer Mentors by Year:
Michael Miller

Exploring Quantum Sound and Music
This project seeks to explore the state of the art using quantum computing concepts to create sound and music. We will explore information available from previous conferences and seek out new developments. We will then explore what is needed to create a framework/interface for users to experiment with and look toward having a platform available to deploy on quantum resources when acquired by NCSA.
Kevin Chang

LLM-Based Knowledge Agent
Large language models like ChatGPT have changed the landscape of artificial intelligence and promise to automate how we perform knowledge work. This project will explore LLMs as the “engines” to build agents for such automation– e.g., to help you find the knowledge you need, to synthesize the knowledge for a specific topic, to answer a technical question, to tutor a student with a personalized learning experience. Techniques: large language models, natural language processing, information retrieval, data mining, machine learning.
Rachel Adler

Generative AI mHealth Apps for Older Adults
The population of older adults is rising sharply across the world, especially so in the United States where the population is expected to rise by 47% from 58 million in 2022 to 82 million in 2050, becoming nearly a quarter of the population then (23%) from 17% now. Moreover, older adults, particularly those with disabilities, are more susceptible to physical inactivity. Research suggests that health-related outcomes and quality of life can be improved by including physical interventions. This research project is oriented towards creating custom health applications that would work on both mobile and web interfaces featuring physical activity interventions, with the interface of a generative AI based conversational agent, and also retrieving, processing, and personalising knowledge from wearable fitness trackers. This conversational agent would be trained on specialised health knowledge and will personalise interactions. We are looking for students who are interested in contributing to development of mobile health applications, in training and fine-tuning of generative AI models, and/or conducting usability testing.
T. Andrew Manning

Blast: A Web Application for Characterizing the Host Galaxies of Astrophysical Transients
Characterizing the host galaxies of astrophysical transients is important to many areas of astrophysics, including constraining the progenitor systems of core-collapse supernovae, correcting Type Ia supernova distances, and probabilistically classifying transients without photometric or spectroscopic data. Given the increasing transient discovery rate in the coming years, there is substantial utility in providing public, transparent, reproducible, and automatic characterization for large samples of transient host galaxies. We have developed a web application and workflow management system called Blast that ingests live streams of transient alerts, matches transients to their host galaxies, and performs photometry on coincident archival imaging data of the host galaxy.
We are looking for a student interested in learning how to develop and deploy research applications and cyberinfrastructure within cloud-native platforms like OpenStack and Kubernetes. Development activities will include improving and optimizing the Blast system in the LSST (Legacy Survey of Space & Time) era: implementing resource limits and requests on task workers, designing an elastic horizontal scaling system to handle dynamic data loads, optimizing workflow efficiency, and more.
The student must have some familiarity with Python (Django), Linux, Git, and containerization. Although knowledge of astrophysics is optional, enthusiasm to support scientists by building research software is required.
Sever Tipei

Music on High-Performance Computers
The project centers on DISSCO, software for composition, sound design and music notation/printing developed at UIUC Computer Music Project, NCSA and Argonne National Laboratory. A parallel version was also developed at the San Diego Supercomputer Center with support from XSEDE (Extreme Science and Engineering Discovery Environment). Written in C++, DISSCO presently runs on the NCSA Delta system with support from ACCESS and uses both CPU and CUDA platforms.
DISSCO has a directed graph structure and uses stochastic distributions, sieves (part of Number Theory), Markov chains and elements of Information Theory to produce musical compositions. Efforts are presently directed toward moving the existing GUI from using gtkmm to Qt, adding new features, refining a system for the notation of music as well as to the realization of an Evolving Entity, a composition whose aspects change when computed recursively over long periods of time thus mirroring the way living organisms are transformed in time (Artificial Life).
Due to the fact that DISSCO is a “black box” that does not allow the user to interfere during computations and that the computer makes decisions not controlled by the user, it shares features with AI type of projects. Further developments are considered in this area.
Papers on DISSCO, co-authored with SPIN interns, have been presented at international conferences and compositions realized with DISSCO have been featured in concerts in the US, Europe, Asia and Australia.
Skills needed:
- proficiency in C++ programming
- familiarity with Linux Operating System
- familiarity with music notation preferred but not required.
More information: About DISSCO, About Notation, About Evolving Entity
Bin Peng

AI for Science: Advancing AI for Crop, Soil, and Environmental Prediction
We are hiring talented undergraduate or graduate interns to join the Water, Agriculture, and Conservation Innovation (WACI) Lab led by Prof. Bin Peng at Department of Crop Sciences (CPSC), College of Agricultural, Consumer and Environmental Sciences (ACES). The project goals are to foster AI for Science with a focus on crop, soil, and environmental sciences. Selected applicants will work on the following projects: (1) Knowledge-guided machine learning (KGML) or physics-informed neural networks (PINN) for crop, soil, and environmental prediction. (2) LLM for crop, soil, and environmental prediction. (3) Deep learning for large-scale crop, soil, and environmental monitoring with airborne and satellite remote sensing.
Angela Lyons, Aiman Soliman

Climate Change, Migration, and Socioeconomic Impacts of the Deforestation of the Amazon in Brazil
This interdisciplinary research project examines how internal migration in Northern Brazil and the Amazon—driven by environmental degradation and economic hardship—is reshaping land use, regional economies, and ecological sustainability. Rapid deforestation, fueled by land speculation, agricultural expansion, and poorly designed policies, has disrupted ecosystems, accelerated climate change, and created harsh living conditions that push vulnerable populations to migrate. These migration flows often move people to equally fragile areas, potentially perpetuating cycles of deforestation.
Using AI-based geospatial and machine learning methods, our team investigates:
- Where people are moving
- How these movements affect local economies and land use
- The role of policies, credit access, and environmental risk management in shaping migration dynamics
The ultimate goal is to inform sustainable development strategies that balance economic needs with conservation efforts in the Amazon and similar forested regions worldwide.
The student will work closely with NCSA researchers, faculty, and graduate students to:
- Preprocess geospatial and socioeconomic datasets
- Conduct data modeling, analysis, and predictions
- Create maps and other data visualizations from geospatial and socioeconomic data
- Develop, review, and document code in Python and/or R
- Assist with the development, training, validation, and testing of machine learning algorithms
- Create and maintain a GitHub repository for scripts and documentation
- Participate in regular mentor meetings and team discussions
Preferred Qualifications:
- Undergraduate student in data science, computer science, electrical and computer engineering, statistics, or related field
- Proficiency in Python and/or R
- Basic knowledge of machine learning techniques and/or geospatial analysis
- Experience with GIS tools (e.g., QGIS, ArcGIS) and/or remote sensory data is a plus
- Ability to create maps, dashboards, and visual analytics
- Familiarity with version control systems (e.g., Git/GitHub)
- Strong problem-solving skills and attention to detail
Haohan Wang

Agentic Transcriptome Atlas: Redefining Disease from Molecular Signals
A new view of disease will form where molecular signals, not broad clinical labels, define how conditions are understood. As transcriptomic patterns accumulate, categories will reorganize according to shared gene-expression structure rather than surface-level symptoms or diagnostic groupings. Conditions that appear unrelated at the bedside will reveal common biological programs, and rare diseases will connect to well-studied disorders through measurable molecular proximity. Disease organization will update as data expands, allowing the structure of medicine to follow underlying biology instead of fixed naming schemes.
The system will operate through coordinated LLM agents that plan analyses, verify statistical outputs, and re-run specific steps when results deviate from expected behavior. Core functions—dataset loading, normalization, and signature identification—will run as modular calls inside agent-controlled workflows. Prompts will encode state management and error-recovery rules, and evaluation will emphasize reproducibility and stability under perturbation. Students will focus on agent coordination, trace-level debugging, and long-context prompt engineering in the context of real transcriptomic data.
Jill Palmer Naiman

Quantifying the Effectiveness of Scientific Documentaries Using Natural Language Processing
This project aims to quantify the effectiveness of documentaries created by the Advanced Visualization Lab at NCSA through sentiment and impact analysis of reviews left on various platforms (e.g. Amazon, YouTube). Work on this project includes hand-annotation of sentences in our dataset, training of deep learning models on already annotated data (e.g., RoBERTa), and the use of LLM APIs (e.g., ChatGPT) for zero-shot and few-shot learning. Project requirements: patience (for annotations), some experience in Python expected, experience with machine learning/LLMs preferred, but not required.
Abhijeet Ghoshal

Recommendation Systems for Donation Platforms
Many organizations experiencing financial adversities (typically for resources required for general operations) may utilize crowdfunding platforms, such as ‘Donors Choose’, to obtain the required financial help. For example, a school facing financial adversities for purchasing supplies for the primary sections may open a project at the platform (Donors Choose), and donors can donate money to this project. Once the project meets its financial target, the platform purchases the supplies and sends them to the school. These platforms run the projects under the provision-point policy, that is, within a time period chosen by the project owner, the platform fulfills the demands of the project (purchases the supplies and sends them to the school) if the financial target is met. In case the project falls short of the financial target, the platform returns the money to the donors and the project gets nothing. Thus, the utility of the platform is measured in terms of the projects that meet their financial targets. In our research, we are designing and building a recommender system that the platform can deploy to increase the likelihood of donations by the donors. The projects for recommendations will be selected based on the preferences of the donors, the financial targets of the projects, and the time remaining to finish them. The primary objective of designing this recommendation system is to ensure an increased number of finished projects (projects that meet their financial goals in time). Currently, the platform finishes close to 69% of the projects. We expect that the recommendation system will significantly increase this number. In addition, we will incorporate a fairness constraint that will ensure that higher priority projects (decided by the platform) will get higher preference, subject to the conditions already stated above. This constraint will further increase the utility of the crowdfunding platform for the donors and for the project owners.
Yue Lin

Developing a Typology of U.S. Neighborhoods that Accounts for Racial-Ethnic Disparities
The goal of this study is to develop open, national-scale geodemographic clustering data of U.S. neighborhoods that ensures balanced representation across different racial and ethnic subgroups. To achieve this, the study will employ a novel machine learning framework called socially-fair geodemographic clustering (SFGC), which minimizes the maximum average cost across subgroups rather than the total cost. This study seeks to extend the impact of SFGC to the national scale in the U.S., addressing limitations of current nationwide geodemographic data products that still largely rely on classical clustering methods, such as k-means. To maximize accessibility and usability, an interactive web map will be created using Leaflet, allowing stakeholders, such as policymakers, urban planners, researchers, and community members, to access, explore, and analyze the clustering results in an intuitive, interactive format. The project will also provide clear documentation, tutorials, and open-source code to support transparency, reproducibility, and community engagement, ensuring the dataset can be leveraged for a wide range of social, economic, and urban research applications.
Deana McDonagh

Turn Code Into Compassion: Visualize Aging
Aging is one of the most fascinating and universal human experiences—and understanding it takes more than biology. It takes computational creativity. We’re looking for passionate computer science students to help us model and visualize the aging process in ways that are interactive, meaningful, and impactful.
Using data science, machine learning, and cutting-edge visualization, you’ll transform complex biological changes into dynamic, engaging representations that inform healthcare, design, and policy. This is your chance to apply your technical skills to a challenge with real human significance—creating tools that not only illustrate aging but inspire empathy and innovation for a better future.
Volodymyr Kindratenko

Enabling LLM-as-a-Service on High-Performance Computing Systems
This project focuses on advancing a framework that enables Large Language Models (LLMs) to run efficiently on high-performance computing (HPC) systems such as Delta and DeltaAI. The approach, known as LLM-as-a-Service, is being developed at the Center for AI Innovation at NCSA. The team has already implemented a batch-processing framework for LLMs, and the summer project will build on this foundation by improving performance and scalability on HPC systems, adding new functionality to enhance usability and flexibility, and conducting extensive testing to ensure reliability and robustness. Required Skills include strong proficiency in Python programming, basic understanding of LLMs and HPC concepts, familiarity with software engineering practices, including version control with GitHub, curiosity and willingness to learn and experiment with new tools and ideas. Students working on this project will gain hands-on experience in cutting-edge AI technologies, HPC environments, and collaborative software development practices. This is an excellent opportunity to contribute to real-world AI infrastructure projects and develop skills highly valued in both academia and industry.
Volodymyr Kindratenko

Building an Intelligent Assistant for High-Performance Computing Environments
Our team at the Center for AI Innovation at NCSA is developing an agentic AI system designed to serve as an intelligent assistant on NCSA’s flagship computing platforms, Delta and DeltaAI. This assistant can be accessed directly from the command line and provides support for a variety of tasks, including explaining the status of submitted jobs, troubleshooting issues such as module loading or compilation, and assisting with file management. We have already built a fully functional prototype based on OpenCode and custom MCP servers, and the summer project will focus on testing and evaluating the system, enhancing its capabilities, and creating user documentation. The ideal candidate will have strong Python programming skills, a basic understanding of large language models and agentic AI concepts, first-hand experience working on a high-performance computing system, such as Delta, and familiarity with software engineering practices such as version control with GitHub. Curiosity and a willingness to learn and experiment with new tools are essential. This project offers hands-on experience with cutting-edge AI technologies, HPC environments, and collaborative software development, providing an excellent opportunity to contribute to real-world AI infrastructure and build skills highly valued in both academia and industry.
Volodymyr Kindratenko

Building Intelligent Chatbots on the Illinois Chat Platform
The student will contribute to the development of the Illinois Chat platform, an initiative from the Center for AI Innovation at NCSA that provides the campus academic and educational community with tools for building and deploying chatbots. Working closely with the platform development team, the student will help enhance the platform by implementing improvements, adding new functionality, fixing bugs, creating documentation, and developing specific chatbot applications. The ideal candidate should have strong Python programming skills, a basic understanding of large language models and agentic AI concepts, and experience in developing both frontend and backend web services. Familiarity with software engineering practices, including version control with GitHub, is essential, along with curiosity and a willingness to learn and experiment with new tools. This project offers hands-on experience with cutting-edge AI technologies and collaborative software development, providing an excellent opportunity to contribute to a real-world AI platform and build skills highly valued in both academia and industry.
Mohamad Alipour

Drone-Based Remote Sensing and Autonomous Inspection for Infrastructure, Agriculture, and Hazards
This project focuses on drone-based remote sensing for a wide range of real-world applications, including building and bridge inspections, agricultural sensing, and natural hazard reconnaissance. The student will gain hands-on experience operating drones and collecting high-quality aerial data while engaging in research activities that integrate drone path planning, AI-based feature detection, and autonomous flight programming. The project emphasizes designing intelligent flight strategies to efficiently capture critical information, developing and applying machine learning methods to detect structural, environmental, and agricultural features from drone imagery and sensor data, and implementing autonomous flight workflows for repeatable and scalable data collection. Through this experience, the student will be exposed to the full pipeline of drone-enabled sensing—from mission planning and data acquisition to intelligent analysis—while contributing to research that advances the use of drones for infrastructure monitoring, precision agriculture, and rapid response to natural hazards.
Joshua Allen

Guinea Pig Simulator 2.0
The Genomics Group, as part of it’s outreach goals, has designed a simple guinea pig simulator to demonstrate genomics principals to an all ages audience. We seek to improve on this simple jupyter notebook with a more interactive demonstration. The student will get to help design and implement this simulator, and have direct input on the framework used, and many of the details. This will be a programming heavy assignment that draws on game design, visualization, genomics, and Markov-based simulations. We will work on design phase and minimum viable product for the summer.