2023 Summer Mentors & Projects

Summer Mentors by Year:

Mohamad Alipour

Mixed Reality Digital Technology For Digital Twins

Engineering tasks such as design, monitoring, inspection, and education and training rely on our ability to create high-fidelity models describing the behavior of structures in the physical world (e.g., bridges, wind turbines, pipelines, power transmission, etc.). Efficient execution of these engineering tasks in the information age relies on our ability to digitize the physical structures and systems and connect sensing data and physical models to the digital replica of the physical structure (digital twin). This study aims to create such digital representations for engineering systems, visualize them using virtual and augmented reality (VR/AR), and to study their suitability for performing engineering tasks.

To that end, 3D models of sample structures will be captured via laser scanners or developed in graphical software and connected to the physical structure using sensing data and existing mechanical models. These models will then be imported into mixed reality environment and enriched with data such as sensing, descriptions of condition, and other information of interest for the specific engineering task. Depending on the progress of the project, opportunities exist for participating in real tests of target structures instrumented with mechanical and optical sensors within the Newmark Structural Engineering Laboratory (NSEL) in the Department of Civil and Environmental Engineering. The end goal will be to create a prototype digital twin of a structure that can be viewed and interacted with using AR/VR headsets.

Student Background and Research Activities

Successful applicants will have strong programming skills, and experience with game engines, virtual reality programming (e.g., Unity or similar platforms). No knowledge of civil, structural, and mechanical engineering is required. This project involves exciting research activities including mixed reality application development and programming, computer simulations, and potentially working with sensing systems. The student will also be working with mixed reality headsets such as the Meta Oculus, HTC Vive, and Magic Leap. Depending on the progress, this project may lead to a conference paper and/or a longer-term research position.

Greg Bauer

Science Gateways – Learn What Makes an HPC Gateway “Tick”

Science gateways improve accessibility, usability and community use of research software on high-performance computing (HPC) clusters. The NCSA Delta Project has as one of its programs the development of such a gateway, called the Delta Science Gateway, that enables access to NVIDIA GPU accelerated tools, applications, and codes available on the Delta HPC clusters for researchers who are not familiar or comfortable with traditional HPC interfaces such command line environments.

In this SPIN project you will assist with deployment and integration of existing computational software on the gateway. These software applications come from a range of research areas from Machine Learning to Molecular Modeling and everywhere in between with varying interfaces ranging from command line, desktop like graphical user interfaces or web style interface like that used by Jupyter notebooks. Through this program you will learn about software compilation, build and installation as well the integration of these software applications to be available via the web interface. You will gain understanding of how researchers use these applications and how powerful computation resources like Delta are essential for many types of workloads.

HPC and ML Benchmarking

The NCSA Workload and Benchmarking group is looking for an interested and motivated intern who would like to help design, implement and evaluate high performance computing (HPC) and machine learning (ML) benchmarks for use in system and software performance evaluation and regression, and architecture evaluation

The NCSA Workload and Benchmarking group is looking for an interested and motivated intern who would like to help design, implement and evaluate high performance computing (HPC) and machine learning (ML) benchmarks for use in system and software performance evaluation and regression, and architecture evaluation. The intern would work with NCSA staff, helping select existing computational or IO workloads, prepare selected workloads for internal and external execution, add to git repositories, and collect and evaluate performance metrics. Preferred Skills: Familiarity with software build systems such as autoconf, cmake, git, linux and some knowledge of computer architecture is helpful.

Kevin Chang

Living Encyclopedia

Develop algorithms and models to– Automatically discover keywords (concepts or entities, e.g., “data structure”, “neural network”) used in a professional domain (e.g., computer science). Build an “encyclopedia” for these keywords. Organize information in the domain by the keywords. Techniques: machine learning, data mining, natural language processing.

Roland Haas

Implementing an OpenPMD Reader Plugin for Visit

Science gateways improve accessibility, usability and community use of research software on high-performance computing (HPC) clusters. The NCSA Delta Project has as one of its programs the development of such a gateway, called the Delta Science Gateway, that enables access to NVIDIA GPU accelerated tools, applications, and codes available on the Delta HPC clusters for researchers who are not familiar or comfortable with traditional HPC interfaces such command line environments.

Volodymyr Kindratenko

Development and Application of AI Models

In this project, we are developing various AI models to help solve challenging science and engineering problems. The student will help with gathering data, evaluating existing deep learning models, developing, and evaluating new models and software.

Xin Liu

DeepDISC: Detection, Instance Segmentation, and Classification for Astronomical Surveys with Deep Learning

The next generation of massive astronomical surveys such as the upcoming Legacy Survey of Space and Time (LSST) on the Rubin Observatory will deliver unprecedented amounts of images through the 2020s and beyond. As both the sensitivity and depth increase, larger numbers of blended (overlapping) sources will occur. If left unaccounted for, blending would result in biased measurements of sources that are assumed isolated, contaminating key cosmological inferences such as photometry and photometric redshift and morphology and weak gravitational lensing to probe the nature of dark matter and dark energy.


In the LSST era, efficient deblending techniques are a necessity and thus have been recognized a high priority. However, an efficient and robust method to detect, deblend, and classify sources for upcoming massive surveys is still lacking. Leveraging the rapidly developing field of computer vision, this NCSA project will develop a deep learning framework “DeepDISC”. DeepDISC will efficiently process images and accurately identify blended galaxies with the lowest possible latency to maximize the science returns of upcoming massive astronomical surveys. The approach is fundamentally different from traditional methods. The project is interdisciplinary, combining state-of-the-art astronomy data with the latest deep learning tools from computer science. DeepDISC will efficiently and robustly detect, deblend, and classify sources in upcoming surveys at depths close to the confusion limit. It will also provide accurate estimates of the deblending uncertainty, which can be propagated further down the analysis of galaxy properties for cosmological inferences. The project will have strong implications for a wide range of problems in astronomy, ranging from efficiently detecting transients and solar system objects to the nature of dark matter and dark energy. DeepDISC will be directly applicable for LSST as well as other upcoming massive surveys such as NASA’s Roman Space Telescope. The program will reinforce the Illinois brand in big data and survey science.

Preferred Skills:

  • Python
  • PyTorch
  • Deep Learning

Angela Lyons, Alejandro Montoya Castano

A Machine Learning and Geospatial Approach to Targeting Humanitarian Assistance Among Syrian Refugees in Lebanon

An estimated 84 million persons are forcibly displaced worldwide, and at least 70% of these are living in conditions of extreme poverty. More efficient targeting mechanisms are needed to better identify vulnerable families who are most in need of humanitarian assistance. Traditional targeting models rely on a proxy means testing (PMT) approach, where support programs target refugee families whose estimated consumption falls below a certain threshold.

Jill Naiman

Enhancing Optical Character Recognition (OCR) Capabilities for Historical Documents

This project is a subset of a NASA Astrophysics Data Analysis Program (ADAP) project aimed at creating several science-ready data products to help astronomers’ search the literature in new ways. This goal is being accomplished by extending the NASA Astrophysics Data System (ADS), known as an invaluable literature resource, into a series of data resources. One part of this project involves the “reading” of figure captions using Optical Character Recognition (OCR) from scanned article pages.

A large source of error in the OCR process comes from artifacts present on scanned pages — scan-effects such as warping, lighting gradients and dust can generate many misspellings. This project is focused on better understanding these types of effects using image processing and analysis to better clean old images before OCR processing AND to potentially generate artificial training data using “aged” images of newer, digitized documents.

Quantifying the Effectiveness of Scientific Documentaries Using Natural Language Processing

NCSA’s Advanced Visualization Lab (AVL) in collaboration with iSchool are looking for an undergraduate research intern to help with a research project that builds on the research of doctoral candidate Rezvaneh (Shadi) Rezapour and Professor Jana Diesner, which uses data mining and natural language processing techniques to study the effects of issue-focused documentary films on various audiences by analyzing reviews and comments on streaming media sites.

This new research will focus specifically on science-themed documentaries that use computational science research in their science explanations. Student researchers would be responsible for working with mentors in iSchool (Professor Jill Naiman) and AVL to collect data from streaming sites and analyze the data using existing purpose-built software and developing new tools.

No skills required. Students will be trained to conduct the classification of text documentary reviews. Preferred: background in interdisciplinary research.

The Reading Time Machine: Transforming Astrophysical Literature into Actionable Data

This project is a subset of a NASA Astrophysics Data Analysis Program (ADAP) project aimed at creating several science-ready data products to help astronomers’ search the literature in new ways. This goal is being accomplished by extending the NASA Astrophysics Data System (ADS), known as an invaluable literature resource, into a series of data resource

One part of this process will be classifying the figures that appear in journal articles by their “type” (for astronomical literature, classes will include things like “images of the sky,” “graphs,” “simulations,” etc). For this summer research project, a student will help with this image classification both by by hand and testing with machine learning methods in collaboration with Dr. Jill Naiman and/or a grad student (School of Information Sciences and NCSA).

The main parts of the project will involve developing the codebook of image classifications so that citizen scientists can complete more classifications on a large scale and running the by hand classification scripts. Options to extend this by working on the UI for the classification scripts (in Python, and/or for the Zooniverse citizen science platform) and working with the machine learning methods for image classification are available for interested students.

Skills required:

  • Patience – ok with classifying images by hand
  • Attention to detail – to develop the codebook for different and tricky image classes
  • Curious about the machine learning image classification process

Preferred skills:

  • Experience with Python
  • Experience with machine learning (can be taught “on the job”)

WormFindr: Automatic Segmentation of Neurons in C.elegans

This project is a part of the larger WormAtlas and C.elegans project, recently funded by the NIH, which aims to study and visualize the anatomy of C.elegans in order to better understand their neural connections. For the WormFindr SPIN project, our goal is to apply machine learning segmentation models to test the effectiveness of these methods in automatically segmenting the neurons in images of C.elegans.

Preferred Skills: Experience with programming, preferably in Python Knowledge or practice of machine learning methods is welcomed, but not required

Santiago Nunez-Corrales

quAPL: Implementing a High-Level Array Programming Language for Quantum Computing

This project seeks to implement more fully a set of quantum computing primitives in quAPL, an experimental high-level array programming language aiming to democratize access to quantum programming. Preferred Skills: Prior exposure to functional programming, APL programming and Amazon Braket.

Andre Schleife

Building Blender Scenes for Visualizing Materials Science Research

Visualizing outcomes of materials science research to a broad general public is important to convey successes and insight of scientific research. For this project, we are looking for students with experience in Blender, who can help build scenes that can be used to visualize atomic/crystal structures, bonds, and phonons. The goal is to make these scenes look interesting and artistic, and a corresponding background of interested students would be helpful.

Aiman Soliman, Zeynep Madak-Erdogan

Spatial Analysis of Metastatic Breast Cancer Tumor Heterogeneity Using Machine Learning Techniques

Using spatial sequencing data and additional health records, we aim to identify genes critical for metastatic tumors. We will analyze clinical parameters such as blood glucose and liver enzymes and correlate these parameters with genes that show homogeneous or heterogeneous expression throughout a sample.

Multi-Scale Spatial Analysis of Lung Cancer in Chicago

In this project, we aim to identify neighborhood factors that impact gene expression profiles, thus, the clinical outcome of lung tumors. Multilevel integrations of neighborhood data, patient data, and spatial gene expression results will enable the student to select genes indicative of poor neighborhood impact on tumor biology.

Sever Tipei

Music on High-Performance Computers

The project centers on DISSCO, software for composition, sound design and music notation/printing developed at UIUC, NCSA and Argonne National Laboratory. Written in C++, it includes a Graphic User Interface using gtkmm.

A parallel version has been developed at the San Diego Supercomputer Center with support from XSEDE (Extreme Science and Engineering Discovery Environment). DISSCO has a directed graph structure and uses stochastic distributions, sieves (part of Number Theory), Markov chains and elements of Information Theory to produce musical compositions. Presently, efforts are directed toward refining a system for the notation of music as well as to the realization of an Evolving Entity, a composition whose aspects change when computed recursively over long periods of time thus mirroring the way living organisms are transformed in time (Artificial Life). Another possible direction of research is sonification of complex scientific data, the aural rendition of computer generated data, as a companion to visualization. 

More information about: 

  • DISSCO 
  • Notation 
  • Evolving Entity 
  • Skills required: 
  • Proficiency in C++ programming 
  • Familiarity with Linux Operating System 
  • Preferred skills: Familiarity with music notation.

Antonios Tsokaros

High Performance Computing for Magnetized Neutron Stars

Neutron stars are extraordinary not only because they are the densest form of matter in the visible Universe but also because they have magnetic fields which can reach levels that can distort the very nature of quantum vacuum. In this project, with the help of supercomputers we will study the combined gravitational and electromagnetic field of a neutron star in a self-consistent way in order to create a realistic model for the first time.

The successful applicant will use the Einstein Toolkit to perform astrophysical simulations of magnetized neutron stars that will help understand better multimessenger events like GW170817. Theoretical work in magnetohydrodynamics will also be possible. Students from computer science, physics, astronomy are invited to apply.

Students Pushing Innovation (SPIN)
1205 W. Clark St.
Urbana, IL 61801
Email: kindrat2@illinois.edu
CookieSettings CookieSettings