Spin – 2023 Summer Mentors

2023 Summer Mentors

Summer Mentors

Mixed Reality Digital Technology For Digital Twins

Mohamad Alipour

Engineering tasks such as design, monitoring, inspection, and education and training rely on our ability to create high-fidelity models describing the behavior of structures in the physical world (e.g., bridges, wind turbines, pipelines, power transmission, etc.). Efficient execution of these engineering tasks in the information age relies on our ability to digitize the physical structures and systems and connect sensing data and physical models to the digital replica of the physical structure (digital twin). This study aims to create such digital representations for engineering systems, visualize them using virtual and augmented reality (VR/AR), and to study their suitability for performing engineering tasks.

To that end, 3D models of sample structures will be captured via laser scanners or developed in graphical software and connected to the physical structure using sensing data and existing mechanical models. These models will then be imported into mixed reality environment and enriched with data such as sensing, descriptions of condition, and other information of interest for the specific engineering task. Depending on the progress of the project, opportunities exist for participating in real tests of target structures instrumented with mechanical and optical sensors within the Newmark Structural Engineering Laboratory (NSEL) in the Department of Civil and Environmental Engineering. The end goal will be to create a prototype digital twin of a structure that can be viewed and interacted with using AR/VR headsets.

 

Student Background and Research Activities

Successful applicants will have strong programming skills, and experience with game engines, virtual reality programming (e.g., Unity or similar platforms). No knowledge of civil, structural, and mechanical engineering is required. This project involves exciting research activities including mixed reality application development and programming, computer simulations, and potentially working with sensing systems. The student will also be working with mixed reality headsets such as the Meta Oculus, HTC Vive, and Magic Leap. Depending on the progress, this project may lead to a conference paper and/or a longer-term research position.

 

Contact Mohamad Alipour

Science Gateways – Learn What Makes an HPC Gateway “Tick” 

Greg Bauer

Science gateways improve accessibility, usability and community use of research software on high-performance computing (HPC) clusters. The NCSA Delta Project has as one of its programs the development of such a gateway, called the Delta Science Gateway, that enables access to NVIDIA GPU accelerated tools, applications, and codes available on the Delta HPC clusters for researchers who are not familiar or comfortable with traditional HPC interfaces such command line environments.

 

In this SPIN project you will assist with deployment and integration of existing computational software on the gateway. These software applications come from a range of research areas from Machine Learning to Molecular Modeling and everywhere in between with varying interfaces ranging from command line, desktop like graphical user interfaces or web style interface like that used by Jupyter notebooks. Through this program you will learn about software compilation, build and installation as well the integration of these software applications to be available via the web interface. You will gain understanding of how researchers use these applications and how powerful computation resources like Delta are essential for many types of workloads. 

 

Contact Greg Bauer

HPC and ML Benchmarking 

Greg Bauer

The NCSA Workload and Benchmarking group is looking for an interested and motivated intern who would like to help design, implement and evaluate high performance computing (HPC) and machine learning (ML) benchmarks for use in system and software performance evaluation and regression, and architecture evaluation. 

 

The NCSA Workload and Benchmarking group is looking for an interested and motivated intern who would like to help design, implement and evaluate high performance computing (HPC) and machine learning (ML) benchmarks for use in system and software performance evaluation and regression, and architecture evaluation. The intern would work with NCSA staff, helping select existing computational or IO workloads, prepare selected workloads for internal and external execution, add to git repositories, and collect and evaluate performance metrics. Preferred Skills: Familiarity with software build systems such as autoconf, cmake, git, linux and some knowledge of computer architecture is helpful.

 

Contact Greg Bauer

Living Encyclopedia

Kevin Chang

Develop algorithms and models to– Automatically discover keywords (concepts or entities, e.g., “data structure”, “neural network”) used in a professional domain (e.g., computer science). Build an “encyclopedia” for these keywords. Organize information in the domain by the keywords. Techniques: machine learning, data mining, natural language processing. 


Contact Kevin Chang

Implementing an OpenPMD Reader Plugin for Visit

Roland Haas


Science gateways improve accessibility, usability and community use of research software on high-performance computing (HPC) clusters. The NCSA Delta Project has as one of its programs the development of such a gateway, called the Delta Science Gateway, that enables access to NVIDIA GPU accelerated tools, applications, and codes available on the Delta HPC clusters for researchers who are not familiar or comfortable with traditional HPC interfaces such command line environments.

Modern scientific simulations have enabled us to study non-linear phenomena that are impossible to study otherwise. Among the most challenging problems is the study of Einstein’s theory of relativity which predicts the existence of gravitational waves detected very recently be the LIGO collaboration. The Einstein Toolkit is a community driven framework for astrophysical simulations. I am interested in recruiting a student interested in improving the quality of gravitational waveform templates describing colliding black holes produced with the in the Einstein Toolkit. 

 

CarpetX, a new driver for the Einstein Toolkit based on AMReX, is now available for testing. A driver in the Einstein Toolkit is responsible for basic computational algorithms such as adaptive mesh refinement, parallelism, inter-processes communication, or GPU offloading. Thorns can then implement the physics bits and discretization methods, relying on the driver to stitch everything together into a single application. CarpetX offers a range of new features for the Einstein Toolkit that are interesting for hydrodynamics or magnetic fields (staggered grids, refluxing), improved performance (multi-threading, GPUs, scalability, I/O), and additional safety features that prevent or catch programming errors (uninitialized grid points, inconsistent definitions). It uses the openPMD (https://www.openpmd.org) file format for 3D output during simulation and checkpointing. 

 

This project will be to implement a reader plugin for the 3D visualization package VisIt (https://wci.llnl.gov/simulation/computer-codes/visit) for use with Carpet and openPMD. The successful applicant will be involved with both the Gravity Group at NCSA and will be invited to participate in the weekly group meetings and discussions of their research projects. 

 

Required and Preferred skills: Familiarity with Linux, git, make Strong working knowledge of C++ in application development and using STL. Working knowledge of Python, matplotlib. No experience using OpenGL or 3D rendering is required Pre-Interview Exercise Before applying, you must do the exercise at https://wiki.ncsa.illinois.edu/display/~rhaas/SPIN+2023+Exercise and send the results to rhaas@illinois.edu. I will *not* consider your application if I have not received the exercise material in advance.

 

Contact Roland Haas

Development and Application of AI Models

Volodymyr Kindratenko

 

In this project, we are developing various AI models to help solve challenging science and engineering problems. The student will help with gathering data, evaluating existing deep learning models, developing, and evaluating new models and software.

 

Contact Vlodymyr Kindratenko

DeepDISC: Detection, Instance Segmentation, and Classification for Astronomical Surveys with Deep Learning  

Xin Liu

 

 

The next generation of massive astronomical surveys such as the upcoming Legacy Survey of Space and Time (LSST) on the Rubin Observatory will deliver unprecedented amounts of images through the 2020s and beyond. As both the sensitivity and depth increase, larger numbers of blended (overlapping) sources will occur. If left unaccounted for, blending would result in biased measurements of sources that are assumed isolated, contaminating key cosmological inferences such as photometry and photometric redshift and morphology and weak gravitational lensing to probe the nature of dark matter and dark energy.

In the LSST era, efficient deblending techniques are a necessity and thus have been recognized a high priority. However, an efficient and robust method to detect, deblend, and classify sources for upcoming massive surveys is still lacking. Leveraging the rapidly developing field of computer vision, this NCSA project will develop a deep learning framework “DeepDISC”. DeepDISC will efficiently process images and accurately identify blended galaxies with the lowest possible latency to maximize the science returns of upcoming massive astronomical surveys. The approach is fundamentally different from traditional methods. The project is interdisciplinary, combining state-of-the-art astronomy data with the latest deep learning tools from computer science. DeepDISC will efficiently and robustly detect, deblend, and classify sources in upcoming surveys at depths close to the confusion limit. It will also provide accurate estimates of the deblending uncertainty, which can be propagated further down the analysis of galaxy properties for cosmological inferences. The project will have strong implications for a wide range of problems in astronomy, ranging from efficiently detecting transients and solar system objects to the nature of dark matter and dark energy. DeepDISC will be directly applicable for LSST as well as other upcoming massive surveys such as NASA’s Roman Space Telescope. The program will reinforce the Illinois brand in big data and survey science.

 

 Preferred Skills:

  • Python
  • PyTorch
  • Deep Learning

 

Contact Xin Liu

A Machine Learning and Geospatial Approach to Targeting Humanitarian Assistance Among Syrian Refugees in Lebanon Angela Lyons, Alejandro  

Montoya Castano

 

An estimated 84 million persons are forcibly displaced worldwide, and at least 70% of these are living in conditions of extreme poverty. More efficient targeting mechanisms are needed to better identify vulnerable families who are most in need of humanitarian assistance. Traditional targeting models rely on a proxy means testing (PMT) approach, where support programs target refugee families whose estimated consumption falls below a certain threshold.

Despite the method’s practicality, it provides limited insights, its predictions are not very accurate, and it can impact the targeting effectiveness and fairness. Alternatively, multidimensional approaches to assessing poverty are now being applied to the refugee context. Yet, they require extensive information that is often unavailable or costly. This project applies machine learning and geospatial methods to novel data collected from Syrian refugees in Lebanon to develop more effective and operationalizable targeting strategies that provide a reliable complementarity to current PMT and multidimensional methods. The insights from this project have important implications for humanitarian organizations seeking to improve current targeting mechanisms, especially given increasing poverty and displacement and limited humanitarian funding. 

 

Preferred Skills: 

Background in computer science, data science, and statistical modeling Programming languages: Python, R, and/or Stata Basic knowledge and skills in machine learning and/or geospatial analysis Expertise in creating mappings and other data visualizations. Experience in the programming and development of dashboards.

 

Contact Angela Lyons

Enhancing Optical Character Recognition (OCR) Capabilities for Historical Documents  

Jill Naiman

This project is a subset of a NASA Astrophysics Data Analysis Program (ADAP) project aimed at creating several science-ready data products to help astronomers’ search the literature in new ways. This goal is being accomplished by extending the NASA Astrophysics Data System (ADS), known as an invaluable literature resource, into a series of data resources. One part of this project involves the “reading” of figure captions using Optical Character Recognition (OCR) from scanned article pages. 

 

A large source of error in the OCR process comes from artifacts present on scanned pages — scan-effects such as warping, lighting gradients and dust can generate many misspellings. This project is focused on better understanding these types of effects using image processing and analysis to better clean old images before OCR processing AND to potentially generate artificial training data using “aged” images of newer, digitized documents.

 

Contact Jill Naiman

Quantifying the Effectiveness of Scientific Documentaries Using Natural Language Processing 

Jill Naiman

NCSA’s Advanced Visualization Lab (AVL) in collaboration with iSchool are looking for an undergraduate research intern to help with a research project that builds on the research of doctoral candidate Rezvaneh (Shadi) Rezapour and Professor Jana Diesner, which uses data mining and natural language processing techniques to study the effects of issue-focused documentary films on various audiences by analyzing reviews and comments on streaming media sites. 


This new research will focus specifically on science-themed documentaries that use computational science research in their science explanations. Student researchers would be responsible for working with mentors in iSchool (Professor Jill Naiman) and AVL to collect data from streaming sites and analyze the data using existing purpose-built software and developing new tools. 

No skills required. Students will be trained to conduct the classification of text documentary reviews. Preferred: background in interdisciplinary research.

 

Contact Jill Naiman

The Reading Time Machine: Transforming Astrophysical Literature into Actionable Data 

Jill Naiman

This project is a subset of a NASA Astrophysics Data Analysis Program (ADAP) project aimed at creating several science-ready data products to help astronomers’ search the literature in new ways. This goal is being accomplished by extending the NASA Astrophysics Data System (ADS), known as an invaluable literature resource, into a series of data resources.


One part of this process will be classifying the figures that appear in journal articles by their “type” (for astronomical literature, classes will include things like “images of the sky,” “graphs,” “simulations,” etc). For this summer research project, a student will help with this image classification both by by hand and testing with machine learning methods in collaboration with Dr. Jill Naiman and/or a grad student (School of Information Sciences and NCSA). 

The main parts of the project will involve developing the codebook of image classifications so that citizen scientists can complete more classifications on a large scale and running the by hand classification scripts. Options to extend this by working on the UI for the classification scripts (in Python, and/or for the Zooniverse citizen science platform) and working with the machine learning methods for image classification are available for interested students. 

Skills required: 

Patience – ok with classifying images by hand 

Attention to detail – to develop the codebook for different and tricky image classes 

Curious about the machine learning image classification process 

Preferred skills: 

Experience with Python Experience with machine learning (can be taught “on the job”)

 

Contact Jill Naiman

WormFindr: Automatic Segmentation of Neurons in C.elegans 

Jill Naiman, Kalina Borkiewicz

This project is a part of the larger WormAtlas and C.elegans project, recently funded by the NIH, which aims to study and visualize the anatomy of C.elegans in order to better understand their neural connections. For the WormFindr SPIN project, our goal is to apply machine learning segmentation models to test the effectiveness of these methods in automatically segmenting the neurons in images of C.elegans. 

 

Preferred Skills: Experience with programming, preferably in Python Knowledge or practice of machine learning methods is welcomed, but not required

 

Contact Jill Naiman

quAPL: Implementing a High-Level Array Programming Language for Quantum Computing

Santiago Nunez-Corrales

 

This project seeks to implement more fully a set of quantum computing primitives in quAPL, an experimental high-level array programming language aiming to democratize access to quantum programming. Preferred Skills: Prior exposure to functional programming, APL programming and Amazon Braket.

 

Contact Nunez-Corrales

Building Blender Scenes for Visualizing Materials Science Research

Andre Schleife

Visualizing outcomes of materials science research to a broad general public is important to convey successes and insight of scientific research. For this project, we are looking for students with experience in Blender, who can help build scenes that can be used to visualize atomic/crystal structures, bonds, and phonons. The goal is to make these scenes look interesting and artistic, and a corresponding background of interested students would be helpful.


Contact Andre Schleife

Spatial Analysis of Metastatic Breast Cancer Tumor Heterogeneity Using Machine Learning Techniques

Aiman Soliman, Zeynep Madak-Erdogan

Using spatial sequencing data and additional health records, we aim to identify genes critical for metastatic tumors. We will analyze clinical parameters such as blood glucose and liver enzymes and correlate these parameters with genes that show homogeneous or heterogeneous expression throughout a sample.

 

Contact Aiman Soliman

Multi-Scale Spatial Analysis of Lung Cancer in Chicago

Aiman Soliman, Zeynep Madak-Erdogan

In this project, we aim to identify neighborhood factors that impact gene expression profiles, thus, the clinical outcome of lung tumors. Multilevel integrations of neighborhood data, patient data, and spatial gene expression results will enable the student to select genes indicative of poor neighborhood impact on tumor biology.

 

Contact Aiman Soliman

Music on High-Performance Computers

Sever Tipei

The project centers on DISSCO, software for composition, sound design and music notation/printing developed at UIUC, NCSA and Argonne National Laboratory. Written in C++, it includes a Graphic User Interface using gtkmm. 

A parallel version has been developed at the San Diego Supercomputer Center with support from XSEDE (Extreme Science and Engineering Discovery Environment). DISSCO has a directed graph structure and uses stochastic distributions, sieves (part of Number Theory), Markov chains and elements of Information Theory to produce musical compositions. Presently, efforts are directed toward refining a system for the notation of music as well as to the realization of an Evolving Entity, a composition whose aspects change when computed recursively over long periods of time thus mirroring the way living organisms are transformed in time (Artificial Life). Another possible direction of research is sonification of complex scientific data, the aural rendition of computer generated data, as a companion to visualization. 

 

More information about: 

DISSCO 

Notation 

Evolving Entity 

 

Skills required: 

Proficiency in C++ programming 

Familiarity with Linux Operating System 

 

Preferred skills: Familiarity with music notation.

 

Contact Sever Tipei

High Performance Computing for Magnetized Neutron Stars 

Antonios Tsokaros

Neutron stars are extraordinary not only because they are the densest form of matter in the visible Universe but also because they have magnetic fields which can reach levels that can distort the very nature of quantum vacuum. In this project, with the help of supercomputers we will study the combined gravitational and electromagnetic field of a neutron star in a self-consistent way in order to create a realistic model for the first time. 

The successful applicant will use the Einstein Toolkit to perform astrophysical simulations of magnetized neutron stars that will help understand better multimessenger events like GW170817. Theoretical work in magnetohydrodynamics will also be possible. Students from computer science, physics, astronomy are invited to apply.

 

Contact Antonios Tsokaros