Spin – 2020 Summer Mentors

2021 Summer Mentors

Summer Mentors

GPU Computing for Bionanotechnology 

Aleksei Askimentiev

Atomic Resolution Brownian Dynamics (ARBD) is a GPU-accelerated code developed by the Aksimentiev lab at Illinois to perform coarse-grained molecular dynamics simulations of biomolecular systems. We are looking for students having interest and some experience in the development of scientific software to assist with ARBD development. 

Possible research projects include implementation of popular coarse-grained models in ARBD, redesign of ARBD class structure, increasing parallel performance of ARBD on multi-node GPU systems and development of a graphical user interface. The qualified students should be familiar with C++ language and programming in Linux environment. Ideally, they would also have some basic knowledge of CUDA programming. Previous experience in either microscopic simulations, numerical algorithms, parallel programming or GUI development using C/C++ and/or Tcl/Tk would be an asset.

Athletic Data Analysis

Loretta Auvil

Are you interested in Football? Sports? Analytics? We are looking for a student with willingness to apply their data science capabilities to sports data. We have football data at the play level and we are looking for a student to help find patterns in this data. The data contains various features that describe the play and what happened on the field. 

We are looking for a Python programmer who has knowledge of artificial intelligence or machine learning skills to find patterns. Data transformations will be necessary as well. We would also like the ability to validate the patterns in another set of data.

Champaign County 211 Reboot

Anita Say Chan

he Community Data Clinic (CDC) is seeking a student to lead the programming and web development for a new initiative with Cunningham Township (CT), United Way, and Illinois 211. The project will create a responsive web platform for community services, with the potential to generate a mobile application. The selected student will be able to determine their own schedule and plans for implementation with the support of CDC faculty and staff as needed.

For context, our preliminary plan (which is open and responsive to insight from the student) would use Javascript to write to and read from a database (i.e. a Firebase Real-Time Database) that we’ll populate with entries for local services (location, hours, contact info and other parameters for institutions like food banks, housing assistance, etc.). The main project will be a user-facing web application that’s simple, intuitive, and integrates user feedback—thus it requires database management (processing feedback, updating entries, etc.) that could be provided within a service (like Firebase) or created independently, depending on project capacities.

 

Desired technical experience:

HTML, CSS, Javascript (if you’d prefer to implement this in another language let us know, we’re open to other options)

Interactive web design, mobile design

Experience with database-backed applications

 

The ideal candidate could also be interested in:

Community-centered research

Ethical approaches to tech developments

Collaborative and intersectional team work

Community, civil, or non-profit programming

Mobile app development

Multiscale Modeling of Fracture in Complex Materials

Ahmed Elbanna

In this project we develop numerical schemes bridging microscopic and continuum scales to predict deformation and damage in a wide range of material systems including crustal faults, biological tissues, and engineering composites. The intern (s) will work on code optimization and parallel implementation to speed up the computation.

Skills desired: Matlab, C++, and some knowledge of parallel programming (MPI)

Visual Analytics UX Components

Lisa Gatzke

 

The Visual Analytics team creates software primarily for bio-medical use. We focus particular attention on creating excellent user interfaces and user experiences for the software that we produce. We are seeking student interns who have some experience/skill with standard design software such as Photoshop, Illustrator, and Sketch and are curious and willing to learn more about both design fundamentals and the tools we use to produce our work.

Student tasks could include: interface production assistance, learning new tools to produce unique proofs of concept components, website design and maintenance and presentation of work.

Neuromorphic Computing

Mattia Gazzola, Volodymyr Kindratenko

 

Newly emerging neuromorphic chip architectures from IBM and Intel enable a novel class of algorithms that describe behaviors commonly associated with the neural cells than the fixed logic of computers. Spiking neural networks (SNNs) are at the core of these chips; they differ drastically from other current approaches to AI, such as those based on deep neural networks (DDNs) that require an extensive offline training. 

The objective of this project is to explore Loihi—a SNN research chip from Intel—in order to understand and characterize its potential for novel algorithms and applications for simulating nature-inspired processes. We are particularly interested in how to integrate real-time learning and adaptation that is an essential feature of all living organisms.

 

The student will first study theory of SNNs and explore simulation software, such as GENESIS, to model SNNs consisting of a small number of neurons. He/she will then study Intel’s Loihi’s architecture and its programming model supported by Intel’s NxSDK via a remote access to Intel’s Neuromorphic Research Cloud (NRC) system and will implement small SSNs on this chip and compare results from the simulation software and the chip. Eventually, the student should be able to setup and demonstrate to the research team how SNNs can be implemented and trained on Loihi and describe their potential and limitations with regards to the neural systems that can be efficiently modeled on this architecture. This project is particularly suited for students interested in continuing with advanced degree at the intersection of neuroscience and computer science.

Using Sattelite Data for Large-Scale Crop Monitoring

Kaiyu Guan, Yizhi Huang

 

Dr. Kaiyu Guan’s lab is conducting research on using novel satellite data from the NASA satellites to study environmental impact on global and U.S. agriculture productivity, in the platform of the most powerful supercomputer in scientific research (Blue Waters). We are looking for highly motivated and programming-savvy undergraduate students to join the lab for the SPIN program. 

The chosen students will be closely mentored by Dr. Guan, and will be working on issues including processing large satellite data, understand and implement remote sensing algorithms, and solve questions that are related to the global food production and food security.

Constructing a Library of Numerical Relativity Waveforms for Use By Ligo

Roland Haas – Gravity Group

 

This project is part of the ongoing effort in the NCSA Gravity Group to study gravitational waves produced by colliding compact objects like black holes and neutron stars. We use various software tools for this purpose. For simulations on supercomputers we use the Einstein Toolkit computational framework. We have developed a python code POWER to postprocess waveforms produced by the Einstein Toolkit.

In this project, you will write python libraries to extract information from numerical relativity simulations that describe mergers of black holes and neutron stars and write standardized HDF5 files that LIGO’s software library can use. This involves reading output of full numerical relativity simulations, computing metadata information based on this output and writing data files to disk. The project requires a working knowledge of python, in particular the numpy and matplotlib modules as well as the willingness to read and understand existing python codes and build on them. Prior experience using the version control system git and well as using the command line will be highly beneficial.

 

Before applying for this project please work through an exercise, as this will be part of the interview with prospective candidates.

Deep Learning for Gravitational Wave Astrophysics

Eliu Huerta

 

The Center for Artificial Intelligence Innovation and the NCSA Gravity Group have an opening for a student interested in designing neural network models for the classification and regression of gravitational waves in complex and noisy data sets. These neural network models will be used to explore the ability of deep learning to search for and extract gravitational waves using the low-latency computational infrastructure of the LIGO Project. 

The selected student will become a member of the LIGO Scientific Collaboration, and will be working with a network of researchers at MIT and the University of Washington.

Virtual Micro-Reactor User Facility

Kathryn Huff

 

This student will contribute to literature review, software development, and virtual reality experience testing as part of a project focused on simulating a nuclear micro-reactor. The candidate will help our research group to identify existing modeling and simulation tools relevant to nuclear micro-reactors and use them to create an online reactor simulator game. This activity will be focused on identifying the key features of an open online educational tool for the public, targeted at young teens. 

 The student will be expected to work well both independently and with mentorship. A strong interest in scientific computing combined with interest in nuclear energy will be required. The work will allow the student to explore and combine the most recent developments in reactor kinetics and multi-physics software, as well as virtual reality (VR), to develop a virtual micro-reactor simulator appropriate for teaching the public about micro-reactors. This tool, when mature, may be used to train students students as well as operation and maintenance personnel to operate these reactors in an advanced Virtual Reality environment. During and after development, this simulator will be available to interested users both digitally and physically. The effort will emphasize coupling and reconfiguration of existing nuclear reactor kinetics, dynamics, multi-physics, and I&C; simulator software to provide configurable, generic simulators of two micro-reactor classes: gas-cooled and heat-pipe. Additionally, this work will directly contribute to commercialization by providing a useful, flexible framework for vendors to leverage. That is, many vendors have well-developed simulation tools for their reactor concepts. The framework will allow plug-and-play coupling with those reactor physics modules. Components in this design/testing/training software framework will include: a modular API for coupling with existing reactor kinetics, dynamics, multi-physics, and vendor reactor models; 3D interactive VR visualization of instrumentation and controls; a configurable virtual monitoring and control panel for instrumentation and control design; and a physical implementation in the Virtual Education and Research Lab.

Hybrid Energy Systems Modeling for Micro-Reactors

Kathryn Huff

This student will contribute to literature review, software development, and data analysis as part of a project focused on energy systems analysis. The candidate will help our research group to devise a roadmap to simulating and analyzing the hybrid energy systems at universities using the Illinois campus as an archetype. 

Performance of possible future energy transition scenarios with regard to key technical requirements will underpin the feasibility of our Hybrid Energy System (HES) assessment, including:

 

CO2 emissions avoided.

Applicability to electricity, steam, hydrogen production, and research applications.

Capability for new technologies to dynamically integrate with solar, geothermal, and other generation.

Grid resiliency in weather extrema and campus building use fluctuations (holidays, finals week).

To quantify performance with respect to these metrics, our team will comprehensively model the embedded electric, steam, and chilled water systems at Illinois incorporating data and systems currently used to monitor and control our energy system. We will then incorporate new technology models to distinguish options and simulate relevant future campus NRHES scenarios. The results will inform future technology choices, operations optimization, new technology siting, and other considerations underpinning a feasible, near-term, campus HES.

 

 Skills desired: Python, git, literature review skills, strong written communication

 

Community Resource Pool

Kaveh Karimi Asli, Diego Andres Calderon Rivera

 

The goal of this project is to provide community-based non-profit organizations a user-friendly web and mobile dashboard for inventory management. The major stakeholders and target audience of the dashboard are the following:

Non-profit organizations that accept and provide used household items from and to the community for free or at a low and affordable cost. Habitat for Humanity and Veteran Affairs Housing Assistance are examples of such organizations. Community members and individuals who are looking for places to donate items they have no use for anymore. People who need and are interested in using such items.

The mission of this project is using existing technologies developed at NCSA to upcycle and reuse items that would end up in landfills by default and facilitate dignity and access to resources for under-resourced community members. The dashboard would make it easier to close the life cycle of a product. It can contribute to the three pillars of sustainable development by empowering community, reducing waste, and reducing expenses for all the stakeholders. The project leverages the open source software and technologies created at NCSA to automate and optimize the process of accepting and categorizing items, managing requests, and allocating items to people who have requested them.

 

The main technology we have in mind for this purpose is Clowder, a customizable and scalable data management framework that supports any data format. More specifically, our Clowder instance extends its built-in automatic metadata extraction to categorize images of the potential donated items uploaded from the client dashboard. The SPIN interns would work on the client dashboard which would be based on the Geodashboard framework. This dashboard would talk to Clowder and use its categorization and management power to save and retrieve items submitted and requested by people. The admin dashboard would let the partner organizations administer the process and provide logistics support. Additionally, the dashboard will support uploading images, which will later be reviewed by administrators, and in the case that they are approved, the images would be uploaded to clowder for the metadata extraction and management part. For an overview of existing projects that are based on the Geodashboard framework, please take a look at gltg-dev.ncsa.illinois.edu and imlczo.ncsa.illinois.edu/geodashboard.

 

Required experience:

Preferably pursuing a bachelors of science in computer science or related fields. Alternative degrees will be considered if accompanied by relevant experience.

Course level experience in software development.

Good sense of design.

Ability to communicate results.

Ability to provide input for presentations and reports.

 

Preferred experience:

Experience with Javascript and Python.

Web development experience.

We do not have available funds to pay for the intern, and we will participate in the Open House.

Meeting the LSST Data Challenge: Galaxy Detection and Segmentation with Deep Learning

Xin Liu

The Large Synoptic Survey Telescope (LSST) project—in which NCSA is a major partner—will produce terabytes of data/night and produce a “movie” of the night sky. Correctly detecting, identifying, and segmenting galaxies in LSST images efficiently is a top priority. LSST images will be so deep and detailed, that galaxies will be crowded or “blended” together. Looking to the rapidly-developing field of computer vision, our group has developed a proof-of-principle deep learning framework to process astronomical images and identify blended galaxies in them.

However, the work and most others in the literature have thus far been limited to simulated images, untested on a real and large image dataset.

 

To solve this problem, here we propose a pilot SPIN research project to apply our novel image segmentation method to real images taken by the most powerful camera in the world, the Hyper-Suprime Cam (HSC), on one of the largest ground-based telescopes, the 8.5-meter Subaru telescope. Being the closest match to LSST in terms of the expected cutting-edge image data quality, HSC is the ideal publicly-available dataset for evaluating and developing this interdisciplinary approach on real astronomical images. The SPIN student will develop and test cutting-edge deep learning architectures within our already proven framework. The student will leverage NCSA resources, such as the HAL GPU cluster, as well as collaborate with local experts. This is an opportunity to test and develop a competitive new approach combining Astronomy Big Data with Machine Learning and Computer Vision to meet the LSST data challenge.

Truth-Seekers and Liars: Sorting Things Out With Python and Logic

Bertram Ludascher

In the famous Knights & Knaves logic puzzles, some characters are truth-tellers and others are liars, and you don’t know who is who. Nowadays factual information but also “Fake News” and other forms of misinformation and disinformation can be produced in elaborate ways using data as apparent evidence. In the Truth-Seekers Project you will learn about problems that can occur with different forms of aggregate data.

In Simpson’s Paradox, for example, conflicting trends from aggregate data can be (mis-)used to advance different (i.e., opposing) arguments. Similarly, gerrymandering employs voter-aggregation techniques to establish an (unfair) advantage of one party over another. More generally, algorithmic bias describes systematic errors in computer-based systems to create unfair outcomes.

 

As part of this SPIN project you will be able to “sort things out” by developing a number of Jupyter notebooks (in Python and/or R) that reveal the different ways data can be used (or misused) to make an argument. To analyze alternative scenarios, you will be able to employ existing tools (e.g., a Possible Worlds Explorer that combines Python data analysis features with a logic programming approach) or develop new ones. The notebooks will be developed and shared via the Whole Tale project to let other truth-seekers reproduce and explore your findings.

 

 Skills required: Programming experience in Python

 Skills desired: Database experience (SQL)

 

Machine Learning Approach to Computational Fluid Dynamics

Shirui Luo, Volodymyr Kindratenko

Machine learning (ML) has made transformative impacts on modelling many high-dimensional complex dynamical systems. Multiphase flow is one of the promising targets for using ML to improve both the fidelity and efficiency of computational fluid dynamics (CFD) simulations. We are examining the use of ML to fit the CFD simulation data to develop closure relations for multiphase flow system. 

For example, DNNs can be trained on datasets with flows where the initial velocity and void fraction are different. The trained model is then used to predict other flow evolutions with different initial conditions. More broadly, we are tackling problems encountered with the interplay between learning and multiphase flow such as: How can learning algorithms be constructed to include physical constraints such as the incompressibility of fluid? What dimensionality reduction techniques and coarsening strategies are most applicable to identify hidden low-dimensional features? How can the computational scientists, experimentalists and theorists collaborate to produce sufficient training database for multiphase flow simulation?

 

The student will use open source software packages such as TensorFlow and PyTorch to construct networks to improve predictive capabilities based on a high-fidelity DNS simulation database. The student will have access to HPC platform at NCSA and learn to analyze CFD data at large scale. Besides of the practice of typical ML skills, the student will also learn more fundamentally on how the neural networks be designed to best incorporate physical constraints while avoiding overfitting to imposed physics, as typical statistical learning methods can ignore underlying physical principles.

Resolving Racial Health Disparities by Using Advanced Statistics and Machine Learning On Complex Multidimensional Datasets

Zeynep Madak-Erdogan, Justina Zurauskiene

African American women have a 4-5 fold greater risk of death from breast cancer compared to Caucasian women, even after controlling for stage at diagnosis, treatment, and other known prognostic factors. Our initial cross-sectional studies suggest that the composition of serum from African American vs. Caucasian women were different and reflected biochemical changes due to socioeconomic status. Thus, we are now tackling a complex multidimensional dataset including proteomic, genomic, biometric, geographic and socioeconomic measurements. 

These dimensions need to be harmonized and correct statistical approaches applied, in order to determine the exact combination of factors that drive this racial health disparity. Additionally, we are planning to increase the size of our dataset, which will make the problem computationally challenging. We are also extending our analyses to other health disparity problems and other datasets. We invite a talented student to participate in this important and exciting project, and get involved in optimization of our analyses pipelines, development of advanced statistical approaches and data analytics.

 

 Skills desired: Statistics, machine learning, computing, bioinformatics

The Rokwire Project

Kenton McHenry, Sandeep Puthanveetil Satheesan

We are looking for a number of students to work as part of a software team to develop a novel platform for the seamless and intuitive integration of an increasingly rich world of internet services. The work aims to address a broad spectrum of needs, from the bringing together of oncampus student services, to addressing needs within the rapidly grown smart communities initiatives currently underway around the world.

Qualified students should be skilled in at least one programming language (e.g. Python, Java) with the ability to learn more if needed. Experience with web services, container technologies (e.g. Docker) and familiarity with cross-platform mobile app development frameworks (e.g. Flutter, React Native) is preferred.

 

Some relevant, though not required, courses include:

CS 225 Data Structures

CS 427 Software Engineering I

CS 428/429 Software Engineering II

CS 473 Algorithms

CS 498 Mobile Interactive Design

CS 498 Art and Science of Web Programming

Speech-to-Text Auto Captioning 

Michael Miller – Event Services

This project researches frameworks and workflows for speech-to-text recognition in order to facilitate live auto captioning and creation of standard caption files for use in live events and video editing, utilizing and enhancing speech-to-text HPC/cloud services and seeks to advance the state of the art in speech-to-text recognition. A successful candidate would need to have completed CS125 (Intro to Computer Science) or have equivalent experience.

Implementation of Machine Learning Algorithms on FPGAS

Ashish Misra, Volodymyr Kindratenko

Many research domains, such as computer vision and language understanding, have been transformed using novel machine learning (ML) and deep learning (DL) methods and techniques. However, these methods are very compute-intensive and rely on state-of-the-art hardware and large datasets to achieve an acceptable level of performance. Research team at the Innovative Systems Lab (ISL) at NCSA has been investigating how neural networks at the core of DL algorithms can be implemented on reconfigurable hardware with the objective to speedup the execution and reduce power requirements for inference algorithms.

FPGAs are a good choice for implementing neural networks since they enable highly customized parallel hardware implementation and provide a great degree of flexibility with regards to numerical data types. Most recently, ISL started to explore a novel platform enabled by IBM’s CAPI 2.0 interface and SNAP API. This platform allows to develop FPGA applications using high-level synthesis (HLS) methodology rather than a traditional hardware design approach and integrate kernels accelerated on an FPGA with the host-side applications running on IBM POWER9 servers.

 

The students working on this project will acquire the skillsets that are required to develop ML/DL algorithms in hardware using HLS approach. The students will be involved with a) evaluating performance of existing ML/DL implementations on reconfigurable hardware platforms and documenting the results, b) developing new ML/DL algorithms for implementation on reconfigurable hardware and preparing datasets for testing and evaluation, and c) helping ISL research staff with porting the algorithms to reconfigurable hardware. Required skills include completion of ECE 385 and ECE 408 or equivalent courses.

Multiscale Modeling of the Cell Membarane-Associated Phenomena

Taras Pogorelov

The cell membrane environment is complex and challenging to model. The Pogorelov Lab at Illinois develops workflows that combining computational and experimental molecular data. We work in close collaboration with experimental labs. Modeling approaches include classical molecular dynamics, quantum electronic structure, and quantum nuclear dynamics.

These projects include development of workflows for modeling and analysis of the lipid interactions with proteins and ions that are vital for life of the cell. The qualified student should have experience with R/Python programming, use of Linux environment, and of NAMD molecular modeling software.

Virtual Reality and Ray Tracing for Materials Science Data Visualization

Andre Schleife

Computational materials science research produces large amounts of static and time-dependent data for atomic positions and electron densities that is rich in information. Determining underlying processes and mechanisms from this data, and visualizing it in a comprehensive way, constitutes an important scientific challenge. 

In this project we will continue development of our Unity app that is compatible with Windows Mixed Reality, Google Daydream, and iOS. We will implement new features, such as the display and interaction with time-dependent data, as well as novel modes of interaction with the data. In addition, we will use and develop physics-based ray-tracing and stereoscopic rendering techniques to visualize the atomic and electronic structure of existing and novel materials e.g. for solar-energy harvesting and optoelectronic applications. In a team, we will further develop codes based on the physics-based ray-tracer Blender/LuxRender and the yt framework to produce immersive images and movies.

 

Skills desired: Android app development, OpenGL/Unity/WebGL, VR code development, creativity and motivation

 

Detecting Landscapes Changes Across the Arctic and Antarctic Using Deep Learning

Aiman Soliman, Volodymyr Kindratenko

Arctic and Polar scientists have been studying the changes of specific landscape features, fauna, and flora over fairly restricted spatial extents using field expeditions and very high-resolution remote sensing datasets. Over the past years, combined efforts in polar geospatial science and HPC have yielded novel high-resolution Digital Elevation Models (DEM), namely the ArcticDEM and the Reference Elevation of Antarctica (REMA).

These state-of-the-art archives capture polar landscape surface at unprecedented spatial (meters scale) and temporal scales (2-3 weeks), and represent records of all the changes that happened and are happening at the Earth’s poles. However, the size of the archives represents a real challenge for scientists to extract conclusive results. We are developing DL models that can be applied at a scale to conduct an inventory of polar landscape features and quantify their lateral and vertical changes.

 

The students working on this project will acquire the skillsets that are required to develop DL models while applying them to monitor the current state of polar environments. The students will be involved with a) preparing model training sets from existing field survey data; b) evaluating the performance of different DL architectures that are suited to segment images, such as Convolutional Neural Networks, as well as architectures that are suited for detecting changes in image sequences, such as Recurrent and Siamese Neural Networks; and c) developing HPC workflows to manage and apply the developed DL models to existing elevation data archives leveraging the cyberinfrastructure at NCSA.

Music on High-Performance Computers 

Sever Tipei

The project centers on DISSCO, software for composition, sound design and music notation/printing developed at Illinois and Argonne National Laboratory. Written in C++, it includes a graphical user interface using gtkmm, a parallel version is being developed at the San Diego Supercomputer Center. DISSCO has a directed graph structure and uses stochastic distributions, sieves (part of number theory) and elements of information theory to produce musical compositions. 

Presently, efforts are directed toward refining a system for the notation of music as well as to the realization of an evolving entity, a composition whose aspects change when computed recursively over long periods of time thus mirroring the way living organisms are transformed in time (artificial life).