2017 Summer Mentors

NCSA SPIN mentor Donna Cox

Are you a programmer, a filmmaker, a musician, an architect, a physicist, or a mathematician? Do you know GPU programming, MaxMSP, or Processing? Can you use Houdini, Maya, After Effects, or Unity? Have you built mobile apps, virtual or augmented reality scenes, or computer simulations? The AVL is looking for multi-disciplinary students who can build digital experiences for cutting-edge arts applications. Tell us what you are good at so we can see you in your best light!

NCSA SPIN mentors Mark Fredricksen and Wayne Hoyenga

In order to keep a log of system changes as well as provide a method for system admins working on the same machine from stepping on each other, a utility called rcsvi was implemented.

This routine does a number of things: 1) implements version control on edited files; 2) automatically populates the system change log starting with comments from the check-in process; and 3) since RCS will not check out an editable file, it gives some indication that someone else may also be working on that file. We are looking for a student to implement additional functionality and improvements to the local revision control software, such as

  • Rather than using local revision control master files, the routine would interact with a remote revision control repository (like -git– or something similar)
  • The system master change log would also be in some remote repository (database or whatever)
  • Methods for keeping different admins from making simultaneous edits would be re-examined
  • If possible, if an admin attempts to edit a file already being edited, some information about who is editing the file would be returned
  • There would need to be utilities to be able to examine and search system change logs
  • There would need to be utilities to be able to examine/search individual file change logs (and be able to manipulate those changes)

Experience with Perl is preferred, but not required. Interested students should include in the application what their preferred programming/scripting language is and why.

NCSA SPIN mentors Mark Fredricksen and Daniel Lapine

We need to develop an easy way to gather information needed for metrics related to our service level agreements, and verifying functionality at all times. Right now there are a variety of tools used to gather the information, that we would want collected into a single tool for reporting. We also would like to consolidate some of the tools where there is redundant capability. Skills required: programming or scripting, web development basics, and planning.

NCSA SPIN mentor Kiel Gilleade

Looking to have some fun with biosensors this summer? Come develop the next-generation in silent disco technology using sensors to capture and share clubgoers biorhythms with the crowd. In this project you will be tasked with developing a prototype platform for measuring biosignals at a silent DJ event and visualising that same information in a meaningful manner to a crowd in real-time. You will need a keen interest in biosensors, electronics prototyping (e.g. Arduino), 3D printing, data visualisation and literature research. Experience with electronic prototyping in Arduino and Android programming is desirable. One form the project output could take: A snap-on LED attachment for a silent DJ headphone system is developed. This add-on will run off Arduino or similar platform and will display information collected from an attached or paired sensor device.

NCSA SPIN mentor Kaiyu Guan

Reliable and time-lead forecasting systems for crop type and crop yield has critical values for various purposes for farmer communities and government agencies. Field-level estimation of crop yield is particularly useful for understanding how crop productivity responses to various management requirements and environmental factors. With continued climate variability (e.g. 2012 Midwest drought) and the ongoing climate change, farmer community and our government require better information to monitor crop growth and their near-term prospects. As the most important staple food production area, the U.S. Corn Belt produces half of the global corn and soybean combined, and has significant importance for regional, national, and global economy and food security. However, given the great needs for such forecasting information, we do not have a forecasting system for the U.S. Corn Belt for public use. In this project, we propose to develop scalable machine-learning methods to integrate data from satellite remote sensing and other auxiliary information to make accurate yet cost-effective predictions of crop type. First, we propose to generate a 30-meter (2000-present; 10-meter for post-2014 period), daily, cloud-free data stack for the three major Corn Belt States (i.e. Illinois, Iowa, Indiana) by integrating three major satellite datasets (Landsat, MODIS, and new Sentinel-2). We will build upon this data stack to develop a machine-learning forecasting system for crop type, up to a lead time of 1-to-2 months before the harvest time with a particular focus on rainfed corn and soybean. We will fully leverage satellite information, (including spectral, phenological, and field-level texture information) for our analytics to achieve field-level predictions with advanced deep machine learning approaches.

NCSA SPIN mentor Roland Haas

Modern scientific simulations have enabled us to study non-linear phenomena that are impossible to study otherwise. Among the most challenging problems is the study of Einstein's theory of relativity which predicts the existence of gravitational waves detected very recently be the LIGO collaboration. The Einstein Toolkit is a community driven framework for astrophysical simulations. I am interested in recruiting a student interested in improving the scalability of the Einstein Toolkit, and its use for large scale simulation campaigns. Depending on student interest project range from python scripting to C++ codes running on thousands of CPU cores on Blue Waters. The successful applicant will be involved with both the Relativity Group at NCSA and the Blue Waters project and will be invited to participate in the weekly group meetings and discussions of their research projects.

NCSA SPIN mentor Eliu Huerta

The Relativity Group at NCSA is developing new tools to detect gravitational waves emitted by the collision of black holes and neutron stars in dense stellar environments. We use the community software the Einstein Toolkit to numerically solve Einstein's equations to get a detailed description of these astrophysical events. We are assembling a large catalog of simulations from which we need to extract patterns that are persistent across the 5-dimensional parameter space we are exploring (2D for the masses of the compact objects, orbital eccentricity, and the z-component of the spin of the binary components). We would like to interact with a student who is knowledgeable of Mathematica and python, and who is interested in getting experience with high performance environments, in particular Blue Waters and the Campus Cluster. The successful candidate will become a member of NCSA's Relativity Group and NCSA's LIGO team.

NCSA SPIN mentor Dan Katz

Most scientific computational and data work can be thought of as a set of high-level steps, and these steps can often be expressed as a workflow. Software tools can help scientists define and execute these workflows, for example, Swift which is both a language and a runtime system. This project could have two parts, depending on the student's interests and experiences. One, focused on scientific applications, will examine how workflows like Swift can be used to help scientific communities that haven't considered generic workflow tools, specifically in astronomy, such as the Large Synoptic Survey Telescope (LSST) and the Square Kilometer Array (SKA). LSST is a new kind of telescope, currently under construction in Chile, designed to conduct a ten-year survey of the dynamic universe. LSST can map the entire visible sky in just a few nights, and images will be immediately analyzed to identify objects that have change or moved: from exploding supernovae on the other side of the Universe to asteroids that might impact the Earth. SKA is a massive, international, multiple radio telescope project, that will provide the highest resolution images in all astronomy. Both projects represent challenging data acquisition and analysis problems, integrating workflows, scientific codes, and advanced data centers. The second, focused on software aspects, will examine how Swift might interact with other open source projects, such as the Apache stack. Interested students should have an interest in high-performance computing, big data computing, and/or distributed computing. They should be proficient in a Linux/Unix software development environment and skilled in the C language. Optional but desirable skills include Java, ANTLR/bison/yacc/lex, sockets, and/or MPI.

Most scientific computational and data work can be thought of as a set of high-level steps, and these steps can often be expressed as a workflow. Software tools can help scientists define and execute these workflows, for example, Swift which is both a language and a runtime system. This project, focused on scientific applications, will examine how workflows like Swift can be used to help scientific communities that haven’t considered generic workflow tools, specifically in material science. The particular challenge to be addressed here is to automate the computational and data transfer (including search, load, and publish) components needed to analyze, understand, and share the properties or a variety of materials. The data transfer elements will involve interacting the national Materials Data Facility. This project represent challenging analysis and data problems, integrating workflows, scientific codes, and advanced data facilities. Interested students should have an interest in high-performance computing, big data computing, and/or distributed computing. They should be proficient in a Linux/Unix software development environment and skilled in the C language. Optional but desirable skills include Java, ANTLR/bison/yacc/lex, sockets, and/or MPI.

Scientific software is an essential enabler across computation, experiment and theory in all disciplines. Much of this software is open source, meaning that in many case, it is produced and shared on a voluntary basis, at least at universities. One of the reasons the system works as well as it does is reputation: those who write the code are recognized for having done so by their peers. However, the informal reputation system used in open source is not the same as the academic reputation system that is based on publication of peer-reviewed papers, citations (people discussing your paper in their papers), journal impact factors (papers that cite papers in a given journal), and metrics such as h-index (a single measure of an authors productivity and citations). To encourage open source software and shared software in academia, we want to map software metrics to existing paper metrics, starting with the idea of software citation. This project will investigate and test possible implementations of software citation, and will use some of those to better understand the impact and knowledge that could be gained by having a software citation system and culture in place. Interested students should have an interest in Internet computing, and sharing/cooperative work in the context of academia. They should be experienced with programming in a Linux/Unix software development environment and ideally with GitHub or other distributed software management systems.

NCSA SPIN mentor Vlad Kindratenko

The goal of this project is to deploy, maintain, and experiment with the latest release of OpenStack cloud operating system software on a cluster at the Innovative Systems Lab. The purpose of this experimental OpenStack deployment is to gain and maintain operational awareness of the new features and functionality ahead of the NCSA's production cloud, provide NCSA staff and affiliate faculty with a platform to experiment with the new OpenStack functionality, and to study and evaluate new projects within the OpenStack environment. This project is best suited for students interested in system administration, deployment and operation of complex cloud and HPC environments. Requirements: CS 425 or similar course.

This project will involve deployment and evaluation of existing deep learning frameworks on an HPC cluster and on a cloud. The goal is to gain hands-on experience with deep learning codes, frameworks, and methodologies and to support upcoming projects requiring deep learning.  The work may also require parallelizing codes to work on multiple nodes. This project is best suited for students interested in the development of machine learning techniques and their applications in science and technology fields. Requirements: CS 446 and CS 420, or similar courses.

Innovative Systems Lab operates an experimental 64-GPU cluster with theoretical peak performance of over 0.5 PFLOPS in SP.  The goal of this project is to investigate the suitability of AMD’s Radeon Open Compute (ROCm) platform for GPU computing for implementing scientific computing applications on this cluster.  The project will require writing codes in a C++ dialect using Heterogeneous Compute Compiler, measuring their performance, and scaling codes to work across multiple GPUs.  This work is best suited for students interested in parallel computing and willing to learn new GPU development tools based on C++ language extensions. Requirements: ECE 408/CS 483.

The goal of this project is to investigate the use of reconfigurable computing for acceleration scientific applications using design methodologies based on OpenCL or similar high-level languages. The project will require studying current high-level based design methodologies developed by Xilinx and Altera, implementing computational kernels using one of these methodologies, and evaluating their performance with regards to speed and power.  This work is best suited for students interested in hardware design and already familiar with HDL design methodologies. Requirements: ECE 385 or similar course.

NCSA SPIN mentor Matthew Krafczyk

An important measure of the value of a scientific finding is its ability to be independently reproduced by others skilled in the area. When efforts are made to reproduce such findings years may have elapsed, and reproducibility may be unsuccessful. There are many factors which may prevent the replication of a study, and we seek to understand those related to the computational aspects of the work. We estimate that only about 10% of scientists doing computationally based work release their source code in any form. The successful applicant will join an effort to introduce more transparency to computationally based research. This will include locating source code which was used to create articles and using it to reproduce the article's result. During this process we will study scientific workflows and development habits which hinder or enable reproducibility, as well as develop tools to empower researchers to make their code available more easily. We will build a website to help elucidate these details of the scientific method to the public. Recommended Skills: R, C/C++, Python, web development including HTML, database engineering including SQL, workflow tools.

NCSA SPIN mentor JaeHyuk Kwack

Finite element method is a popular numerical technique in science and engineering projects for finding approximate solutions to boundary value problems for partial differential equations. It requires discretized domains (i.e., meshes) filled with finite elements (e.g., tetrahedrons and hexahedrons in 3D). The p-refinement is a common way to improve numerical accuracy of solutions without changing the number of elements in the finite element mesh. It refers to increasing the degree of the highest complete polynomial (p) within an element; as a result, each element happens to have higher accuracy for computed field data. Via this project, the SPIN intern will develop a standalone program for p-refinements of 3D finite element meshes with boundary conditions. Since mesh generation for complicated geometry is usually processed under GUI environments ahead of numerical simulations, it is practically inefficient to update mesh information like p-refinement at the beginning of simulations. The developed program will provide an efficient way for users to improve mesh quality without maneuvering complicate geometry in a GUI environment. In addition, the SPIN intern will provide an interface to a parallel finite element program for computational fluid dynamics and fluid-structure interactions; therefore, it will allow the developed program to be used as a building block for an adaptive mesh refinement scheme or a sub-grid mesh generation process for multi-scale analyses. Skills required: C, C++, or Fortran.

NCSA SPIN mentor David LeBauer

The TERRA REF program will provide an unprecedented open-access source of data and an integrated phenotyping system for energy sorghum. The TERRA REF system includes field- and controlled-environment digital sensing of energy sorghum along with computational pipelines and open data for the research community. These will be used for crop selection and better understanding of the interactions among genes, traits, and the environment. This position will assist in the development of infrastructure for data processing and access required by the TERRA program.

The intern will work with researchers at NCSA, IGB, Crop Sciences, and Civil Engineering to develop process and faciliate the cross-disciplinary exchange of data and information. Desired skills in image analysis, geospatial information systems, informatics, and high performance computing will be useful. Programming can be done in any open source scripted or compiled language such as R, Python, or C++.

NCSA SPIN mentor Bertram Ludaescher

Data provenance (or data lineage) describes the origin and processing history of data products from workflows or scripts and thus is important metadata in support of transparency and reproducibility in computational and data science. Provenance information often comes in the form of labeled, directed graphs, representing the conceptual or actual dataflow of the computation. Being able to effectively and efficiently query such graphs is an important research problem. As part of this internship, you will learn about different languages for querying graphs (e.g., regular path queries), interesting advanced queries (e.g., to compute the lowest common ancestor(s) in trees and DAGs), and ways to implement such queries using different approaches. The overall goal is to prototype one or more graph querying approaches and evaluate their efficiency on large provenance graphs. Desirable skills: programming experience and interest in algorithms and databases.

NCSA SPIN mentor Charalampos Markakis

Numerical relativity is a rapidly developing field. The development of black-hole simulations has been revolutionary, and their predictions were recently confirmed with the detection of gravitational waves by LIGO. The next expected source is neutron-star binaries, but their simulation is more complicated, as one needs to model relativistic fluids in curved spacetime, and the behavior of matter under the extreme conditions found in neutron-star cores. In this project, you will use the methods you are already familiar with, from Lagrangian or Hamiltonian mechanics, to model fluids in an intuitive way. You will find that a seemingly complex hydrodynamic problem can be greatly simplified, and be reduced to just solving a non-linear scalar field equation. The successful applicant will be able to solve such wave equations numerically in his favorite programming or scripting language (C, Python, Mathematica, etc). This powerful approach allows one to accurately model oscillating stars or radiating binaries, some of the most promising sources expected to be observed in the next LIGO science runs.

Student background: A background in classical mechanics and numerical methods is useful. Familiarity with fluid dynamics or scalar fields is a plus, but training will be provided.

NCSA SPIN mentor Michael Miller

This project researches frameworks and workflows for speech-to-text recognition in order to facilitate live auto captioning and creation of standard caption files for use in live events and video editing, utilizing and enhancing speech-to-text HPC/cloud services and seeks to advance the state of the art in speech-to-text recognition.

NCSA SPIN mentor Luc Paquette

The C-STEPS sketching tools, developed by Professor Emma Mercier and her team in the College of Education, allow students to collaboratively work together to solve problems presented to them on tablet computers. Students interacts with tablets using a stylus or their fingers to write and draw on a digital worksheet as they collaborate to solve complex problems. Every input entered by a student on their own tablet is automatically synchronized to tablets of the other members of the group, allowing students to work together to solve problems. As they interact with the tablet, C-STEPS collects a complete log of the students' actions in the software, thus providing detailed trace of the students' behavior as they solve problems in C-STEPS.

In this project, the selected SPIN intern will apply machine learning approaches to analyze interaction logs collected from engineering undergraduate students who used C-STEPS as part of their regular curriculum. The goal of those analyses will be to discover common behavior patterns used by groups of students in C-STEPS and study how those patterns are related to good or bad collaborative learning practices. The results of those analyses will be used to provide in-the-moment actionable reports to Teaching Assistants in order for them to better support students during their learning activities.

NCSA SPIN mentor Andre Schleife

Computational materials science research produces large amounts of static and time-dependent data that is rich in information. Extracting relevant information from these data to determine underlying processes and mechanisms constitutes an important scientific challenge. It is the goal of this project to use and develop physics-based ray-tracing and stereoscopic rendering techniques to visualize the structure of existing and novel materials e.g. for solar-energy harvesting, optoelectronic applications, and focused-ion beam technology. This team will develop codes e.g. based on the open-source ray-tracer Blender/LuxRender and the open-source yt framework to produce image files and movies. Stereoscopic images will be visualized using virtual-reality viewers such as Google Cardboard, Oculus Rift, or HTC Vive. Preliminary implementations exist and within this project the team will develop GPU-based visualization codes to enable high-throughput rendering of large data sets.

NCSA SPIN mentor Jeff Terstriep

The CyberGIS Center empowers geospatial research by employing advanced cyberinfrastructure. Our team develops data-intensive applications leveraging massively-parallel computation, with an emphasis on service-oriented architectures and open source software. 

NCSA SPIN mentor Sever Tipei

The project involves algorithmic composition and digital sound synthesis using a software package developed at the University of Illinois Computer Music Project and Argonne National Laboratory. It is an ongoing project using stochastic distributions, elements of Graph Theory and Information Theory and requiring C++ programming skills and possibly Graphic User Interface building.