As more and more revisions are made to data and scientific analyses available on government websites concerning environmental and climate protection, there has been a growing need for researchers and coders to preserve environmental data and keep citizens informed of such changes. This position will assist the Environmental Data and Governance Initiative (EDGI), a network of scholars and researchers that archives federal environmental data to safeguard it against potential reductions in access by the current administration, develops online tools to support monitoring changes to federal environmental websites, and tracks cuts in funding, research, and regulation at environmentally oriented agencies. These agencies and departments include, but are not limited to, the EPA (Environmental Protection Agency), NOAA (National Oceanic and Atmospheric Administration), NASA (National Aeronautics and Space Administration), USGS (United State Geological Survey), OSHA (Occupational Safety and Health Administration), DOE (Department of Energy) and BLM (Bureau of Land Management).
This position will support collaborations under EDGI's public data working group that include projects for indexing millions of government web pages on a weekly basis, tracking changes on them, and producing regular reports. Additional ongoing efforts include distributed protocol development for data storage, machine learning work that can isolate the most important website changes for enhanced tracking efforts, and security advancements for privacy protection of EDGI volunteers and workshop participants engaging in data preservation and website monitoring. Potential project work could also extend developments made under EDGI's Google Summer of Code partnership, where recent collaborations utilized machine learning algorithms to identify and monitor changes on government agency websites using data from multiple sources: Versionista, PageFreezer, and Internet Archive; another recent collaboration used D3 to develop DataRescue Maps as impactful, publically-meaningful models to allow users to easily visualize changes to government websites archived by EDGI. The data being archived is vital for environmental research and protection, but it can be meaningless or overwhelming in the hands of users without clear graphs or interactive models that help provide context and a general overview of the data.