CTA Leader: Dr. Cary Butler

Covers the entire computational ecosystem (hardware, software, storage, and networks) required to conduct large-scale data analytics. This ecosystem includes how large data is managed, analyzed, and visualized. Capabilities include methods for conducting exploration (What does the data look like?), descriptive (What happened?), diagnostics (Why did it happen?), predictive (What will happen?), and prescriptive (How can we make it happen?) analyses. This CTA will focus on how traditional methods in data analytics can be adapted to take advantage of the latest advancements in supercomputing architectures. The goal of the ecosystem is to streamline the process of analyzing data and using the results to make decisions. The notion of leveraging supercomputing will help in two ways. First, traditional methods for conducting data analytics are severely limited in the overall volume of data that can be processed. In most cases, data is simply ignored as a way to fit it into the analysis. Second, complexity of the analysis results in extremely large computational spaces that go way beyond what a typical Department of Defense user can process. Hyper-dimensional analysis introduces complexity that traditional methods are unable to handle. This CTA will advise how data analytics can take advantage of powerful supercomputers as a way to shorten the time between asking a question and seeing the answer. Recommendations for Algorithms in machine learning, deep learning, and graph analytics for supercomputing platforms will be part of this CTA’s role. Capabilities resulting from this CTA are far-reaching and beneficial to existing as well as new classes of high-performance computing users. Furthermore, efforts in DDA are inherently cross cutting, thus complementary to other CTAs.