The resource has been added to your collection
Humans are generating, sensing, and harvesting massive amounts of digital data, and many of these unprecedentedly large data sets will be archived in their entirety. The familiar notions of sequential or random access files no longer apply in the cloud. Instead developers will write code that mines this mass of unstructured data, extracts what is of interest, and then inserts the resulting data subset into a relational database or other structured data store where it will be analyzed and visualized. In a data-intensive world where the sheer volume of data demands new approaches and techniques, the inclination is to move the computation to the data, a basic theme underlying this course. Called the "fourth paradigm" (after theory, experiment, and computation), data-intensive computing is poised to transform scientific research. Students will learn about the notion of "data at rest" and its impact on data movement and computation, the role of cloud infrastructure in data-intensive computing, and the need for semantic metadata, preservation, and curation of digital data. Participants will get hands-on programming experience with data-intensive computing languages such as MapReduce.
This resource has not yet been reviewed.
Not Rated Yet.