University of Massachusetts Amherst

Search Google Appliance

Links

Certificate in Statistical and Computational Data Science

There are three pillars to Data Science: statistical skills, computer science and domain expertise. This certificate is offered jointly through the Statistics and Computer Science departments. The program blends topics in statistical methods, statistical computing, machine learning and algorithm development to train students to become effective data scientists for any domain. Additional skills that students will develop include the ability to work with large databases, and to manage and evaluate data sets and create meaningful output that can be used in effective decision making. More information hereQuestions about this program should be directed to csinfo@cs.umass.edu.

 

Curriculum

The Certificate is a total of 15 credits and can be completed in one year. It consists of at least two computer science courses and two statistics courses.

 

Useful Links

Certificate Courses offered Spring 2019


COMPSCI 514: Algorithms for Data Science

With the advent of social networks, ubiquitous sensors, and large-scale computational science, data scientists must deal with data that is massive in size, arrives at blinding speeds, and often must be processed within interactive or quasi-interactive time frames. This course studies the mathematical foundations of big data processing, developing algorithms and learning how to analyze them. We explore methods for sampling, sketching, and distributed processing of large scale databases, graphs, and data streams for purposes of scalable statistical description, querying, pattern mining, and learning. Was COMPSCI 590D. Undergraduate Prerequisites: COMPSCI 240 and COMPSCI 311. 3 credits


COMPSCI 590V: Data Visualization and Exploration

In this course students will learn the fundamental principles of exploring and presenting complex data, both algorithmically and visually.  We will cover systems infrastructure for collating large data, basic visualization of summary statistics, algorithms for exploring patterns in the data (such as rule-mining, graph analysis, clustering, topic models and dimensionality reduction), and artistic and cognition aspects of data presentation (including interactive visualization, human perception, decision-making).  Domains will include numeric data, relational data, geographic data, graphs and text.  Hands-on labs and projects will be performed in Python and D3.