University of Massachusetts Amherst

Search Google Appliance

Links

Masters Concentration in Data Science

The Computer Science Masters with a Concentration in Data Science was created to help meet the need for expanded and enhanced training in the area of data science. It requires coursework in Theory for Data Science, Systems for Data Science, Data Analysis and Statistics.

 

Aerial photo of computer science buildingThe Masters Concentration in Data Science teaches you to develop and apply methods to collect, curate, and analyze large-scale data, and to make discoveries and decisions using those analyses.

 

Requirements and Admissions

 

Who should apply?

Students require a bachelor’s degree and a solid undergraduate background in computer science.

 

 

Curriculum

The Masters Degree is a total of 30 credits and is usually completed in two years.  Four Data Science core courses (12 credits) including one each from the areas of Theory for Data Science, Systems for Data Science, and Data Analysis, and one additional core course from any area. Two courses (6 credits) taken from among a set of courses designated as satisfying the Data Science Elective requirement. One course (3 credits) taken from among a set of courses satisfying the Data Science Probability and Statistics requirement.  

 

 

Useful Links

The full-time graduate program admission deadlines are:

  • October 1 for Spring enrollment (Master's Program only)
  • December 15 for Fall enrollment

Courses offered Spring 2019


COMPSCI 514: Algorithms for Data Science

With the advent of social networks, ubiquitous sensors, and large-scale computational science, data scientists must deal with data that is massive in size, arrives at blinding speeds, and often must be processed within interactive or quasi-interactive time frames. This course studies the mathematical foundations of big data processing, developing algorithms and learning how to analyze them. We explore methods for sampling, sketching, and distributed processing of large scale databases, graphs, and data streams for purposes of scalable statistical description, querying, pattern mining, and learning. Was COMPSCI 590D. Undergraduate Prerequisites: COMPSCI 240 and COMPSCI 311. 3 credits


COMPSCI 589: Machine Learning

This course will introduce core machine learning models and algorithms for classification, regression, clustering, and dimensionality reduction. On the theory side, the course will focus on understanding models and the relationships between them. On the applied side, the course will focus on effectively using machine learning methods to solve real-world problems with an emphasis on model selection, regularization, design of experiments, and presentation and interpretation of results. The course will also explore the use of machine learning methods across different computing contexts including desktop, cluster, and cloud computing. The course will include programming assignments, a midterm exam, and a final project. Python is the required programming language for the course.


COMPSCI 590V: Data Visualization and Exploration

In this course students will learn the fundamental principles of exploring and presenting complex data, both algorithmically and visually.  We will cover systems infrastructure for collating large data, basic visualization of summary statistics, algorithms for exploring patterns in the data (such as rule-mining, graph analysis, clustering, topic models and dimensionality reduction), and artistic and cognition aspects of data presentation (including interactive visualization, human perception, decision-making).  Domains will include numeric data, relational data, geographic data, graphs and text.  Hands-on labs and projects will be performed in Python and D3.


COMPSCI 611: Advanced Algorithms

Principles underlying the design and analysis of efficient algorithms. Topics to be covered include: divide-and-conquer algorithms, graph algorithms, matroids and greedy algorithms, randomized algorithms, NP-completeness, approximation algorithms, linear programming.


COMPSCI 677: Distributed and Operating Systems

This course provides an in-depth examination of the principles of distributed systems in general, and distributed operating systems in particular. Covered topics include processes and threads, concurrent programming, distributed interprocess communication, distributed process scheduling, virtualization, distributed file systems, security in distributed systems, distributed middleware and applications such as the web and peer-to-peer systems. Some coverage of operating system principles for multiprocessors will also be included. A brief overview of advanced topics such as cloud computing, green computing, and mobile computing will be provided, time permitting.

 


COMPSCI 683: Artificial Intelligence

In-depth introduction to Artificial Intelligence focusing on techniques that allow intelligent systems to reason effectively with uncertain information and cope limited computational resources. Topics include: problem-solving using search, heuristic search techniques, constraint satisfaction, local search, abstraction and hierarchical search, resource-bounded search techniques, principles of knowledge representation and reasoning, logical inference, reasoning under uncertainty, belief networks, decision theoretic reasoning, planning under uncertainty using Markov decision processes, multi-agent planning, and computational models of bounded rationality.


COMPSCI 690D: Deep Learning for Natural Language Processing

This course offers an introduction to the models and principles behind state-of-theart deep learning techniques applied to natural language processing problems. It is intended for graduate students in computer science and linguistics who are (1) interested in learning about cutting-edge research progress in NLP and (2) familiar with machine learning fundamentals. We will cover a variety of models, including vector-based word representations, basic neural network architectures (e.g., convolutional, recurrent), and more advanced variants of these networks that are especially useful for NLP (e.g.,attention-based or memory-augmented). We will also see these models in action on a variety of NLP tasks, including text classification,question answering, and text generation. Coursework includes reading recent research papers, programming assignments, and a final project. 3 credits.


COMPSCI 701: Advanced Topics in Computer Science

This is a 6 credit reading course corresponding to the masters project. The official instructor is the GPD although the student does the work with and is evaluated by the readers of his or her master s project. 6 credits.