The Advanced Data Science option aims to educate the next generation of thought leaders who will both build and apply new methods for data science. This option will help to educate and recognize PhD students whose thesis work focuses specifically on building and using advanced data science tools. The goal of this option is not to educate all students in the foundations of data science but rather to provide advanced education to the students who will push the state-of-the-art in data science method.

The Advanced Data Science option replaces the previous Big Data track introduced in 2014. This is an official UW degree option which will appear on your transcript.

Students enrolled in this option can expect to interact with students enrolled in similar Advanced Data Science PhD options in Genome Sciences, Statistics, Oceanography, Chemical Engineering and Astronomy.  Formally, the option is affiliated with an NSF IGERT training award in Big Data, and PhD students in the track are eligible for funding via that award. 

Description

The Advanced Data Science Option is an overlay on top of our regular quals requirements. Hence students must make sure to satisfy the regular quals requirements in addition to the option requirements. The latter impose some extra constraints on the course selection. These constraints apply to both pre-quals and post-quals courses, and are designed to help you organize your coursework towards research in Big Data.

Course requirements

1. Quals-level requirements

Successfully complete the department's PhD qualifying coursework requirements, and satisfactorily complete three out of four of the following core courses in Big Data (some of the courses listed below may also be counted towards the qualifying coursework requirements, if allowable in the standard requirements):

- Data Management: CSE 544 (satisfies the “Programming Systems” quals category).

- Machine Learning: CSE 546 (satisfies the “AI” quals category)

- Data Visualization: CSE512 (satisfies the “Applications” quals category).

- Statistics: STAT 509 (Introduction to Mathematical Statistics).  Alternatively, for a more advanced sequence, you may choose to take STAT 512 (Statistical Inference), but, in this case, we strongly recommend that you also take STAT 513, the second course in this sequence.  

In general, quals course waivers may not be applied in lieu of one of these core Big Data courses.  However, a student may petition to substitute a requirement by *a more advanced course in that area, taken at UW*.  Petitions should be sent to Magda Balazinska. 

2. Post-quals requirements

Satisfactorily complete one additional course with explicit emphasis on advanced “Big Data” techniques:

  • A fourth core course from the list above
  • CSE 547 / STAT 548 - Machine Learning for Big Data
  • STAT 513 - Statistical Inference
  • A new Big Data Management course planned for for the future (Magda Balazinska and Dan Suciu)
  • EE 578 - Convex Optimization
  • STAT 527 - Nonparametric Regression and Classification
  • STAT 538 - Advanced Statistical Learning
  • CSE 552 - Distributed and Parallel Systems Data
  • CSE 599C - Big Data Management Systems (Spring 2017 offering)

3. eScience Community Seminar

To further expand students’ education and create a campus-wide community, students will register for at least 4 quarters in the weekly eScience Community Seminar.

4. eScience Institute

Students interested in data science should also check out other activities that we are carrying out in the eScience Institute. The eScience Community Seminar is one of those activities. Other relevant activities include various tool and method-oriented workshops as well as speaker series. Visit http://data.washington.edu for more information

In addition, the track is designed to complement the activities of the eScience Institute and to leverage ongoing activities associated with the Moore/Sloan Foundation Data Driven Discovery Initiative, involving the University of Washington, New York University and the University of California, Berkeley.

Admission

CSE Ph.D. students who choose to enroll in the Advanced Data Science Option must have approval of their research advisor. Email this approval to the Graduate Program Advisor (Elise Dorough, elised@cs). There is no additional admission procedure.

If you have any questions about the Advanced Data Science Option, please email Magda Balazinska.