**What classes do I need to take to get a Certificate in Integrated Data Science? **The Northwestern Certificate in Integrated Data Science will be available beginning in the 2016-2017 academic year and requires five courses. Four are specified and one is an elective from an extensive list.

**Required courses**

*DATA_SCI-401: Data-Driven Research in Physics, Geophysics, and Astronomy*

This course integrates the domain-focused projects in Physics & Astronomy (P&A) and Earth and Planetary Sciences (EPS) and will be team-taught by one professor from P&A and one from EPS. This course will cover one quarter of material, but be spread over 2 quarters (Fall and Winter every year) to allow alignment and further interdisciplinary integration with DATA_SCI-421 and DATA_SCI-422.

It will focus on principles and methods of data analysis–specifically the science motivation and goals that unite three distinct research projects: Earthscope, the Large Synoptic Survey Telescope (LSST), and the Laser Interferometer Gravitational-wave Observatory (LIGO) for a broad audience of graduate students interested in data science methods. The course will provide an overview of the science that motivated these projects and highlight data challenges and opportunities for all three, including the connectivity, similarities and differences. The course will be guided by project-specific literature developed by the pertinent scientific communities as well as professional articles from astro- and geo-physical journals. Student will work on projects requiring written reports, class presentations, and participation in discussions.

Prerequisite: None.

*DATA_SCI-421: Integrated Data Analytics I (*also: PHYS 441: Statistical Methods for Physicists and Astronomers)

Data analysis in the modern age requires familiarity of many concepts and methods from statistics. This course provides an introduction to the basics as well as exposure to some of the most advanced techniques. The emphasis will be on practical problems from physics and astronomy, rather than on theory or on statistical methods from other fields. Prior knowledge of statistics is not required.

Prerequisite: None.

*DATA_SCI-422: Integrated Data Analytics II (*also EPS 329: Mathematical Inverse Methods in Earth and Environmental Sciences)

This course covers the theory and application of inverse methods to gravity, magnetotelluric, seismic waveform, multilateration, and students’ data. In particular, students will learn how about nonlinear, linearized, underdetermined, and mixed-determined problems and solution methods, such as regularized least-squares and neighborhood algorithms.

Prerequisite: MATH 230, STAT 232, or equivalent; MATH 240 or STAT 320-1, 320-2 recommended.

*DATA_SCI-423: Integrated Data Analytics III (*also EECS 495: Machine Learning: Foundations, Applications, and Algorithms)

From robotics, speech recognition, and analytics to finance and social network analysis, machine learning has become one of the most useful set of scientific tools of our age. With this course we want to bring interested students and researchers from a wide array of disciplines up to speed on the power and wide applicability of machine learning. The ultimate aim of the course is to equip Trainees with all the modeling and optimization tools they’ll need to formulate and solve problems of interest in a machine learning framework. We will build these skills with lectures and reading materials that introduce machine learning in its many applications, as well as by describing in a detailed, but user-friendly, manner the modern techniques from nonlinear optimization used to solve them. In addition to a well curated collection of reference materials, registered students will receive a draft of a forthcoming manuscript authored by the instructors on machine learning to use as class notes.

Prerequisite: Students should have a thorough understanding of vector calculus and linear algebra, and have a basic understanding of the Python or MATLAB/OCTAVE programming environments.

**Elective options**

*From the Department of Electrical Engineering and Computer Science (McC):*

- Data Management and Information Processing (EECS 317)
- Machine Learning (EECS 349)
- Digital Image Processing (EECS 420)
- Nonlinear Optimization (EECS 479)
- Probabilistic Graphical Models (EECS 395/495)
- Statistical Pattern Recognition (EECS 433)
- Social Media Mining (EECS 510)
- Geospatial Vision and Visualization (EECS 395/495)
- Data Science (EECS 395/495)

*From the Department of Engineering Sciences and Applied Mathematics**:*

- Models in Applied Mathematics (ES_APPM 421-1)
- Numerical Methods for Random Processes (ES_APPM 448)

*From the Department of Statistics:*

- Time Series Analysis (STAT 454)
- Applied Bayesian Inference (STAT 457)
- Theory of Data Mining (STAT 461)

*From the Department of Industrial Engineering and Management Sciences:*

- Statistical Methods for Data Mining (IEMS 304)