Numerical Methods and Statistics

These are lecture notes and homeworks for a course taught at the University of Rochester by Andrew White in the Chemical Engineering Department. The course is taught using Jupyter Notebooks.

View the course online: whitead.github.io/numerical_stats/

Course Description

This course provides an introduction to numerical methods and engineering statistics for chemical engineers. Students learn to use computer models and statistics to understand engineering systems. The focus of numerical methods is translating engineering problems into nalgorithms and implementing them in a spreadsheet or programming language. Topics covered include basic data structures, programming flow control, plotting, function minimization, integration and differential equations. The statistics portion teaches students basic probability theory, the central limit theorem, hypothesis testing, confidence intervals, regression, model fitting and basic error analysis.

Projects

See project folder.

Table of Contents

Unit 1 — Introduction

Lecture 1: Sample Spaces, Probability Algebra of Samples, Events

Unit 2 — Probability

Lecture 1: Combinations & Permutations, Multidimensional Sample Spaces, Random Variables, Continuous Probability Distributions

Lecture 2: Marginals, Joints, Independence of Random Variables, Table of Useful Equations

Lecture 3: Conditionals, Working with Joints/Marginals/Conditionals, Bayes’ Theorem, Math Definition Independence, Compound Conditionals, Conditional Independence, Table of Useful Equations

Unit 3 — Python Basics

Lecture 1: Python Variables, String Formatting, Representing Integers

Lecture 2: Floating Point Representation, Python Booleans, Default Booleans, Floating Point Booleans, Lists, Slicing

Unit 4 — Python Basics, Expected Value

Lecture 1: List Methods, Range, Numpy Arrays, Python Tutor, For loops, Python Data Types (dictionaries, tuples, ints, floats), Function Arguments, Basic Plotting, Jupter Notebook Format

Lecture 2: Expected Values and Variance, Conditional Expectation

Unit 5 — Probability Distributions

Lecture 1: Bernoulli, Geometric, Binomial, Poisson, Exponential and Normal Distribution Equations

Lecture 2: Probability of a Sample or Interval, Prediction Interval

Unit 6 — Python Program Flow

Lecture 1: Plotting - Basics, LaTeX, Point Markers, Vertical/Horizontal Lines, Legends

Lecture 2: Break Statement, While Loops, Discrete Distribution Prediction Intervals, Scipy Stats, Working with Probability and Prediction Intervals of Normal Distribution

Lecture 3: Defining Functions, Named Arguments, Default Function Arguments, Documenting Functions,

Unit 7 — Functions and Sample Statistics

Lecture 1: Sample Statisics for 1D data: median, mean, mode, quartiles and quantiles.

Lecture 2: Presenting Results and Precision, Calculating Sample Statistics, Visualizing 1D data with histograms, Caclulating Sample Statistics with Categories, Visualizing Categorical 1D data with Boxplots and Violin Plots.

Lecture 3: Sample Statisics for 2D data: Sample Covariance, Sample Correlation.

Lecture 4: Plotting 2D data (scatter plot) and computing sample covariance/correlation

Unit 8 — Central Limit Theorem and Confidence Intervals

Lecture 1: Central Limit Theorem and Theory of Confidence Intervals

Lecture 2: Computing Confidence Intervals

Unit 9 — Linear Algebra in Python

Lecture 1: Python Tips & Tricks

Lecture 2: Matrix Algebra (linalg), Solving Systems of Equations, Eigenvector/Eigenvalue, Matrix Rank

Lecture 3: Numerical Differentiation, Numerical Integration via Trapezoidal Rule, Numerical Integration in Scipy, Anonymous Functions (lambda)

Unit 10: — Hypothesis Testing

Lecture 1: Introduction to Hypothesis Testing, the zM and Student’s t-Test

Lecture 2: Non-Parametric Statistics, Reading a CSV file in Pandas, Wilcoxon Sum of Ranks, Wilcoxon Signed Rank, Poisson Test, Binomial Test

Unit 11 — Optimization

Lecture 1: Common mistakes with functions, Scope, Root Finding in 1D, Minimization in 1D, Convexity

Lecture 2: Root finding in multiple dimensions, Minimization in multiple dimensions, Bounded Optimization, Non-convex Optimization

Unit 12: — Regression

Lecture 1: Shapiro-Wilk Normality Test, Ordinary Least-Squares Linear Regression in 1- (OLS-1D) and N dimensions (OLS-ND), Standard error, Uncertainty in OLS-1D, OLS-ND, Fit coefficient hypothesis tests, Fit coefficient confidence intervals, Overview of steps to justify and perform regression (bottom of lecture)

Lecture 2: Non-linear regression and error analysis. Deconvoluting spectrum example.

Lecture 3: Regressing categorical data with discrete domains

Lecture 4: Regressing with constant uncertainty/measurement error in independent and/or dependent variables

Unit 13: — Differential Equations & Uncertainty Propagation

Lecture 1: Standard form and categorizing differential equations, Solving ODEs

Lecture 2: Error propagation through numerical derivatives, statistical fallacies

Unit 14: — Applied Python - Working with Data

Lecture 1: Dealing with duplicate, missing, NaN, non-contiguous, out of order data, Joining datasets, Using Pandas, Using Seaborn, Computing Running Means

Lecture 2: Packaging and deploying Python modules

Unit 15: — What to do now

Lecture 1: Next steps to learn more about numerical methods, statistics, and programming

Unit 16: — MATLAB

Lecture 1: An overview of MATLAB, the Jupyter Hub server and Excel

Unit 17: — User Interfaces

Lecture 1: Creating and writing animations

Lecture 2: Introduction to HTML, CSS, JS and modifying notebook style

Unit 18: — Design of Experiments

Lecture 1: Tables of experiments, vocabulary of design of experiments, ANOVA, factorial design, fractional factorial design, nuisance factors, blocking