STAT-437-mhudelson-2025-09-18-11-13-18
Title of Course [High Dimensional Data Learning and Visualization]
Prefix and Number [STAT 437]
Semester and Year [tbd]
Number of Credit Hours [3 Credits (2 credits lecture, 1 credit studio)]
Prerequisites [STAT 435]
Course Details
Day and Time: [tbd]
Meeting Location: [tbd]
Instructor Contact Information
Instructor Name: [tbd]
Instructor Contact Information: [office location, phone, email] [tbd]
Instructor Office Hours: [click here for best practices] [tbd]
TA Name: [tbd]
TA Contact Information: [office location, phone, email]: [tbd]
TA Office Hours: [click here for best practices] [tbd]
Course Description
[This course is the second part of a two-course sequence whose first part is STAT 435 “Statistical Modeling for Data Analytics”. STAT 435 focuses on supervised learning via regression models and their regularized versions. STAT 437 focuses on visualization, non-predictive modeling, and unsupervised learning.
It will cover the following topics: data visualization (via R packages such as ggplot2, gganimate, igraph, plotly), metric-based clustering (such as hierarchical clustering, and K-means), probabilistic and metric-based classification (such as nearest neighbor, mixture models, support vector machine, tree-based method, and neural networks), algebraic and probabilistic dimension reduction (such as principal components, spectral methods, and latent variable models), scalable and approximate inferential methods (such as variational inference and approximate Bayesian computation).
The methods to be covered by the course will be implemented by the software R.]
Course Materials
Required Books: [An introduction to statistical learning (with application in R), Corrected 8th printing, 2017, G. James, D. Witten, T. Hastie, R. Tibshirani]
Supplementary Texts: 1. ggplot2: elegant graphs for data analysis, 2009, Hadley Wickham
2. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition, 2009, T. Hastie, R. Tibshirani and J. Friedman
3. R for Data Science, 2019, Hadley Wickham and Garrett Grolemund]
Other Materials: [None]
Fees: [None]
|
Course Learning Outcomes (students will be able to:) |
Activities Supporting the Learning Outcomes | Assessment of the Learning Outcomes |
|---|---|---|
| [Perform data visualization] |
[There will be approximately 4 to 7 homework assignments, and each homework assignment usually contains both conceptual exercises and applied exercises, the latter of which require software implementation.
There will be 2 projects whose written reports need to be submitted.] |
Homework is graded on a per-problem basis. Answers are to be submitted with necessary supporting computer codes, and well-organized.
The written project rubric includes details concerning Introduction, Methods and Results, Discussion, and Appendix: The written project report should contain at least the following sections: |
| [Perform clustering, classification, and dimension reduction] | ||
| [Perform approximate and scalable statistical inference] |
| Dates | Lesson Topic | Assignment | Assessment |
|---|---|---|---|
|
Week 1 |
1. Basics on R packages dplyr and ggplot2 2. Create scatter plot |
Homework 1 assigned | N/A |
| Week 2 [dates] |
1. Elementary Visualizations (via ggplot2): density plot, histogram, boxplot, barplot, pie chart 2. Advanced Visualizations via ggplot2: faceting, annotation |
Project 1 assigned | N/A |
| Week 3 [dates] |
1. Advanced Visualizations via ggplot2: adjusting scales, legends, fonts, orientation 2. Advanced Visualizations via ggplot2: math expressions, and other ggplot2 tricks |
Homework 1 due Homework 2 assigned |
Homework 1 graded on a per-problem basis |
| Week 4 [dates] |
1. Visualizing spatial data, networks and graphs 2. Basics for interactive and dynamic visualization 3. K-means clustering: I |
None | N/A |
| Week 5 [dates] |
1. K-means clustering: II 2. Hierarchical clustering: I |
Homework 2 due Homework 3 assigned |
Homework 2 graded on a per-problem basis |
| Week 6 [dates] |
1. Hierarchical clustering: II | None | N/A |
| Week 7 [dates] |
1. Hierarchical clustering: III 2. Bayes classifier 3. Nearest-neighbor classifier: Part I |
Homework 3 due Homework 4 assigned |
Homework 3 graded on a per-problem basis |
| Week 8 [dates] |
1. Nearest-neighbor classifier: Part II 2. Discriminant analysis for classification: Part I |
Project 2 assigned | N/A |
| Week 9 [dates] |
1. Discriminant analysis for classification: Part II | None | N/A |
| Week 10 [dates] |
1. Discriminant analysis for classification: Part III | Project 1 due | Project graded using rubric |
| Week 11 [dates] |
1. Support vector machines for classification: Part I |
Homework 4 due Homework 5 assigned |
Homework 4 graded on a per-problem basis |
| Week 12 [dates] |
1. Support vector machines for classification: Part II 2. Neural networks: I |
None | N/A |
| Week 13 [dates] |
1. Neural networks: II 2. Principal component analysis for dimension reduction: Part I |
None | Homework 5 graded on a per-problem basis |
| Week 14 [dates] |
1. Principal component analysis for dimension reduction: Part II | None | N/A |
| Week 15 [dates] |
1. Principal component analysis for dimension reduction: Part III 2. Multidimensional scaling (if time allows) |
Project 2 due | Project 2 graded using the project rubric |
Expectations for Student Effort
For each hour of lecture equivalent and each hour in studio, students should expect to have a minimum of two hours of work outside of class.
Grading [add more lines if necessary]
| Type of Assignment (tests, papers, etc) | Number/Frequency | Percent of Overall Grade |
|---|---|---|
| In-class discussion | Weekly | 12% |
| Weekly class survey | Weekly | 3% |
| Assignments | 5 | 50% |
| Projects | 2 | 35% |
| Grade | Percent | Grade | Percent |
|---|---|---|---|
| A |
93-100 |
C | 73-76.99 |
| A- | 90-92.99 | C- | 70-72.99 |
| B+ | 87-89.99 | D+ | 66-69.99 |
| B | 83-86.99 | D | 60-65.99 |
| B- | 80-82.99 | F | 0-59.99 |
| C+ | 77-79.99 |
Attendance and Make-Up Policy
“Students should make all reasonable efforts to attend all class meetings. However, in the event a student is unable to attend a class, it is the responsibility of the student to inform the instructor as soon as possible, explain the reason for the absence (and provide documentation, if appropriate), and make up class work missed within a reasonable amount of time, if allowed. Missing class meetings may result in reducing the overall grade in the class.
Late assignments are not accepted, except under extenuating circumstances. If there are extenuating circumstances, such as an extended illness, a student should contact the instructor before the homework due time and provide the instructor with needed evidence, and the instructor, at his or her discretion, may extend the due date of a homework assignment. Usually, a maximal extension is 7 days. However, any such decision, in particular regarding the length of an extension, will made by the instructor on a case-by-case basis.
Academic Integrity Statement
Academic integrity is the cornerstone of higher education. As such, all members of the university community share responsibility for maintaining and promoting the principles of integrity in all activities, including academic integrity and honest scholarship. Academic integrity will be strongly enforced in this course. Students who violate WSU’s Academic Integrity Policy (identified in Washington Administrative Code WAC 504-26-010(3) and WAC 504-26-404) will receive a score of zero on any graded coursework, which may result in failing the course, will not have the option to withdraw from the course pending an appeal, and will be reported to the the Center for Community Standards.
Cheating includes, but is not limited to, plagiarism and unauthorized collaboration as defined in the Standards of Conduct for Students, WAC 504-26-010(3). You need to read and understand all of the definitions of cheating: http://app.leg.wa.gov/WAC/default.aspx?cite=504-26-010. If you have any questions about what is and is not allowed in this course, you should ask course instructors before proceeding. If you wish to appeal a faculty member’s decision relating to academic integrity, please use the form available
at
communitystandards.wsu.edu. Make sure you submit your appeal within 21 calendar days of the faculty member’s decision.