ED_PSYCH-581-s.dai-2025-04-11-05-53-34
Title of Course [Machine Learning Applications in Education ]
Prefix and Number [ED_PSYCH 581]
Semester and Year [Spring 2026]
Number of Credit Hours [3]
Prerequisites [ED_RES 565 Quantitative Research]
Recommended Preparation [ED_PSYCH 569 Seminar in Quantitative Techniques in Education]
Course Details
Day and Time: Thursday, 4:10 PM – 7:00 PM
Meeting Location: Cleveland Hall 63
Instructor Contact Information
Instructor Name: Shenghai Dai
Instructor Contact Information:
- Office: Cleveland Hall 354
- Phone: (509) 335-0958
- Email: s.dai@wsu.edu
Instructor Office Hours: Thursdays: 2:00 – 4:00 PM or by appointment
TA Name: [tbd]
TA Contact Information: [office location, phone, email]: [tbd]
TA Office Hours: [tbd]
Course Description
Ed Psych 581 aims to introduce classical and novel machine learning methods and their applications in broad educational contexts. Major topics include statistical learning, predictions with linear regressions, regularization methods including shrinkage (LASSO, RIDGE, and Elastic Net) and dimension reduction methods (principal component partial least squares), classification with logistic regression and discriminant functions, cross-validation and bootstrapping, tree-based methods (decision trees, bagging, boosting, random forests), support vector machines, deep learning and neural networks, and unsupervised learning.
The course will primarily consist of lectures and practical exercises based on the assigned readings and examples. Each topic will be introduced with a lecture, followed by a prompted real data example and reflection on the topic discussed. During the semester, you are expected to complete readings prior to class, engage in practice and discussion during class periods, and complete several assignments. You are expected to select a topic of your interest within your area of research and provide in-depth analysis in both written and oral forms (see more detail below under Research Project). You are expected to actively participate and build on the knowledge previously acquired and to gain technical foundations necessary to be consumers and contributors to applied and methodological research using structure equations and/or latent variable models.
Course Materials
Books
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to statistical learning (2nd Ed.). New York: Springer. Free online access: https://www.statlearning.com/
- Khine, M. S. (2024). Machine Learning in Educational Sciences. Springer Nature Singapore. Free online access: https://link.springer.com/book/10.1007/978-981-99-9379-6
Other Materials:
Recommended Other Book
- American Psychological Association (2019). Publication manual of the American Psychological Association (7th ed.). Washington, D. C.: Author.
Suggested Readings (for assigned readings, see Tentative Class Schedule)
- Reviews of Machine Learning in Education
- Hilbert, S., Coors, S., Kraus, E., Bischl, B., Lindl, A., Frei, M., ... & Stachl, C. (2021). Machine learning for the educational sciences. Review of Education, 9(3), e3310.
- Albreiki, B., Zaki, N., & Alashwal, H. (2021). A systematic literature review of student performance prediction using machine learning techniques. Education Sciences, 11(9), 552.
- Technical Notes on Machine Learning Methods
- Murphy, K. P. (2012). Machine learning: A probabilistic perspective. MIT Press.
- Luan, J., Zhang, C., Xu, B., Xue, Y., & Ren, Y. (2020). The predictive performances of random forest models with limited sample size and different species traits. Fisheries Research, 227, 105534.
- Zhou, D.-X. (2013). On grouping effect of elastic net. Statistics & Probability Letters, 83(9), 2108–2112.
- Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320.
- Applications of Machine Learning in Education
- Regressions
- Immekus, J. C., Jeong, T. S., & Yoo, J. E. (2022). Machine learning procedures for predictor variable selection for schoolwork-related anxiety: evidence from PISA 2015 mathematics, reading, and science assessments. Large-scale Assessments in Education, 10(1), 1-21.
- Gorostiaga, A., & Rojo-Álvarez, J. L. (2016). On the use of conventional and statistical-learning techniques for the analysis of PISA results in Spain. Neurocomputing, 171, 625-637.
- Regularization Methods
- Dai, S., Hao, T., Ardasheva, Y., Ramazan, O., Danielson, R., & Austin, B. (2023). PISA reading achievement: Identifying predictors and examining model generalizability for multilingual students. Reading and Writing. 36, 2763-2795.
- Ramazan, O., Dai, S., Danielson, R., Ardasheva, Hao, T., & Y. Austin, B., (2023). Students’ 2018 PISA reading self-concept: Identifying predictors and examining model generalizability for emergent bilinguals. Journal of School Psychology. 101, 101254.
- Yoo, J. E. (2018). TIMSS 2011 Student and Teacher Predictors for Mathematics Achievement Explored and Identified via Elastic Net. Frontiers in Psychology, 9.
- Tree-Based Methods
- Chang, C. N., Lin, S., Kwok, O. M., & Saw, G. K. (2023). Predicting STEM Major Choice: a Machine Learning Classification and Regression Tree Approach. Journal for STEM Education Research, 1-17.
- Liu, X., & Ruiz, M. E. (2008). Using data mining to predict K–12 students' performance on large‐scale assessment items related to energy. Journal of Research in Science Teaching, 45(5), 554-573.
- Support Vector Machines
- Bernardo, A. B. I., Cordel, M. O., Lucas, R. I. G., Teves, J. M. M., Yap, S. A., & Chua, U. C. (2021). Using Machine Learning Approaches to Explore Non-Cognitive Variables Influencing Reading Proficiency in English among Filipino Learners. Education Sciences, 11(10), 628.
- Chen, J., Zhang, Y., Wei, Y., & Hu, J. (2021). Discrimination of the Contextual Features of Top Performers in Scientific Literacy Using a Machine Learning Approach. Research in Science Education, 51(1), 129–158.
- Dong, X., & Hu, J. (2019). An exploration of impact factors influencing students’ reading literacy in Singapore with machine learning approaches. International Journal of English Linguistics, 9(5), 52–65.
- Gorostiaga, A., & Rojo-Álvarez, J. L. (2016). On the use of conventional and statistical-learning techniques for the analysis of PISA results in Spain. Neurocomputing, 171, 625–637.
- Neural Networks
- Loesche, P. M. (2019). Estimating the true extent of gender differences in scholastic achievement: A neural network approach. Intelligence, 77, 101398.
- Others (Multiple Methods Used)
- Ho, I. M. K., Cheong, K. Y., & Weldon, A. (2021). Predicting student satisfaction of emergency remote learning in higher education during COVID-19 using machine learning techniques. PLOS ONE, 16(4), e0249423.
- Levin, N. A. (2021). Process mining combined with Expert Feature Engineering to predict efficient use of time on high-stakes assessments. Journal of Educational Data Mining, 13(2), 1-15.
- Regressions
Education Data Sources/Websites
- Mihaescu, M. C., & Popescu, P. S. (2021). Review on publicly available datasets for educational data mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 11(3), e1403.
- International Association for the Evaluation of Educational Achievement (IEA): https://www.iea.nl
- Integrated Postsecondary Education Data System (IPEDS): https://nces.ed.gov/ipeds
- National Assessment for Educational Progress (NAEP): https://nces.ed.gov/nationsreportcard
- National Center for Education Statistics (NCES): https://nces.ed.gov
- Programme for International Student Assessment (PISA): http://www.oecd.org/pisa
- Progress in International Reading Literacy Study (PIRLS): https://timssandpirls.bc.edu
- Teaching and Learning International Survey (TALIS): http://www.oecd.org/education/talis
- Trends in International Mathematics and Science Study (TIMSS): https://timssandpirls.bc.edu
Software
Computer lab work is a component of the course. This will give the student the opportunity to apply what is discussed in class. The R software (https://cran.r-project.org) is used as the major tool for the class. Other software (Jamovi, Python) will also be introduced as time allows or need arises. All the software tools are free of cost. To get familiar with R modeling, the following sources can be useful.
- Friedman, J., Hastie, T., Tibshirani, R., Narasimhan, B., Tay, K., & Simon, N. (2022). glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models (R Package Version 4.1-4) [R package]. https://cran.r-project.org/package=glmnet
- Greenwell, B., Boehmke, B., Cunningham, J., & GBM Developers. (2020). Generalized Boosted Regression Models (R Package Version 2.1.8) [R package]. https://github.com/gbm-developers/gbm
- Kuhn, M. (2022). caret: Classification and Regression Training (R Package Version 6.0-92) [R package]. https://github.com/topepo/caret
- Liaw, A. (2022). randomForest: Breiman and Cutler’s Random Forests for Classification and Regression (R Package Version 4.7-11) [R package]. https://CRAN.R-project.org/package=randomForest
- Meyer, D., Dimitriadou, E., Hornik, K., Weingesse, A., & Leisch, F. (2022). e1071: Misc Functions of the Department of Statistics, Probability Theory Group (R Package Version 1.7-11) [R package]. https://CRAN.R-project.org/package=e1071
- Ripley, B. (2022). tree: Classification and Regression Trees (R Package Version 1.0-42) [R package]. https://CRAN.R-project.org/package=tree
Fees: [NA]
Course Format
The class will include a variety of activities: individual work, class discussions, lab practices, small group work and presentations. Please come to class prepared to discuss reading and other assignments.
|
Course Learning Outcomes (students will be able to:) |
Activities Supporting the Learning Outcomes | Assessment of the Learning Outcomes |
|---|---|---|
|
Articulate the similarities and differences between machine/statistical learning and classical statistical methods. |
Weeks 1-2: Introduction to machine and statistical learning; model accuracy and interpretability; predictions with linear regressions |
Lab practice; Homework #1 |
|
Select the best regularization regressions based on tuning parameters, conduct the analysis, and interpret the results. |
Weeks 3-4: Regularization methods – penalized regressions (LASSO, RIDGE, Elastic Net) |
Lab practice; Homework #2 |
|
Conduct the analysis using dimension reduction methods and interpret the results. |
Week 5: Regularization methods – dimension reduction methods |
Lab practice; Homework #3 |
|
Select the appropriate classification method, conduct the analysis, and interpret the results. |
Weeks 6-7: Classification methods (logistic regression and discriminant functions) |
Lab practice; Homework #3 |
|
Using k-fold and/or bootstrapping to enhance the robustness of the methods. |
Week 8: Cross-validation and resampling |
Lab practice; Homework #4 |
|
Investigate the issues of linearity and examine the potential to use splines and additive models to address non-linearity. |
Week 9: Moving beyond linearity - splines and additive models |
Lab practice; Homework #4 |
|
Apply tree-based methods and interpret the results. |
Weeks 10-11: Tree-based methods |
Lab practice; Homework #4 |
|
Conduct analysis using support vector machines and interpret the results. |
Week 12: support vector machines |
Lab practice; Homework #5 |
|
Conduct analysis using deep learning and neural networks, and interpret the results. |
Week 13: deep learning and neural networks |
Lab practice; Homework #5 |
|
Implement machine learning in your area of research, propose and accomplish a research project (proposal, presentation, and research paper). |
Weeks 1-16; (Week 15: Project work and consultation; Week 16: Project presentation) |
Project proposal; presentation; Research project. |
| Dates | Lesson Topic | Assignment | Assessment |
|---|---|---|---|
|
Week 1 |
Overview of the course; introduction to machine and statistical learning; model accuracy and interpretability |
James et al. (2021, CH1); |
HW#1 |
| Week 2 [01-22] |
Predictions with linear regressions – simple & multiple regressions, non-linear transformations, regression vs. K-nearest neighbors (KNN) |
James et al. (2021, CH3) |
|
| Week 3 [01-29] |
Regularization methods – penalized regressions
|
James et al. (2021, CH6); |
HW#2 |
| Week 4 [02-05] |
Lab practice – penalized regressions |
Dai et al. (2021); |
|
| Week 5 [02-12] |
Regularization methods – dimension reduction methods
|
James et al. (2021, CH6) |
HW#3 |
| Week 6 [02-19] |
Classification methods
|
James et al. (2021, CH4) |
|
| Week 7 [02-26] |
Lab practice – classification methods |
||
| Week 8 [03-05] |
Cross-validation and resampling
|
James et al. (2021, CH5) |
HW#4 |
| Week 9 [03-19] |
Moving beyond linearity
|
James et al. (2021, CH7) |
|
| Week 10 [03-26] |
Tree-based methods
|
James et al. (2021, CH8); |
|
| Week 11 [04-02] |
Tree-based methods
|
James et al. (2021, CH8); |
|
| Week 12 [04-09] |
Support vector machines |
James et al. (2021, CH9); |
HW#5 |
| Week 13 [04-16] |
Deep learning and neural networks |
James et al. (2021, CH10); |
|
| Week 14 [04-23] |
Project work and consultation | NA | Research Project |
| Week 15 [04-30] |
Project Presentations | NA | Presentation |
Notes: HW = homework
- The instructor reserves the right to adjust the schedule as needed.
- Additional readings may be assigned.
Expectations for Student Effort
For each hour of lecture equivalent, students should expect to have a minimum of two hours of work outside of class. I expect all students to (a) attend class on time, (b) participate actively in class discussions, (c) read all assigned readings, and (d) turn in assignments on time. We will try to have lab time each class session to practice what we have discussed. If you are unable to attend class, please notify me in advance. You are responsible for the information missed during your absence. No late assignments are accepted for credit.
Grading
| Type of Assignment (tests, papers, etc) | Points | Percent of Overall Grade |
|---|---|---|
| Homework Assignment | 50 | 50% |
| Research Project Proposal | 10 | 10% |
| Research Project Presentation | 10 | 10% |
| Research Project Paper | 30 | 30% |
Grades will be based on (a) homework assignments (50%) and (b) research project (50%). You are encouraged to work together on the data analysis part of the assignment and assist each other with the course material. However, all the other parts of the assignments, including article critique, results write-up for the analysis, and responses to the questions, should be your own work. Academic honesty is expected. Late Policy: Assignments turned in after the due date will not be eligible for credit toward the final grade you earn. Late assignments will be worth 0 points.
The course grade a student earns is determined by the following combination of assessments of the objectives listed above. I also note that you should expect to spend a few hours on homework assignments and work throughout the semester on your project. A last-day effort on assignments is not a robust strategy for mastering the content.
Homework Assignments [50%]
There will be a total of five homework assignments throughout the course [10% each]. The assignments will consist of practices or problems in related topics discussed in class, but also may include other learning/practice experiences. The data for the assignments and further assignment information will be provided. No make-up assignments will be offered. Late assignments are not accepted for credit. Assignments are due at the beginning of class, with no exceptions.
Research Project & Presentation [50%]
The purpose of the project is to provide you with an opportunity to apply skills learned in this class by investigating a problem of your interest through simulated or real data analysis. The project involves conducting a latent variable and structural equation modeling based study on data that are of interest to you. The dataset can be obtained from one of your professors, colleagues, or one that you have collected. A methodological study (i.e., simulation study) of an aspect of methodology is also encouraged. If you have questions about a data source, please ask. I can also generate data for you but need sufficient time to do so (i.e., 3-4 weeks). Projects will be presented to the class at the end of the semester. The written report is due 11: 59 PM on the Friday of the final exam week.
There are three graded components of the project:
- Project proposal (10%)
The project proposal should focus mainly on the research question, a brief literature review, and the study design (1-3 pages);
- Project presentation (10%)
Each individual or group will present their project at the end of the semester. This will be in the conference presentation style. Each team will have 10-12 minutes to present the project and a chance to answer questions from their colleagues.
- Project paper (30%)
The project report must be typed (double-spaced) and follow APA format (7th edition). The APA style manual is available at the bookstore and in the reference section of the library. Font size should be no smaller than 10 or larger than 12 point. Page margins should be 1.0 inch. The paper should be written in a form suitable for publication or submission for a conference paper in your area with a limit of 800 -2000 words, excluding references, tables, and figures. The range of possible projects is very broad and each paper should include: a) title page ( title ≤ 15 words), b) abstract (≤ 250 words), c) introduction (theoretical rationale, literature review, purpose statement, & hypothesis), d) method and study design, e) analysis and results, f) discussion (implications, limitations, etc.), and g) references. Computer programs (e.g., R code) and sample output from the analysis must be provided with the paper. More details will be given in class. Please proofread your work carefully. Incorrect grammar, misspelled words, and not following APA format are unacceptable. Projects given to me after the due date will not be eligible for credit toward your final grade.
| Grade | Percent | Grade | Percent |
|---|---|---|---|
| A |
100.00-93.00% |
C |
76.99-73.00% |
| A- |
92.99-90.00% |
C- |
72.99-70.00% |
| B+ |
89.99-87.00% |
D+ |
69.99-67.00% |
| B |
86.99-83.00% |
D |
66.99-60.00% |
| B- |
82.99-80.00% |
F |
59.99% or below |
| C+ |
79.99-77.00% |
Note: I reserve the right to change the scale if in favor of the student and I round to the nearest whole number.
Attendance and Make-Up Policy
Students should make all reasonable efforts to attend all class meetings. However, in the event a student is unable to attend a class, it is the responsibility of the student to inform the instructor as soon as possible, explain the reason for the absence (and provide documentation, if appropriate), and make up class work missed within a reasonable amount of time, if allowed. Missing class meetings may result in reducing the overall grade in the class.
Assigning Incompletes: University policy (Acad. Reg. #90) states that Incompletes may only be awarded if: "the student is unable to complete their work on time due to circumstances beyond their control".
Academic Integrity Statement
You are responsible for reading WSU's Academic Integrity Policy, which is based on Washington State law.
Academic integrity is the cornerstone of higher education. As such, all members of the university community share responsibility for maintaining and promoting the principles of integrity in all activities, including academic integrity and honest scholarship. Academic integrity will be strongly enforced in this course. Students who violate WSU’s Academic Integrity Policy (identified in Washington Administrative Code (WAC) 504-26-010(3) and -404) will result in action of failing the assignment and/or course depending on the nature of the offense in accord with the policy, will not have the option to withdraw from the course pending an appeal, and will be reported to the Office of Student Conduct.
Cheating includes, but is not limited to, plagiarism and unauthorized collaboration as defined in the Standards of Conduct for Students, WAC 504-26-010(3). You need to read and understand all of the definitions of cheating: http://app.leg.wa.gov/WAC/default.aspx?cite=504-26-010. If you have any questions about what is and is not allowed in this course, you should ask course instructors before proceeding.
If you wish to appeal a faculty member's decision relating to academic integrity, please use the form available at conduct.wsu.edu.”
Attention to this policy is particularly important in a course like ED_PSYCH 581, in which collaboration with other students is encouraged. Specifically, you can only make use of the following sources when working on your homework assignments and exams. Other sources such as asking or paying others to do the work or similar are not acceptable and will be treated as violations of the WSU academic code (WAC 504-26-010).
- Lecture and lab notes, course materials, or other public-available text-based sources. These materials should not be used by copying and pasting, and should be credited appropriately (e.g., cited).
- Discussion with peers in class. If you work with other students during the planning, execution, or interpretation of your data analyses – a process that I support – you should make sure that the other students’ contributions are recognized explicitly in your written account.
- Asking me for help.
If you cheat in your work in this class you will:
- Fail the course
- Be reported to the Center for Community Standards
- Have the right to appeal my decision
- Not be able to drop the course or withdraw from the course until the appeals process is finished.
If you have any questions about what you can and cannot do in this course, ask me.
If you want to ask for a change in my decision about academic integrity, use the form at the Center for Community Standards website. You must submit this request within 21 calendar days of the decision.
University syllabus statement and link
Students are responsible for reading and understanding all university-wide policies and resources pertaining to all courses (for instance: accommodations, care resources, policies on discrimination or harassment), which can be found in the university syllabus.