Engagement Detection for DAiSEE and VRESEE datasets Using Hybrid EfficientNetB7 Together With TCN LSTM and-Bi-LSTM

Students Engagement Level Detection in Online e-Learning Using Hybrid EfficientNetB7 Together With TCN, LSTM, and Bi-LSTM

Tasneem Selim, Islam Elkabani, Mohamed A. Abdou

📚 Table of Contents

📄 Abstract
🎥 Datasets
- 🔹 DAiSEE (Dataset for Affective States in E-Environments)
- 🔹 VRESEE (Video Recorded for Egyptian Students Engagement in E-learning)
🧠 Proposed Hybrid Models
📊 Results
📁 Project Structure and Usage
📝 Cite
📜 Paper and References
📬 Contact

Abstract

Students engagement level detection in online e-learning has become a crucial problem due to the rapid advance of digitalization in education. In this paper, a novel Videos Recorded for Egyptian Students Engagement in E-learning (VRESEE) dataset is introduced for students engagement level detection in online e-learning. This dataset is based on an experiment conducted on a group of Egyptian college students by video recording them during online e-learning sessions. Each recorded video is labeled with a value from 0 to 3, representing the level of engagement of each student during the online session. Furthermore, three new hybrid end-to-end deep learning models have been proposed for detecting student’s engagement level in an online e-learning video. These models are evaluated using the VRESEE dataset and also using a public Dataset for the Affective States in E-Environment (DAiSEE). The first proposed hybrid model uses EfficientNet B7 together with Temporal Convolution Network (TCN) and achieved an accuracy of 64.67% on DAiSEE and 81.14% on VRESEE. The second model uses a hybrid EfficientNet B7 along with Long Short Term Memory (LSTM) and reached an accuracy of 67.48% on DAiSEE and 93.99% on VRESEE. Finally, the third hybrid model uses EfficientNet B7 along with a Bidirectional LSTM and achieved an accuracy of 66.39% on DAiSEE and 94.47% on VRESEE. The results of the first, second and third proposed models outperform the results of currently existing models by 1.08%, 3.89%, and 2.8% respectively in students engagement level detection.

FIGURE 1. The preprocessing of VRESEE video files

Datasets

This project utilizes two key datasets for engagement detection: DAiSEE and VERSEE. Both datasets are curated for affective computing in educational environments and support multi-level engagement analysis.

DAiSEE (Dataset for Affective States in E-Environments)

Source: DAiSEE Dataset
🎮 Number of videos: Over 9,000
⏱️ Average duration: Approximately 10 seconds
🎯 Labels: 4-class (Engagement, Boredom, Confusion, Frustration), each with scale 0–3 (here, 0-3 engagement levels are only used)
🔄 Preprocessing:
- Frame extraction and resizing for EfficientNetB7 input

VRESEE (Videos Recorded for Egyptian Students Engagement in E-learning)

🎮 Number of videos: Over 3,500
⏱️ Average duration: Approximately 10 seconds
🎯 Labels: 4-class (0–3 engagement levels)
🔄 Preprocessing:
- Frame extraction and resizing for EfficientNetB7 input

Proposed Hybrid Models

Three hybrid architectures were developed, combining EfficientNetB7 for spatial feature extraction with temporal sequence models:

1. EfficientNetB7 + TCN

Utilizes Temporal Convolutional Networks to capture sequential dependencies over time.

2. EfficientNetB7 + LSTM

Employs Long Short-Term Memory networks to model engagement progression.

3. EfficientNetB7 + Bi-LSTM

Incorporates Bidirectional LSTM to understand past and future frames for accurate engagement prediction.

These models aim to capture both spatial cues (from frames) and temporal trends (over time), which are critical for understanding engagement dynamics.

FIGURE 2. The model architecture.

FIGURE 3. The EfficientNet B7 architecture.

Results

Model	Dataset	Accuracy
EfficientNetB7 + TCN	DAiSEE	64.67%
EfficientNetB7 + TCN	VRESEE	81.14%
EfficientNetB7 + LSTM	DAiSEE	67.48%
EfficientNetB7 + LSTM	VRESEE	93.99%
EfficientNetB7 + Bi-LSTM	DAiSEE	66.39%
EfficientNetB7 + Bi-LSTM	VRESEE	94.47%

Note: These results demonstrate that the proposed models outperform existing student engagement level detection methods.

Project Structure and Usage

This project was prepared to run on Colab

There are several steps:

1- Prepare the dataset

Utilize the "separate_data_into_4_classes.ipynb" file to preprocess the dataset
This notebook facilitates the division of the dataset into four distinct categories, with each category allocated to a separate folder
Modification of the following five variables within the notebook is necessary to adapt it for use with either the same dataset or a different dataset (e.g., VRESEE)
The five variables: "csv_file", "existing_path_prefix", "new_path_prefix_0", "new_path_prefix_1", "new_path_prefix_2", and "new_path_prefix_3"

2- Augmentation

Employ the "DAISEE-AugClass0&1.ipynb" file to implement augmentation techniques specifically tailored for class 0 and class 1
Adaptation of the notebook to accommodate a different dataset (e.g., VRESEE) is feasible by solely modifying the paths within the fourth cell

3- Feature Extraction using EfficientNet B7

Adjust the paths to correspond with your specific directory structure.

a- For the DAiSEE dataset:

"DAISEETrain-FeatureExtractionUsingEfficientNetB7.ipynb" and "DAISEEValidate&Test-FeatureExtractionUsingEfficientNetB7.ipynb" files are utilized to extract features from the Train, Validate, and Test splits of the DAiSEE dataset

b- For the VRESEE dataset:

"EgyptianTrain-FeatureExtractionUsingEfficientNetB7.ipynb" and "EgyptianValidate&Test-FeatureExtractionUsingEfficientNetB7.ipynb" files are employed to extract features from the Train, Validate, and Test splits of the VRESEE dataset

4- Train, test, and tune the models

Update the paths for all the following files to match your directory structure
Load the spatially extracted features and utilize them to train the models for capturing temporal information

a- For the DAiSEE dataset

"DAISEEEfficientNetB7TCN.ipynb", "DAISEEEfficientNetB7lstm.ipynb", and "DAISEEEfficientNetB7BiLSTM.ipynb" files are designated for training, tuning, and testing TCN, LSTM, and Bi-LSTM models, respectively.

b- For the VRESEE dataset

"EgyptianEfficientNetB7TCN.ipynb", "EgyptianEfficientNetB7lstm.ipynb", and "EgyptianEfficientNetB7BiLSTM.ipynb" files are utilized for training, tuning, and testing TCN, LSTM, and Bi-LSTM models, respectively.

Cite

If any part of our paper or code is helpful to your work, please generously cite with:

@article{selim2022students,
  title={Students engagement level detection in online e-learning using hybrid efficientnetb7 together with tcn, lstm, and bi-lstm},
  author={Selim, Tasneem and Elkabani, Islam and Abdou, Mohamed A},
  journal={IEEE Access},
  volume={10},
  pages={99573--99583},
  year={2022},
  publisher={IEEE}
}

Paper and References

📘 IEEE Access Paper: Students Engagement Level Detection in Online e-Learning Using Hybrid EfficientNetB7 Together With TCN, LSTM, and Bi-LSTM
📚 Google Scholar: Tasneem Selim Profile
📖 ResearchGate: Research Profile

Contact

For questions or collaboration: Tasneem Selim 📧 tasneem.selim@email.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Engagement Detection for DAiSEE and VRESEE datasets Using Hybrid EfficientNetB7 Together With TCN LSTM and-Bi-LSTM

📚 Table of Contents

Abstract

FIGURE 1. The preprocessing of VRESEE video files

Datasets

DAiSEE (Dataset for Affective States in E-Environments)

VRESEE (Videos Recorded for Egyptian Students Engagement in E-learning)

Proposed Hybrid Models

1. EfficientNetB7 + TCN

2. EfficientNetB7 + LSTM

3. EfficientNetB7 + Bi-LSTM

FIGURE 2. The model architecture.

FIGURE 3. The EfficientNet B7 architecture.

Results

Project Structure and Usage

1- Prepare the dataset

2- Augmentation

3- Feature Extraction using EfficientNet B7

4- Train, test, and tune the models

Cite

Paper and References

Contact

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 194 Commits
Figures		Figures
DAISEE-AugClass0&1.ipynb		DAISEE-AugClass0&1.ipynb
DAISEEEfficientNetB7BiLSTM.ipynb		DAISEEEfficientNetB7BiLSTM.ipynb
DAISEEEfficientNetB7TCN.ipynb		DAISEEEfficientNetB7TCN.ipynb
DAISEEEfficientNetB7lstm.ipynb		DAISEEEfficientNetB7lstm.ipynb
DAISEETrain-FeatureExtractionUsingEfficientNetB7.ipynb		DAISEETrain-FeatureExtractionUsingEfficientNetB7.ipynb
DAISEEValidate&Test-FeatureExtractionUsingEfficientNetB7.ipynb		DAISEEValidate&Test-FeatureExtractionUsingEfficientNetB7.ipynb
EgyptianEfficientNetB7BiLSTM.ipynb		EgyptianEfficientNetB7BiLSTM.ipynb
EgyptianEfficientNetB7TCN.ipynb		EgyptianEfficientNetB7TCN.ipynb
EgyptianEfficientNetB7lstm.ipynb		EgyptianEfficientNetB7lstm.ipynb
EgyptianTrain-FeatureExtractionUsingEfficientNetB7.ipynb		EgyptianTrain-FeatureExtractionUsingEfficientNetB7.ipynb
EgyptianValidate&Test-FeatureExtractionUsingEfficientNetB7.ipynb		EgyptianValidate&Test-FeatureExtractionUsingEfficientNetB7.ipynb
LICENSE		LICENSE
README.md		README.md
separate_data_into_4_classes.ipynb		separate_data_into_4_classes.ipynb

License

Tasneem-Selim-Researcher/Engagement-Detection-Using-Hybrid-EfficientNetB7-Together-With-TCN-LSTM-and-Bi-LSTM

Folders and files

Latest commit

History

Repository files navigation

Engagement Detection for DAiSEE and VRESEE datasets Using Hybrid EfficientNetB7 Together With TCN LSTM and-Bi-LSTM

📚 Table of Contents

Abstract

FIGURE 1. The preprocessing of VRESEE video files

Datasets

DAiSEE (Dataset for Affective States in E-Environments)

VRESEE (Videos Recorded for Egyptian Students Engagement in E-learning)

Proposed Hybrid Models

1. EfficientNetB7 + TCN

2. EfficientNetB7 + LSTM

3. EfficientNetB7 + Bi-LSTM

FIGURE 2. The model architecture.

FIGURE 3. The EfficientNet B7 architecture.

Results

Project Structure and Usage

1- Prepare the dataset

2- Augmentation

3- Feature Extraction using EfficientNet B7

4- Train, test, and tune the models

Cite

Paper and References

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages