Skip to content

Tasneem-Selim-Researcher/Engagement-Detection-Using-Hybrid-EfficientNetB7-Together-With-TCN-LSTM-and-Bi-LSTM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Engagement Detection for DAiSEE and VRESEE datasets Using Hybrid EfficientNetB7 Together With TCN LSTM and-Bi-LSTM

Students Engagement Level Detection in Online e-Learning Using Hybrid EfficientNetB7 Together With TCN, LSTM, and Bi-LSTM

Tasneem Selim, Islam Elkabani, Mohamed A. Abdou

General badge General badge PWC

Static Badge TensorFlow Badge Static Badge Static Badge Static Badge Static Badge Static Badge Static Badge Static Badge Static Badge Static Badge

๐Ÿ“š Table of Contents

Abstract

Students engagement level detection in online e-learning has become a crucial problem due to the rapid advance of digitalization in education. In this paper, a novel Videos Recorded for Egyptian Students Engagement in E-learning (VRESEE) dataset is introduced for students engagement level detection in online e-learning. This dataset is based on an experiment conducted on a group of Egyptian college students by video recording them during online e-learning sessions. Each recorded video is labeled with a value from 0 to 3, representing the level of engagement of each student during the online session. Furthermore, three new hybrid end-to-end deep learning models have been proposed for detecting studentโ€™s engagement level in an online e-learning video. These models are evaluated using the VRESEE dataset and also using a public Dataset for the Affective States in E-Environment (DAiSEE). The first proposed hybrid model uses EfficientNet B7 together with Temporal Convolution Network (TCN) and achieved an accuracy of 64.67% on DAiSEE and 81.14% on VRESEE. The second model uses a hybrid EfficientNet B7 along with Long Short Term Memory (LSTM) and reached an accuracy of 67.48% on DAiSEE and 93.99% on VRESEE. Finally, the third hybrid model uses EfficientNet B7 along with a Bidirectional LSTM and achieved an accuracy of 66.39% on DAiSEE and 94.47% on VRESEE. The results of the first, second and third proposed models outperform the results of currently existing models by 1.08%, 3.89%, and 2.8% respectively in students engagement level detection.

FIGURE 1. The preprocessing of VRESEE video files

Dataset Preprocessing

Datasets

This project utilizes two key datasets for engagement detection: DAiSEE and VERSEE. Both datasets are curated for affective computing in educational environments and support multi-level engagement analysis.

DAiSEE (Dataset for Affective States in E-Environments)

  • Source: DAiSEE Dataset
  • ๐ŸŽฎ Number of videos: Over 9,000
  • โฑ๏ธ Average duration: Approximately 10 seconds
  • ๐ŸŽฏ Labels: 4-class (Engagement, Boredom, Confusion, Frustration), each with scale 0โ€“3 (here, 0-3 engagement levels are only used)
  • ๐Ÿ”„ Preprocessing:
    • Frame extraction and resizing for EfficientNetB7 input

VRESEE (Videos Recorded for Egyptian Students Engagement in E-learning)

  • ๐ŸŽฎ Number of videos: Over 3,500
  • โฑ๏ธ Average duration: Approximately 10 seconds
  • ๐ŸŽฏ Labels: 4-class (0โ€“3 engagement levels)
  • ๐Ÿ”„ Preprocessing:
    • Frame extraction and resizing for EfficientNetB7 input

Proposed Hybrid Models

Three hybrid architectures were developed, combining EfficientNetB7 for spatial feature extraction with temporal sequence models:

1. EfficientNetB7 + TCN

  • Utilizes Temporal Convolutional Networks to capture sequential dependencies over time.

2. EfficientNetB7 + LSTM

  • Employs Long Short-Term Memory networks to model engagement progression.

3. EfficientNetB7 + Bi-LSTM

  • Incorporates Bidirectional LSTM to understand past and future frames for accurate engagement prediction.

These models aim to capture both spatial cues (from frames) and temporal trends (over time), which are critical for understanding engagement dynamics.

FIGURE 2. The model architecture.

The model architecture

FIGURE 3. The EfficientNet B7 architecture.

The EfficientNet B7 architecture

Results

Model Dataset Accuracy
EfficientNetB7 + TCN DAiSEE 64.67%
EfficientNetB7 + TCN VRESEE 81.14%
EfficientNetB7 + LSTM DAiSEE 67.48%
EfficientNetB7 + LSTM VRESEE 93.99%
EfficientNetB7 + Bi-LSTM DAiSEE 66.39%
EfficientNetB7 + Bi-LSTM VRESEE 94.47%

Note: These results demonstrate that the proposed models outperform existing student engagement level detection methods.

Project Structure and Usage

This project was prepared to run on Colab

There are several steps:

1- Prepare the dataset

  • Utilize the "separate_data_into_4_classes.ipynb" file to preprocess the dataset
  • This notebook facilitates the division of the dataset into four distinct categories, with each category allocated to a separate folder
  • Modification of the following five variables within the notebook is necessary to adapt it for use with either the same dataset or a different dataset (e.g., VRESEE)
  • The five variables: "csv_file", "existing_path_prefix", "new_path_prefix_0", "new_path_prefix_1", "new_path_prefix_2", and "new_path_prefix_3"

2- Augmentation

  • Employ the "DAISEE-AugClass0&1.ipynb" file to implement augmentation techniques specifically tailored for class 0 and class 1
  • Adaptation of the notebook to accommodate a different dataset (e.g., VRESEE) is feasible by solely modifying the paths within the fourth cell

3- Feature Extraction using EfficientNet B7

  • Adjust the paths to correspond with your specific directory structure.

a- For the DAiSEE dataset:

"DAISEETrain-FeatureExtractionUsingEfficientNetB7.ipynb" and "DAISEEValidate&Test-FeatureExtractionUsingEfficientNetB7.ipynb" files are utilized to extract features from the Train, Validate, and Test splits of the DAiSEE dataset

b- For the VRESEE dataset:

"EgyptianTrain-FeatureExtractionUsingEfficientNetB7.ipynb" and "EgyptianValidate&Test-FeatureExtractionUsingEfficientNetB7.ipynb" files are employed to extract features from the Train, Validate, and Test splits of the VRESEE dataset

4- Train, test, and tune the models

  • Update the paths for all the following files to match your directory structure
  • Load the spatially extracted features and utilize them to train the models for capturing temporal information

a- For the DAiSEE dataset

"DAISEEEfficientNetB7TCN.ipynb", "DAISEEEfficientNetB7lstm.ipynb", and "DAISEEEfficientNetB7BiLSTM.ipynb" files are designated for training, tuning, and testing TCN, LSTM, and Bi-LSTM models, respectively.

b- For the VRESEE dataset

"EgyptianEfficientNetB7TCN.ipynb", "EgyptianEfficientNetB7lstm.ipynb", and "EgyptianEfficientNetB7BiLSTM.ipynb" files are utilized for training, tuning, and testing TCN, LSTM, and Bi-LSTM models, respectively.

Cite

If any part of our paper or code is helpful to your work, please generously cite with:

@article{selim2022students,
  title={Students engagement level detection in online e-learning using hybrid efficientnetb7 together with tcn, lstm, and bi-lstm},
  author={Selim, Tasneem and Elkabani, Islam and Abdou, Mohamed A},
  journal={IEEE Access},
  volume={10},
  pages={99573--99583},
  year={2022},
  publisher={IEEE}
}

Paper and References

Contact

For questions or collaboration: Tasneem Selim ๐Ÿ“ง tasneem.selim@email.com

Releases

No releases published

Packages

No packages published