Post DUSC Instructor thoughts

Thoughts after DUSC workshop 15/11/23:

- Using a different dataset that has a more even spread (50:50) between the two classes would make the bias-variance tradeoff clearer
- Also using a non-medical dataset would expand the usability 
- Generally needs more programming tasks. If this lesson follows on from the intro to ML course then you can:
1. Set the data pre-processing as a task
1. Get learners to determine accuracy across test/training data from the get go
1. Set a general task at the end to allow user to train on more or on different data types
- The course really could do with highlighting the benefits of random forests and gradient boosting. This can only be done by adding more features sooner.
- Reduce the amount of plotting. It effective early on to visualise decision trees but not effective and time consuming when comparing later models. 
- Perhaps ignore gradient boosting entirely. It is skimmed over so fast it doesn't convey any of the benefits or differences over random forests. 
- to show the power of random forests try running the model on highly correlated features
- Ideally the code should not be continually renaming the mdl variable, but create new variables for each model to help comparison



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Post DUSC Instructor thoughts #24

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Post DUSC Instructor thoughts #24

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions