-
Notifications
You must be signed in to change notification settings - Fork 344
Description
This is a cool notebook. But just some thoughts for improvement as I was going through it:
Clarify overall structure
Fewer L2 headers to really emphasise core logical steps. Relegate the rest to L3 headers. Add a callout box near start of the notebook which lays out the notebook structure. This approach provides more context to the reader and gives them a mental scaffold to keen in mind as they are working through the detail.
- Fix duplicate section name 'data generation process'
Add adstock delays into the data generating process
This is already very good, but at the moment the transmission of impressions through 1 -> (2, 3) -> 4 is instantaneous. The data would be more realistic (and the case more strongly made) if you modelled the adstock delay between impression measurements. If not, explain why an instantaneous transmission approach is ok.
Add in a full SCM approach
The proposed modular solution is fine, but I think many people would ask if they could just model the full process and use that to estimate causal effects. Personally, I think it would be cool to model the full SCM in a pymc model. Then you do parameter estimation. Then you do your g-computation stuff: compare actual Y (or model predicted Y) under the actual situation to the counterfactual situation where you used the do operator to set X1=0 for all time points. That would give you the total effect of X1 on Y. The benefit of this is that you still have your SCM of the world and you can use it to work out the total effect of X2, or X4, etc. It might be that the model is complex and the parameter estimation poor. If so, this lends weight to the existing proposed modular solution. Regardless of whether that worked out numerically good or bad, it would be very educational to say that you can take that approach.
(To clarify, I'm proposing to have that in addition to what's there already.)
Tagging @cetagostini who I assume is the author?