We're taking about doing basic reporting on the system. Remember, we're keeping out a control group. We were changing the control group process from keeping out individual control groups per each campaign (which caused a lot of problems actually -- more in a later post).
Now, the dead obvious comparison is treatment and control. There are a couple of nuances we can add on. We can compare
- Total treatment vs. total control
- The treatment and control that contact the company
- The treatment and control that have an opportunity to be marketed to
All happy, all treatment vs. control.
But then the senior DBA in the project says "We shouldn't on report the control group that could be marketed to. That's a biased number".
"That number is biased by the fact that we're taking out the customers that didn't contact us and that we couldn't market to."
His plan was to compare 1) Treatment group that contacted us and that we could market to (because the others clearly weren't effected by the program) to 2) The total control group. This would create a huge unfair effect favoring the treatment group, simply because the customers that are actively contacting the company are much more likely to purchase new products. That may have been the hidden agenda that the DBA had: create reporting that would have a large built in bias.
About that word bias: there's no such thing as a biased number. The number is what it is. Bias happens with unfair comparisons. We want the treatment and control factor to be the only factor in the comparison.