Statistics and DBA work really are two different disciplines, although from the outside we're both numbers people. I've learned the hard way that there's a lot that I don't know about how to set up a database. Likewise, I've had some database people push some very strange ideas about how to do analysis.
Take random samples. Unless I can actually see the code used to make random samples, I'd rather do random sampling myself. My favorite example of the problem was "we randomly gave you data from California".
Time sensitivity is another issue. I was making a customer attrition study for a cell phone company. We wanted to look at attrition over a year, so we needed customer data from the start of the year and we see how it effects attrition. What happened was that the database people, instead of following our instructions gave us customer data from the end of the year instead of start.
Why? "Don't you want the most current data possible?" It's the nature of reporting to get the most current data possible for the report, and understanding statistical analysis that will often require data from the past is a little alien to that way of thinking.
Sunday, May 31, 2009
Bozoing Campaign Measurements - III
I've got another story from the customer cross-sell system I was talking about in Bozoing Measurements I
We're taking about doing basic reporting on the system. Remember, we're keeping out a control group. We were changing the control group process from keeping out individual control groups per each campaign (which caused a lot of problems actually -- more in a later post).
Now, the dead obvious comparison is treatment and control. There are a couple of nuances we can add on. We can compare
All happy, all treatment vs. control.
But then the senior DBA in the project says "We shouldn't on report the control group that could be marketed to. That's a biased number".
Huh?
"That number is biased by the fact that we're taking out the customers that didn't contact us and that we couldn't market to."
His plan was to compare 1) Treatment group that contacted us and that we could market to (because the others clearly weren't effected by the program) to 2) The total control group. This would create a huge unfair effect favoring the treatment group, simply because the customers that are actively contacting the company are much more likely to purchase new products. That may have been the hidden agenda that the DBA had: create reporting that would have a large built in bias.
About that word bias: there's no such thing as a biased number. The number is what it is. Bias happens with unfair comparisons. We want the treatment and control factor to be the only factor in the comparison.
We're taking about doing basic reporting on the system. Remember, we're keeping out a control group. We were changing the control group process from keeping out individual control groups per each campaign (which caused a lot of problems actually -- more in a later post).
Now, the dead obvious comparison is treatment and control. There are a couple of nuances we can add on. We can compare
- Total treatment vs. total control
- The treatment and control that contact the company
- The treatment and control that have an opportunity to be marketed to
All happy, all treatment vs. control.
But then the senior DBA in the project says "We shouldn't on report the control group that could be marketed to. That's a biased number".
Huh?
"That number is biased by the fact that we're taking out the customers that didn't contact us and that we couldn't market to."
His plan was to compare 1) Treatment group that contacted us and that we could market to (because the others clearly weren't effected by the program) to 2) The total control group. This would create a huge unfair effect favoring the treatment group, simply because the customers that are actively contacting the company are much more likely to purchase new products. That may have been the hidden agenda that the DBA had: create reporting that would have a large built in bias.
About that word bias: there's no such thing as a biased number. The number is what it is. Bias happens with unfair comparisons. We want the treatment and control factor to be the only factor in the comparison.
Labels:
analysis,
bias,
control groups,
measurements,
reporting
Tuesday, May 19, 2009
Bozoing Campaign Measurements -- II
Our next contestant comes from the telecom world.
What the group was doing was evaluating marketing campaigns over the course of several years. Does an attrition-prevention campaign have any effect after three years? This is an absolutely wonderful thing to do, of course, but not the way they went about it.
The campaigns were in a series of mailings that went out to customers that were about to go off contract, and the offer was a monetary reward to renew their contract for a year. Each campaign had a carefully selected control group.
The dead-obvious thing to do is to compare the treatment group vs. the control group, but that's not what got done. What happened was the analysis compared the whole control group to the customers in the treatment group that renewed their contract, because clearly "customers that didn't renew their contract weren't effected by the campaign".
Sound familiar?
Why doing analysis this way is a bad idea: before the mailing on contract renewal, customers are going to have a certain basic affinity towards the company. Some are going to love it, some are going to hate it, some are going to be on the fence. When the customers get the offer the ones that already hate the company will toss the offer, the ones that love the company will take free money for staying with a company they like, and the ones on the fence may or may not take the offer and have their future behavior change. So, to a good extent a retention program like this isn't changing behavior but instead is sorting the customers into buckets based on how they already feel about the company. Comparing "total control group" to "contract renewers" confounds two effects, one effect of the customers predisposition to the company and the second effect of having some customers renew their contracts for a reward. Moreover, this comparison doesn't actually answer the real question: does the program have a meaningful, measurable impact on churn? To answer the real question in the right way Keep Things Simple and Statistical and do a straight treatment vs. control.
What the group was doing was evaluating marketing campaigns over the course of several years. Does an attrition-prevention campaign have any effect after three years? This is an absolutely wonderful thing to do, of course, but not the way they went about it.
The campaigns were in a series of mailings that went out to customers that were about to go off contract, and the offer was a monetary reward to renew their contract for a year. Each campaign had a carefully selected control group.
The dead-obvious thing to do is to compare the treatment group vs. the control group, but that's not what got done. What happened was the analysis compared the whole control group to the customers in the treatment group that renewed their contract, because clearly "customers that didn't renew their contract weren't effected by the campaign".
Sound familiar?
Why doing analysis this way is a bad idea: before the mailing on contract renewal, customers are going to have a certain basic affinity towards the company. Some are going to love it, some are going to hate it, some are going to be on the fence. When the customers get the offer the ones that already hate the company will toss the offer, the ones that love the company will take free money for staying with a company they like, and the ones on the fence may or may not take the offer and have their future behavior change. So, to a good extent a retention program like this isn't changing behavior but instead is sorting the customers into buckets based on how they already feel about the company. Comparing "total control group" to "contract renewers" confounds two effects, one effect of the customers predisposition to the company and the second effect of having some customers renew their contracts for a reward. Moreover, this comparison doesn't actually answer the real question: does the program have a meaningful, measurable impact on churn? To answer the real question in the right way Keep Things Simple and Statistical and do a straight treatment vs. control.
Monday, May 18, 2009
How to Bozo Campaign Measurements
You know, at their heart statistical measurements are basically the easiest thing in the world to do, especially when it comes to direct marketing. Set up your test, randomly split the population, run the test, measure the results. It pretty much takes serious work to mess this up. It's amazing how many bright people leap at the chance to go the extra mile and find an inventive way to bozo a measurement.
The first exhibit is a database expert working for a customer contact project at a bank. A customer comes in, talks to the teller, and the system 1) randomly assigns the customer to the control group or not if this is the first time the customer has hit the system, otherwise it looks up the customer's status and then 2) makes a suggestion for a product cross-sell. The teller may or may not use the suggestion, depending on how appropriate the teller thinks the offer is for the customer and/or how busy the branch is and if there is time available to talk to the customer.
So now, we've got the simplest test/control situation possible. What the DBA decided was to toss out all the customers where no offer was made, on the theory that if no offer was made then the program had no effect. So, all the reporting was done on "total control group" vs. "treatment group that received the offer", creating a confounding effect. The teller decision to make the offer or not was highly non-random. The kind of person that comes in at rush hour (where the primary concern of the teller is handling customers and keeping wait times down) is going to be very different from the kind of person that comes during the slow time in the middle of the afternoon.
The project team understood this confounding, that in their reporting they were mixing up two different effects, and talked for over two years about how to overcome this confounding when all they had to do was be lazier and report on the random split.
The first exhibit is a database expert working for a customer contact project at a bank. A customer comes in, talks to the teller, and the system 1) randomly assigns the customer to the control group or not if this is the first time the customer has hit the system, otherwise it looks up the customer's status and then 2) makes a suggestion for a product cross-sell. The teller may or may not use the suggestion, depending on how appropriate the teller thinks the offer is for the customer and/or how busy the branch is and if there is time available to talk to the customer.
So now, we've got the simplest test/control situation possible. What the DBA decided was to toss out all the customers where no offer was made, on the theory that if no offer was made then the program had no effect. So, all the reporting was done on "total control group" vs. "treatment group that received the offer", creating a confounding effect. The teller decision to make the offer or not was highly non-random. The kind of person that comes in at rush hour (where the primary concern of the teller is handling customers and keeping wait times down) is going to be very different from the kind of person that comes during the slow time in the middle of the afternoon.
The project team understood this confounding, that in their reporting they were mixing up two different effects, and talked for over two years about how to overcome this confounding when all they had to do was be lazier and report on the random split.
Subscribe to:
Posts (Atom)