Sunday, August 16, 2009

Bozoing Measurements VII

A while ago I saw a consultant give a presentation. He had been given 20 campaigns to analyze. He spent a lot of time discussing the one campaign that was significant at the 5% level.

Sunday, July 12, 2009

Bozoing Campaign Measurements VI

And the hits keep coming.

This story involves a tracking database. The database was tracking long-running campaigns, where the process was that a customer 1) contacted the company via customer care 2) at that point, was randomized on a by-campaign basis. Once there was customer activity, that customer was tracked for three months.

Here's where it gets tricky. On the next customer contact the treatment group was given the pitch again if they still qualified whereas the control group was automatically not given the pitch. That means in the treatment group the next contact generates a campaign-relevant data point whereas in the control group it doesn't. Remember the three month-tracking? After three months any control group customers are dropped out of the database, whereas treatment group customers that are still in contact with the company are still tracked. These are long-running campaigns. So the control group was composed of customers that had at most a three-month window to take the offer whereas the treatment group had a potentially unlimited time to take the offer. What a clever way to make sure the results are excellent!

I was once reviewing analysis of campaigns from this system. I was originally asked to make sure the T-Test formula was right, and poked around in the data a little. I saw a weird thing: the campaign results were a linear function of the control group size. The smaller the control group the better the results. I commented that they really shouldn't publish results until they had figure out what the Weird Thing was. Looking back, I can see how the database anomaly aboce could account for the effect. As time goes on, customers are going to be dropped out of the control group. Also, the treatment group will be given longer and longer to take the offer. So as time goes on, the control group numbers will fall and treatment group takes will rise.

So all the positive results that were being ascribed to the marketing system could have been due to the reporting anomalies.

Saturday, June 27, 2009

Customers are Weird

Really, really weird.

Imagine a company with 2mm customers. Reasonable-sized, not huge.

How many people do you know well? Maybe 100 people? Think about the absolute weirdest person you know. That company has customers that are literally 100 times weirder than the weirdest person you know. In fact, they've got 200 of them.

It's a bad idea to think you know what customers are going to do without testing, measuring, and finding out.

Bozoing Campaign Measurements V

Here's a classic: toss all negative results.

Clearly, everything we do is positive, right?

Nope. Anything that can have an effect can have a negative effect.

(I've met a number of marketing people that really truly believe that people wait at home looking forward to their telemarketing calls. And that calling something 'viral' in a powerpoint is enough to actually create a viral marketing campaign).

There's another factor. Depressingly, a lot of marketing campaigns do absolutely nothing. Random noise takes over; half will be a little positive and half will a little negative. Toss the negative results and you're left with a bunch of positive results. Add them up and suddenly you've got significant positive results from random noise. This is bad.

I've seen an interesting variant on this technique from a very well-paid consultant. Said VWPC analyzed 20 different campaigns and reported extensively on the one campaign that had results that were significant at a 5% level.

Sunday, June 7, 2009

Bozoing Campaign Measurements - IV

Another installment in the "How to Bozo Simple Campaign Analysis". I've got a lot of them. It's amazing how inventive people get when it comes to messing up data.

Anyway, this is from a customer onboarding program. When the company got a new customer, they would give them a call in a month to see how things were going. There was a carefully held out control group. The reporting, needless to say, wasn't test and control. It was "total control" vs. "the test group that listened to the whole onboarding message". The goal was to enhance customer retention.

The program directors were convinced that the "recieve the call or not" decision was completely random; and given that it was completely random the reporting should be concentrated on only those that were effected by the program (that again -- it's amazing how often the idea comes up).

Clearly, the decision to respond to telemarketing is a non-random decision, and I have no idea what lonely neurons fired in the directors brains to make them think that. To start with, someone who is at home to take a call during business hours is going to be a very different population that people that go to work. More importantly, a person that thinks highly of a company is much more likely to listen to a call than someone who isn't that fond of a company.

Unsurprisingly, the original reporting showed a strong positive result. When I finally did the test/control analysis, the result showed that there was no real effect from the campaign.

Sunday, May 31, 2009

Statistics and DBAs

Statistics and DBA work really are two different disciplines, although from the outside we're both numbers people. I've learned the hard way that there's a lot that I don't know about how to set up a database. Likewise, I've had some database people push some very strange ideas about how to do analysis.

Take random samples. Unless I can actually see the code used to make random samples, I'd rather do random sampling myself. My favorite example of the problem was "we randomly gave you data from California".

Time sensitivity is another issue. I was making a customer attrition study for a cell phone company. We wanted to look at attrition over a year, so we needed customer data from the start of the year and we see how it effects attrition. What happened was that the database people, instead of following our instructions gave us customer data from the end of the year instead of start.

Why? "Don't you want the most current data possible?" It's the nature of reporting to get the most current data possible for the report, and understanding statistical analysis that will often require data from the past is a little alien to that way of thinking.

Bozoing Campaign Measurements - III

I've got another story from the customer cross-sell system I was talking about in Bozoing Measurements I

We're taking about doing basic reporting on the system. Remember, we're keeping out a control group. We were changing the control group process from keeping out individual control groups per each campaign (which caused a lot of problems actually -- more in a later post).

Now, the dead obvious comparison is treatment and control. There are a couple of nuances we can add on. We can compare

  • Total treatment vs. total control

  • The treatment and control that contact the company

  • The treatment and control that have an opportunity to be marketed to


All happy, all treatment vs. control.

But then the senior DBA in the project says "We shouldn't on report the control group that could be marketed to. That's a biased number".

Huh?

"That number is biased by the fact that we're taking out the customers that didn't contact us and that we couldn't market to."

His plan was to compare 1) Treatment group that contacted us and that we could market to (because the others clearly weren't effected by the program) to 2) The total control group. This would create a huge unfair effect favoring the treatment group, simply because the customers that are actively contacting the company are much more likely to purchase new products. That may have been the hidden agenda that the DBA had: create reporting that would have a large built in bias.

About that word bias: there's no such thing as a biased number. The number is what it is. Bias happens with unfair comparisons. We want the treatment and control factor to be the only factor in the comparison.