Sunday, August 16, 2009

Bozoing Measurements VII

A while ago I saw a consultant give a presentation. He had been given 20 campaigns to analyze. He spent a lot of time discussing the one campaign that was significant at the 5% level.

Sunday, July 12, 2009

Bozoing Campaign Measurements VI

And the hits keep coming.

This story involves a tracking database. The database was tracking long-running campaigns, where the process was that a customer 1) contacted the company via customer care 2) at that point, was randomized on a by-campaign basis. Once there was customer activity, that customer was tracked for three months.

Here's where it gets tricky. On the next customer contact the treatment group was given the pitch again if they still qualified whereas the control group was automatically not given the pitch. That means in the treatment group the next contact generates a campaign-relevant data point whereas in the control group it doesn't. Remember the three month-tracking? After three months any control group customers are dropped out of the database, whereas treatment group customers that are still in contact with the company are still tracked. These are long-running campaigns. So the control group was composed of customers that had at most a three-month window to take the offer whereas the treatment group had a potentially unlimited time to take the offer. What a clever way to make sure the results are excellent!

I was once reviewing analysis of campaigns from this system. I was originally asked to make sure the T-Test formula was right, and poked around in the data a little. I saw a weird thing: the campaign results were a linear function of the control group size. The smaller the control group the better the results. I commented that they really shouldn't publish results until they had figure out what the Weird Thing was. Looking back, I can see how the database anomaly aboce could account for the effect. As time goes on, customers are going to be dropped out of the control group. Also, the treatment group will be given longer and longer to take the offer. So as time goes on, the control group numbers will fall and treatment group takes will rise.

So all the positive results that were being ascribed to the marketing system could have been due to the reporting anomalies.

Saturday, June 27, 2009

Customers are Weird

Really, really weird.

Imagine a company with 2mm customers. Reasonable-sized, not huge.

How many people do you know well? Maybe 100 people? Think about the absolute weirdest person you know. That company has customers that are literally 100 times weirder than the weirdest person you know. In fact, they've got 200 of them.

It's a bad idea to think you know what customers are going to do without testing, measuring, and finding out.

Bozoing Campaign Measurements V

Here's a classic: toss all negative results.

Clearly, everything we do is positive, right?

Nope. Anything that can have an effect can have a negative effect.

(I've met a number of marketing people that really truly believe that people wait at home looking forward to their telemarketing calls. And that calling something 'viral' in a powerpoint is enough to actually create a viral marketing campaign).

There's another factor. Depressingly, a lot of marketing campaigns do absolutely nothing. Random noise takes over; half will be a little positive and half will a little negative. Toss the negative results and you're left with a bunch of positive results. Add them up and suddenly you've got significant positive results from random noise. This is bad.

I've seen an interesting variant on this technique from a very well-paid consultant. Said VWPC analyzed 20 different campaigns and reported extensively on the one campaign that had results that were significant at a 5% level.

Sunday, June 7, 2009

Bozoing Campaign Measurements - IV

Another installment in the "How to Bozo Simple Campaign Analysis". I've got a lot of them. It's amazing how inventive people get when it comes to messing up data.

Anyway, this is from a customer onboarding program. When the company got a new customer, they would give them a call in a month to see how things were going. There was a carefully held out control group. The reporting, needless to say, wasn't test and control. It was "total control" vs. "the test group that listened to the whole onboarding message". The goal was to enhance customer retention.

The program directors were convinced that the "recieve the call or not" decision was completely random; and given that it was completely random the reporting should be concentrated on only those that were effected by the program (that again -- it's amazing how often the idea comes up).

Clearly, the decision to respond to telemarketing is a non-random decision, and I have no idea what lonely neurons fired in the directors brains to make them think that. To start with, someone who is at home to take a call during business hours is going to be a very different population that people that go to work. More importantly, a person that thinks highly of a company is much more likely to listen to a call than someone who isn't that fond of a company.

Unsurprisingly, the original reporting showed a strong positive result. When I finally did the test/control analysis, the result showed that there was no real effect from the campaign.

Sunday, May 31, 2009

Statistics and DBAs

Statistics and DBA work really are two different disciplines, although from the outside we're both numbers people. I've learned the hard way that there's a lot that I don't know about how to set up a database. Likewise, I've had some database people push some very strange ideas about how to do analysis.

Take random samples. Unless I can actually see the code used to make random samples, I'd rather do random sampling myself. My favorite example of the problem was "we randomly gave you data from California".

Time sensitivity is another issue. I was making a customer attrition study for a cell phone company. We wanted to look at attrition over a year, so we needed customer data from the start of the year and we see how it effects attrition. What happened was that the database people, instead of following our instructions gave us customer data from the end of the year instead of start.

Why? "Don't you want the most current data possible?" It's the nature of reporting to get the most current data possible for the report, and understanding statistical analysis that will often require data from the past is a little alien to that way of thinking.

Bozoing Campaign Measurements - III

I've got another story from the customer cross-sell system I was talking about in Bozoing Measurements I

We're taking about doing basic reporting on the system. Remember, we're keeping out a control group. We were changing the control group process from keeping out individual control groups per each campaign (which caused a lot of problems actually -- more in a later post).

Now, the dead obvious comparison is treatment and control. There are a couple of nuances we can add on. We can compare

  • Total treatment vs. total control

  • The treatment and control that contact the company

  • The treatment and control that have an opportunity to be marketed to

All happy, all treatment vs. control.

But then the senior DBA in the project says "We shouldn't on report the control group that could be marketed to. That's a biased number".


"That number is biased by the fact that we're taking out the customers that didn't contact us and that we couldn't market to."

His plan was to compare 1) Treatment group that contacted us and that we could market to (because the others clearly weren't effected by the program) to 2) The total control group. This would create a huge unfair effect favoring the treatment group, simply because the customers that are actively contacting the company are much more likely to purchase new products. That may have been the hidden agenda that the DBA had: create reporting that would have a large built in bias.

About that word bias: there's no such thing as a biased number. The number is what it is. Bias happens with unfair comparisons. We want the treatment and control factor to be the only factor in the comparison.

Tuesday, May 19, 2009

Bozoing Campaign Measurements -- II

Our next contestant comes from the telecom world.

What the group was doing was evaluating marketing campaigns over the course of several years. Does an attrition-prevention campaign have any effect after three years? This is an absolutely wonderful thing to do, of course, but not the way they went about it.

The campaigns were in a series of mailings that went out to customers that were about to go off contract, and the offer was a monetary reward to renew their contract for a year. Each campaign had a carefully selected control group.

The dead-obvious thing to do is to compare the treatment group vs. the control group, but that's not what got done. What happened was the analysis compared the whole control group to the customers in the treatment group that renewed their contract, because clearly "customers that didn't renew their contract weren't effected by the campaign".

Sound familiar?

Why doing analysis this way is a bad idea: before the mailing on contract renewal, customers are going to have a certain basic affinity towards the company. Some are going to love it, some are going to hate it, some are going to be on the fence. When the customers get the offer the ones that already hate the company will toss the offer, the ones that love the company will take free money for staying with a company they like, and the ones on the fence may or may not take the offer and have their future behavior change. So, to a good extent a retention program like this isn't changing behavior but instead is sorting the customers into buckets based on how they already feel about the company. Comparing "total control group" to "contract renewers" confounds two effects, one effect of the customers predisposition to the company and the second effect of having some customers renew their contracts for a reward. Moreover, this comparison doesn't actually answer the real question: does the program have a meaningful, measurable impact on churn? To answer the real question in the right way Keep Things Simple and Statistical and do a straight treatment vs. control.

Monday, May 18, 2009

How to Bozo Campaign Measurements

You know, at their heart statistical measurements are basically the easiest thing in the world to do, especially when it comes to direct marketing. Set up your test, randomly split the population, run the test, measure the results. It pretty much takes serious work to mess this up. It's amazing how many bright people leap at the chance to go the extra mile and find an inventive way to bozo a measurement.

The first exhibit is a database expert working for a customer contact project at a bank. A customer comes in, talks to the teller, and the system 1) randomly assigns the customer to the control group or not if this is the first time the customer has hit the system, otherwise it looks up the customer's status and then 2) makes a suggestion for a product cross-sell. The teller may or may not use the suggestion, depending on how appropriate the teller thinks the offer is for the customer and/or how busy the branch is and if there is time available to talk to the customer.

So now, we've got the simplest test/control situation possible. What the DBA decided was to toss out all the customers where no offer was made, on the theory that if no offer was made then the program had no effect. So, all the reporting was done on "total control group" vs. "treatment group that received the offer", creating a confounding effect. The teller decision to make the offer or not was highly non-random. The kind of person that comes in at rush hour (where the primary concern of the teller is handling customers and keeping wait times down) is going to be very different from the kind of person that comes during the slow time in the middle of the afternoon.

The project team understood this confounding, that in their reporting they were mixing up two different effects, and talked for over two years about how to overcome this confounding when all they had to do was be lazier and report on the random split.

Friday, February 13, 2009

The Data Daemon

Appropos of "Murphy's Laws of Data" , I find it useful to imagine that data is created by a little deamon and his job is to make me look like a durn fool.