How to Conduct a Proper Mystery Shop
Yesterday’s piece on the pitfalls of mystery shopping brought, I hope, some important perspective on the dangerously deceptive data that a poorly-designed mystery shop can bring in.
So, how does one conduct a proper mystery shop?
STEP 1: Develop a management problem statement. Mystery shops, like any other sort of research, should never be done just for the sake of doing them. They should have a purpose behind them that is clearly defined by management. I recommend that this be turned into a formal statement called a management problem. The more specific in scope and application, the better.
BAD PROBLEM STATEMENT
“The management of ABZ Retailer would like to ensure that front-line sales staff members are giving customers a superior customer experience.”
GOOD PROBLEM STATEMENT
“The management of ABZ Retailer would like to improve the ABZ employee training program by conducting mystery shops within a random sample of stores and determine how well employees are adhering to the guidelines set for them during the training process.”
Note that the bad statement is likely to be the original reason for the study, whereas the good statement takes the concept and shapes it into something that can actually be researched and acted upon. It is extremely important that the ultimate result of the study be actionable from a management perspective. Without a clear idea of what you want to do with the data, it can be counter-productive (and even a negative influence on morale!) to conduct a mystery shop study.
Step 2: Develop Research Objectives: Research objectives help you to narrow down what you really want to learn from the research. There are two ways of developing these: by asking general research questions and determining the specific objectives you’ll need to answer those questions, or by stating general research objectives and determining which questions you need to ask to meet them. Either method is valid, and both will create a hierarchy that will look something like this:
RO1: Determine how well employees are adhering to the dress code.
- RQ1.1: Are employees wearing name tags?
- RQ1.2: Are employees wearing proper uniforms?
- RQ1.3: Are employees wearing proper shoes?
- RQ1.4: Are employees smiling?
RO2: Determine how customers are being greeted when they walk in the store.
- RQ2.1: Are customers being greeted within 60 seconds by an employee?
- RQ2.2: Are employees greeting customers from behind the counter or walking out onto the floor?
- RQ2.3: Are employees using a proper greeting or asking a discouraged phrase like, “May I help you?”
- RQ2.4: If any employee is on the phone when the customer walks in, is the employee interrupting the call to briefly acknowledge the customer?
- RQ2.5: Are employees engaging in any behavior when the customer enters the store that appears to be improper or unprofessional?
And so on. The advantage of creating this hierarchy is that it gives you a very nice outline for your survey instrument later on. It also helps you to avoid to asking those, “everything but the kitchen sink” questions that are not relevant to the study.
Step 3: Determine your action standards. Ultimately, the goal of marketing research should be to develop action standards that are a direct result of the research. It’s tempting to just gather information and respond to it once you’ve got it, but it’s far better to have a clear idea of what you want to do with the information before it’s collected. This will keep you from overreacting to negative data and help you to keep sight of the bigger picture.
Good action standards are based on research findings, and should be tied to the major themes of the study. At the same time, you can establish standards for correcting specific behaviors if they fall below a certain threshhold.
BAD ACTION STANDARD
AS1: If employees are found to fail to live up to the training standards, they will automatically be enrolled in a remedial training program.
GOOD ACTION STANDARDS
AS1: If it is determined that an aspect of the uniform and dress code policy is not being adhered to by fewer than 75% of employees, management will require that all front-line employees review the dress code policy and sign a statement indicating that they agree to adhere to it.
AS2: If it is determined that more than front-line 25% of employees are using discouraged phrases such as, “May I help you?” to open a sale, all front-line employees will be required to undergo a one-on-one sales training coaching session with a store manager or district manager.
AS3: If it is determined during two or more shops of a single store that employees are engaging in unprofessional or inappropriate behavior when a customer walks into the store, the district manager will be asked to investigate the situation and enforce the appropriate disciplinary protocols.
These action standards are clear, they’re specific, and they lay out the protocols for dealing with problems in a manner that is preemptive rather than reactionary. It also gives the management involved in the design of the study time to determine what the acceptable threshholds are for negative marks on a mystery shop so that there is less of an opportunity to overreact to them after the data has come in.
Step 4: Generate a sample. Depending upon the size of your business and the number of outlets, it may be in your best interests to generate a random sample for each phase of the study rather than trying to take a census of all stores. The advantages of taking a random sample are that you can complete the study much more quickly and at a lower cost. A census should only be used if you have a small number of locations. It is probably not necessary for a business with more than 100 stores.
As a general rule, you will need to shop at least 30 stores from your random sample to have any sort of statistical projectability to your data. All of your statistics will have error associated with them, so the more stores you can shop, the less error these statistics will have.
Step 5: Determine the number of shops. I recommend doing three shops per location for each phase of the study. This will help you to mitigate outlier behavior (such as an employee who forgot his or her name tag one day or a busy assistant manager who is trying to juggle multiple tasks while running the sales floor because a part-timer called in sick) and also help you to determine if improper behavior is a pattern at a particular store. One shop is simply not enough to tell you whether or not a store has a need for better training.
I recommend that you use different shoppers for each of these shops to decrease the effects of personal bias. If three different shoppers all have the same negative impression, it’s much more likely to be true than if one shopper has a bad experience the first time and goes in expecting that experience the next two times.
Step 6: Develop a survey instrument. This should be a one or two-page document that shoppers can fill out after their experience. It should follow your research objectives closely and only ask the questions that are important for the research. For example, if you are not concerned about the relative cleanliness of a store, it should not be a question on the survey.
It is easier to evaluate mystery shops if shoppers are required to rate individual elements on a scale and then provide comments to clarify their rating. This scale does not need to be very sensitive; I recommend three points of distinction for subjective questions (“How enthusiastic was the greeting you received when you walked in the door?” “Not enthusiastic,” “Somewhat enthusiastic,” “Very enthusiastic”) and two points of distinction for compliance issues (“Was the employee wearing a nametag?” “No” “Yes.”). If you feel the subjective scale needs to be more sensitive, I would not take it beyond five points, simply because scores may vary due to individual shopper bias.
Step 7: Train your shoppers. Don’t outsource mystery shopping to a national firm and wash your hands of it — make sure you train all of your shoppers to do their job properly, even if that means you need to train them over the phone or on Skype. This can be time-consuming, but it will ensure that you explain the entire process to them and help them to understand your expectations for what they are looking for.
Don’t tell them too much about the aims of the study, though; just focus on the survey instrument. The less a shopper knows about the big picture, the better, since he or she will do a better job of getting all of the information knowing less about what you really want to know.
Step 8: Validate shoppers. If you want to ensure that your shoppers actually make a purchase, I suggest giving them a gift card or some other form of currency that’s not cash in advance for their shopping experience. Mystery shopping studies that make shoppers submit receipts or expenses for reimbursement encourage shoppers to pull all sorts of shenanigans, whereas gift cards can be electronically validated to ensure that the purchase matches up with the time that the shopper was allegedly in the store. This will give them more of an incentive to actually do the shop — there’s no cash to pocket, and the gift card can be canceled if they don’t live up to their end of the bargain. It will be easier for them to just do the shop.
Step 9: Keep the data blind. Once the shops start coming in, identify each store by an arbitrary ID number and enter the data accordingly. Present this data to management in aggregate, with no identifying information so that managers will focus on the action standards and not individual retribution of stores. If one store is show to have a consistent pattern of poor behavior, then and only then should the store be singled out and dealt with, and even then by the previously established action standards and not by some hotheaded decision made after the fact.
It is extremely unwise to send the individual reports back to stores. Rather, share the aggregate data with them as well as their average scores (if they were part of the sample) for each question so they can see if they were above or below the mark. Do not present any form of ranking, and do not allow district managers to share the results over conference calls. District managers should independently review each report with each store to help them understand how they can improve their scores for the future through better training. These mystery shops should in no way be tied to the manager’s annual performance review or bonus.
Step 10: Repeat regularly. If you want to see the long-term effects of changes you are making in training or company communication, repeat the mystery shops once every quarter or, at a minimum, once every year. Use the same process, but feel free to tweak the objectives and action standards as needed to continue improving your organization.
Make sure your repetitions are at regular intervals, but don’t do them too frequently. I would avoid the temptation to use mystery shops as a monthly diagnostic, since it is very expensive to do this properly and probably won’t show you the effects of changes that you’ve made until two or three months into the process.
With all that said… if you have any questions, post them below, and I’ll be happy to answer them!


[...] I’m going to offer a piece called “How to Conduct a Proper Mystery Shop.” It will explain how you can correct all of these problems and use mystery shopping to its [...]