QUALTRICS TOOLCHEST – How To Use Embedded Data to Speed Up Open-Ended Data Analysis

Embedded data is a powerful tool in Qualtrics that can allow you to program your survey to do all sorts of useful things that speed up your data analysis. Whether you need to set up profile groups based on data sourced from a contact list or URL, need to ensure that you’re filtering participants for a correct open-ended entry such as a ZIP code or need to set up randomized branches for a survey, embedded data can help you.

But one of the areas where it’s very difficult to be speedy is in dealing with open-ended data from a form or text entry box. Qualtrics has an analytic tool for Text in the Data & Analysis tab, but it’s very slow and is best-suited to analysis that’s conducted after a survey is done rather than being a good tool for real-time analysis of a cross-sectional survey.

Embedded data, on the other hand, can be programmed ahead of time and can really save you a lot of time and trouble if you take the time to think through how you can use it. If you’re asking a question where you have a good idea of the range of responses and you’re just trying to quantify what you hear, today’s trick will give you a significant speed boost in your next survey that utilizes open-ended responses.

Let’s imagine for a moment that you’re conducting a survey where, among other things, you’re asking for unaided mentions of brands. I’ll use the following example for simplicity’s sake:

Q1 What's your favorite brand of soda?
__________________________________________________

We know this question is going to provide a predictable range of responses because there are a finite number of brands and several which are very popular. But we also know that because this question is unaided and open-ended, we’re going to get a lot of variations on common brand names. For Coca-Cola, for example, we may get “Cocacola,” “Coke,” “Coca Cola” and other similar variations. For Dr Pepper (which does not have a period in its official brand name!), we will probably get variations such as “Dr. Pepper” or “DrPepper.” For Pepsi, we’ll occasionally see a “Pepsi Cola” or “Pepsicola.” We want to be careful to anticipate the correct terms, because many brand names do overlap (for example, Diet Coke, Coke Zero and Cherry Coke could easily be identified by a simple keyword search for “Coke”), and so the order in which we set our embedded data to populate is important. We also want to determine keywords that are distinctive and eliminate keywords that might get things mislabeled, like “Cola.”

It’s a good idea to make a list of the acceptable keywords for each brand name in a document or spreadsheet and plan your strategy out before you begin the next step.

Assuming we have our keywords in place, we need to go into our survey flow and create an embedded data variable to go at the very beginning of our survey (before the first question block):

Create New Field or Choose From Dropdown…Set Value Now
Brand MentionedOther
We will set the brand mentioned to “Other” by default so that if our respondent enters something like “Mountain Dew” or “Sprite” and we’re not looking for them, they fall within “Other.”

IMPORTANT NOTE:: Click on “Options” in the green box and set “Brand Mentioned” to “Text Set.” This will allow you to utilize these data in the reporting tab instead of the Text analytic tool. It’s good practice to do this every time your survey flow is going to update the variable, as Qualtrics likes to flip it back to “Text” by default, limiting your options.

(And if you forget to do this, don’t worry! You can fix this even after you’ve collected data. Make sure every time your variable is going to be changing in your survey flow, you’ve set it to “Text Set.” Then, take the survey once through preview and then it will update!)

In your survey flow, you now need to go below your questions, click “Add Below” and start a new branch. We need to set a condition that says:

IF what the participant entered falls within the keywords I'm expecting for Coca-Cola or other products in its family...

THEN assign "Coca-Cola" to the embedded data variable "Brand Mentioned."

The easiest way to do this is to utilize logic that says “IF the answer to Q1 CONTAINS a keyword I’m looking for… THEN SET EMBEDDED DATA to change “Brand Mentioned” to “Coca Cola.” But we may want to program in several variations of our keywords for this particular brand, since we’re likely to get “Coke,” “Coca-Cola”, “Coca Cola” and so forth.

(I went ahead and programmed in a few other variations, for reasons that will become clear in a moment.)

IMPORTANT NOTE:: For each of your keywords, be sure to click “ignore case” so Qualtrics isn’t looking for capital letters.

But we have a problem now: What if someone inputs “Diet Coke?” We not only still need to set up our logic to assign that brand, but we also are going to need to ensure that we don’t assign “Diet Coke” to someone who enters “Diet Cherry Coke.”

We need to remember two things. First of all, Qualtrics assigns embedded data in a linear process, which means that we can evaluate our data to first find mentions of “Coke” and then reassign them to be “Diet Coke” and then “Diet Cherry Coke” if it’s appropriate. The second is that we can use the fact that we’ve already classified all other brand mentions as “Other” to help us filter our data to ensure Qualtrics is only evaluating what’s appropriate.

So, let’s add a branch below what we just did to tell Qualtrics to make a change IF the respondent’s answer CONTAINS the word “diet” so that we change “Brand Mentioned” from “Coca-Cola” to “Diet Coke.” Easy enough!

You might have noticed I told the original logic to include “dietcoke” in the keywords for “Coca-Cola.” This is just me thinking ahead, since I know that some respondents don’t like to use spaces and they may input the brand this way. I still want to ensure that the data go through my Coca-Cola branch so they are correctly labeled.

I can also apply the same process to other variants such as Cherry Coke, Coke Zero, Diet Cherry Coke and Cherry Coke Zero:

When we get to our next brand, Pepsi, we’ll follow the same process, but we need to do add in one additional criterion for our branch: we need to exclude any soda that’s already been labeled as some form of Coke. This will help us avoid conflicts when we get to the diet variants. We’ll do that by duplicating our Coca-Cola branch logic, making the relevant changes and then adding in an extra standalone line of AND branch logic to exclude any “Brand Mentioned” that isn’t “Other”:

Now we’re getting somewhere! We can do the same with Dr Pepper:

Now that we have this in place, let’s test it out! Unfortunately, we’ll have to do so manually, so I did this by entering a bunch of brand names through the survey preview. How well did it work? You be the judge:

One of the cool things you’ll notice is that our system was able to deal with weird variations like “dietcherry coke” or “coke diet” with ease. “Mtn dew” was appropriately labeled as “Other.” And everything is labeled correctly, which is a bonus!

Now you can visit the “Reports” tab and see your data. If you’ve got the “Brand Mentioned” embedded variable set appropriately to “Text Set,” you should be able to set it up to show a table or a visualization chart like this:

Handy! That’d be great for having available for a quick rundown of brand mentions in a dynamic report designed to give real-time data while the survey’s running.

Of course, you might notice that there’s going to be some opportunity to refine our process to get more precise embedded data. There may also be mistakes you have to fix later. Fortunately, you can always make corrections as the survey’s running, and you can also easily manually edit cells through the “Data & Analysis” tab to ensure your data are always up to date and reliable. (This used to be a lot harder to do in Qualtrics, so it’s nice that the “Edit” function is now available!)

But the upshot of all of this is if you put in a few hours’ worth of prep time and really think through your keywords and test your design, chances are good you’ll save yourself many more hours’ worth of fiddling with analytics after the survey’s done.

I hope you enjoyed this article. Here’s a link to the Survey Preview and to the QSF file if you’d like to refer to either. If you’d like to discover more Qualtrics tips and tricks, be sure to check out other articles in our series, and if we can help you or someone you know with a Qualtrics question or problem, please contact us!