How to Ensure the Integrity of Your Survey Data

Three Ways to Avoid the Dreaded “Garbage Out” Result

By Ken Donaven and Chelsea May

Information keeps getting easier to get, but more difficult to truly understand.

The arrival and evolution of artificial intelligence and machine learning has made the once laborious process of data gathering dramatically easier and more efficient. But with that greater access to voluminous data comes equal amounts of risk and margin for error — separating the proverbial wheat from the chaff is now the top priority.

A critical component of modern quantitative research is ensuring that the survey sample truly represents an accurate reflection of the opinions and perceptions of the proper respondent population. Data quality control and data integrity practices are paramount, especially in the modern media landscape. While “bots” can comb the universe looking for the information you need, they also can mimic human respondents, jeopardizing the quality and accuracy of the information being collected.

When surveys shifted from phone-based to online platforms, the need for quality control practices increased significantly. It became clear that some respondents were simply looking to earn incentives and were largely disregarding the questions being asked. Now bots are infiltrating surveys at scale, using the very artificial intelligence that is being celebrated across many industries. This has resulted in survey sponsors and practitioners being more challenged to ensure the integrity of survey data.

Baking in Confidence and Integrity Before, During and After a Quantitative Research Study

Of course, the purpose of doing research in the first place is to glean actionable insights. But taking the wrong actions based on compromised “insights” can be a recipe for disaster. The good news is that there are measures one can take along the entire spectrum of the survey’s timeline — before, during and after the data collection process.

Pre-Survey Actions to Take

We recommend that data integrity practices begin before a survey is even administered. Guardrails preventing poor quality of inputs can be established through sound survey design, such as:

Purposefully integrate quality-control questions that “smoke out” bots and humans simply going through the motions of completing a survey. For example, every five questions or so, you may include a directive that prompts the participant to select a specific answer for quality control purposes. This makes the process of reviewing responses for quality control easier, as only someone who is truly participating and paying attention will get those questions correct every time.
Ensure the survey is addressing the appropriate audience — those opinions that matter most to the brand or company conducting the research. Irrelevant opinions from non-stakeholders can negatively impact the purity of the data sample. We work with our clients to develop appropriate and robust screening questions to ensure only relevant respondents are included. These questions help us target decision makers who are familiar with the product or service. We may also include some “dummy” responses in these screening questions to further ensure a high-quality sample.
Utilize panel companies with long and proven track records of recruiting highly targeted and engaged survey respondents. It’s important that you don’t just target a high response rate, but rather insist on accurate responses from highly targeted opinion holders.
Control against subconscious bias, as well as prejudicial or leading questions or phrases that might influence the response. For example, rather than prompting the participant to ‘Please explain what you like about brand X, you might ask, Please explain what you like or dislike about brand X.

Steps to Follow During a Survey

While AI tools show promise and can be exciting in terms of what the future may hold, we believe that the best intelligence is found when artificial intelligence combines with human intelligence to manifest “augmented intelligence.” Human oversight is still critical — probably more so today than ever before, as counterfeit survey respondents proliferate.

We recommend that surveys be monitored in real time as responses come in, so that bad data can be filtered out in real time. A considerable percentage of responses must be filtered out and discarded (often to a greater degree in B2C research studies), so that no “garbage” goes in and only pure intelligence comes out.

There are four types of responses that human oversight is adept at weeding out:

Straightliners: Those who simply choose C out of every multiple choice question, or rate everything a 7 on a scale of one to ten
Speeders: Respondents who race through a survey just to reach the end and redeem the incentive. We time all response sessions, and eliminate those that don’t map to the control established in the survey design phase as the appropriate length of time for completion.
Bad Open-End Responses: Human eyeballs can detect (even better than AI can) when an open-ended response is either insufficient, irrelevant or nonsensical. Strategically constructing open-ended questions in the survey design can make monitoring easier and more effective during the response timeline. If you ask a respondent to provide a favorite brand of pet food, and the answer comes back as “Walmart,” you can be sure that is either the response of a bot or disengaged human participant, and that response should be discarded from the survey sample before it inaccurately weights response results.
Illogical Responses: It takes careful human oversight to detect abnormalities or irregularities that might suggest the respondent is either a bot or simply not answering accurately or carefully enough. For example, if a 25-year-old somehow indicates that they have 25 years experience in their career field, we know this can’t be true and that the entire survey response is unusable. AI might miss this error, but humans should not.

By way of emphasis, human eyeballs are still much better at detecting illogical disconnects than language learning models, no matter how sophisticated they seem today or will become in the future.

Post-Survey Actions to Take

Perhaps the most important step when conducting quantitative research happens once the responses have been collected, filtered and controlled for quality and integrity — analysis and recommendations. We often refer to this colloquially as:

So What? Raw data, even when monitored for data integrity, often is next to meaningless if it isn’t deciphered and decoded to reveal the underlying and overarching intelligence the numbers and words are trying to reveal. Again, we maintain that humans have a unique ability to distill meaning and convey the implications behind the data in the survey results. Given the data collected, what does it say that we either didn’t understand completely before, or how does it validate our presumptions that now give us more confidence about the market we are trying to serve?
Now What? Companies will embark on research studies looking for “actionable insights.” This means that the data should reveal what actions a company can and should take to better appeal to or serve a given market. Look for research partners that embrace the role of consultant, as much as they do that of a data scientist.

The age-old expression “Garbage in, garbage out” remains as true today as it ever did. The only difference is that there is much more garbage collection going on, thanks to technology and the proliferation of those looking to “game the system” for incentives. If researchers don’t work diligently to take proactive measures to protect the integrity of the data sample, what comes out of the process will only be as good, reliable, informative and actionable as what went in.

The Best Quant Studies Begin and End with Qual

We are big believers that the best way to ensure the highest level of quality, integrity and accuracy of a quantitative research project is to book-end the quant survey with two specific types of qualitative research. But that’s a topic for another day and another article. In the meantime, if you’d like to hear more about such an approach, please contact one or both of us using the contact form below.

Ken Donaven serves as Senior Director with Martec, and Chelsea May serves as Project Manager.

Market Sizing is Not a One-Time Event

It’s not just a matter of the size of the market, but rather why, and how we should respond, position and innovate to improve market standing, future outcomes, and overall profitability.

Ken Donaven June 3, 2024

Customer Experience

The Use of AI in Quantitative Research: What to Adopt, What to Avoid

The integration of artificial intelligence into market research processes has been embraced by some as a game-changer, promising to streamline data collection, enhance analysis, and drive informed decision-making. However, as with any technological advancement, the advent of AI brings both opportunities and challenges, prompting researchers to navigate the terrain with caution and curiosity alike.

Ken Donaven and Chelsea May May 7, 2024

Customer Experience

Show Me, Don’t Tell Me.

One of our recent innovations in our ongoing pursuit to optimize and perfect Emotion Intelligence research is the use of images in a “qual-then-quant” process to gain deeper and more authentic insights into how emotions and sentiment are driving purchase decisions (or not).

Emily Bielak April 2, 2024

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.