Ad Testing
Inputs
- Can multiple campaign variants be tested within one test?
Each video must be tested separately because each variant is evaluated using a different sample, which is then A/B tested against the others. This method ensures reliable data and makes it easy to compare campaigns. As a result, each variant functions as its own independent test.
- Can I add one or two questions to the questionnaire?
We do not recommend adding your own questions, as the questionnaire is automatically prepared for optimal results.
- Can we include our old campaign for comparison?
We can certainly re-evaluate your old campaign. However, we can only measure metrics related to branding, needs, and emotions, which we’ll analyze in detail to show exactly what worked and what didn’t. The challenge lies with the impact test—after a longer period, the ad’s impact can no longer be accurately captured. A campaign’s effect would likely be much lower, or even negligible, compared to its performance a week after it ended. This could lead to misleading results, which we aim to avoid.
- When does it make sense to test concepts?
- You need help deciding which concept to choose.
- You’re unsure if your idea will resonate with people.
- You want specific insights to improve the concept.
- You’re uncertain if people are understanding the intended message.
- You want to ensure the concept won’t offend your target audience.
- Should we include an (illustrative) image with our concept?
Your concept should include an image you plan to use in subsequent communication or one that effectively conveys the intended idea. It’s important for people to clearly understand the message the concept communicates. If the images are too unrealistic, there’s a risk that people will focus on evaluating the image itself rather than what the concept is offering. For more details on concept tests, see here.
- Can you test a website?
The Behavio platform is not designed for UX testing. However, we can evaluate first impressions and determine whether the website is overall understandable. This can be done using a video walkthrough of the website, with results presented as curves, or by analyzing the website’s visuals.
Methodology
- Why don’t you just let people pick one of the concepts instead of doing an A/B test?
Allowing people to choose between concepts leads to analytical thinking and less accurate results. A/B testing captures quick, intuitive responses that better reflect real consumer behavior. When faced with multiple options, people often make random choices, skewing the metrics and obscuring how elements like key messages or visuals perform. In everyday life, such as on the street, people typically see only one visual at a time.
- Why is the statistical error 4% sometimes and 6% at other times when the sample size is the same?
Even with the same sample size, the statistical error varies depending on the measured values. Generally, the statistical error is smaller at very low or very high values, and largest around 50%. For instance, with a sample of 500 respondents, the statistical error is up to 2% for values between 1% and 5%, while for values near 50%, the error exceeds 4%.
- How do you predict the attention with a heatmap?
For predicting attention—whether in videos or static visuals—we use the UNISAL model, developed by researchers at the University of Oxford. Built on Google’s MobileNetV2 visual model, it excels at analyzing both static formats and videos. The model was trained on datasets derived from eye-tracking data across 20,000 experiments, where participants viewed videos and images, all aimed at understanding what truly captures attention.
Results interpretation
- How do you calculate the Overall score?
When measuring the ad in the platform, we analyze three KPIs: branding, need, and emotion. These are compared against benchmarks from other tests, which vary depending on the ad’s development stage—from the initial idea to the final spot. The overall rating is the average of these indicators, reflecting how your ad or concept performs relative to others. For more details on the Overall score, see here.
- What does the curve height indicate, and how is it calculated?
- The height of the emotion curve indicates how many people responded positively to the ad at any given moment using a “smiling” emoticon.
- The height of the brand curve indicates how many people spontaneously recognized the brand at any given moment.
- The height of the need curve indicates how many people recalled the product category at any given moment.
- How do you handle typos in brand recall?
Usually, when coding brand recall, we also account for typos. If the term is similar to the brand name, we include it in the brand recall, though each case is evaluated individually. However, if the client requires the brand name to be written perfectly, we do not count typos, and brand recall is considered only if there is a 100% match with the brand’s name.
Benchmarking
- How do your benchmarks work?
Our benchmarks compare dozens of campaigns measured over the past year, with separate benchmarks for each ad development stage. For example, we compare only storyboards or only final video campaigns. To ensure relevance, we use benchmarks no older than two years, allowing us to compare concepts with current campaigns that customers encounter (known as marketing noise). For more details on our benchmarks, see here.
- How accurate is your ‘reach’ metric compared to post-buy data?
Our reach metric is rigorous, modeling real-life conditions to assess the true impact of the campaign—whether a random 5-second segment of the ad sticks in viewers’ minds. In contrast, post-buy data from media agencies indicates that, on platforms like TV, the ad was shown to the target group, but they weren’t necessarily engaged or paying attention (e.g., they didn’t have to be sitting with a phone).
Brand Tracking
General
- Is brand tracking usable for B2B?
Our brand tracking works well for B2B if we can capture it in our panel. If your clients are entrepreneurs, we can do this easily. However, if they are senior executives, our classic brand tracking may not be the best solution.
- Is brand tracking usable for employer branding?
It depends on the target group. If you want to view your brand through the lens of all employees, that’s no problem. We can also target more specific groups, like IT specialists from certain regions in the US. However, IT specialists earning over $200,000 per year won’t be captured by the panel.
- Can you measure a brand that is strong in specific regions but unknown in others?
This depends on the brand’s overall penetration. For example, if a renowned restaurant is known by just five percent of the population, brand tracking makes sense as the restaurant aims to attract people from all over the state.
Data collection
- Does Behavio work in markets other than the Czech Republic?
Yes, we are primarily operating in Europe (excluding Russia) and the USA, but we can also handle measurements in Asia and the Middle East. So we’re ready to measure audiences almost anywhere!
- What socio-demographic data do you provide in your report?
Currently, we provide basic socio-demographic data on selected groups as part of brand tracking. If desired, you can integrate this data with the Czech Atlas tool (for Czech clients only) to access 900 additional characteristics about your clients and competitors. However, this is outside the scope of the standard packages (approximately CZK 10,000 per selected group).
- Are you able to guarantee that my respondents will not repeat themselves, even if I want to measure 500 respondents per month?
Yes, we guarantee it. The exact timing depends on the survey size and number of waves, but we follow the industry standards, which recommend a minimum six-month gap between surveys of the same respondents. We go a step further by re-surveying the same respondent only after at least one year.
Inputs
- How do I determine the importance of each attribute/association (i.e. gluten-free, great taste) being tracked?
Ideally, start with Market Tracking to determine the importance of each attribute/association, then incorporate it into brand tracking for long-term monitoring. For a deeper dive into your needs, we recommend our strategic research service.
Methodology
- What is implicit association testing? What are its advantages?
Implicit association testing leverages the brain’s natural associative process, making it a highly accurate behavioral method. Comparing two items side by side and observing reaction times is simpler and more precise than traditional scaling. This method focuses on responses within the 0.5-2.5 second interval, filtering out mechanical responders and those who rely on rational reasoning.
- Why not use NPS?
NPS is a surface-level metric that can complement research but shouldn’t be the primary KPI. It lacks context and doesn’t reliably predict future behavior or provide specific guidance on improving the customer experience. Extreme responses can skew results, and neutral ratings are excluded. Therefore, NPS should be supplemented with other research methods for a fuller understanding of the customer experience.
- What is consideration? Why don't we measure it?
At Behavio, we don’t typically measure brand consideration because it often doesn’t reflect actual purchasing behavior and isn’t applicable to all industries. People tend to select all brands they know and don’t have negative associations with, which doesn’t accurately represent real consideration or purchasing intent.
- How do you identify barriers? Are they specific to each brand, or just general barriers to purchase?
You can set specific barriers as associations. For example, if you want to measure how much your fast food brand is associated with junk food, you can track that through associations. Spontaneous barriers will also emerge in brand tracking.