The image of angsty teenagers taking long drags from cigarettes held deftly with two fingers has persisted since the early 1900s. Today, tobacco use remains a significant public health challenge worldwide, particularly among teenagers. As of 2019, 155 million teenagers and young adults globally were estimated to be current smokers. However, slow innovation to date has hindered researchers’ ability to accurately measure the scope of the tobacco pandemic. Complementing traditional surveys with alternative data collection methods, such as biomarker data and mobile app surveys, can improve efforts to monitor tobacco use globally.
Challenges Collecting Data
Traditional methods of measuring important behavioral risk factors, such as tobacco use, include self-reported data from in-person surveys and questionnaires. Self-reported surveys have long been lauded as the gold standard for determining smoking prevalence because they directly probe about current behaviors and allow for additional questions on past behaviors, including when the respondent started smoking and other factors. These surveys allow researchers to systematically gather data from groups of individuals who reflect the demographics of entire countries or populations, which can provide insight into the patterns of tobacco use at a large scale.
Self-reported data failed to identify upward of 22 percent of smokers
Surveys regarding social behaviors — such as tobacco and alcohol consumption, diet and food intake, and sedentary behavior — are all susceptible to bias stemming from people’s desires to provide socially acceptable answers. Further, when adolescents are under the age of consent, surveys need to be administered with a parent or guardian present in the room while the teen answers or by proxy when a parent or guardian answers on behalf of the teen. It seems highly unlikely that a teenager would respond accurately to a question about risky behavior when a parental figure is in the room, particularly because the impact of social pressures can be amplified for at-risk groups. In fact, one study of Aboriginal Australian and Torres Strait Islander people found that youth smoking was underestimated or inaccurate when proxies were used or when a parent was present.
Sure enough, evidence is considerable that risky health behaviors are systematically underreported when using self-report versus biomarker data, our analysis suggesting that self-reported data failed to identify upward of 22 percent of smokers in different age-sex groups. In addition, adolescent respondents may also not be aware that they have nicotine in their system, thus explaining the disconnect with self-reported answers. These lapses may be related to the common misconception that vapes do not contain nicotine, when they often do.
What Biomarker Data Reveals
A way to measure tobacco use objectively is with biomarker data — biological measures of substance use typically taken from saliva, hair, blood, or urine samples. These types of data can provide an unbiased measure of tobacco use and their analysis can help validate or adjust self-reported data. In other words, if someone states that they do not smoke, and their biomarkers similarly do not contain cotinine (the biomarker for nicotine consumption), researchers can confirm that the self-reported data is accurate.
Biomarker data has revealed significant underreporting of tobacco use among women around the world. For example, in South Korea, it revealed that women were smoking much more than they reported in surveys, a discrepancy likely driven by negative social perceptions of female smoking. Looking at biomarker data for smoking among men shows that men are more likely to be truthful about their smoking habits than women. An unpublished survey of three African cities in 2017 and 2018 found that female smokers younger than twenty-nine are much less likely to self-report tobacco use than male smokers in the same age groups.
Although useful for gauging the accuracy of self-reported data, this pattern also reflects one major limitation of biomarker use. Biomarkers are unable to differentiate between various forms of nicotine, such as whether someone has been smoking or vaping, which self-reported data often does. Biomarkers can also miss occasional smokers or users of smokeless tobacco. Additional research to parse out more accurate biomarkers for these different components of tobacco use is crucial for further uptake of this type of data. Further, the disconnect found in adolescents between self-reported and biomarker data is not universal. In other settings and populations, the various data strongly aligns, which further validates the use of self-report data as the gold standard to assess smoking status.
Biomarker collection methods also typically require high levels of engagement from participants and are extremely expensive
Pragmatically, biomarker collection methods also typically require high levels of engagement from participants and are extremely expensive. These restrictions limit the locations and number of people that can be included in biomarker studies. Despite these limitations, biomarker data is nonetheless highly accurate in determining data bias. In our pilot analysis at the Institute for Health Metrics and Evaluation, we found that combining biomarker data with self-reported survey data more accurately captured the full scope of smoking prevalence than either approach alone.
Large-scale survey series used to measure noncommunicable disease risk factors are just now starting to leverage biomarker data relating to infectious and sexually transmitted infections, diabetes, micronutrient deficiencies, and exposure to environmental toxins. The introduction of biomarker modules in these traditional surveys presents an exciting opportunity to move toward integrated multimodal data collection for large groups of people. They pave the way for other surveys to embrace multimodal data collection for a larger range of topics, including tobacco use.
Mobile App Data on the Go
With the rise of mobile phone use around the world, researchers can also now leverage app-based surveys to reach more people with a lower cost and burden than traditional survey approaches. Mobile app surveys offer the main advantage of reaching larger groups of people quicker, instead of requiring survey staff manually visiting individual homes. Overall, collecting survey data through mobile apps also reduces many of the logistics associated with in-person survey collection. Surveys can be translated into many languages and cultural contexts using local collaborators, and time, training, and travel are not necessary for field workers. Human error is also reduced because online survey platforms have the ability to check data quality in real time. For instance, if someone is eighteen years old, an automatic limit can be set so that they can’t accidentally answer that they’ve been using tobacco for thirty years.
Populations that might have challenges participating in in-person surveys due to geographical, social, or logistical challenges, such as teenagers and rural communities, now do not have to face those barriers. Instead, they can flexibly answer questions in the relative privacy of their own phone. Mobile app surveys also bolster researchers’ ability to monitor and collect data in real time by seeking input from respondents on the move. Additionally, many apps have geolocation, time, and demographic features built in, reducing the number of questions that need to be directly answered.
Mobile app surveys, however, are not without limitations. They systematically exclude populations without phones or access to the internet. Although the use of mobile phones is on the rise globally, important segments of the population still do not have access to a personal device. These individuals are likely the most marginalized within today’s technologically dense world. Even though traditional surveys can be designed to capture a sample of individuals that reflect a country’s population as a whole, mobile app surveys typically do not have the same targeted sampling capability, which can result in skewed samples.
Ways Forward
Embracing innovative approaches to measuring tobacco use can reduce bias and improve our understanding of the scope of health challenges. Traditionally implemented surveys, mobile app surveys, and biomarkers have complementary strengths. For example, traditional self-reported surveys allow for high-quality representative data, mobile app surveys allow for timely, large-scale rapid data, and biomarkers can quantify the scope of bias in self-reported data. The high-quality, gold standard data of traditional self-reported surveys cannot be replaced, but alternative methods, such as mobile apps and biomarkers, can be used in conjunction to provide a more comprehensive and accurate measure of tobacco use.
Moreover, these tools serve as avenues of innovation in the realm of public health. As an example, improved biomarkers that result from future research could enhance the accuracy of estimates for different tobacco products, including vapes and cigarettes. Knowing how many people are using different types of tobacco is crucial to designing effective public health interventions. Better incorporating biomarkers into future data collection could provide valuable insights and help develop targeted interventions for those at highest risk; meanwhile, leveraging mobile app surveys could provide insight into real-time behaviors and better contextualize intervention targets.
AUTHORS’ NOTE: In the absence of quality data that allow disaggregation by gender, we use data that disaggregates by sex, with the understanding that outcomes for people outside the gender binary are often less equitable than they are for cis women or men.
ACKNOWLEDGMENTS: The authors thank Jack Cagney for providing feedback on the references used in this article.
EDITOR'S NOTE: The authors are employed by the University of Washington's Institute for Health Metrics and Evaluation (IHME) and receive funding from Bloomberg Philanthropies and the Bill & Melinda Gates Foundation. IHME collaborates with the Council on Foreign Relations on Think Global Health. All statements and views expressed in this article are solely those of the individual author and are not necessarily shared by their institution.