I checked 7 public opinion journals on Friday, June 05, 2026 using the Crossref API. For the period May 29 to June 04, I found 4 new paper(s) in 3 journal(s).

Journal of Official Statistics

Prediction-Powered Estimation: Unbiased Model-Assisted Estimation
Nicholas Denis, Mohammed Haddou
Full text
National statistical agencies increasingly face budget constraints and shrinking sample sizes, while simultaneously gaining access to rich auxiliary data and powerful pre-trained machine learning (ML) and artificial intelligence (AI) models, including Large Language Models (LLMs). Traditional model-assisted estimation techniques, which fit models using survey sample data, are limited by small sample sizes, struggle to leverage complex non-linear relationships in auxiliary data, and cannot accommodate frontier pre-trained models. This work re-examines the use of pre-trained black-box models, fit independently of the survey sample, for design-based parameter estimation. Inspired by the Prediction-Powered Inference (PPI) framework, we introduce the Prediction-Powered Estimator (PPE), an unbiased estimator with an unbiased variance estimator for the survey design setting. We also formalize the use of pre-trained models with the classic difference estimator—which we term the Prediction-Powered Difference (PPD) estimator—and with the Generalized Regression Estimator via predicted values as covariates ( GREG y ^ ). Through LLM-based use-cases leveraging unstructured auxiliary data (images and text) and experiments with real-world survey data from Statistics Canada, complemented by simulation studies in the Supplemental Material , we demonstrate that these approaches consistently outperform standard baseline estimators across bias, mean absolute error, mean squared error, coverage, and confidence interval width. The results suggest that pre-trained models can yield more accurate and efficient estimates while potentially reducing survey sample sizes and respondent burden, and motivate expanding the survey methodologist’s toolbox to include pre-trained models and novel auxiliary data sources.
A Note on the Additive Decomposition of GEKS Indexes
Steve Martin
Full text
It is often useful to decompose an index number into the contribution of each product toward the total index, and consequently there are several well-known decompositions for bilateral indexes. In this note, I extend these decompositions to cases where bilateral indexes are made into multilateral GEKS indexes. Although the result is primarily of theoretical interest, it shows how decompositions based on a bilateral index can be extended to a multilateral index, and highlights the challenge of decomposing GEKS indexes.

Journal of Survey Statistics and Methodology

EXPLORING THE POTENTIAL OF NOVEL PARADATA IN RESPONDENT-DRIVEN SAMPLING
Sunghee Lee, Leng Seong Che
Full text
This study extends the scope of the paradata discussion to respondent-driven sampling (RDS). Unlike traditional sampling, RDS relies on existing social networks within a target population. This unique process provides opportunities to produce novel paradata. Specifically, this study examined two types of paradata in RDS: one based on interviewer observations and the other based on recruitment behaviors ascertained from tracking recruitment coupons. We implemented these paradata features in two independent RDS surveys. In an in-person RDS survey of persons who inject drugs in Southeast Michigan, we implemented an interviewer observation questionnaire. This included questions about interviewers’ assessments of respondents’ understanding of coupon distribution instructions, as well as their expectations regarding respondents’ chances to recruit others and to return for a follow-up interview. These observations predicted recruitment success. In a Web-RDS study of Korean Americans, physical distance between linked respondents (such as a respondent and their recruiter) was determined by tracking recruitment coupons and geocoding respondent addresses. Greater geographic distance was associated with a higher likelihood of serious psychological distress. The results demonstrate that the unique features of RDS offer new avenues for utilizing paradata in both methodological and substantive research. These findings warrant further exploration and development of paradata specific to RDS.

Social Science Computer Review

Whose Centre Holds? White Normativity in Race Dimensions Across Word Embeddings
Nnaemeka Ohamadike, Kevin Durrheim, Mpho Primus
Full text
Bias in word embeddings is often measured using bipolar dimensions, constructed as the difference between two anchor centroids. This technique assumes both poles are symmetrical and equally informative. However, normativity literature shows that one category may function as the unmarked norm, with others framed as marked deviations. In race, whiteness typically holds the normative position, and embedding-based race dimensions may inherit the skew. We test this possibility using dimensions constructed from validated African–European name anchors, probed with neutral and valence words. In three embedding models (Wiki-News, South African news, Google News), we assess whether race dimensions favour whiteness as a normative anchor, whether this skew is stronger in culturally specific models (SA, Google), and whether bipolar offsets amplify one pole, given unipolar evidence. Results show that neutral and valence terms cluster nearer to the white pole (most strongly in the Wiki-News model), indicating whiteness as the semantic default. Overshoot favoured Black in Google and Wiki-News, while White overshoot only occurred in the South African model. We argue that this captures racialised variance where the pole with more spread tends to exert greater leverage on the bipolar axis. The study provides quantitative evidence of white-normative anchoring and diagnostics for asymmetric amplification in embedding-based bias measures.