I checked 15 psychology journals on Saturday, February 28, 2026 using the Crossref API. For the period February 21 to February 27, I found 17 new paper(s) in 6 journal(s).

Advances in Methods and Practices in Psychological Science

Beyond Statistical Myopia: Replying to a Misguided Critique of Mind–Body Research
Peter J. Aungle, Daniel L. Chen, Nicholas P. Holmes
Full text
In response to Gelman and Brown’s recent critique of Aungle and Langer, we argue that their article illustrates how narrow statistical reasoning and selective literature review can misrepresent and undermine credible scientific findings. Using their discussion of perceived time and physical healing as a case study, we identify three general problems: (a) a failure to accurately characterize the methods and results of the study they critiqued, (b) misinterpretations and omissions in their review of the relevant literature, and (c) a tendency to generalize from isolated statistical issues to sweeping claims about the invalidity of mind–body research. We adopt Gelman and Brown’s recommended model and find that the main effect remains robust. We also document errors in their interpretations of other cited studies and demonstrate that they ignore decades of rigorous, well-replicated research on placebo effects and health mindsets. By examining their critique in detail, we highlight how methodological skepticism, when untethered from accurate reading and balanced appraisal, can mislead rather than clarify.
Large Language Models as Psychological Simulators: A Methodological Guide
Zhicheng Lin
Full text
Large language models (LLMs) offer emerging opportunities for psychological and behavioral research, but methodological guidance is lacking. In this article, I develop a framework for using LLMs as psychological simulators across two primary applications: simulating roles and personas to explore diverse contexts, and serving as computational models to investigate cognitive processes. For simulation, the framework includes (a) an implementation-confound checklist distinguishing essential from context-dependent methodological checks, (b) methods for developing psychologically grounded personas that move beyond demographic categories, and (c) a three-tier validation framework (direct, indirect, and generative) tailored to data availability. A diagnostic decision framework guides researchers through establishing performance validity, identifying implementation artifacts, and interpreting LLM-human discrepancies. For cognitive modeling, I synthesize (a) emerging approaches for probing internal representations, (b) methodological advances in causal interventions, and (c) strategies for relating model behavior to human cognition. The framework addresses overarching challenges, including prompt sensitivity, temporal limitations from training-data cutoffs, and ethical considerations that extend beyond traditional human-subjects review. Open-weight models are the default for reproducibility. Together, this framework integrates emerging empirical evidence about LLM performance—including systematic biases, cultural limitations, and prompt brittleness—to help researchers wrangle these challenges and leverage the unique capabilities of LLMs in psychological research.

Behavior Research Methods

A multi-strategy cognitive diagnosis model based on response times and fixation counts
Junhuan Wei, Chun Wang, Yan Cai, Peida Zhan, Dongbo Tu
Full text
How plausible is my model? Assessing model plausibility of structural equation models using Bayesian posterior probabilities (BPP)
Ivan Jacob Agaloos Pesigan, Shu Fai Cheung, Huiping Wu, Florbela Chang, Shing On Leung
Full text
In structural equation modeling (SEM), one method to select the most plausible model from several candidates, or to compare one or more hypothesized models with similar alternatives on plausibility, is to compare the models using Bayesian posterior probability (BPP). BPP can be computed from the Bayesian information criterion (BIC) scores (Wu et al. Multivariate Behavioral Research , 55 (1), 1–16, 2020). This approach complements conventional goodness-of-fit indices such as the Comparative Fit Index (CFI), the root mean square error of approximation (RMSEA), and the standardized root mean square residual (SRMR) in giving concise BPP for assessing uncertainties among all models considered. It can also reveal evidence against a model otherwise hidden by these indices. However, Wu et al. Multivariate Behavioral Research , 55 (1), 1–16. (2020) did not provide guidelines on deciding the models that should be considered. To facilitate the use of BPP, we proposed a novel method for selecting this set of models, called neighboring models , to help researchers decide on the initial set. This novel method integrates seamlessly into the typical workflow for SEM analysis. Researchers can fit a model as usual and then use this method to assess whether it is the most plausible model compared with the neighboring models. We believe the proposed method will make it easier for researchers to make better-informed decisions when evaluating their models. We developed a user-friendly R package, , to automate all the steps: generating the set of neighboring models, fitting them, and computing the BPPs, all in a single function.
A tutorial for software options to aid in assessing functional relations in single-case experimental designs
Rumen Manolov
Full text
Single-case experimental designs (SCEDs) can be used for identifying effective interventions via the intensive study of one or a few individuals in different conditions, actively manipulated by the researcher. The assessment of SCED data entails both judging whether there is sufficient evidence of a functional relation (i.e., a causal effect of the intervention on the target behavior) and quantifying the magnitude of the effect. In the current text, the focus is on assessing the presence of a functional relation, considering all the attempts to demonstrate an effect that SCEDs include. Specifically, the aim is to review several freely available websites, which require no additional software to be installed, and offer graphical representations of the data, visual aids, and quantifications. Several data analytical steps are outlined for performing the assessment, both dealing with each basic effect separately and evaluating the consistency of effects. Software that is useful for carrying out these steps is reviewed, including the way in which the data files should be specified and the few clicks required by applied researchers to achieve the desired output. The interpretations of the output are illustrated with real data.
Generalized least squares transformation for single-case experimental design: Introducing the R package lmeSCED
Chendong Li, Eunkyeng Baek, Wen Luo
Full text
Comparing effect latencies in the visual world paradigm: Monte Carlo simulations to assess resampling-based procedures
Serge Minor
Full text
In a series of Monte Carlo simulation studies, we evaluated the power and Type I error rates of resampling-based procedures for comparing effect latencies between groups in the visual world paradigm (VWP). Resampling-based methods, while versatile, are known to fail in certain cases. Therefore, validation of such methods through simulation is crucial. We compared permutation- and bootstrapping-based tests combined with different methods for measuring effect latency while manipulating sample size and true effect size. Alongside previously used latency measures, we tested new measures involving the application of an effect size threshold. Simulations were based on existing VWP datasets representing different effect types (preferential looks triggered by lexical vs. grammatical cues, cohort competitor effects in word recognition) and data collection methods (infrared- vs. webcam-based eye tracking). A total of 156,000 simulations were conducted across five studies, involving 548 million resampled datasets. The main findings are as follows: (1) With sufficient sample sizes, tests were effective in detecting latency differences of 200–300 ms in sentence processing tasks, and as small as 100 ms in word recognition. (2) The permutation test and bootstrapped percentile CIs exhibited the highest overall power without inflation of Type I error rates. (3) Applying an effect size threshold in latency estimation led to consistent increases in statistical power. (4) Resampling by participant was robust to increases in cross-subject variability;in contrast, bootstrapping within participants and time bins led to elevated Type I error rates. Based on these results, we offer recommendations for using non-parametric resampling-based procedures to compare group latencies in VWP experiments.
Collection of body–object interaction ratings for 5,637 Japanese words
Masaya Mochizuki, Naoto Ota
Full text

Computers in Human Behavior

Engaging with cybercriminals: phases and influence strategies in ransomware negotiations.
Michail Georgiou, Ellen Giebels, Miriam S.D. Oostinga, Remco Spithoven
Full text
Virtual peers reduce gambling symptoms and related problems of moderate-risk gamblers: A randomized controlled trial
Kenji Yokotani, Yosuke Seki, Nobuhito Abe, Masahiro Takamura, Tetsuya Yamamoto, Hideyuki Takahashi
Full text
Interdisciplinary perspectives and current findings on the role of trust as a psychological mediator in human interaction with artificial intelligence: Editorial overview
Irene Valori, Johannes Kraus, Merle T. Fairhurst
Full text

Journal of Experimental Social Psychology

Principles of nostalgia: Meta-analytic tests
Evan Weingarten, Ziwei Wei, Tim Wildschut, Constantine Sedikides
Full text

Journal of Personality and Social Psychology

Are the metatraits fact or artifact? Ruling out alternative explanations for the higher-order factors of the Big Five.
Colin G. DeYoung, Ming Him Tai, Edward Chou, Boris Mlačić
Full text
Confront in public, validate in private: Effective male allyship responses to sexist remarks.
Hsuan-Che (Brad) Huang, Jonathan B. Evans
Full text
Transcending embarrassment: On the reputational benefits of laughing at yourself.
Selin Goksel, Ovul Sezer, Jonathan Z. Berman
Full text
Femininity culture: Theory and workplace implications.
Andrea C. Vial, Marta Beneda
Full text

Personality and Social Psychology Bulletin

The Gendered Benefits of Communication Strategies: Women Leaders Are Less Effective but More Liked When They Use Prevention-Focused Language
M. Asher Lawson, Sandra C. Matz, Friedrich M. Götz, Ashley E. Martin
Full text
Research has identified a double-bind for female leaders: When acting in line with gender stereotypes, they are viewed as more likeable but less competent. Here, we test the impact of using gender stereotypical language—characterized by more prevention-focused language (e.g., avoiding risks) and less promotion-focused language (e.g., seeking gains)—on U.S. governors’ approval ratings during COVID-19 and their ability to promote effective social distancing behaviors. With a final dataset of 3,759 documents capturing governors’ communication, a 13-week panel of Google mobility data containing 6,534 observations (Study 1), U.S. nationally representative survey data from 57,532 participants (Study 2), and 24,247 tweets (Study 3), we find that female governors who use less prevention-focused, stereotypical language in their communications are more effective at increasing compliance with social distancing measures but receive lower approval ratings. As such, women leaders’ necessary approaches in crisis situations may undermine their sustainability in positions of power.