Dear Editor, We thank the correspondents for their thoughtful and constructive comments regarding our recently published article titled “Frequency of and factors associated with social problems among working-age patients with cancer in Japan.” We appreciate the opportunity to clarify several methodological aspects of our analyses. Our clarifications primarily concern the specification of logistic regression models, handling of missing data, considerations related to events per variable, multiple testing, and the interpretation of associations in the context of a cross-sectional design. Each of these points is addressed further. 1. Logistic regression analysis We thank the correspondents for highlighting the apparent inconsistency in our description of the logistic regression analysis. In our study, 18 covariates were specified a priori as candidate variables based on the findings of previous studies and their clinical or sociodemographic relevance to psychosocial distress and social problems among patients with cancer. These variables were entered into the model specification as potential predictors. The estimation of models was conducted using a likelihood ratio–based stepwise procedure as implemented in SPSS. Although all candidate variables were initially considered, the final odds ratios and confidence intervals reported in the article corresponded to the variables that were retained in the final step of the stepwise selection process, rather than to a forced-entry model including all 18 variables. We acknowledge that the wording in the Methods section was imprecise and may have given the impression that all variables were forcibly retained in the final models. We appreciate the opportunity to clarify this point. The objective of the multivariable analyses was exploratory. We aimed to identify factors associated with the severity of specific social problems rather than to construct a parsimonious prediction model. As noted in the Discussion section, the acknowledged limitations of stepwise procedures included the potential instability of estimates and inflation of type I errors, particularly in the context of multiple outcomes and substantial missing data. These limitations should be considered when interpreting individual associations. 2. Missing data and listwise deletion We thank the correspondents for their insightful remarks concerning the management of missing data and the implementation of listwise deletion. As delineated in the Methods section, all questionnaire items required a response before participants could progress to the next item. However, response options for several variables, such as “do not want to answer,” “do not know,” and “not applicable,” were intentionally provided to allow participants to opt out of questions that they considered sensitive or not relevant to their personal circumstances. These responses were treated as missing for the purposes of multivariable logistic regression. We acknowledge that this approach resulted in substantial proportions of missing data for some outcome-specific models, particularly for items applicable only to specific subgroups (e.g., sexual life, fertility-related concerns, and partner relationships). As this meant that the effective sample size varied across models, listwise deletion may have reduced the statistical power and introduced bias if the data were not missing completely at random. At the time of study design and analysis, listwise deletion was chosen to ensure transparency and interpretability because of the large number of outcome-specific models and complex structure of skip patterns inherent in the questionnaire. The application of multiple imputation across a series of more than 20 dichotomized outcomes, several of which were conditional on marital or parental status, was considered methodologically challenging. Such an approach entailed the risk of introducing additional assumptions that might not have been verifiable. Therefore, the regression results were interpreted as exploratory and hypothesis-generating, rather than as definitive estimates of effect size. This limitation is explicitly acknowledged in the Discussion section, and readers are advised to interpret the associations with caution, particularly for outcomes with high proportions of missing data. Detailed information on the number of participants included in each outcome-specific model and the corresponding number of missing cases is provided in the publicly available summary data set referenced in the Data Sharing Statement. 3. Events per variable (EPV) We thank the correspondents for highlighting the important issue of EPV in relation to our multivariable logistic regression analyses. In the planning phase of our study, sample size considerations were guided by commonly cited rules of thumb for multivariable regression, including recommendations suggesting approximately 10 cases per independent variable. We acknowledge that EPV is a more appropriate metric in logistic regression than total sample size, and that low EPV can lead to unstable estimates and wide confidence intervals. As noted by the correspondents, the prevalence of some severe social problems was relatively low, and the effective sample size was further reduced in certain models because of outcome-specific missing data. As a result, EPV varied substantially across models and could have fallen below conventional benchmarks for some outcomes. This may partly explain the large odds ratios and wide confidence intervals observed for certain predictors. We, therefore, emphasize that the multivariable analyses were intended to be exploratory and descriptive, with the aim of identifying patterns of co-occurring vulnerabilities rather than providing precise or definitive effect estimates. Our interpretation focused on the consistency and plausibility of associations across related social problem domains, rather than on the magnitude of individual odds ratios. 4. Multiple testing We thank the correspondents for their comment on the issue of multiple testing that arose from the large number of outcome-specific logistic regression analyses. Our analytic strategy involved examining the associations between a common set of candidate predictors and multiple, distinct social problem outcomes. Each outcome represented a different domain of patients' social experiences. Although these outcomes were conceptually related, they were not interchangeable. The analyses were conducted to characterize patterns of association across domains rather than to formally test a single global hypothesis. We acknowledge that conducting multiple multivariable analyses increased the likelihood of chance findings, especially in an exploratory setting and when combined with stepwise procedures. For this reason, we did not interpret isolated statistically significant associations as definitive evidence. Instead, we focused on the consistency, directionality, and clinical plausibility of the observed associations across related outcomes. We did not apply formal statistical adjustment for multiplicity in the multivariable analyses because our primary aim was descriptive and hypothesis-generating rather than confirmatory. However, we agree that readers should interpret individual p-values with caution, and that further confirmatory studies designed around a smaller number of prespecified outcomes and hypotheses would benefit from more formal multiplicity control. 5. Conceptual overlap and causal interpretation We thank the correspondents for their thoughtful comments regarding the potential for conceptual overlap between certain predictors and outcomes, and for noting the risk of causal overinterpretation given our study's cross-sectional design. We agree that some variables included as predictors, such as work status or changes in work status, were conceptually similar to specific social problem outcomes related to employment. However, we included these variables to capture different aspects of patients' lived experiences at a single point in time, not to imply a causal direction. Because our study used a cross-sectional design, the temporal ordering of psychological, clinical, and social factors could not be determined. Therefore, we did not intend to infer causality from the observed associations. Throughout the article, we framed our findings in terms of associations and co-occurring vulnerabilities rather than causal effects. 6. Conclusion We appreciate the correspondents' careful methodological considerations and agree that transparent reporting and rigorous analytical strategies are essential to advancing research on the social challenges faced by patients with cancer. We view our exploratory study as a contribution that highlights the breadth and complexity of the social problems experienced by patients of working age in Japan. Our study also underscores the need for more refined methodological approaches in subsequent work. Further studies using longitudinal designs, prospectively defined hypotheses, larger event counts for specific outcomes, and advanced methods for handling missing data will be well positioned to clarify temporal relationships and causal pathways. We hope this dialog stimulates further methodologically robust research aimed at improving psychosocial care and social support for working-age patients with cancer. Yours sincerely, Kazuho Hisamura, MSW, PhD (on behalf of the authors)
Kazuho Hisamura (Thu,) studied this question.