4.1 Demographic and Clinical Characteristics of the Study Cohort
A total of 400 women from two geographically distinct regions of Bangladesh participated in the present validation study. The cohort represented a diverse community-based population encompassing women from both northern and southern rural settings, thereby allowing the assessment model to be evaluated within varying sociocultural and healthcare-access environments. The mean age of the participants was 35.9 years, suggesting that the majority of respondents belonged to an age group generally considered relevant for early cervical cancer risk surveillance and reproductive health monitoring (Table 9).
Age distribution analysis demonstrated that most participants were between 30 and 39 years of age (55.0%), followed by women aged 40–49 years (30.0%). Comparatively fewer participants were younger than 30 years (12.5%), while only a small proportion were aged 50 years or older (2.5%) (Table 9). This age pattern is noteworthy because cervical cancer risk has repeatedly been shown to increase with age, particularly among women with prolonged exposure to persistent high-risk HPV infection and cumulative reproductive risk factors (Arbyn et al., 2020).
With respect to marital and sexual activity status, the overwhelming majority of participants were categorized as sexually active (86.25%), whereas only 13.75% reported being sexually inactive (Table 9). Since HPV transmission remains closely associated with sexual exposure, this finding reflects a population potentially vulnerable to cervical cancer-associated risk accumulation over time.
A particularly striking observation emerged in relation to age at sexual debut. Approximately two-thirds of participants (67.5%) reported initiating sexual activity before the age of 15 years, while only 32.5% reported sexual debut at or beyond 15 years of age (Table 9). Early sexual debut has long been recognized as an important epidemiological risk factor because the immature cervical epithelium may be more susceptible to persistent HPV
Table 5. Family Risk Factor (FRF) Weighting Scheme by Familial Relationship. This table presents the differential familial weights assigned to first- and second-degree relatives within the FRF domain of the CerviCheck model. First-degree relatives (mother, sister, daughter) are assigned a weight of 2, reflecting closer genetic proximity, while second-degree relatives (maternal and paternal grandmothers and aunts) receive a weight of 1. These weights are multiplied by the presence or absence of a cervical cancer history in each relative to derive the cumulative family risk score (CFRF). Note. FRF = Family Risk Factor; CFRF = Cumulative Family Risk Factor Score. First-degree weighting reflects established patterns of hereditary and shared environmental risk.
|
Family Members
|
Weights
|
|
Mother
|
2
|
|
Sister
|
2
|
|
Daughter
|
2
|
|
Grandmother (Maternal)
|
1
|
|
Aunt (Maternal)
|
1
|
|
Grandmother (Paternal)
|
1
|
|
Aunt (Paternal)
|
1
|
Table 6. Risk Category Classification Based on Cumulative FRF Score. This table defines the three-category classification applied to cumulative FRF scores. A score of 0 indicates Low familial risk (L); scores of 1–2 indicate Moderate risk (M); and scores exceeding 2 indicate High familial risk (H). These thresholds were derived from the weighted family history scoring architecture and are intended to capture meaningful gradations in hereditary predisposition to cervical cancer.
|
Score
|
FRF Category
|
|
0
|
Low Risk (L)
|
|
1–2
|
Moderate Risk (M)
|
|
> 2
|
High Risk (H)
|
Table 7. Cervical Cancer Symptom Assessment (CCSA) Scoring Criteria and Item Descriptions. This table presents the symptom-based assessment domain of the CerviCheck model, including specific gynecological symptoms evaluated for their clinical relevance to cervical cancer. Reported symptoms—including inter-menstrual bleeding, post-coital bleeding, post-menopausal bleeding, persistent malodorous vaginal discharge, and dyspareunia—are each assigned a score of 1, while asymptomatic status is scored as 0. HPV vaccination history is additionally incorporated, with unvaccinated status assigned a score of 1 and confirmed vaccination status assigned 0, reflecting the known prophylactic benefit of HPV immunization against high-risk cervical neoplasia. Note. CCSA = Cervical Cancer Symptom Assessment; HPV = Human Papillomavirus. Any participant reporting one or more symptomatic items is automatically escalated to the High-Risk overall category, irrespective of PRF or FRF scores.
|
Attributes
|
Variable
|
CCSA Score
|
|
Which of the following symptoms do you currently have?
|
SYMPTOMS
|
|
|
Inter-menstrual bleeding
|
|
1
|
|
Persistent smelly vaginal discharge
|
|
1
|
|
Discomfort or pain during sexual intercourse
|
|
1
|
|
Post-coital bleeding
|
|
1
|
|
Post-menopausal bleeding
|
|
1
|
|
Lack of symptoms from genital areas
|
|
0
|
|
Have you ever been vaccinated against HPV?
|
VACCINE
|
|
|
Yes
|
|
0
|
|
No
|
|
1
|
Table 8. Final Integrated Risk Assessment Framework Combining PRF, FRF, and CCSA Domains. This table presents the composite risk stratification algorithm governing overall cervical cancer risk classification within the CerviCheck model. Overall risk category—Low, Moderate, or High—is derived from the combinatorial interaction of CCSA, PRF, and FRF scores. Participants reporting any cervical cancer symptom (CCSA ≥ 1) are automatically classified as High Risk and recommended for intensive clinical screening. Among asymptomatic participants, overall risk is determined by PRF and FRF category combinations, with High or Moderate–High PRF-FRF profiles triggering escalated screening recommendations. All risk categories are paired with structured clinical recommendations to guide appropriate referral and follow-up action. Note. PRF = Personal Risk Factor; FRF = Family Risk Factor; CCSA = Cervical Cancer Symptom Assessment; L = Low; M = Moderate; H = High. Intensive screening referral is indicated for all High-Risk classifications regardless of symptom status.
|
CCRA Category
|
PRF Category
|
FRF Category
|
Overall Risk Category
|
Recommendation
|
|
0
|
0–4
|
L
|
Low
|
Regular screening
|
|
0
|
5–9
|
H/M
|
Moderate
|
Regular screening
|
|
0
|
≥10
|
H/M
|
High
|
Intensive screening
|
|
1+
|
–
|
–
|
High
|
Intensive screening
|
|
0
|
–
|
–
|
Low
|
Regular screening
|
Table 9. Baseline Sociodemographic and Reproductive Characteristics of the Study Cohort (N = 400). This table summarizes the distribution of key sociodemographic, behavioral, and reproductive characteristics among the 400 women enrolled in the community-based cross-sectional validation study. Participants were recruited from two geographically and socioeconomically distinct regions of Bangladesh (Kurigram and Shyamnagar) in collaboration with Friendship NGO. Characteristics include age distribution, sexual activity based on marital status, age at sexual debut, parity, total number of marriages, history of abdominal or pelvic surgery, contraceptive use pattern, and tobacco or smoking exposure. Frequencies and proportions are reported for all categorical variables. The mean participant age was 35.9 years. Note. N = total study sample; % = percentage of total participants. MRSTA = Marital Status-Based Sexual Activity; SXDBT = Age at Sexual Debut; PARITY = number of full-term deliveries; TLMAR = Total Number of Marriages; isPLV = History of Abdominal or Pelvic Surgery; CNTRCEP = Contraceptive Use Status; SMOKE = Smoking or Tobacco Use History.
|
Variables
|
No. (%)
|
|
Total Participants
|
400
|
|
Mean Age (years)
|
35.9
|
| |
|
|
AGE
|
|
|
< 30 years
|
50 (12.5%)
|
|
30–39 years
|
220 (55.0%)
|
|
40–49 years
|
120 (30.0%)
|
|
≥ 50 years
|
10 (2.5%)
|
| |
|
|
MRSTA (Sexual Activity Based on Marital Status)
|
|
|
Sexually inactive
|
55 (13.75%)
|
|
Sexually active
|
345 (86.25%)
|
| |
|
|
SXDBT (Age at Sexual Debut)
|
|
|
≥ 15 years
|
130 (32.5%)
|
|
< 15 years
|
270 (67.5%)
|
| |
|
|
PARITY
|
|
|
0
|
80 (20.0%)
|
|
1–3
|
153 (38.25%)
|
|
> 4
|
167 (41.75%)
|
| |
|
|
TLMAR (Total Number of Marriages)
|
|
|
0
|
20 (5.0%)
|
|
1
|
153 (38.25%)
|
|
> 1
|
227 (56.75%)
|
| |
|
|
isPLV (History of Abdominal/Pelvic Surgery)
|
|
|
No
|
395 (98.75%)
|
|
Yes
|
5 (1.25%)
|
| |
|
|
CNTRCEP (Contraceptive Use)
|
|
|
Never used
|
68 (17.0%)
|
|
Currently using
|
232 (58.0%)
|
|
Previously used
|
100 (25.0%)
|
| |
|
|
SMOKE (Smoking/Tobacco Use)
|
|
|
Never used
|
180 (45.0%)
|
|
Currently using
|
133 (33.25%)
|
|
Previously used
|
87 (21.75%)
|
infection and subsequent neoplastic transformation (Arbyn et al., 2020). The relatively high proportion observed in this cohort perhaps reflects broader sociocultural and early-marriage dynamics frequently encountered within resource-limited settings.
Reproductive history analysis further demonstrated a substantial prevalence of high parity among participants. While 20.0% of women reported no childbirth history, 38.25% had experienced one to three childbirths, and 41.75% reported more than four childbirths (Table 9). This trend is particularly relevant because multiple full-term pregnancies have previously been associated with increased cervical cancer susceptibility, potentially due to hormonal influences, prolonged cervical transformation zone exposure, and cumulative reproductive stressors.
Marriage history within the cohort also revealed notable patterns. Only 5.0% of participants had never been married, whereas 38.25% reported one marriage and 56.75% reported multiple marriages (Table 9). Although marital status alone is not directly causative, repeated marital transitions may indirectly reflect cumulative sexual exposure and increased opportunities for HPV transmission.
Regarding prior abdominal or pelvic surgical history, almost all participants (98.75%) reported no history of pelvic or abdominal surgery, while only 1.25% reported previous procedures (Table 9). Although this variable did not appear highly prevalent within the present cohort, it was retained in the model because previous pelvic interventions may occasionally influence gynecological health status or symptom interpretation.
Patterns related to contraceptive use demonstrated substantial variability. Current contraceptive use was reported by 58.0% of participants, while 25.0% indicated previous use and 17.0% reported never using contraceptives (Table 9). Long-term hormonal contraceptive exposure has previously been discussed within cervical cancer epidemiological literature, although findings remain somewhat heterogeneous across different populations and healthcare contexts (Ghebre et al., 2017).
Smoking and tobacco exposure, another recognized cervical cancer-associated behavioral risk factor, was also relatively common in the study population. Nearly half of participants (45.0%) reported never using tobacco products, whereas 33.25% were current users and 21.75% identified as former users (Table 9). Tobacco-related carcinogenic exposure may contribute to cervical epithelial damage and impaired local immune responses, thereby potentially facilitating HPV persistence and disease progression.
Overall, the cohort characteristics revealed a population exhibiting multiple overlapping behavioral, reproductive, and demographic risk factors traditionally associated with cervical cancer susceptibility. Importantly, these findings underscore the practical need for accessible community-level risk assessment systems capable of identifying vulnerable individuals before progression toward advanced disease states.
4.2 Performance of the Questionnaire-Based Risk Assessment Model
The primary objective of this study was to evaluate whether a culturally tailored questionnaire-driven mHealth model could reasonably identify women at elevated cervical cancer risk within a resource-constrained population. To assess this, the performance of the model was compared against physician-led clinical risk evaluation conducted during the validation phase.
Among the 400 participants included in the analysis, 22 women were clinically categorized as having elevated cervical cancer risk based on physician assessment. In contrast, the questionnaire-driven model identified 69 participants as belonging to moderate- or high-risk categories. Although the model generated a larger number of positive classifications relative to physician evaluation, this discrepancy was intentional to some extent because the framework was designed with strong emphasis on sensitivity rather than specificity.
From a public health perspective, minimizing false-negative outcomes was considered particularly important. Missing potentially high-risk individuals in low-resource settings could delay referral and intervention, ultimately contributing to poorer clinical outcomes. Consequently, the model was intentionally calibrated to function conservatively, favoring broader identification of potentially vulnerable individuals over overly restrictive classification thresholds.
Performance evaluation demonstrated that the proposed model achieved an overall accuracy of 88.25%. Sensitivity reached 100%, indicating that all clinically identified at-risk cases were successfully captured by the questionnaire-based system. Specificity was estimated at approximately 87.5%, reflecting a relatively strong ability to correctly identify low-risk individuals while still maintaining conservative screening behavior.
The positive predictive value (PPV) of the model was calculated at 31.9%, whereas the negative predictive value (NPV) reached 100%. The exceptionally high NPV is particularly noteworthy because it suggests that participants categorized as low risk by the model were highly unlikely to belong to clinically elevated-risk groups. In practical community-health contexts, this characteristic may hold substantial value by helping prioritize limited clinical resources toward women requiring further evaluation.
4.3 Contribution of PRF, FRF, and Symptom Assessment Variables
To further explore the influence of individual risk domains on final model prediction, the effects of Personal Risk Factors (PRF), Family Risk Factors (FRF), and Cervical Cancer Symptom Assessment (CCSA) variables were examined separately.
The analysis suggested that PRF and symptom-related variables contributed substantially to overall risk classification, whereas FRF appeared to exert comparatively weaker influence within the present cohort. This observation may partly reflect the relatively limited number of participants reporting confirmed familial cervical cancer history. It is also possible that awareness regarding family cancer history remained incomplete among some participants, particularly within rural communities where diagnostic confirmation and medical record accessibility may historically have been limited.
Symptom-related variables appeared especially influential during risk prediction. Participants reporting abnormal bleeding patterns, persistent vaginal discharge, or post-coital symptoms were considerably more likely to fall into moderate- or high-risk categories. Clinically, this finding aligns with previous evidence indicating that symptom recognition remains a critical component of cervical cancer triage and early detection efforts (Ghebre et al., 2017).
4.4 Logistic Regression and Predictive Analysis
To better understand the predictive behavior of the model, multivariate logistic regression analysis incorporating 17 variables was subsequently performed. The regression analysis further reinforced the importance of symptom-related variables within the overall predictive structure.
The logistic regression model demonstrated an overall accuracy of 85.0%, accompanied by a sensitivity of 95.92% and specificity of 99.64%. These findings suggest that symptom-driven and behavioral variables retained strong discriminatory capacity even when analyzed collectively within a multivariable framework.
Receiver operating characteristic (ROC) curve analysis additionally demonstrated robust predictive performance. The area under the ROC curve (AUC) reached 0.987, indicating excellent discrimination between clinically elevated-risk and low-risk individuals (Figure 4). An AUC value approaching 1.0 generally reflects highly effective classification performance, suggesting that the integrated questionnaire framework possessed considerable ability to differentiate between varying levels of cervical cancer risk within the study population.
The ROC findings also support the broader conceptual rationale underlying the study—that structured symptom assessment combined with demographic and behavioral risk profiling may provide meaningful preliminary screening support in settings where conventional diagnostic infrastructure remains limited.
4.5 Internal Validation and Model Stability
Internal cross-validation procedures were conducted to examine the stability and consistency of model performance across the dataset. Cross-validation analysis yielded an overall accuracy estimate of approximately 95.98%, suggesting that the framework maintained relatively stable predictive behavior within repeated validation iterations.
Nevertheless, these findings should be interpreted cautiously. Although the internal validation results were encouraging, the current cohort size remained relatively modest, and the number of clinically elevated-risk cases was comparatively limited. Therefore, while the present results suggest promising screening utility, larger multicenter studies involving more demographically diverse populations will likely be necessary before broader generalization can confidently be established.
4.6 Performance of the CerviCheck mHealth Platform
Beyond numerical model performance, the practical deployment characteristics of the CerviCheck application were also evaluated during field implementation (Figure 2; Figure 3). Overall, participants demonstrated favorable engagement with the platform, particularly when
Figure 1. Algorithmic Flowchart Illustrating the Integrated Risk Stratification and Clinical Recommendation Pathway of the CerviCheck Model. This figure presents the stepwise decision logic governing the CerviCheck cervical cancer risk assessment framework. The algorithm begins with sequential evaluation of the three principal risk domains: Cervical Cancer Symptom Assessment (CCSA), Personal Risk Factor (PRF) scoring, and Family Risk Factor (FRF) scoring. Participants reporting any cervical cancer-related symptom are immediately classified as High Risk and directed toward intensive clinical screening. For asymptomatic participants, final risk categorization is determined through the combinatorial integration of PRF and FRF scores, resulting in Low, Moderate, or High overall risk classifications paired with tailored screening recommendations. The flowchart delineates all possible decision branches, threshold-based score transitions, and recommended clinical actions, thereby illustrating the operational architecture of the model as implemented within the CerviCheck mHealth application. Note. CCSA = Cervical Cancer Symptom Assessment; PRF = Personal Risk Factor; FRF = Family Risk Factor. Regular Screening = routine cervical cancer screening per national guidelines; Intensive Screening = expedited clinical referral and follow-up evaluation.

Figure 2. Overview of the CerviCheck mHealth Application Developed for Culturally Tailored Cervical Cancer Early Detection in Bangladeshi Women. This figure provides a representative overview of CerviCheck, an Android-compatible mobile health application designed to deliver the integrated cervical cancer risk assessment model in a user-accessible digital format. The application incorporates multilingual support (Bengali and English), simplified user navigation, structured questionnaire delivery, automated risk score computation, and integrated educational content pertaining to cervical cancer awareness and prevention. The application architecture was designed to maintain lightweight deployment and operational functionality within low-bandwidth environments characteristic of rural and peri-urban settings in Bangladesh. Data encryption and role-based access controls are embedded within the application to ensure participant confidentiality and data security throughout the assessment process. Note. mHealth = Mobile Health. CerviCheck was developed and deployed as an Android-compatible application to maximize accessibility across diverse socioeconomic and geographical contexts within Bangladesh.

Figure 3. Representative User Interface Screens of the CerviCheck mHealth Application Demonstrating the Questionnaire Workflow, Risk Output Display, and Educational Content Modules. This figure illustrates key interface components of the CerviCheck application as encountered by end users during the risk assessment process. Displayed screens include representative questionnaire input panels corresponding to the PRF, FRF, and CCSA domains, automated risk category output screens conveying individualized Low, Moderate, or High risk classifications, and integrated health education modules promoting cervical cancer awareness and preventive practices. The interface design prioritizes visual clarity, minimal literacy requirements, and culturally appropriate iconography to facilitate usability across diverse participant groups, including women with limited formal education in rural community settings. Both Bengali and English language versions are accessible within the application. Note. PRF = Personal Risk Factor; FRF = Family Risk Factor; CCSA = Cervical Cancer Symptom Assessment. Interface screens depicted are representative examples from the validated application version used during the community-based study.

Figure 4: Receiver Operating Characteristic (ROC) Curve (AUC 0.987)
questionnaires were administered in Bengali and supported by trained female interviewers.
The bilingual interface and simplified navigation design appeared to improve accessibility among women with varying educational backgrounds. Importantly, the mobile-based structure allowed assessments to be conducted outside traditional clinical settings, potentially reducing some of the sociocultural discomfort often associated with invasive gynecological screening procedures.
Taken together, the findings suggest that culturally adapted digital risk assessment platforms may hold meaningful potential as supportive tools for community-level cervical cancer awareness, triage, and early referral initiatives in underserved populations.