Exploratory factor analysis on the development and validation of the understanding, attitude, practice and health literacy questionnaire on COVID-19 in Malay language

Dalawi, Izzaty; Isa, Mohamad Rodi; Aimran, Nazim

doi:10.1038/s41598-025-04517-z

Download PDF

Article
Open access
Published: 04 June 2025

Exploratory factor analysis on the development and validation of the understanding, attitude, practice and health literacy questionnaire on COVID-19 in Malay language

Izzaty Dalawi¹,
Mohamad Rodi Isa² &
Nazim Aimran³

Scientific Reports volume 15, Article number: 19654 (2025) Cite this article

3690 Accesses
4 Citations
Metrics details

Subjects

Abstract

Coronavirus Disease 2019 (COVID-19) is an emerging respiratory illness caused by a novel coronavirus known as Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). To effectively assess understanding, attitudes, practices, and health literacy regarding COVID-19, a valid and reliable questionnaire needs to be developed. Therefore, this study aims to develop and validate the Understanding, Attitude, Practice and Health Literacy Questionnaire on COVID-19 (MUAPH C-19) in Malay Language. A cross-sectional study was conducted across three hospitals in Klang Valley from June 20, 2022, to June 30, 2022. The study’s target population consisted of Malaysian citizens aged 18 years and older who attended the three hospitals. A systematic random sampling method was employed to select participants. The original MUAPHQ C-19 questionnaire consists of four domains: understanding (12 items), attitude (15 items), practice (11 items), and Health Literacy (12 items). The validity of the questionnaire was assessed using exploratory factor analysis (EFA) and the reliability was assessed using Cronbach’s alpha and interclass correlation. A factor loading threshold of 0.50 was set. A Cronbach’s alpha value greater than 0.6 was considered acceptable. An intraclass correlation coefficient (ICC) value greater than 0.75 indicated good reliability. A total of 100 respondents participated in the study. The Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy ranged from 0.691 to 0.899, and Bartlett’s Test of Sphericity was significant across all domains, indicating sufficient sample adequacy for the EFA. An eigenvalue greater than 1.0 revealed two components in the understanding domain (41.308–68.250%), three components in the attitude domain (22.802–58.973%), four components in the practice domain (19.852–73.500%), and a single component in the health literacy domain (64.194%). In the reliability assessment, Cronbach’s alpha ranged from 0.677 to 0.914, while the intraclass correlation ranged from 0.562 to 0.759 across all domains, demonstrating a good and reliable scale. Finally, 10 items have been deleted, and 10 sub-domains have emerged across four domains. This study successfully developed a valid and reliable tool of the 42-item MUAPHQ C-19 for measuring understanding, attitudes, practice and health literacy related to COVID-19 among the general public in Malaysia. Future studies utilizing the MUAPHQ C-19 can be conducted nationwide to monitor public literacy and behaviours regarding COVID-19 and to assess the effectiveness of prevention and control programs implemented.

COVID-19 awareness, knowledge and perception towards digital health in an urban multi-ethnic Asian population

Article Open access 24 May 2021

Predicting coronavirus disease 2019 severity using explainable artificial intelligence techniques

Article Open access 19 March 2025

An interpretable machine learning model based on a quick pre-screening system enables accurate deterioration risk prediction for COVID-19

Article Open access 30 November 2021

Introduction

The coronavirus disease 2019 (COVID-19) is an emerging respiratory disease illness caused by a novel coronavirus known as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)¹. Since its initial detection, it has spread very fast to more than 200 countries and has been declared as a public health emergency of international concern (PHEIC) by the World Health Organisation (WHO) and declared it a Public Health Emergency of International Concern (PHEIC) on 30th January 2020².

A review of the literature reveals that numerous studies have examined the knowledge, attitudes, and practices (KAP) related to COVID-19 across various populations^3,4 including two studies from Malaysia^5,6. While these studies employed various questionnaires to assess KAP, there was a notable gap in evaluating health literacy concerning COVID-19. However, none of the studies explored the health literacy on COVID-19. This is one of the most important aspects that are lacking in the published articles.

Knowledge, attitude and practice (KAP) survey data are vital to help plan, implement and evaluate a prevention and control work. Apart from identifying the knowledge gap, cultural beliefs or behavioural patterns, it can also identify the factors influencing behaviour that are known and not known to most people, reasons for their attitudes, how and why they practice the certain health behaviours. It can also deepen the understanding of commonly known information, attitudes, and factors that influence human behaviour⁷. Besides, it can also enable the stakeholders or policymakers to set the priorities and make strategic decisions based on the gap and essential information found from the KAP survey data⁸. KAP research also has been the leading educational intervention strategy for respiratory diseases control across the globe⁸. Various researchers had reported that an individual’s level of KAP is linked to the intelligent control and prevention of illness, response to medical treatment and advancement of one’s health^9,10,11.

A questionnaire is the most used tool in collecting the survey data for research or surveillance, or general observation. However, to ensure the consistency and accuracy of the collected data, the questionnaire should be carefully constructed, have a grounded theory to develop the tools, validity and reliability to be ensured for the usage of the intended population. Validation of a questionnaire is crucial as part of establishing a standardized and valid questionnaire. The benefit of having a standardized questionnaire is it can be used to measure and compare on studied domains among different groups of people or settings. It also needs to be properly validated to ensure that the questionnaire is psychometrically sound¹².

There is a critical gap for Malaysians understand and apply the health information particularly in health communication and health literacy¹³. Study by Okan et. a; l¹⁴. , found that more than half of the German adult population had “inadequate” or “problematic” levels of health literacy related to COVID-19. Therefore, the need for targeted public information campaigns and the promotion of population-based health literacy to help individuals navigate the infodemic, identify disinformation, and make decisions based on reliable and trustworthy information. Enhancing health literacy (HL) and digital health literacy (DHL) is crucial to improve students attitudes and behaviour during COVID-19 pandemic¹⁵.

During the COVID-19 pandemic in Malaysia, most of the information regarding COVID-19 were primarily delivered in Malay. Therefore, to the target language preference, the questionnaire need to develop in Malay for the use of general Malaysian population. Most of the COVID-19 questionnaire in Malaysia only focused on knowledge, attitude and practice about that disease. However, the questionnaire should integrate with health literacy to find trustworthy information and critically appraise misinformation about COVID-19. According to a report by Department of Statistics Malaysia¹⁶ stated that majority of the Malaysian population can speak and understand Malay. Therefore, this study aims to develop and validate the Malay language of understanding, attitude, practice and health literacy towards COVID-19 Questionnaire (MUAPHQ C-19) using exploratory factor analysis (EFA).

Materials and methods

Study design

A cross-sectional study conducted for the validation of the questionnaire (MUAPHQ C-19) including the unidimensionality, reliability and validity. The methodology in this study was adhered to the Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN) questionnaire validation guideline¹⁷ and the questionnaire translation and cultural adaptation guidelines¹⁸.

Study setting

The study was conducted in three hospitals, i.e., Pusat Perubatan Universiti Teknologi MARA, Sungai Buloh (PPUiTM Sungai Buloh), Hospital Al-Sultan Abdullah, Puncak Alam (Hospital UiTM Puncak Alam) and Hospital Kuala Lumpur.

Target population

The target population for this study was the general public, defined as Malaysian citizens aged 18 years old and above from the Klang Valley tertiary hospital, either male or female, who are not a healthcare worker. The inclusion criteria were those who are able to read and understand the Malay Language and the exclusion criteria was illiterate people.

Sample size determination

Exploratory factor analysis (EFA)

For the exploratory factor analysis, there are several rules of thumb on the minimum sample size has been discussed and summarised by Kyriazos¹⁹ i.e., 100 to 250^20,21, 300²² or 500 or more²³. The sample size for EFA also can be calculated based on the N: p ratio, i.e., number of participants (N) to variables (p). Traditionally, it is set as 5:1. However, studies suggest that strength of item loadings, uniformity of the communalities and number of items per factor²⁴ are vital for the stability, reliability, and replicability of a factor solution²⁵.

Therefore, the minimum sample size taken for getting a reliable result for EFA was taken as 100 based on one of the rules of thumb^19,21. There were 100 participants recruited and analysed for the exploratory factor analysis. This sample size is based on a combination of methodological recommendation and practical constraints.

Test-retest reliability

According to the central limit theorem, a large sample size of at least 30 will yield data approaches to the normal distribution^26,27. Seventy participants agreed to participate in the test-retest analysis. However, only 30 participants responded to the questionnaire after being approached within two to four weeks after their first response. The 30 participants’ data were analysed for the test-retest reliability (PPUiTM Sg Buloh, n = 7 respondents; HUiTM Puncak Alam, n = 11 respondents; HKL, n = 12 respondents).

Sampling method

Participants were selected using a systematic random sampling technique to ensure a representative and unbiased sample of hospital attendees. The steps were as follows: eligible participants who fulfilled the inclusion and exclusion criteria were listed, a sampling frame was developed, and the sampling intervals were determined. Then, a random starting point was selected from numbers 1 to 9 before systematic selection was conducted until the desired sample size of 100 participants was reached.

Instrument

The original MUAPHQ C-19 questionnaire consists of four domains: understanding, attitude, practice, and health literacy (Appendix A). The result of content validity and face validity (FVI) have been published in Dalawi et al.²⁸.

Understanding

Initially, this domain comprises 12 items, categorized into seven subdomains: source of information (1 item), causative agent (1 item), route of transmission (2 items), symptom (1 item), risk factor (1 item), complication (1 item) and preventive measures (4 items). Each item is measured on an interval scale ranging from 1 (lowest understanding) to 10 (highest understanding).

Attitude

This domain begins with 15 items, categorized into four subdomains: perceived susceptibility (4 items), perceived severity (5 items), perceived benefits (2 items) and perceived barrier (5 items). Items in this domain are measured using an interval scale from 1 (strongly disagree) to 10 (strongly agree).

Practice

This domain comprises 11 items, divided into two subdomains: do’s (7 items) and don’t (4 items). Items are measured on an interval scale from 1 (very rare) to 10 (very frequent).

Health literacy

Originally, this domain includes 12 items, organized into four subdomains: access to health information (3 items), understand health information (4 items), appraising health information (4 items) and applying health information (2 items). Items in this domain are measured on an interval scale ranging from 1 (strongly disagree) to 10 (strongly agree).

Conduct of the study

Exploratory factor analysis

A cross-sectional study was conducted for two weeks, i.e., from 20th June 2022 to 30th June 2022. A total of 100 participants who fulfilled the inclusion and exclusion criteria were recruited from the Outpatient clinics, PPUiTM Sungai Buloh, Hospital UiTM Puncak Alam and Hospital Kuala Lumpur (HKL). Participants were recruited and asked to answer the MUAPHQ C-19. A study information sheet that explains the study’s background, purpose, benefit, study process, participation in the study, confidentiality, and the researcher’s contact numbers were given to the participants. Only respondents who were eligible and agreed to participate in the study were recruited.

Test-retest reliability

Participants were also invited to answer back the MUAPHQ C-19 in two to four weeks to test for the test-retest reliability of the MUAPHQ C-19. This invitation was made upon their participation in the study. Participants who were keen to participate in the test-retest were asked to give their contact numbers. They were re-contacted, and to return the questionnaire to the researcher within two to four weeks.

Statistical analysis

Data was analyzed using IBM SPSS (Version 29.0)²⁹. The factor structure of every domain in the MUAPHQ C-19 was explored statistically. The results are displayed in three outcomes: descriptive statistics, exploratory factor analysis and reliability analysis.

Descriptive statistics

The socio-demographic characteristics is presented using mean and standard deviation for normally distributed continuous data or median and interquartile for non-normally distributed data. For the categorical variables, data are presented using absolute number and frequency.

Exploratory factor analysis (EFA)

All items with negative statements from the attitude and practice domain were reverse-coded in the SPSS software before further analysis were conducted. Before proceeding with EFA, several essential steps must be reviewed to fulfil the EFA assumptions³⁰. These assumptions include checking for missing data or outliers, ensuring sample size adequacy, assessing the suitability of the data for multivariate normality, verifying the sufficiency of item correlations and addressing multicollinearity.

For factor exploration, the data suitability and sample size adequacy was tested using Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy and Bartlett’s Test of Sphericity. The KMO index must be more than 0.5 and the Bartlett’s Test of Sphericity must be significant (p < 0.05) to proceed for factor analysis^31,32,33.

The data was checked for multivariate normal distribution, through the chi-square versus Mahalanobis distance plot. The maximum value for Mahalanobis is obtained and is compared to the critical value of degree of freedom in the chi-square distribution table at 0.05 significance level. If the maximum value is more than the critical value (p < 0.05), no multivariate normality is seen. If no multivariate normality is violated, the unweighted least squares (ULS) technique can be employed in the factor analysis process to analyse the data.

Additional measures are available to ascertain whether the items are sufficiently correlated. The anti-image denotes the proportion of variance in an item that is unrelated to another item in the analysis^34,35. Anti-image correlation matrix diagonals should be greater than 0.5, which is associated with smaller off-diagonal partial pairwise correlations^35,36. Other than that, the data matrix’s interrelationships should be sufficient. If the correlation coefficient is less than 0.33, no factor analysis can be performed^30,36. In addition, correlation matrix that gives a very high correlation (> 0.80) to the pair item need to be eliminated by any of the pair.

The principal component analysis (PCA) with the varimax rotational method was applied to examine the dimensionality and the construct validity of MUAPHQ C-19. The factor loading for every item must be 0.4 or greater to retain the item and forms the factor structure^32,33. The cross-loading items are recommended to be eliminated. According to Çokluk³⁷a cross-loading item is defined as item that has a high factor loading on more than one component and the difference between that of components are less than 0.10. Samuels³⁵ has suggested that the item elimination process should be repeated until no cross-loading items remain.

The number of factors are determined in three ways^32,38,39: by referring to eigenvalue > 1.0; or by referring to the minimum cumulative variance of 60%; or by referring to the scree plot (the data points above the point of inflection). Which items belong to which component were assessed using the rotated component matrix^38,39. Other than that, communality values also were taken into consideration. The communality measured the percent of variance in each variable is explained by all the factors jointly. The communality values greater than 0.3 was taken as the cut-off value for item removal consideration. However, the communality cut-off value of greater than 0.2 is still considered acceptable⁴⁰.

Reliability analysis

There are two types of reliability study which includes the internal consistency reliability test and test-retest reliability. This reliability was accounted for the scores’ agreement to the same subjects and not be regarded as randomized samples.

The internal consistency for all domains was determined based on the Cronbach’s alpha coefficient. Literatures have recommended that Cronbach’s alpha of more than 0.7 was considered acceptable^39,41. There were also another literatures that supported the Cronbach’s alpha above 0.6 as acceptable and highly reliable^42,43.

The test-retest reliability was analysed using the intra-class correlation coefficient (ICC). ICC values of greater than 0.75 indicate a good reliability^44,45. Two-way mixed effect model and absolute agreement was applied based on average measurement of baseline data and two to four weeks after the first data collected.

Ethical consideration

The ethical approval was obtained from the Universiti Teknologi MARA (UiTM) Research Ethics Committee (REC/01/2021 (MR/01)) and Medical Research and Ethics Committee, MREC (NMRR-21-319-58243 (IIR)) before the study commencement. The research complied with the ethical principles outlined in the Declaration of Helsinki and the Malaysian Good Clinical Practice Guideline. There were no identifiers collected in the data collection form.

Informed consent process

An inform consent was obtained from the interested and eligible participants prior to their involvement in the study. Participants were approached and asked about their willingness to participate in the research. They were fully informed of the study’s background, the purpose of the study, the benefit of the study, the process of the study, and participation in the study, confidentiality, and the researcher’s contact numbers were given as a hard copy to those participants who agreed to participate in the study. This information was provided both verbally and in writing through an informed consent form in the Participant Information Sheet (PIS) and the participation was voluntary. The participants had the right to withdraw from the study at any time, including during and follow-up stages without any penalties.

Results

Sociodemographic characteristics

A total of 100 participants that fulfilled the inclusion and exclusion criteria of the study and completed the questionnaire were taken for the exploratory factor analysis (PPUiTM Sungai Buloh, n = 24 respondents; HUiTM Puncak Alam, n = 38 respondents; HKL, n = 38 respondents). The drop-out rate was 1.9%. The demographic characteristics of the respondents are presented in Table 1.

Table 1 The sociodemographic of the respondents (N = 100).

Full size table

Exploratory factor analysis (EFA)

The mean and standard deviation, communalities extraction and rotated component matrix for the understanding, attitude, practice and health literacy domains are displayed in Table 2.

Table 2 The mean and standard deviation, communalities extraction and rotated component matrix for the understanding, attitude, practice and health literacy domains.

Full size table

Understanding domain

The range of mean scores of the items in the understanding domain was between 8.28 and 9.06. Due to the presence of a high correlation between items U4 and U3 with the value of 0.827 and the cross-loading item of U12, the EFA for the understanding domain was re-analysed after removing the item U4 and U12.

Attitude domain

The range of mean scores of the items in the understanding domain was between 4.38 and 8.91. Item A4 was deleted due lto the ower value of anti-imge correlation than the threshold (0.500). Then, data was re-analyzed without item A4. Following this revised analysis, a cross-loading issue was identified with item A10, as it loaded onto factors 1 and 3 with values less than 0.1. Therefore, item 10 was also removed, and the data was re-analysed again by excluding both item A4 and item 10.

Practice domain

The range of mean scores of the items in the practice domain was between 5.05 and 8.97. There is no deleted item in this domain.

Health literacy

The range of mean scores of the items in the health literacy domain was between 7.04 and 9.72.

Item HL1, HL4 and HL6, was deleted due to highly correlated and Item HL7 was deleted due to lower value of anti-image correlation analysis than the threshold (0.500).

Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy and bartlett’s test

The result of KMO Measure of Sampling Adequacy and Bartlet’s Test for each domain are shown in Table 3.

Table 3 Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy and bartlett’s Test.

Full size table

Since the KMO is more than 0.6 and Bartlett’s test of sphericity is significant (p < 0.005) in all domains, it indicates that the sample data is adequate to proceed with the reduction procedure in EFA.

Total variance explained

The total variance explained for all domains are shown in Table 4.

Table 4 The total variance explained for all domains.

Full size table

Understanding

After analyzing with 10 items (after removing U4 and U12), two components emerged from the EFA with the computed Eigenvalue greater than 1.0. The eigenvalues ranged between 2.694 and 4.131. The variance explained 41.308% for Component 1, and 26.942% for Component 2. The total variance explained for measuring this domain is 68.250%, well exceeding the cut-off value of 60% and improved after the item removal procedure.

Attitude

After analyzing with 13 items (after removing items A3 and A10) the EFA revealed three components based on Eigenvalues greater than 1.0, with the Eigenvalues ranging from 1.740 to 2.964. The variance explained for 22.802% for Component 1, 22.787% for Component 2, and 13.385% for Component 3. In total, these components explain 58.973% of the variance, which is considered an acceptable level for measuring this construct.

Practice

After analyzing with 11 items (without removing any item), the EFA revealed four components based on the computed eigenvalue greater than 1.0. The eigenvalues ranged between 1.586 and 2.184. The variance explained 19.982%, 19.543%, 19.490% and 14.414% for Component 1, 2, 3 and 4, respectively. The total variance explained for measuring this construct is 73.300%, exceeding the cut-off value of 60%.

Health literacy

After analyzing with 12 items (A = after Removing Item HL1, HL4, HL6 and HL7), the EFA revealed only one component emerged based on the computed eigenvalue greater than 1.0. The eigenvalue is 5.135. The total variance explained for 64.194% for Component 1 which is acceptable.

Reliability analysis

The reliability of the MUAPHQ C-19 was assessed using two methods: Cronbach’s alpha coefficient and the intra-class coefficient (ICC) value. The Cronbach’s alpha was computed to assess the internal consistency reliability of the domains in the MUAPHQ C-19. In contrast, the ICC was computed to assess the test-retest reliability of the domains over time.

Cronbach’s alpha

Based on Table 5, Cronbach’s Alpha of all domains in MUAPHQ C-19 (Version 3.0) was good to excellent. This indicates that the MUAPHQ C-19 is a good and reliable scale.

Table 5 Cronbach’s alpha for each domain and the total cronbach’s alpha of the MUAPHQ C-19 before and after the item removal. (N = 100)

Full size table

Intraclass correlation (ICC)

Table 6 shows the intraclass coefficient (ICC) for the test-retest of every domain in the MUAPHQ C-19 (Version 3.0). There were 30 out of 100 participants completed the test-retest (30.0%). Overall, the range of ICC for all domains lies from 0.562 to 0.715. The stability of understanding, attitude, practice and health literacy domain was 0.715, 0.698, 0.759 and 0.562, respectively. The practice domain has the highest ICC value, which indicates that the reliability is good. While the understanding, attitude and health literacy domains show fair reliability. In addition, the mean score between the test and retest for all domains is not significant. This indicates that the domains are stable and reliable over time.

Table 6 Intraclass coefficient (test-retest) for all domains (N = 30).

Full size table

Summary

Table 7 presents a summary of the items for all domains in the MUAPHQ C-19 before and after the EFA analysis. Initially, there were 50 items in the MUAPHQ C-19 questionnaire. However, after conducting the EFA, eight items were removed for various reasons, resulting in a total of 42 items remaining at the conclusion of the analysis.

Table 7 The summary of the items for all domains in MUAPHQ C-19 before and after EFA analyses.

Full size table

Discussion

In this study 100 responses were obtained and used for the exploratory data analysis (EFA) as the preliminary data to explore the factor structure of a set of measured variables in MUAPHQ C-19. Hair et al.³² has mentioned that a EFA can be conducted as a pilot study to test the justification, straightforwardness and the suitability of the gathered information which later the usefulness of each item and their dimensionality in measuring their respective latent construct can be measured by the EFA⁴⁶.

EFA is useful in determining the constructs under a given data set and the extent to which these constructs represent the original variables. Besides, EFA also can investigate the correlations between the observed variables. EFA also has the ability to combine the common variables in the dataset into descriptive categories and thus reducing the number of factors. This will lead to shrinking a relatively large set of variables to a smaller and more manageable number while preserving the original variance as much as possible⁴⁷.

The dimensionality of items may change when the current study differs from the previous study in terms of the difference in the industry, the difference in culture and socioeconomic status between the two populations, and the lapse in time (duration) between the current study and the previous studies. In other words, the dimensions obtained by previous studies might not hold, especially when the current study is conducted in a different environment and different industry. Many researchers suggested that the EFA procedure is necessary for every construct before proceed to the Confirmatory Factor Analysis (CFA) to determine if the dimensionality of items has changed from a previous study where the dimensions were developed^33,38,39,48.

Regarding sample size for EFA, there are debates regarding the rule of thumb or the best sample size estimation needed for EFA to be conducted. There are many ways of determining the sample size needed for an EFA, which include the minimum number of cases, samples to variables ratio (N/p) and factor loading requirement. The suggested sample size ranges from as low as less than 50 to as high as greater than 400 ⁴⁷. Ideally, the EFA will function better and gives a more stable solution with a larger sample size by reducing the margin of error. In this study, 100 samples have been used for EFA, which still falls within the range of the suggested sample size⁴⁹. Furthermore, 100 samples can be considered adequate for the EFA as evidenced by the Kaiser-Meyer-Olkin result above 0.6 and the significant Bartlett’s test for Sphericity result.

The development of MUAPHQ C-19 was continued with several processes of adjusting and deleting items following the EFA results for every domain in the questionnaire. The emerging components from each domain were renamed accordingly, following the meaningful characteristics of its items. The deletion of the items was done following several issues on the very high or low correlation between items and the presence of an item with cross-loading, indicating the redundancies between the subdomains. Redundancies are essentially worth deletion as they ask the same things differently^12,47. When developing a questionnaire, decisions to remove items during Exploratory Factor Analysis (EFA) should be guided by both statistical evidence and conceptual considerations⁵⁰. Items with low factor loadings, low communalities, or significant cross-loadings should be considered for removal. From a conceptual standpoint, it is crucial to assess the relevance of each item to the underlying construct before deciding on its removal. Therefore, a balance between statistical evidence and conceptual considerations must be maintained⁵¹.

According to established guidelines, Cronbach’s alpha values above 0.7 generally indicate acceptable reliability for psychometric tools⁵². Therefore, it can be concluded that the questionnaire demonstrates an adequate reliability, and the items appear to measure the constructs consistently.

The MUAPHQ C-19 was developed to discuss on the association between health literacy to the knowledge, attitudes and practices on COVID-19. MUAPHQ C-19 targets health literacy as a central element, while instruments by Rahman et al.,⁵³ primarily focus on general KAP without a strong emphasis on health literacy. This developed tool is very important for the real-world application designed to assess the understanding, attitudes, practices and health literary related to COVID-19, and can be effectively utilized by various stakeholders such as policy maker and health care providers which could be integrated into community⁵⁴.

Besides, the rearrangement of the items and factor structure was needed, thus making the questionnaire development has deviated from the earlier conceptual framework proposed in the study. For example, in the Health Belief Model theory, four elements were adapted to be the questionnaire’s subdomain for the attitude domain. The four elements were Perceived Susceptibility, Perceived Severity, Perceived Benefits and Perceived Barriers. EFA has grouped all the items under the subdomain of Perceived Barriers as one component. However, the items developed under the other three subdomains were regrouped into two groups, making the researcher renamed them accordingly. This could be due to some items that the research team thought had grouped differently, but in fact, there are sharing the same elements of meaning with the respondents.

The health literacy domain was initially developed by following the European Health Literacy Survey Framework, which used four elements: access, understand, appraise and apply the information related to health. However, EFA has suggested regrouping all the items into one group only. This may imply that EFA has defined the variable and factor better with more suitable items under the suggested factor. During the COVID-19 pandemic, health literacy has emerged as a critical factor influencing how individuals interpret public health messages¹⁴. Individuals with higher health literacy were better equipped to navigate the overwhelming influx of information, discern credible sources, and make informed decisions⁵⁵.

In summary for EFA, after considering the issues regarding the data adequacy and suitability for analysis, item’s correlation, factor’s extraction and retention, item’s retention and deletion, as well as item’s factor loading, the study has deleted eight items, thus, retained the 42 items under all four domains. The understanding domain retained two factors with ten items, the attitude domain retained three factors with 13 items, the practice domain retained four factors with 11 items, and the health literacy domain with one factor with eight items. The 42-item-Malay Version of the Understanding, Attitude, Practice and Health Literacy Questionnaire on COVID-19explored in the EFA proceeded for the confirmatory method in the confirmatory factor analysis afterwards. An ongoing evaluation and updating of the questionnaire are needed to ensure its continued relevance and accuracy in line with the most current information and public health recommendations.

Strength and limitation of the study

This study has a few limitations. Firstly, the sample selections were limited to populations that visited three leading hospitals near Kuala Lumpur and Selangor region. Thus, the results of the study do not represent a nationwide outcome. However, the researcher has employed systematic random sampling to minimise the selection bias and improve the generalizability of the study. This sample study was drawn from hospital attendees, which may not be fully representative of the general Malaysian population. Therefore, it affects the generalizability. Future research is needed to replicate this study in more diverse settings to better capture broader population. A cross-sectional study was chosen in this study utilized a self-reported data which also subjected to response bias which limiting the representativeness and generalizability of the results.

The validated MUAPHQ C-19 was developed in a single language which is the Malay language. Thus, it opened an unequal opportunity for Malaysian citizens with low Malay language literacy. Future translation studies can be conducted into English, Tamil and Mandarin to cater for broader study populations. Next, the utility of the MUAPHQ C-19 questionnaire could become constrained as time progresses beyond the period of lifted control measures. Nonetheless, given that the preventive measures and controls for COVID-19 are relevant to a substantial portion of airborne and droplet-mediated transmission of infections, its applicability extends to potential respiratory outbreak scenarios soon.

This study did not examine whether reliability or factor structures differ across demographic groups. It is recommended to conduct these subgroup analyses in the future research for better understand how the questionnaire performs across various demographic groups. It will help the consistency and applicability of the tool in broader public health efforts related to COVID-19.

Conclusion

This study produced an exploration valid and reliable new 42-item MAPHQ C-19 that can be used to measure the attitude, health literacy and practice towards COVID-19 among the general public in Malaysia. Future studies on the nationwide population using MAPHQ C-19 (MUAPHQ C-19 can be conducted from time to time to monitor the public’s literacy and behaviour towards COVID-19 and to assess the effectiveness of the prevention and control program implemented.

This study developed a new, valid and reliable 42-item MUAPHQ C-19 questionnaire to measure the understanding, attitude, practice and health literacy towards COVID-19 among the general public in Malaysia. The tool demonstrates preliminary evidence of validity and reliability, based on the results of EFA and internal consistency analysis. Further validation studies, including confirmatory factor analysis (CFA) in future research need to be conducted to strengthen the psychometric properties of the instrument. Additionally, future nationwide studies using the questionnaire can be conducted periodically to monitor public literacy and behaviours towards COVID-19 and assess the effectiveness of the implemented prevention and control program.

Data availability

All data generated or analyzed during this study are available from the corresponding author on reasonable request.

References

World Health Organization, W. Coronavirus disease 2019 (COVID-19). Situation report – 51. (2020).
Eurosurveillance Editorial, T. Note from the editors: world health organization declares novel coronavirus (2019-nCoV) sixth public health emergency of international concern. Euro. Surveill. 25 https://doi.org/10.2807/1560-7917.ES.2020.25.5.200131e (2020).
Abou-Abbas, L. et al. Knowledge and practice of physicians during COVID-19 pandemic: a cross-sectional study in Lebanon. BMC Public. Health. 20, 1474. https://doi.org/10.1186/s12889-020-09585-6 (2020).
Article CAS PubMed PubMed Central Google Scholar
Çolakoğlu, M. K., Mehmet Özgün, Y., Pişkin, E., Birol Bostancı, E. & Özmen, M. M. The attitude of Turkish general surgeons during the COVID-19 pandemic: results of general surgery COVID-19 pandemic attitude survey. Turk. J. Surg. 36, 137–146. https://doi.org/10.5578/turkjsurg.4809 (2020).
Article PubMed PubMed Central Google Scholar
Azlan, A. A., Hamzah, M. R., Sern, T. J., Ayub, S. H. & Mohamad, E. Public knowledge, attitudes and practices towards COVID-19: A cross-sectional study in Malaysia. PLOS ONE. 15, e0233668. https://doi.org/10.1371/journal.pone.0233668 (2020).
Article CAS PubMed PubMed Central Google Scholar
Alabed, A. A. A., Elengoe, A., Anandan, E. S. & Almahdi, A. Y. Recent perspectives and awareness on transmission, clinical manifestation, quarantine measures, prevention and treatment of COVID-19 among people living in Malaysia in 2020. Z. Gesundh Wiss. 1–10. https://doi.org/10.1007/s10389-020-01395-9 (2020).
World Health Organization. W. Knowledge, attitudes, and practices (KAP) surveys during cholera vaccination campaigns: guidance for oral cholera vaccine stockpile campaigns. 1–41 (2014).
World Health Organization, W. Advocacy, Communication and Social Mobilization for TB Control: a Guide To Developing Knowledge, Attitude and Practice SurveysVol. 46 (World Health Organization, 2008).
Goni, M. D. et al. Health education intervention as an effective means for prevention of respiratory infections among Hajj pilgrims: A review. Front. Public. Health. 8, 449. https://doi.org/10.3389/fpubh.2020.00449 (2020).
Article PubMed PubMed Central Google Scholar
Mujibur Rahaman, M. et al. Knowledge, attitude, and practice of a local community towards the prevention and control of rabies in gaibandha, Bangladesh. J. Adv. Vet. Anim. Res. 7, 414–420. https://doi.org/10.5455/javar.2020.g436 (2020).
Article CAS PubMed PubMed Central Google Scholar
Soltanizadeh, N. et al. Knowledge, attitude, and practice among staff associated with human papillomavirus vaccine of young children in Iran. Med. J. Malaysia. 75, 543–547 (2020).
CAS PubMed Google Scholar
Tsang, S., Royse, C. R. & Terkawi, A. Guidelines for developing, translating, and validating a questionnaire in perioperative and pain medicine. Saudi J. Anesth. 11, 80–89. https://doi.org/10.4103/sja.SJA_203_17 (2017).
Article Google Scholar
Abdullah, A., May, L. S., Salim, H. S., Ng, C. J. & Chinna, K. Health literacy research in malaysia: A scoping review. Sains Malaysiana. 49, 1021–1036. https://doi.org/10.17576/jsm-2020-4905-07 (2020).
Article Google Scholar
Okan, O. et al. Coronavirus-Related health literacy: A Cross-Sectional study in adults during the COVID-19 infodemic in Germany. Int. J. Environ. Res. Public Health. 17, 5503. https://doi.org/10.3390/ijerph17155503 (2020).
Article CAS PubMed PubMed Central Google Scholar
Patil, U. et al. Health literacy, digital health literacy, and COVID-19 pandemic attitudes and behaviors in U.S. College students: implications for interventions. Int. J. Environ. Res. Public. Health. 18, 3301. https://doi.org/10.3390/ijerph18063301 (2021).
Article ADS PubMed PubMed Central Google Scholar
Department of Statistics Malaysia. Key Findings Population and Housing Census of Malaysia 2020: Administrative District. (2022). https://www.dosm.gov.my/portal-main/release-content/key-findings-population-and-housing-census-of-malaysia-2020-administrative
Mokkink, L. B., Prinsen, C. A. C., Bouter, L. M., Vet, H. C. W. & Terwee, C. B. The COnsensus-based standards for the selection of health measurement instruments (COSMIN) and how to select an outcome measurement instrument. Braz. J. Phys. Ther. 20, 105–113. https://doi.org/10.1590/bjpt-rbf.2014.0143 (2016).
Article PubMed PubMed Central Google Scholar
Guillemin, F., Bombardier, C. & Beaton, D. Cross-cultural adaptation of health-related quality of life measures: literature review and proposed guidelines. J. Clin. Epidemiol. 46, 1417–1432. https://doi.org/10.1016/0895-4356(93)90142-n (1993).
Article CAS PubMed Google Scholar
Kyriazos, T. A. & Applied Psychometrics Sample size and sample power considerations in factor analysis (EFA, CFA) and SEM in general. Psychology 2207–2230. https://doi.org/10.4236/psych.2018.98126 (2018).
Cattell, R. B. The Scientific Use of Factor Analysis in Behavioral and Life Sciences.Plenum, (1978).
Gorsuch, R. Factor Analysis (L. Erlbaum Associates, 1983).
Tabachnick, B. G. & Fidell, L. Using Multivariate Statistics (Pearson Education Inc., 2013).
Comrey, A. L. & Lee, H. B. A First Course in Factor Analysis (Lawrence Eribaum Associates, 1992).
Guadagnoli, E. & Velicer, W. F. Relation of Sample Size to the Stability of Component Patterns. Vol. 103, pp. 265–275 (1988).
Wang, L. L., Watts, A. S., Anderson, R. A. & Little, T. D. Common Fallacies in Quantitative Research Methodology718–758 (Oxford University Press, 2013).
Ghasemi, A. & Zahediasl, S. Normality tests for statistical analysis: A guide for non-statisticians. Int. J. Endocrinol. Metabolism. 10, 486–489. https://doi.org/10.5812/ijem.3505 (2012).
Article Google Scholar
Läärä, E. & Statistics Reasoning on uncertainty, and the insignificance of testing null. Ann. Zool. Fennici. 46, 138–157 (2009).
Article Google Scholar
Dalawi, I., Isa, M. R., Chen, X. W., Azhar, Z. I. & Aimran, N. Development of the Malay Language of understanding, attitude, practice and health literacy questionnaire on COVID-19 (MUAPHQ C-19): content validity & face validity analysis. BMC Public. Health. 23, 1131. https://doi.org/10.1186/s12889-023-16044-5 (2023).
Article PubMed PubMed Central Google Scholar
(IBM Corp & Armonk, N. Y. IBM Corp, Armonk, NY: IBM Corp, (2020).
Acar Güvendir, M. & Özer Özkan, Y. Item removal strategies conducted in exploratory factor analysis: A comparative study. Int. J. Assess. Tools Educ. 9, 165–180. https://doi.org/ (2022).
Article Google Scholar
Kaiser, H. F. An index of factorial simplicity. Psychometrika 39, 31–36. https://doi.org/10.1007/BF02291575 (1974).
Article Google Scholar
Hair, J. F., Black, W. C., Babin, B. J. & Anderson, R. E. Multivariate data analysis (7th ed.)Pearson Education, Upper Saddle River,. (2014).
Mohd Yusoff, S. & Tengku Ariffin, T. F. Development and validation of contextual leadership instrument for principals in Malaysian school context (MyCLIPS). Leadersh. Policy Schools. 1–16. https://doi.org/10.1080/15700763.2021.1971259 (2021).
Sarstedt, M. & Mooi, E. in A Concise Guide to Market Research: The Process, Data, and Methods Using IBM SPSS Statistics (eds Marko Sarstedt & Erik Mooi) 235–272Springer Berlin Heidelberg, (2014).
Samuels, P. Advice on Exploratory Factor Analysis. Technical ReportCentre for Academic Success, Birmingham City University.,. (2017).
Can, A. SPSS ile bilimsel araştırma sürecinde nicel veri analizi [Quantitative data analysis in the process of scientific research with SPSS]. (2016).
Çokluk, Ö., Şekercioğlu, G. & Büyüköztürk, Ş. Sosyal bilimler için çok değişkenli istatistik [Multivariate statistics for social sciences]. (2010).
Awang, Z. Research methodology for business & social science (Edition 2010.). (University Publication Centre (UPENA), UiTM., (2010).
Awang, Z. Research Methodology and Data Analysis (Universiti Teknologi MARA, 2012).
Cattell, R. B. The scree test for the number of factors. Multivar. Behav. Res. 1, 245–276. https://doi.org/10.1207/s15327906mbr0102_10 (1966).
Article CAS Google Scholar
Tavakol, M. & Dennick, R. Making sense of cronbach’s alpha. Int. J. Med. Educ. 2, 53–55. https://doi.org/10.5116/ijme.4dfb.8dfd (2011).
Article PubMed PubMed Central Google Scholar
Nunnally, J. C. & Bernstein, I. R. Psychometric Theory 3rd edn (McGraw-Hill, 1994).
Pallant, J. PSS Survival manual - a Step by Step Guide To Data Analysis Using SPSS for Windows (version 10) (Buckingham Open University, 2001).
Koo, T. K. & Li, M. Y. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 15, 155–163. https://doi.org/10.1016/j.jcm.2016.02.012 (2016).
Article PubMed PubMed Central Google Scholar
Portney, L. & Watkins, M. P. Foundation of clinical research. Application to practice. Upper Saddle River, 61–77 (2009).
Khirfan, R., Aziz, A. A. & Awang, Z. Development and validation of instrument measuring perception of healthcare provider on cost containment and related practices in hospital setting. Int. J. Acad. Res. Bus. Social Sci. 12, 588–607. https://doi.org/10.6007/IJARBSS/v12-i4/13094 (2022).
Article Google Scholar
Sürücü, L., Yikilmaz, İ. & Maslakci, A. Exploratory Factor Analysis (EFA) in Quantitative Researches and Practical Considerations. (2022).
Mahmudul Hoque, A. S. M., Awang, Z., Jusoff, K., Salleh, F. & Muda, H. Social business efficiency: instrument development and validation procedure using structural equation modeling. Int. Bus. Manage. 11, 222–231. https://doi.org/10.36478/ibm.2017.222.231 (2017).
Article Google Scholar
Suhr, D. D. Exploratory or confirmatory factor analysis? Paper 200–231 (2006).
Hickman, R. L. Jr, Pinto, M. D., Lee, E. & Daly, B. J. Exploratory and confirmatory factor analysis of the decision regret scale in recipients of internal cardioverter defibrillators. J. Nurs. Meas. 20, 21–34. https://doi.org/10.1891/1061-3749.20.1.21 (2012).
Article PubMed PubMed Central Google Scholar
Knekta, E., Tunyon, C. & Eddy, S. One size doesn’t fit all: using factor analysis to gather validity evidence when using surveys in your research. Life Sci. Educ. 18, 1–17. https://doi.org/10.1187/cbe.18-04-0064 (2019).
Article Google Scholar
Nunnally, J. C. Psychometric Methods (McGraw-Hill., 1978).
Rahman, M. M. et al. Knowledge, attitude and practices toward coronavirus disease (COVID-19) in Southeast and South asia: A mixed study design approach. Front. Public. Health. 10, 875727. https://doi.org/10.3389/fpubh.2022.875727 (2022).
Article PubMed PubMed Central Google Scholar
Zhong, B. L. et al. Knowledge, attitudes, and practices towards COVID-19 among Chinese residents during the rapid rise period of the COVID-19 outbreak: a quick online cross-sectional survey. Int. J. Biol. Sci. 16, 1745–1752. https://doi.org/10.7150/ijbs.45221 (2020).
Article PubMed PubMed Central Google Scholar
de Gani, S. M., Berger, F. M. P., Guggiari, E. & Jaks, R. Relation of corona-specifc health literacy to use of and trust in information sources during the COVID-19 pandemic. BMC Public. Health. 22, 42. https://doi.org/10.1186/s12889-021-12271 (2022).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

Acknowledgements We want to acknowledge the Universiti Teknologi MARA, UiTM and participants who involved in this study for their participation, contribution, and dedication in this study. Finally, we want to acknowledge the Director General of Health Malaysia for the permission to publish this manuscript.

Funding

This work was self-funded.

Author information

Authors and Affiliations

National Public Health Laboratory, Ministry of Health, Malaysia, Lot 1853, Kampung Melayu, Sungai Buloh, 47000, Selangor, Malaysia
Izzaty Dalawi
Department of Public Health Medicine, Faculty of Medicine, Universiti Teknologi MARA, Jalan Hospital, Sungai Buloh, 47000, Selangor, Malaysia
Mohamad Rodi Isa
School of Mathematical Sciences, College of Computing, Informatics and Media, Universiti Teknologi MARA, Shah Alam, 40450, Selangor, Malaysia
Nazim Aimran

Authors

Izzaty Dalawi
View author publications
Search author on:PubMed Google Scholar
Mohamad Rodi Isa
View author publications
Search author on:PubMed Google Scholar
Nazim Aimran
View author publications
Search author on:PubMed Google Scholar

Contributions

ID and MRI designed the project. ID managed the project and collected all the interviews. ID, MRI and NA analyzed the data. ID and MRI edited the manuscripts and revise manuscript. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Mohamad Rodi Isa.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Dalawi, I., Isa, M.R. & Aimran, N. Exploratory factor analysis on the development and validation of the understanding, attitude, practice and health literacy questionnaire on COVID-19 in Malay language. Sci Rep 15, 19654 (2025). https://doi.org/10.1038/s41598-025-04517-z

Download citation

Received: 09 February 2025
Accepted: 27 May 2025
Published: 04 June 2025
Version of record: 04 June 2025
DOI: https://doi.org/10.1038/s41598-025-04517-z

Subjects

Abstract

Similar content being viewed by others

COVID-19 awareness, knowledge and perception towards digital health in an urban multi-ethnic Asian population

Predicting coronavirus disease 2019 severity using explainable artificial intelligence techniques

An interpretable machine learning model based on a quick pre-screening system enables accurate deterioration risk prediction for COVID-19

Introduction

Materials and methods

Study design

Study setting

Target population

Sample size determination

Exploratory factor analysis (EFA)

Test-retest reliability

Sampling method

Instrument

Understanding

Attitude

Practice

Health literacy

Conduct of the study

Exploratory factor analysis

Test-retest reliability

Statistical analysis

Descriptive statistics

Exploratory factor analysis (EFA)

Reliability analysis

Ethical consideration

Informed consent process

Results

Sociodemographic characteristics

Exploratory factor analysis (EFA)

Understanding domain

Attitude domain

Practice domain

Health literacy

Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy and bartlett’s test

Total variance explained

Understanding

Attitude

Practice

Health literacy

Reliability analysis

Cronbach’s alpha

Intraclass correlation (ICC)

Summary

Discussion

Strength and limitation of the study

Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Electronic supplementary material

Supplementary Material 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links