Main

Noncommunicable diseases (NCDs) are major contributors to morbidity and mortality worldwide, particularly in low-income and middle-income countries. India, home to nearly 20% of the global population, has undergone rapid nutrition and epidemiological transitions over the past two decades, resulting in an alarming increase in NCD prevalence. The recent Indian Council of Medical Research–India Diabetes (ICMR–INDIAB) study1, a national cross-sectional population-based survey conducted from November 2008 to December 2020, reported weighted (for region, age and sex) prevalence of type 2 diabetes (T2D) and prediabetes of 11.4% and 15.3%, respectively. The prevalence of generalized obesity and abdominal obesity was also high at 28.6% and 39.5%, respectively. NCDs account for 6.3 million (68%) of all deaths in India2. By 2060, the economic costs of overweight and obesity are projected to reach $839 billion (2.47% of the nation’s GDP)3. Given the enormous public health burden of NCDs and their associated economic costs, it is crucial to identify cost-effective, practical strategies for reducing NCD risk.

Unhealthy diets contribute substantially to the NCD burden. Improving diet and physical activity can prevent nearly 50% of incident T2D4. Global per capita calorie availability has increased from the 1980s to 2013, with carbohydrates contributing the most (70.5%) and protein the least (10.5%)5. In India, data from epidemiological studies6 and national nutrition surveys, such as those by the National Nutrition Monitoring Bureau (NNMB)7 and National Sample Survey Organization8, indicate that Indian adults currently consume ~65% to 75% of calories (%E) from carbohydrates (one of the highest in the world), 9–11%E from proteins and 14–23%E from fats. Previous ICMR–INDIAB analyses reported macronutrient intakes stratified by glycemic status9. However, examining macronutrients by their quality (refined carbohydrates versus whole grains, animal versus plant protein and saturated fat versus unsaturated fat) is essential for effective national dietary recommendations, as they are differentially associated with NCDs10,11,12. Due to India’s dietary diversity, understanding inter-regional and intraregional differences in macronutrient intake can inform targeted policy recommendations. While evidence suggests that reducing low-quality carbohydrates and increasing intake of plant protein and healthy fats improves cardiometabolic health13,14, there is limited data on the effect of replacing carbohydrates with other macronutrients among Asian Indians.

The ICMR–INDIAB survey is a nationally representative cross-sectional survey of NCDs in Indian adults from urban and rural areas of 30 states, union territories and the National Capital Territory (NCT) of Delhi1. Dietary data from every fifth participant were used to characterize India’s dietary profile across regions and examine inter-regional and intraregional differences in macronutrient intakes. We also investigated the associations between dietary factors and metabolic risk, and evaluated the effects of isocaloric substitution of macronutrients on glycaemia and adiposity. Study findings and policy implications are summarized in Table 1.

Table 1 Policy implications of Indian diets on metabolic risk

Results

Study characteristics

The demographic and metabolic risk factors overall and by region are given in Table 2. The median age of participants was 40 years and 49% were male. Tobacco use (in both smoke and smokeless forms) was reported by 15% of the overall population, ranging from 8% in the West to 24% in the Northeast. Nationally, current alcohol use that included toddy (fermented Asian palm sap), country liquor and other alcoholic beverages was 15% with the highest prevalence in the Northeastern region (23%). A total of 61% were physically inactive, defined as a physical activity level (PAL) between 1.40 and 1.69 (ref. 15). The median body mass index (BMI) was 22.2 kg m−2, 43% were overweight (BMI ≥ 23 to <25 kg m−2), 26% were obese (BMI ≥ 25.0 kg m−2) and 36% had abdominal obesity (waist circumference ≥ 90 cm for men and ≥80 cm for women). The Northern region reported the highest rates of overweight (54%), obesity (37%) and abdominal obesity (48%). Conversely, Eastern region states had the lowest prevalence of overweight (31%), obesity (16%) and abdominal obesity (29%). Nearly one-third of the population had hypertension (27%) with little inter-regional differences. The prevalence of newly diagnosed T2D was 9% overall, with the highest prevalence in the West (12%) and lowest in the East and Northeast (6%). Newly diagnosed prediabetes was prevalent in 41% with the highest prevalence in the Central region (49%). Dyslipidaemia, defined using the National Cholesterol Education Program Adult Treatment Panel III guidelines16, was prevalent in half of the population. Overall, 83% of participants had at least one metabolic risk factor defined as the presence of either newly diagnosed T2D, prediabetes, dyslipidaemia, general obesity, abdominal obesity or hypertension.

Table 2 Demographic characteristics and metabolic risk factors of the study population stratified by regions (n = 18,090)

Compared to rural participants, urban participants were less likely to use tobacco (14% versus 16%) and alcohol (14% versus 15%), but were more likely to be physically inactive (70% versus 57%), and had higher weighted median BMI (23.5 kg m−2 versus 21.6 kg m−2), waist circumference (83.5 cm versus 78.5 cm), fasting capillary blood glucose (96 mg dl−1 versus 95 mg dl−1), serum triglycerides (115 mg dl−1 versus 108 mg dl−1) and low-density lipoprotein cholesterol (100 mg dl−1 versus 90 mg dl−1). They also had a higher weighted prevalence of overweight (56% versus 37%), generalized obesity (37% versus 22%), abdominal obesity (48% versus 31%), hypertension (29% versus 27%) and metabolic risk (87% versus 82%). Compared to men, women were younger (39 years versus 41 years) and were less likely to use tobacco (2% versus 28%) or alcohol (3% versus 27%), but more likely to be physically inactive (65% versus 56%), overweight (43% versus 40%), have general obesity (30% versus 23%) and abdominal obesity (47% versus 24%); however, they were less likely to have hypertension (24% versus 31%) or dyslipidaemia (42% versus 60%; Supplementary Table 1).

Sensitivity analyses showed participants in the diet sub-study were more likely to be male, current tobacco or alcohol users, and less likely to have a family history of T2D (Supplementary Table 2) with no significant differences in BMI, blood pressure or fasting capillary glucose concentrations compared to those not included in this study.

National nutrition profile

Cereal staple choices, the primary constituents of Indian diets, varied widely by region (Fig. 1 and Extended Data Table 1). Refined cereal (white rice) was the primary choice in the Northeast (99%), South (87%) and East (78%), and whole wheat (milled as flour) in the North (90%) and Central (70%) regions. Intraregional variations included a lower preference for white rice in Karnataka (Southern region, 47.6%) and Bihar (Eastern region, 53%). Millets were consumed alongside rice and wheat in 16 states, while whole millet flour was consumed only in Karnataka (47.7%) and Gujarat and Maharashtra (both 35%). Among the millet choices, sorghum and finger millet were commonly consumed in Karnataka, while pearl millet and maize consumption were greater in Gujarat, and pearl millet and sorghum were also commonly consumed in Maharashtra.

Fig. 1: Top three cereal grain staple choices reported by the study population by states and union territories in India (n = 18,090).
figure 1

Map represents the top three cereal staples reported by the weighted percentage of the study population in each state.

Macronutrient profiles expressed as a %E for the nation overall and by geographic regions are shown in Fig. 2a–c and Extended Data Table 2. Nationwide, carbohydrates contributed 62.3% of total daily energy intake, mainly from refined cereals (28.5%E) and milled whole grains (16.2%E). Total fat contributed 25.2%E, while protein intake was low at 12%E (Extended Data Table 2), below the National Dietary Guideline recommendation of 15%E17.

Fig. 2: Inter- and intraregional intakes of macronutrients among the study population (n = 18,090).
figure 2

Data are presented as weighted median. Median and IQR values are provided in Extended Data Tables 2–5. a, Intakes of total carbohydrates and their selected food sources (%E). Refined cereals include white rice, refined wheat flour (maida), refined millet grains/flour (minor millets like little millet and foxtail millet) and wheat semolina (rava). Milled whole grains include whole wheat, whole millets and whole grains milled into flour. Added sugar includes white sugar, honey, jaggery and palm sugar added during cooking and on the table. b, Intakes of total fats and their subtypes. c, Intakes of total protein and their subtypes. Plant protein includes cereals, pulses, legumes, fruits, vegetables (including roots and tubers), nuts and oilseeds. Dairy protein includes fermented dairy protein (Indian style yogurt (curd), buttermilk, lassi) and nonfermented dairy protein (all types of milk (cow/buffalo/yak/goat), milk added to coffee and tea, paneer (curdled milk solids)). Animal protein includes red meat, poultry, fish and egg, and does not include dairy protein. For refined cereals and milled whole grains, the analysis only included participants who consumed ≥50 g. For added sugar, the analysis only included participants who consumed ≥1 teaspoon (≥5 g d−1). The connected lines for each region do not imply a trend.

Carbohydrate intake patterns

Carbohydrate intake was high across regions, with little inter-regional variability, 59.6% in the West to 64.8% in the East. However, sources of carbohydrate varied, with refined cereal intake highest in the Northeast (51.7%E), followed by the South (36%E) and East (31.5%E), largely from white rice (95% of refined cereals; Fig. 2a and Extended Data Table 2). The Central and Northern regions reported the highest intakes of milled whole grains (30.9%E and 27.8%E, respectively, versus 16.2%E nationally), mainly from wheat flour (15%E of 16.2%E). Millet flour contributed minimally (1.4%E). There was also a substantial inter-regional variability in added sugar intake, with median intakes ranging from 2.1%E in the Northeast to 7.9%E in the North.

Substantial variability in refined cereal intake was seen among states within regions (Fig. 2a and Extended Data Table 3). For example, within the Southern region, the consumption of refined cereals in Tamil Nadu was 51.5%E, of which white rice contributed to 47.3%E, as compared to the neighboring state of Karnataka, where the intake was only 25.8%E (23.8%E from white rice). Karnataka had the highest whole grain consumption in the region (25.1%E) with millets contributing 10%E. In the North, refined cereal consumption varied from 9.9%E (7.4%E from white rice) in NCT of Delhi to 18.4%E (14.6%E from white rice) in Himachal Pradesh. The least variability in refined cereal intake was in the Northeastern states, ranging from 49.8%E in Meghalaya to 56.5%E in Assam. Milled whole grain intake varied widely in the West, ranging from 9.3%E in Goa to 27.1%E in Gujarat. In the North, milled whole grain intakes ranged from 19.3%E in Delhi to 38.5%E in Punjab. Karnataka in the South and Gujarat in the West derived an equal proportion of energy from refined cereals and milled whole grains.

Consumption of added sugars (which only included sugar added during cooking, at the table and in processed foods that label for added sugar, such as aerated drinks) was above the recommended 5%E17 in 19 states, the NCT of Delhi and the union territory of Puducherry (Extended Data Table 3). Haryana and NCT of Delhi in the North, Karnataka in the South, Odisha in the East and Madhya Pradesh in the Central region had the highest added sugar intakes (>10%E). Added sugar consumption varied widely in the South (4.8%E in Tamil Nadu to 10.4%E in Karnataka) with little intraregion variability (~1%) in the West. In the Northeast, added sugar intake was within recommended limits (<5%E) except in Meghalaya (5.8%E; Extended Data Table 3).

Total fat and fat subtype patterns

Total fat intake ranged from 21.6%E in the Northeast to 27.9%E in the West (Fig. 2b and Extended Data Table 2). Saturated fats were the most consumed nationally (8.9%E), with the highest intake in the North (11.2%E) and lowest in the Northeast (7.4%E). Monounsaturated fatty acid (MUFA) intake was low overall (6.1%), ranging from 5% in the East to 6.7% in the North and Central regions. Polyunsaturated fatty acids (PUFA) intake showed more inter-regional variability, from 5.3%E in the Northeast to 9.1%E in the Central region. The Northeastern states had the lowest intakes of n6 polyunsaturated fat (4.6%E), and they had the highest intakes of n3 polyunsaturated fat (0.4%E) compared to intake nationally and in other regions.

Intraregional variation in total fat intake was minimal except in the Northeast, where it ranged from 15.6%E in Arunachal Pradesh to 28.9%E in Sikkim (Extended Data Table 4). All Northern states (except Himachal Pradesh), Andhra Pradesh (undivided) in the South, Goa and Gujarat in the West and Mizoram and Sikkim in the Northeast exceeded the general recommended saturated fat intake of <10%E18. Except for Jharkhand in the East, Chhattisgarh in the Central region and Arunachal Pradesh and Manipur in the Northeast, no state met the recommended saturated fat intake of <7%E19 for populations at risk of cardiovascular disease. MUFA and PUFA intakes also varied widely within regions. Within the Northeast, MUFA and PUFA intakes ranged at 4.2%E each in Arunachal Pradesh to 8%E and 9%E in Sikkim. Manipur and Nagaland (0.9%E each) had the highest intakes of n3 polyunsaturated fat in the nation, followed by Mizoram (0.5%E; Extended Data Table 4).

Total protein and protein subtype patterns

Overall, total protein intake in India was low (12%E), with the Northeast having the highest intake (13.6%E; Fig. 2c and Extended Data Table 2). Most protein came from plant sources (8.9%E) with little inter-regional differences. The North had the highest dairy protein intake (3.6%E), while the East (1.2%E) and Northeast (1.1%E) had the lowest. Fermented dairy contributed 0.2%E nationally, with intakes ranging from 0.02%E in the Northeast to 0.5%E in the North. Nonfermented dairy intake was 1.7%E nationwide, ranging from 0.9%E in the Northeast to 3%E in the North. Animal protein intake varied widely from a low of 0.3%E in the North and Central regions to 2.9%E in the Northeast, with poultry as the main contributor (0.3%E).

Intraregional differences in protein intake were minimal except in the Northeast, where it ranged from 12.2%E in Arunachal Pradesh to 17.6%E in Nagaland (also the nation’s highest). With regard to protein subtypes, Himachal Pradesh (North) and Jharkhand (East) had the highest plant protein intakes (10.2%E each), while Kerala in the South (5.7%E) and Goa in the West (5.9%E) had the lowest (Extended Data Table 5). The NCT of Delhi (4.1%E) and Haryana (3.7%E) in the North and Gujarat (3.6%E) in the West had the highest dairy protein intakes; Manipur and Meghalaya in the Northeast and Chhattisgarh in the Central region had the lowest intakes at 0.8%E. Haryana and Rajasthan in the North and Gujarat in the West reported the highest fermented dairy protein intakes at 0.7%E each. Rajasthan had the highest median intake of nonfermented dairy protein (4.7%E) in the nation, while Arunachal Pradesh in the Northeast reported the lowest at 0.4%E (Extended Data Table 5).

The highest animal protein intake was seen in the Northeastern states of Meghalaya (4.9%E), Mizoram (6.9%E) and Nagaland (7.4%E). They also had the highest protein intakes from red meat—Nagaland (4.2%E), Mizoram (2.3%E) and Meghalaya (2.0%E). Mizoram also reported a high intake of poultry protein (2.3%E). Egg protein intake was low and showed little variation nationwide. States in the Northeastern region like Manipur (1.5%E), Meghalaya (1.1%E) and Nagaland (1.0%E) reported the highest intakes of fish protein, along with Goa in the West (1.2%E; Extended Data Table 5).

Nutrient profile by place of residence and sex

Compared to urban participants, rural participants consumed more energy from total carbohydrates and refined cereals, but less from added sugars and all major fat subtypes, including saturated fat, MUFA and PUFA. They had higher intakes of plant and animal protein but lower intakes of dairy protein. Compared to men, women consumed fewer calories (%E) and protein, but more added sugar and saturated fat (%E; Supplementary Table 3).

Carbohydrate intake and metabolic risk

Higher calorie intake from carbohydrates was associated with a 14% higher likelihood of newly diagnosed T2D (odds ratio (OR) = 1.14, 95% confidence interval (CI) = 1.10–1.18). Pooled meta-analyses of the six major regions in India showed that refined cereals, milled whole grains and added sugars were associated with higher odds of newly diagnosed T2D by 13% (95% CI = 1.08–1.18), 9% (95% CI = 1.05–1.13) and 14% (95% CI = 1.09–1.19), respectively (Fig. 3a). There was no evidence of inter-regional heterogeneity in risk estimates. Higher intakes of total carbohydrates (OR = 1.15, 95% CI = 1.11–1.20), refined cereals (OR = 1.15, 95% CI = 1.09–1.21) and added sugars (OR = 1.15, 95% CI = 1.10–1.20) were each associated with a 15% higher likelihood of newly diagnosed prediabetes. Although the risk estimates for milled whole grains were attenuated, their consumption was still associated with higher odds of newly diagnosed prediabetes (OR = 1.12, 95% CI = 1.07–1.17; Fig. 3b).

Fig. 3: Pooled odds ratios (95% CI) for the association of total carbohydrates (%E) with likelihood of newly diagnosed T2D and newly diagnosed prediabetes in the ICMR–INDIAB study (n = 18,090).
figure 3

a, Newly diagnosed T2D. b, Newly diagnosed prediabetes. A fixed-effects model was used to synthesize data from the regions. Two-sided P values were used in calculating ORs and 95% CIs through survey logistic regression. All dietary exposure variables (refined cereals (g d−1), milled whole grains (g d−1), added sugar (g d−1)) were adjusted for total energy using the residual model. Models were adjusted for PSU, state code, population weight, age (in years), sex (male/female), family history of T2D (yes/no), tobacco use (never/current/past), alcohol use (never/current/past), PAL (sedentary/moderate/vigorous), time period the study was conducted, education qualification (primary school/high school/college), BMI (kg m−2), place of residence: urban versus rural, intakes of total energy (kcal d−1), fat (%E), fruits (g d−1), vegetables (g d−1), tubers (g d−1), milk and milk products (g d−1), alcohol (g d−1), pulses and legumes (g d−1), added salt (g d−1) and edible oils and fats (g d−1), hypertension, dyslipidaemia and abdominal obesity. Models with food sources as the main exposure were adjusted for the remaining food sources while models with total carbohydrates (%E) were adjusted for total fat (%E). Newly diagnosed T2D was defined as HbA1c ≥6.5% and/or if fasting CBG ≥126 mg dl−1 (≥7 mmol l−1) or 2-h postglucose load CBG ≥220 mg dl−1 (≥12.2 mmol l−1). Newly diagnosed prediabetes was defined as HbA1c 5.7–6.4% or if fasting CBG was ≥100–125 mg dl−1 (≥5.6–6.9 mmol l−1) and/or if 2-h postload CBG value was ≥160 and <220 mg dl−1 (≥8.9 to <12.2 mmol l−1). CBG, capillary blood glucose.

Higher intakes of total carbohydrate calories (OR = 1.15, 95% CI = 1.10–1.20) and refined cereals (OR = 1.15, 95% CI = 1.10–1.21) were each associated with 15% higher odds of general obesity (Extended Data Fig. 1a). Milled whole grains (OR = 1.11, 95% CI = 1.06–1.16) and added sugars (OR = 1.12, 95% CI = 1.07–1.17) were associated with a weaker, but still significant, odds of general obesity. Likewise, higher intakes of total carbohydrates (OR = 1.18, 95% CI = 1.11–1.25), refined cereals (OR = 1.08, 95% CI = 1.03–1.14), milled whole grains (OR = 1.14, 95% CI = 1.08–1.20) and added sugars (OR = 1.10, 95% CI = 1.06–1.15) were all associated with higher odds of abdominal obesity (Extended Data Fig. 1b).

Compared to the lowest carbohydrate intakes, those with the highest intakes (69%E (66.7, 72.7)) had a 15–30% higher likelihood of newly diagnosed T2D, prediabetes, general obesity and abdominal obesity (Extended Data Fig. 2a). When examining carbohydrate subtypes, those with the highest consumption of refined cereals (median (interquartile range (IQR)) = 256.5 (218.7, 316.9) g d−1) had an 18–25% higher likelihood of being diagnosed with T2D, prediabetes, general obesity and abdominal obesity compared to the lowest intake group (median (IQR) = 93.2 (75.8, 109.8) g d−1; Extended Data Fig. 2b). Those who consumed the highest quantities of white rice (median (IQR) = 198.4 (169.8–252.4) g d−1) had a 19–26% higher likelihood of being newly diagnosed with T2D, prediabetes, general obesity and abdominal obesity compared to those who consumed the least (median (IQR) = 81.5 (66–88.2) g d−1; Extended Data Fig. 2c).

Compared to those with the lowest consumption of milled whole grains (median (IQR) = 80.1 (64.9–93.6) g d−1), those with the highest intakes (median (IQR) = 262.2 (219.8–324.1) g d−1) had 11–29% higher risks of T2D, prediabetes, general obesity and abdominal obesity (Extended Data Fig. 2d). Similarly, a higher risk of 14–31% for T2D, prediabetes and general obesity was observed for the highest tertiles of whole wheat (median (IQR) = 249.9 (211.1–312.7) g d−1) and 21–29% for milled whole millets (median (IQR) = 172.6 (144.3–218.1) g d−1) consumption (Extended Data Fig. 2e,f). Those with the highest intakes of added sugar (median (IQR) = 46.5 (37.3–62.7) g d−1) had a 14–26% higher likelihood of newly diagnosed T2D, prediabetes, general obesity and abdominal obesity compared to those with the least intakes (median (IQR) = 10 (7.2–12.6) g d−1; Extended Data Fig. 2g).

Carbohydrate substitution and metabolic risk

We mathematically modeled the effect of an isocaloric substitution of 5%E from carbohydrates with 5%E from other macronutrients on glycemic risk (Fig. 4). Replacing 5%E from carbohydrates with 5%E from protein, specifically from plants, pulses and legumes or dairy (both fermented and nonfermented), was associated with a lower likelihood of newly diagnosed T2D and prediabetes (Fig. 4). On the other hand, the replacement of carbohydrate with animal protein was not associated with a lower risk of newly diagnosed T2D or prediabetes except when the source of animal protein was fish or egg, where the risk was lower by 6–10%. Replacing 5%E from carbohydrates with 5%E from total fat or fat subtypes, including saturated fat, monounsaturated or n3 polyunsaturated fat did not alter the risk of newly diagnosed T2D or prediabetes (Fig. 4).

Fig. 4: Odds ratios (95% CI) for glycemic risk in isocaloric substitution analysis for the replacement of 5% energy from carbohydrates for 5% energy from other macronutrients in the study population (n = 18,090).
figure 4

Two-sided P values were used in calculating ORs and 95% CIs through survey logistic regression. Models were adjusted for PSU, state code, population weight, age (in years), sex (male/female), family history of T2D (yes/no), tobacco use (never/current/past), alcohol use (never/current/past), PAL (sedentary/moderate/vigorous), time period the study was conducted, education qualification (primary school/high school/college), BMI (kg m−2), place of residence: urban versus rural, intakes of total energy (kcal d−1). It was further adjusted for hypertension, dyslipidaemia and abdominal obesity. Models were additionally adjusted for all nutrients except the nutrient being replaced. Plant protein includes cereals, pulses, legumes, fruits, vegetables (including roots and tubers), nuts and oilseeds. Animal protein (%E) includes red meat, poultry, egg and fish, and does not include dairy protein. Dairy protein includes fermented dairy protein (Indian style yogurt (curd), buttermilk, lassi) and nonfermented dairy protein (all types of milk (cow/buffalo/yak/goat), milk added to coffee and tea, paneer (curdled milk solids)). Newly diagnosed T2D was defined as HbA1c ≥6.5% and/or if fasting CBG ≥126 mg dl−1 (≥7 mmol l−1) or 2-h postglucose load CBG ≥220 mg dl−1 (≥12.2 mmol l−1). Newly diagnosed prediabetes was defined as HbA1c 5.7–6.4% or if fasting CBG was ≥100–125 mg dl−1 (≥5.6–6.9 mmol l−1) and/or if 2-h postload CBG value was ≥160 and <220 mg dl−1 (≥8.9 to <12.2 mmol l−1).

The replacement of carbohydrates with total protein or protein from various food sources was not associated with risk of general or abdominal obesity (Extended Data Table 6). However, replacing 5%E from carbohydrates with 5%E from total fat was associated with a 5% higher likelihood of general obesity (OR = 1.05, 95% CI = 1.02, 1.09; Extended Data Table 6). At the same time, replacing 50 g of refined cereals with 50 g milled whole grains or its subtypes (wheat and millet flour) was not associated with dysglycaemia or obesity (Extended Data Table 7). However, as the population’s intake of intact whole grains (brown rice, whole wheat and millets) was limited, they could not be evaluated in this substitution model.

Discussion

Using nationally representative data, we report high intakes (62.3%E) of low-quality carbohydrates (refined cereals like white rice, milled whole grains and added sugar) and saturated fat, and low protein intake. Millet consumption was high only in three states. Saturated fat intake exceeded national dietary recommendations (<10%E)18 in ten states, while if metabolic risk threshold (<7%E) was used19, all but four states exceeded recommended limits. Total protein intake was low nationwide with plant protein contributing 8.9%E, dairy 2%E and animal protein 1%E. Carbohydrate intake (>56%E), regardless of grain type, was associated with a 14–30% higher metabolic risk. Replacing refined cereals with whole wheat or millet flour was not associated with lower risk. However, replacing 5%E from carbohydrates with 5%E from plant, dairy, egg or fish protein was associated with lower risk of newly diagnosed T2D and prediabetes, while replacement with fat of any subtype did not change risk.

Data from the current study along with previous surveys7,20 confirm that Asian Indians consume high amounts of carbohydrates. Our results show that white rice is the most popular primary cereal staple (61% of population) followed by whole wheat flour (34%), with only a small percentage using millet flour. Because of the Green Revolution (introduction of high-yielding varieties of wheat and rice in developing countries, leading to major increases in the production of these foodgrains) in the 1960s, there has been a decrease in millet consumption, and an increase in rice and wheat consumption21. Consistent with findings from the National Expenditure Survey 2011–2012 (ref. 8), we report that rural residents derive a greater proportion of total calories from cereals than urban residents. While refined cereals are the major source of carbohydrate calories (28.5%E) followed by milled whole grains (16.2%E), actual intakes may be higher, as processed foods, which often contain substantial amounts of refined cereals, were not included in our database.

India accounts for 15% of the global sugar consumption22. Our survey found that added sugar consumption was higher than the recommended 5%E in most states and exceeded levels reported in the NNMB survey (≈15 g d−1)23. The definition of added sugars in this survey included only sugar added at the table, during cooking and from products with added sugar on the product label. Therefore, actual intakes are likely much higher. Despite Food Safety and Standards Authority of India regulations24 to label added sugars in the Nutrition Facts panel, many processed foods in the unregulated food sector (such as fast-food eateries, street vendors, local bakeries, fruit juices and traditional Indian sweets or mithais) do not label for added sugars.

Total fat intake aligned with the NNMB survey17 and remained within the recommended ~30%E19. However, most states exceeded the saturated fat recommendation (<7%E)18, highlighting an urgent need for targeted public health interventions to lower intakes. While total PUFA intake was comparable with global levels25,26, MUFA intake was much lower in the United States and Europe26,27, likely due to higher consumption of nuts and red meat in Western diets. Total protein intake was low nationwide when compared to the United States25. Plant protein was higher in India compared to the West25, while animal protein was low due to low meat28 and fish intake29. Conversely, dairy protein intake met National ‘My plate for the day’ recommendations17 of ≈2%E. These low intakes of animal protein align with the recent National Family Health Survey-5 (ref. 30) data, which show that half of adults consumed pulses/beans and milk/curd daily, but fewer consumed animal foods daily.

The Global Burden of Disease Study demonstrated that, for South Asia, ~23% of T2D burden was due to excess intake of white rice31. Although refined and whole grains differ in their associations with NCD risk32, higher intakes of milled whole grains (wheat and millets) were still associated with adverse metabolic profiles, showing no benefit for milled whole grains. In India, most whole grains are consumed as milled flour for flatbreads such as chapati and roti. Milling lowers the particle size of whole wheat and increases its glycemic index to the extent that the glycemic response becomes similar to that of refined wheat products and white rice32,33. In several randomized crossover trials34,35 that examined the effects of whole grain processing, specifically milling, on glycemic control, finely milled whole wheat flour increased postprandial glycemic response and increased body weight. Given minimal intake of intact whole grains (brown rice, whole wheat and whole millets), public health recommendations should focus on lowering total carbohydrate intake from both refined and milled whole grains, which are the country’s primary cereal staples.

Given the higher risk of T2D associated with refined cereal consumption6,36, identifying optimal macronutrient replacements is imperative. Replacing carbohydrates with protein from plant sources, dairy, eggs or fish was associated with lower T2D risk, unlike western studies where animal protein was associated with a higher T2D risk37,38. This likely reflects the predominance of plant protein in Indian diets. However, the quality of plant protein is equally important39 because, in our study, protein from pulses and legumes, but not cereals, was associated with a lower risk of T2D. Replacing rapidly digested carbohydrates (refined grains) with legume protein may improve glycemic control and lower the overall glycemic index of the diet40, while also addressing protein deficiencies39. Although literature surrounding dairy and egg consumption and metabolic risk has been inconsistent41,42,43, replacing low-quality carbohydrates with dairy, egg or fish can mitigate metabolic risk in the Asian Indian population due to high-carbohydrate intakes and low protein intakes. In fact, low yogurt intake alone accounts for 13% of South Asia’s T2D burden31. The International Lipid Expert Panel recommends increase in protein intake from poultry and fish for cardiometabolic risk reduction44. Another multi-country study has shown benefits with weekly two servings of fish (175 g per week) for individuals with cardiometabolic risk45. While red meat has been previously associated with higher NCD risk46, we found that red meat protein was not associated with likelihood of T2D or prediabetes possibly due to low consumption.

While isocaloric substitutions are useful for understanding the effects of replacing one nutrient with another while keeping total energy intake constant, it is important to recognize that this approach involves statistical modeling rather than actual replacements by participants. Diet is a complex matrix of various foods and nutrients that interact synergistically or antagonistically, and simple isocaloric substitutions may not capture these interactions. We were unable to evaluate the effect of replacing carbohydrates with foods other than plant protein, such as intact whole grains, fruits and vegetables, which are also high in fiber and have been shown to decrease the risk of NCDs47. Further studies should investigate this in the Asian Indian population to inform more comprehensive nutritional policy recommendations.

The current nationwide high-carbohydrate intake represents a major risk factor for NCDs and should prompt policy action to lower consumption. Widespread reliance on subsidized rice and wheat likely drives high cereal intake, highlighting the need to prioritize pulses and legumes in subsidy programs over refined cereals48. Excess added sugar further endangers public health, and current tax rates on carbonated drinks have not reduced consumption49, suggesting the need for alternative strategies. Saturated fat intake also exceeds recommendations in many states, likely due to high intakes of ghee (clarified butter) in the North and the use of subsidized palm oil and coconut oil in the South. In addition, palmolein oil, due to its lower cost compared to other vegetable oils, is widely used in the food industry and in processed foods50. Policies should shift toward provision of healthier oils through the public distribution system (PDS), and raising consumer awareness to support healthier choices.

Our study has several strengths. This is a nationally representative study that highlights similarities and differences in the usual intakes of macronutrients between and within various regions of India. We used a validated food frequency questionnaire (FFQ)51 to assess diet across regions. We also explored a strategy to reduce metabolic risk by modeling isocaloric substitution of protein and its subtypes for carbohydrates. The findings from this study can be extrapolated to other countries in South and Southeast Asia with similar dietary patterns and where high-carbohydrate diets from cereal staples are prevalent.

Our findings should be interpreted considering several limitations. First, this was a cross-sectional study, and therefore, reverse causation is a possibility. However, because our primary outcomes were newly diagnosed T2D and prediabetes, participants were unaware of their status when dietary information was collected. Second, a single measure of glycaemia (glycated hemoglobin (HbA1c) or blood glucose) was used to define T2D and prediabetes, although two measurements are recommended for making a clinical diagnosis. However, for epidemiological studies, a single measure is considered acceptable and often the only feasible option. The use of the American Diabetes Association’s52 less stringent cutoffs for fasting blood glucose (100 g d−1) for prediabetes could have selected more individuals prone to reversion to normoglycemia compared to the higher cut-point advocated by the WHO (110 mg dl−1)53. Third, we were unable to capture intakes of processed foods and trans-fat content of commercial products due to lack of a trans-fat database and labeling regulations. Fourth, although we adjusted for potential confounders, residual confounding is possible due to the observational design. Fifth, we did not include medication use in defining dyslipidaemia and hypertension; therefore, actual prevalence of these risk factors might be higher than reported. Sixth, because our analyses were prespecified, findings were not adjusted for multiple comparisons, potentially increasing the likelihood of a type I error. However, given the mechanistic links among our outcomes, a type I error is unlikely. Seventh, although the FFQ used in INDIAB was validated, measurement errors are likely. Because these errors are nondifferential with respect to the outcome, the associations are likely to be attenuated. Eighth, as the dietary data were collected from every fifth participant, the included population may not fully represent the larger population. However, demographic differences between included and excluded participants were minimal and not clinically relevant, suggesting a good external validity.

In conclusion, based on our findings, national guidelines and policy changes should emphasize reduced consumption of refined carbohydrates and saturated fat, and recommend increased consumption of protein, primarily from plant and dairy sources. Public health messaging should highlight practical strategies such as gradually increasing the ratio of pulses, legumes and dairy in the diet to achieve a desirable macronutrient composition for NCD prevention. Such dietary shifts can reverse the nutrition transition, address protein inadequacies and improve overall diet quality. Subsidies through the PDS should emphasize healthier protein sources (pulses and legumes) and healthier edible oils (low in saturated fat) over refined grains like white rice. Minimum support prices for cereals, if extended to pulses, legumes and dairy can improve both the quality and the quantity of protein consumed. Because healthcare is a state government responsibility in India, the study’s findings are vital for helping states to enhance their PDS plans and to promote subsidies that prevent NCDs. A multisectoral approach involving healthcare, agriculture and socioeconomic policy is crucial for fostering healthier diets and lowering NCD risk in India.

Methods

Sampling and study population

The ICMR–INDIAB national study is a cross-sectional door-to-door population-based survey that includes a representative sample of individuals aged ≥20 years drawn from 30 states and union territories, and the NCT of Delhi. To be nationally representative, the survey was conducted in multiple phases using a stratified multistage sampling design using three-level stratification based on geography, population size and socioeconomic status of each state. In phase 1 (2008–2010), individuals from Tamil Nadu, Chandigarh, Jharkhand and Maharashtra were recruited. Phase 2 was conducted from 2012 to 2013, and included undivided Andhra Pradesh, Bihar, Gujarat, Karnataka and Punjab. Phase 3 (2017–2018) included Delhi, Madhya Pradesh, Rajasthan and Uttar Pradesh, while phase 4 (2018–2019) included Kerala, Goa, Puducherry, Haryana and Chhattisgarh. The Northeast phase was conducted between 2011 and 2017 and included states of Assam, Arunachal Pradesh, Manipur, Meghalaya, Mizoram, Nagaland, Sikkim and Tripura. The final phase (phase 5) was conducted between 2019 and 2020 and covered Himachal Pradesh, Uttarakhand, Odisha and West Bengal. Additional methodological details of the ICMR–INDIAB survey have been published earlier1. The primary sampling units (PSUs) were villages in rural areas and census enumeration blocks in urban areas. Using a systematic sampling method described elsewhere1, from each PSU, 24 households from urban areas and 56 households from rural areas were selected. To avoid selection bias with respect to age and sex, one individual was selected from each household using the WHO Kish method54.

The total sample size, including all 30 states, union territories and the NCT of Delhi, was 113,043 adults, with 79,506 from rural areas and 33,537 from urban areas. Of the 113,043 individuals, every fifth individual (n = 24,800) and all those with known and newly diagnosed diabetes were invited to provide information on diet. A total of 20,860 individuals provided information on diet (response rate 91.8%). The final dataset for this study comprised 18,090 adults (49% male, 27% urban), for whom all biochemical and dietary assessments were available for further analysis.

Ethics

The Institutional Ethics Committee at Madras Diabetes Research Foundation (MDRF) granted approval for the study, and written informed consent was obtained from all study participants. The study was registered with the Clinical Trials Registry of India (CTRI/2019/03/018095). The study was conducted in accordance with the principles outlined in the Declaration of Helsinki for medical research.

Study procedures

Information pertaining to demographic and socioeconomic characteristics of all participants was collected using a standardized, structured questionnaire. Weight, height and waist circumference were measured using standardized procedures. BMI was calculated as weight in kilograms divided by height in meters squared. Blood pressure was recorded in the sitting position twice with an interval of 5 min. Measurements were taken in the right arm to the nearest 1 mm Hg, using digital sphygmomanometers (Omron HEM-7101, Omron). The average of the two readings was used as the final reading. For quality assurance, equipment with the same specifications were used throughout the study. Interobserver and intra-observer coefficients of variation between the field technicians were documented to be less than 5% in all regions.

Biochemical assessments

Biochemical assessments were performed using standardized methods by the same team of laboratory technicians throughout the study. In individuals without self-reported T2D, we conducted an oral glucose tolerance test using capillary blood glucose (One Touch Ultra, LifeScan, Johnson & Johnson). Fasting venous blood samples were drawn for assessment of glycosylated hemoglobin (HbA1c) in every fifth participant. Venous samples were centrifuged within 1 h at the survey site and serum was transferred to labeled vials and temporarily stored in −20 °C freezers before being sent to the central laboratory. HbA1c was measured with high-pressure liquid chromatography using the Variant II Turbo machine (Bio-Rad, Hercules), which is certified by the National Glycohemoglobin Standardization Program as having documented traceability to the Diabetes Control and Complications Trial reference method3. Serum cholesterol (cholesterol esterase oxidase–peroxidase–amidopyrine method), serum triglycerides (glycerol phosphate oxidase–peroxidase–amidopyrine method) and high-density lipoprotein cholesterol (direct method—immune inhibition) were measured using the Olympus 2700/480 automated biochemistry analyzer (Fullerton) from 2008 to 2015, and the Beckman Coulter AU 680 clinical chemistry analyzer (Fullerton) from 2016 to 2021. LDL-C was calculated using the Friedwald equation55. The intra-assay and interassay coefficients of variation for biochemical assays conducted at the central laboratory ranged between 3.1% and 7.6%.

Assessment of diet

Diet was assessed using an interviewer-administered questionnaire, MDRF–FFQ51. The MDRF–FFQ is a meal-based, quantitative FFQ that contains a list of 222 food items normally consumed by both urban and rural populations. Participants were instructed to indicate usual frequency of consumption for each food item (number of times per day, week, month, year or never) and their usual serving size, reflecting their dietary patterns over the past year, using one of the five following categories: ‘never’, ‘daily’, ‘weekly’, ‘monthly’ and ‘yearly’. To precisely estimate portion sizes and tools, participants were provided with a visual atlas of Indian foods. In addition, common household measures, such as household cups, bowls, ladles, spoons, wedges and circles of different diameters, were shown.

The MDRF–FFQ has been previously validated for adults in urban and rural areas across ten states in five regions of India51. The FFQ was administered to adults twice a year (at 0 and 12 months). Reproducibility, assessed using intraclass correlation coefficients, showed moderate to good correlations for nutrients and food groups (for example, 0.50–0.77 for saturated fat and energy in urban areas, and 0.61–0.72 for protein and saturated fat in rural areas). Intraclass correlation coefficients for foods such as whole grains (urban, 0.53; rural, 0.61) and fruits and vegetables (>0.7) were also high, indicating good reproducibility. We also assessed the construct validity of carbohydrate and fat intake using serum lipids in regression analyses. The de-attenuated Pearson correlation for the energy-adjusted nutrients between FFQ and 24-h recalls ranged from 0.73 for carbohydrates to 0.35 for calcium.

The average daily intake of calories, macronutrients, dietary fiber, fatty acids and food groups (excluding nutrient supplements like protein powder) was computed using the ‘EpiNu’ Software (version 2.0). EpiNu consists of a wide range of recipes gathered from various places (ranging from households to commercial eating places such as fast-food outlets and popular restaurants) to be more realistic as to the actual eating habits of the population. The food groups were categorized based on the food items reported by the participants in the FFQ.

Calculation of %E for macronutrients

The %E for each individual was calculated as follows: carbohydrates in g × 4 calories/total daily calories × 100; protein in g × 4 calories/total daily calories × 100; and fat in g × 9 cal/total daily calories × 100.

Calculation of %E food groups

  • For the purpose of this analysis, %E from refined cereals included meals/snacks prepared with white rice, refined wheat flour (white flour/maida), refined millet grains/flour, wheat semolina (rava) and vermicelli.

  • The %E from milled whole grains included calories from whole wheat and whole millets used in flour or grit form.

  • The %E from added sugar included white sugar, honey, jaggery and palm sugar added during cooking or processing and on the table. Added sugar from processed foods was not included in this definition due to lack of adequate information.

  • The %E from plant protein included protein calories derived from cereals (whole and refined), pulses, legumes, nuts and oilseeds.

  • The %E from dairy protein included protein calories derived from milk and its derivatives.

  • The %E from fermented dairy included Indian style yogurt (curd), buttermilk and lassi.

  • The %E from nonfermented dairy included all types of milk (cow/buffalo/yak/goat), milk added to coffee and tea, paneer (curdled milk solids).

  • The %E from animal protein included protein calories derived from meat, poultry, eggs, fish and seafoods.

Physical activity assessment

The MDRF Physical Activity Questionnaire captures data from multiple activity domains over the period of a year from males and females of varying ages and has been validated in both urban and rural settings, covering ten states in different regions56. This interviewer-administered questionnaire documents the frequency and duration of various activities, encompassing regular, obligatory and discretionary tasks across all domains. These domains include occupation, general activities (like sleep, personal care and household chores), transportation and leisure activities. The calculation of physical activity for an ‘average’ day was derived from a sum of the activities in various domains for a 24-h period. Total energy expenditure (TEE) was calculated as the time spent on various activities in the multiple domains and the energy cost of these activities. Energy cost was reported as a multiple of basal metabolic rate (BMR) and called physical activity ratio. Total time spent on habitual activities was multiplied by the physical activity ratio to derive the TEE of 24 h. The PAL was then calculated as TEE/BMR for 24 h (ref. 15). Participants were categorized as physically active or inactive. Moderate and vigorous PAL were combined for the ‘active’ category based on PAL cutoffs for moderate (1.70–1.99) and vigorous (2.0–2.4) activities, while physically inactivity was defined as sedentary PAL of 1.40–1.69 (ref. 15).

Assessment of metabolic risk

Newly diagnosed T2D was defined as HbA1c ≥6.5% and/or fasting (capillary) blood glucose ≥126 mg dl−1 (≥7 mmol l−1) or 2-h postglucose load (capillary) blood glucose ≥220 mg dl−1 (≥12.2 mmol l−1)1,52,53,57. Newly diagnosed prediabetes was defined as an HbA1c 5.7–6.4% or if fasting glucose was ≥100–125 mg dl−1 (≥5.6–6.9 mmol l−1) and/or if 2-h postload glucose (capillary) value was above 160 mg dl−1 but <220 mg dl−1 (≥8.9 to <12.2 mmol l−1)52,53,57. Generalized obesity was defined using Asia-specific cutoffs of BMI of 25 kg m−2 or higher. Abdominal obesity was defined based on the WHO Asia Pacific guidelines, with a waist circumference of 90 cm or greater for men and 80 cm or greater for women58.

Hypertension was defined as a systolic blood pressure of ≥140 mm Hg, or a diastolic blood pressure of ≥90 mm Hg (ref. 59). Dyslipidaemia was defined using the National Cholesterol Education Program Adult Treatment Panel III guidelines as follows: serum cholesterol concentrations of ≥200 mg dl−1 (≥5.2 mmol l−1), serum triglyceride concentrations of ≥150 mg dl−1 (≥1.7 mmol l−1), low high-density lipoprotein cholesterol (male, <40 mg dl−1 (1.04 mmol l−1); female, <50 mg dl−1 (1.3 mmol l−1)) and high low-density lipoprotein cholesterol concentrations of ≥130 mg dl−1 (3.4 mmol l−1)16. Metabolic risk was defined as the presence of any one of newly diagnosed T2D, prediabetes, dyslipidaemia, generalized obesity, abdominal obesity or hypertension.

Statistical analysis

For the purpose of this analysis, the Indian states were divided into the following six geographical zones: North (Chandigarh, Delhi, Haryana, Himachal Pradesh, Punjab and Rajasthan), South (Andhra Pradesh (undivided-subsequently bifurcated into Andhra Pradesh and Telangana), Karnataka, Kerala, Puducherry, and Tamil Nadu), East (Bihar, Jharkhand, Odisha and West Bengal), West (Goa, Gujarat and Maharashtra), Central (Chhattisgarh, Madhya Pradesh, Uttar Pradesh and Uttarakhand) and Northeast (Arunachal Pradesh, Assam, Manipur, Meghalaya, Mizoram, Nagaland, Sikkim and Tripura). Sensitivity analysis on the dietary profile of undivided Andhra Pradesh did not differ from the separate states of Telangana and Andhra Pradesh (Supplementary Table 4), and hence in this study, undivided Andhra Pradesh (matching with the actual data collection period of 2012–2013) was considered. To account for the multistage complex survey design of the study, to ensure that our results are representative of the larger population and to compensate for the unequal probability of selection, our analyses adjusted for sampling weights using the PROC SURVEY (frequency/mean) procedure, where all key survey elements were used in statistical analyses. The PSU was accounted for as the cluster, the normalized weight was accounted for as the final study weight and the state was accounted for as the stratum to estimate population means, medians, variance and proportions.

We excluded participants with self-reported T2D (n = 1,404), HbA1c of <4% (n = 26) or if they had implausible dietary data defined as a reported energy intake (n = 1,340 with values <500 and >4,200 kcal d−1)60. After these exclusions, the final analytic sample consisted of 18,090 participants (Extended Data Fig. 3). Complete data were available for 90% of the study participants. Missing data were considered to be missing completely at random. To impute missing data, we used information on age, sex and BMI to identify the nearest neighbor and applied the k-nearest neighbor imputation method61 using Python package (version 3.11.7). Given the pervasive presence of added sugar in foods and beverages, for analyses involving added sugar as an exposure, we excluded participants who consumed ≤5 g d−1 of added sugar. For analyses on saturated fat, we excluded implausible values (participants with >30%E coming from saturated fat, n = 80).

Intakes of major macronutrients from different food sources were expressed as a percentage of total energy using the residual method60. We evaluated differences in intakes of carbohydrate and its major food sources (refined cereals including white rice, refined wheat flour (maida), refined millet grains/flour (minor millets like little millet and foxtail millet) and wheat semolina (rava); milled whole grains including whole wheat, whole millets and whole grains milled into flour and added sugar including white sugar, honey, jaggery and palm sugar added during cooking and on the table), total protein and its subtypes (plant protein including cereals, pulses, legumes, fruits, vegetables (including roots and tubers), nuts and oilseeds; dairy protein including fermented dairy protein (Indian style yogurt (curd), buttermilk, lassi) and nonfermented dairy protein (all types of milk (cow/buffalo/yak/goat), milk added to coffee and tea, paneer (curdled milk solids)); animal protein including red meat, poultry, fish, egg but not dairy protein) and total fat and its subtypes (saturated fat, monounsaturated fat, polyunsaturated fat, n6 polyunsaturated fat, n3 polyunsaturated fat) by place of residence (urban versus rural), sex, geographical region (North, South, West, East, Central and Northeast) and within geographical regions). Because dietary data were skewed, we presented weighted medians and IQRs when describing intakes. All weighted descriptive statistics were tested for significance by regions, urban versus rural, male versus female using median and quantile test for continuous variables and chi-square test for categorical weighted variables. The map (Fig. 1) represents the top three cereal grains reported by the weighted percentage of adults in each state.

We used PROC SURVEY LOGISTIC REGRESSION to evaluate the association between carbohydrate (%E) and metabolic risk. We also examined associations between refined cereals (g d−1), milled whole grains (g d−1), added sugar (%E) and metabolic risk. For this analysis, we excluded nonconsumers and those with intakes ≤50 g of refined cereals, white rice, milled whole grains, milled whole wheat and milled whole millets, and participants with intakes ≤5 g d−1 of added sugar. Models were adjusted for PSU, state code, population weight, age (in years), sex (male/female), family history of T2D (yes/no), tobacco use (never/current/past), alcohol use (never/current/past), PAL (sedentary/moderate/vigorous), time period the study was conducted, education qualification (primary school/high school/college), BMI (kg m−2; except for generalized obesity), place of residence: urban versus rural, intakes of total energy (kcal d−1), fat (%E), fruits (g d−1), vegetables (g d−1), tubers (g d−1), milk and milk products (g d−1), alcohol (g d−1), pulses and legumes (g d−1), added salt (g d−1) and edible oils and fats (g d−1). Models were further adjusted for newly diagnosed T2D (except in models with newly diagnosed T2D and newly diagnosed prediabetes as outcomes), newly diagnosed prediabetes (except in models with newly diagnosed T2D and newly diagnosed prediabetes as outcomes), hypertension, dyslipidaemia and abdominal obesity (except in models with abdominal and generalized obesity as outcomes). Models were additionally adjusted for all food sources and nutrients (carbohydrates, %E) except those that were the outcome. All analyses were conducted separately in each of the six regions. Heterogeneity in the association of the dietary exposure with metabolic risk was assessed by using the I2 statistic. I2 values of ~25%, 50% and 75% were considered to indicate low, moderate and high heterogeneity, respectively.

We statistically modeled the isocaloric substitution of 5% of energy from carbohydrates with an equal amount from protein (from actual foods and not supplements like protein powders) and fat by simultaneously including %E derived from carbohydrates with %E from the replacement nutrient as continuous variables along with other covariates. The difference in the beta coefficients of the two macronutrients being compared, along with the variances and covariances, were used to estimate the OR and 95% confidence interval for the substitution effect62. Additionally, we statistically modeled the replacement of 50 g of refined cereals with 50 g of milled whole grains, whole wheat flour, or whole millet flour. The base model was adjusted for PSU, state code, population weight, age (in years), sex (male/female), family history of T2D (yes/no), tobacco use (never/current/past), alcohol use (never/current/past), PAL (sedentary/moderate/vigorous), time period the study was conducted, education qualification (primary school/high school/college), BMI (kg m−2; except for generalized obesity), place of residence (urban versus rural) and intakes of total energy (kcal d−1). It was further adjusted for newly diagnosed T2D (except in models with newly diagnosed T2D and newly diagnosed prediabetes as outcomes), newly diagnosed prediabetes (except in models with newly diagnosed T2D and newly diagnosed prediabetes as outcomes), hypertension, dyslipidaemia and abdominal obesity (except in models with abdominal obesity as an outcome). Models were additionally adjusted for all nutrients except the nutrient being replaced. In sensitivity analyses, we tested for potential effect modification of the association between nutrients and metabolic risk by place of residence (urban versus rural) and sex (male versus female). Since these tests were not significant (Pinteraction > 0.05), we present results without stratification. All statistical analyses were two sided and performed with R software (version 4.3.3) and SAS (version 9.4; SAS Institute).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.