Introduction

Cellulose, chitin, and amylose are naturally occurring crystalline polysaccharides. Owing to their abundance in nature, they are favorable organic resources for transformation into highly functional materials. Cellulose has been industrially transformed into microcrystalline materials, regenerated fibers, and biochemical materials using dissolution, hydrolysis, and derivatization of wood pulp [1, 2]. Recently, cellulose and chitin nanofibers, which are highly crystalline fibers, have attracted attention as environmentally friendly, highly functional materials [3,4,5]. The potential use of chitin and chitosan as materials for medical applications is currently under active investigation [6]. Amylose has been developed as a functional material through the inclusion of low-molecular-weight compounds and polymers in composites [7]. Cellulose and amylose derivatives are used as chiral separation materials worldwide [8, 9]. Considerable efforts have been made to develop materials with controlled polysaccharide sequences through precise synthesis, such as derivatization and enzyme-catalyzed polymerization [10].

The crystal structures of polysaccharides have been studied for many years, and highly refined crystallographic data have been obtained through X-ray and neutron diffraction using synchrotron radiation sources. However, gaps in the basic knowledge of the interfacial interactions and the mechanisms of self-assembly and fiber deformation of the crystalline polysaccharides exist. Although the processes used to form higher-order structures, including dissolution and regeneration, are important from an industrial point of view, their molecular mechanisms remain unclear. As a result, material design through simulations and informatics has been a growing research area. This focus review covers theoretical and computational studies, including atomistic simulations, performed by our research group on the crystallographic properties and novel nanostructures of cellulose, the crystal structure of amylose analog polysaccharides, and the dissolution mechanism of cellulose and chitin crystalline fibers.

Twisting deformation and crystal transformation of cellulose fibers

Cellulose is produced as a highly crystalline component in plant cell walls, and its native form exists as crystalline polymorphs. The crystal structure of cellulose nanofibers is generally inherited from that of native cellulose, which exists in two different forms: Iα and Iβ [11,12,13]; thus, basic knowledge of their structural properties and fiber morphology is important for the development of cellulose materials. Regenerated and mercerized cellulose fibers, comprising cellulose II with low crystallinity, constitute one of the forms used in industry [2, 14]. Native cellulose is converted into cellulose IIII by swelling in liquid ammonia or treating with amines and reverts to cellulose I through a crystal transition under hydrothermal conditions [15]. Interestingly, cellulose IIII is highly digestible by cellulase enzymes [16,17,18].

The crystal structure of cellulose is characterized by planar, twofold helical molecular chains, with polar groups at the equatorial positions of the glucose residues arranged along the molecular chain axis; this enabling the formation of the O–H3∙∙∙O5 intramolecular hydrogen bonds and hydrophobic pyranose rings along the axial direction. In contrast, the crystal structures of various polymorphs are characterized by the cellulose molecular chains connected by the intermolecular hydrogen bonds between the polar groups to form molecular chain sheets.

In the crystal structure of native cellulose (cellulose Iα and Iβ), the molecular chain sheets with O–H6∙∙∙O3 intermolecular hydrogen bonds are nearly planar [19, 20]. However, in the crystal structure of cellulose II and IIII, molecular chain sheets are corrugated with O–H6∙∙∙O2 intermolecular hydrogen bonds along two directions [21,22,23]: (010) and (110) in cellulose II and (100) and (\(1\bar{1}0\)) in cellulose IIII. Therefore, the molecular chain sheets can be broadly classified as native or non-native with respect to hydrogen bonding and sheet shape. In terms of chain arrangement, parallel molecular chains are accommodated in the crystal structure of cellulose I and IIII, whereas antiparallel chains are found in the crystal structure of cellulose II [24,25,26].

Molecular dynamics (MD) studies have been performed on cellulose crystalline fibers in aqueous solutions by various research groups. In the early years of this research field, Yui et al. used MD calculations to show that the crystal models of native cellulose were deformed through a spontaneous right-handed twist [27,28,29]. Indeed, actual microfibrils of native cellulose also underwent this deformation, as observed using transmission electron microscopy and atomic force microscopy [30, 31]. In contrast, we found that the crystal models of cellulose IIII retained their initial fiber shape at room temperature, whereas the crystal models of cellulose II were moderately deformed in solvated MD calculations [32].

Wada performed an in situ experimental study of the crystal transition of cellulose IIII to cellulose Iβ and proposed that this process was facilitated by the conversion of IIII (\(1\bar{1}0\)) to Iα (110) and Iβ (200) lattice plane sheets [15]. We partially reproduced a similar crystal transition through MD simulations of the crystal models of cellulose IIII in hot water [33, 34]. The results revealed the disappearance of intermolecular hydrogen bonds in IIII (100) sheets with the conversion of the intermolecular hydrogen bonds from IIII-like O–H6∙∙∙O2 to I-like O–H6∙∙∙O3 in IIII (\(1\bar{1}0\)) chain sheets, accompanied by irreversible conversion of the conformation of hydroxymethyl groups from gauche-trans (gt) to trans-gauche (tg) (Fig. 1).

Fig. 1
figure 1

Conversion of the molecular chain sheets with the exchange of intermolecular hydrogen bonds during the crystal transition from cellulose IIII to cellulose I under hydrothermal conditions. The schematic diagram below illustrates the conversion scheme of the molecular chain arrangement; here, the boxes colored in orange refer to cellulose molecular chains, and the solid blue lines indicate intermolecular hydrogen bonds

Structural stability of the molecular chain sheet models of the cellulose crystal polymorphs

The molecular chain sheets of cellulose I are stacked as a result of van der Waals and hydrophobic interactions [35]. Similarly, according to the crystal transition mechanism, the crystal structure of cellulose IIII features stacked (\(1\bar{1}0\)) sheets, which are conserved during crystal transitions. Against this background, we attempted a different method for characterizing cellulose crystal structures. To investigate whether the molecular chain sheets of the crystal structure of cellulose were stable (structurally conserved), we performed density functional theory (DFT) calculations of molecular chain sheet models isolated from the crystal structures of four cellulose polymorphs of Iα, Iβ, II, and IIII [32, 36]. The molecular chain sheets in the crystal structure remained at the potential energy minimum under the crystalline packing forces, whereas the molecular chain sheets in the isolated state moved away from the minimum point on the potential surface; thus, subsequent optimization led to structural changes, as shown in Fig. 2.

Fig. 2
figure 2

Evaluation of the structural stability of the molecular chain sheets of the crystal structures of cellulose. a Potential energy surfaces of the molecular chain sheets in various environments and b superimposed chain sheets of the initial (blue) and optimized (red) models. Reproduced under the terms of the Creative Commons CC BY license from [36]

Figure 3 depicts the stability of the molecular chain sheets of cellulose crystal polymorphs in terms of deformation behavior. During DFT optimization, the (110) chain sheet of cellulose Iα and the (200) chain sheet of cellulose Iβ undergo a right-handed twist with a similar amount of twisting. In the crystal structures of cellulose Iα and Iβ, the molecular chain sheets are structured similarly but stacked differently. In the two-way stacked sheet models of native cellulose, the Iα (100) and Iβ (\(1\bar{1}0\)) sheets retain their initial structures, whereas the Iα (010) and Iβ (110) sheets undergo complete collapse of the stacked structure of the molecular chains. Consequently, the stacked sheets derived from the Iα (100) and Iβ (\(1\bar{1}0\)) planes are the major structural units in the crystal structures of native cellulose. Moreover, the twisting deformation of cellulose fibers originates from the structural stability of the Iα (110) and Iβ (200) planes; these results confirm that deformation stress is inherent in the crystal structure. These analyses reveal the molecular factors of deformation observed in the cellulose fibers.

Fig. 3
figure 3

Projections of the ab base planes of the cellulose crystal polymorphs and the structural stability of the molecular chain sheets. Stable and unstable planes are shown in the red and blue boxes, respectively. The red rotating arrows indicate the chirality of twisting deformation

The molecular chains in adjacent (010) and (020) planes of cellulose II are oppositely oriented to each other. Accordingly, molecular chains in the (110) plane of cellulose II in the diagonal direction are arranged in alternating orientations to produce antiparallel structures. During DFT optimization, the II (010) and II (020) molecular chain sheets undergo twisting in opposite directions, with right- and left-handed chirality, respectively, whereas the II (110) molecular chain sheet experiences significant collapse of its initial sheet structure. Consistent with the opposite chirality deformation of the II (010) and II (020) planes, moderate deformation occurs during solvated MD calculations of the crystal models of cellulose II.

In the case of the two chain sheets constituting the crystal structure of cellulose IIII, DFT-optimized IIII (\(1\bar{1}0\)) chain sheets retain their original structure, whereas the DFT-optimized IIII (100) chain sheet forms a tube with a collapsed sheet structure. The structural features of the isolated sheet models reflect those of the parent crystal models observed in the solvated MD simulations.

Prediction of the cellulose nanotubes derived from the molecular chain sheets

During DFT optimization, the molecular chain sheet in the IIII (100) plane spontaneously transforms into a tube. The predicted nanostructure of the cellulose nanotube (CelNT) consists of cellulose molecular chains oriented in a right-handed fourfold helix with one-quarter chain staggering, as shown in Fig. 4a [37, 38]. According to the construction principle, the tube structure can be infinitely extended with respect to the degree of polymerization and the number of molecular chains. Furthermore, CelNTs are characterized by hydrophobicity owing to the closed structure of their chain sheet edges resulting from the intermolecular hydrogen bonds.

Fig. 4
figure 4

a Formation of cellulose nanotubes (CelNTs) from cellulose IIII (100) sheets, as predicted by the DFT calculations. b Average structure of the CelNT model in benzene evaluated by MD calculations. Partially modified with permission from [37], Copyright 2018 Elsevier

Using the construction principle, we constructed CelNT models with 16 cellulose chains, each had a polymerization degree of 80 glucose residues, and their structural stability was evaluated in various solvents [37,38,39]. MD simulations revealed that the tube structures of the CelNT models collapsed into stacked structures of the cellulose molecular chains in polar solvents, such as water and ethyl acetate; however, their tube structures were preserved through the continuous exchange of the intermolecular hydrogen bonds in non-polar solvents, such as chloroform, benzene, and cyclohexane (Fig. 4b). Moreover, benzene and cyclohexane were adsorbed on the surface of the CelNT models, which stabilized the tube structure. Because benzene and cycloalkane have a wide range of derivatives, we propose that they are good solvents for CelNT fabrication.

The DFT-optimized structure in Fig. 4a is explained by a slightly left-handed cellulose strand due to end effects caused by short-chain oligomers (decamers) and stronger hydrogen bonds in vacuum. On the other hand, the MD structure in Fig. 4b shows the dynamic and structural stability with the exchange of hydrogen bonds in benzene.

If the method established to fabricate CelNTs is dependent on the selected solvent and controlled arrangement of the cellulose molecular chains, then novel nanofibers with variable diameters can be designed by adjusting the relevant preparation conditions, enabling the introduction of arbitrary guest molecules. DFT and MD calculations indicate that a larger diameter of CelNT correlates to more stable intermolecular hydrogen bonds [37, 38]. Similar molecular arrangements can be attained as multiwall structures or inclusion complexes with hydrophobic guest molecules, which could potentially enable the formation of a larger diameter tubular structure. The spontaneous formation of tubular assemblies has remained speculative due to the limited spatiotemporal scale of the atomistic simulations. Large-scale extended ensemble methods in MD simulations can be applied to search for suitable solvents and supplementary ingredients, such as templates and guest molecules, in future studies.

Evaluation of the artificial crystal structures of amylose analog polysaccharides

Higher-order structures from natural macromolecules and their analogs formed through molecular interactions may exhibit new properties and broadens the scope of applications. Kadokawa reported that thermostable α-glucan phosphorylase (GP) isolated from thermotolerant bacteria (Aquifex aeolicus VF5) recognized analog substrates of α-D-glucose 1-phosphate (Glc-1-P) owing to the weak specificity of the substrate and, furthermore, catalyzed the glycosyl transfer reaction during polymerization to produce amylose analog polysaccharides [10, 40,41,42,43]. For example, GP catalyzed the enzymatic polymerization of 2-deoxy-α-D-glucose 1-phosphate (dGlc-1-P); dGlc-1-P was produced in situ from 1,2-dideoxy-D-glucose (D-glucal) in the presence of inorganic phosphate, with a maltotetraose primer to produce α(1→4)-linked 2-deoxyglucose chains (2-deoxyamylose). Powder X-ray diffraction analysis of 2-deoxyamylose revealed the formation of an artificial crystal structure that was completely different [43] from the well-known crystal structures of the native amylose polymorphs; these native polymorphs are characterized by intertwined 6-fold left-handed parallel double helices to form O2∙∙∙O3 intramolecular and O2∙∙∙O6 intermolecular hydrogen bonds [44, 45].

Efficient sampling across a wide range of three-dimensional structures is needed for polymer modeling. Thus, we evaluated the artificial crystal structure of 2-deoxyamylose using MD simulations and extended ensemble methods. Temperature replica-exchange MD (T-REMD) simulations represent an extended ensemble method in which different temperature conditions are exchanged among the multiple systems (replicas) to improve sampling on the time scale of the simulation, as demonstrated for protein folding [46]. Moreover, crystallization from the chain dispersed state of 2-deoxyamylose has been successfully simulated via T-REMD calculations [43].

T-REMD simulations revealed that the isolated chains of 2-deoxyamylose spontaneously assemble into double helices with antiparallel chain polarity (Fig. 5a). In addition, pyranose rings are stacked in the double helix structure; these results indicate increased van der Waals and hydrophobic interactions without O2∙∙∙O3 intramolecular and O2∙∙∙O6 intermolecular hydrogen bonds owing to the absence of hydroxy groups at the C-2 position. Furthermore, the packing of the double helices into hexagonal structures accompanies the stacking of pyranose rings as building blocks, and the blocks are separated into hydrophilic and hydrophobic regions (Fig. 5b).

Fig. 5
figure 5

a Anti-parallel double helices of 2-deoxyamylose and b hexagonal packing of the helices. Cyan and yellow colors are used in the individual molecular chains in the double helix. Reprinted with permission from [43], Copyright 2020 Elsevier

Understanding the dissolution of cellulose and chitin crystalline fibers in ionic liquids

Structural polysaccharides, such as cellulose and chitin, exhibit low processability and poor solubility owing to their high crystallinity. Because 1-butyl-3-methylimidazolium chloride (BMIMCl) dissolves cellulose [47], ionic liquids have attracted attention as solvents to dissolve polysaccharides. Although ionic liquids are widely used to dissolve cellulose [48, 49], few ionic liquids dissolve chitin, and an example is 1-allyl-3-methylimidazolim bromide (AMIMBr) [50]. However, few studies have investigated the detailed dissolution mechanism of structural polysaccharides. Therefore, we adopted an MD approach to study the dissolution of cellulose and chitin crystalline fibers in imidazolium-based ionic liquids [51,52,53].

MD simulations of the dissolution of cellulose and chitin in ionic liquids require highly precise force field parameters. Thus, we adjusted the charge scale to reproduce the density and transport properties of ionic liquids [51, 54,55,56] and to improve the dihedral angles describing the cellulose chain conformation at the pyranose ring [57].

MD simulations revealed that ionic liquids penetrate between the molecular chains to cleave the hydrogen bonds during cellulose dissolution. Furthermore, ionic liquids demonstrating excellent solvation, such as 1-allyl-3-methylimidazolium chloride (AMIMCl) and 1-ethyl-3-methylimidazolium chloride (EMIMCl), not only cleaved hydrogen bonds but also peeled the molecular chains from the crystal phase, thereby dispersing the chains in solution (Fig. 6). Although cellulose molecular chains were also peeled off by BMIMCl and imidazolium acetates (AMIMOAc, BMIMOAc, and EMIMOAc), the cleavage of the intermolecular hydrogen bonds was insufficient owing to the slow uptake of the cations into the chains. The MD simulations of the crystal models revealed that the structures of the cellulose crystalline fibers in imidazolium bromide (AMIMBr, BMIMBr, and EMIMBr), which could not dissolve cellulose, did not undergo changes.

Fig. 6
figure 6

MD structure of the cellulose crystalline fibers in 1-allyl-3-methylimidazolim chloride (AMIMCl) and the dissolution of cellulose crystalline fibers in ionic liquids. Partially modified with permission from [51], Copyright American Chemical Society 2018

MD simulations of the dissolution of chitin revealed that chitin molecular chains peeled from the crystal surface, accompanied by the cleavage of intermolecular hydrogen bonds and the adsorption of ionic liquids on the crystal surface (Fig. 7). The experimental dissolution of chitin by AMIMBr was remarkable, and the MD results indicated that Br− contributed to the cleavage of the NH∙∙∙O=C intermolecular hydrogen bonds involving the acetamide groups by forming anion bridges (NH∙∙∙Br−∙∙∙HO), whereas AMIM+ prevented the peeled chains from returning to the crystalline phase. Although the molecular chains of chitin were peeled off by imidazolium acetates (AMIMOAc, BMIMOAc, and EMIMOAc), the peeled chains occasionally returned to the crystalline phase. The chitin in imidazolium acetates formed a hydrogen bond network (NH∙∙∙O=C–O−∙∙∙HO) with the acetate ions (AcO−). In addition, the MD simulations of the chitin crystal model in various ionic liquids with different counterions, such as imidazolium chloride (AMIMCl, BMIMCl, and EMIMCl), which could not dissolve chitin, revealed that the molecular chains were not peeled off.

Fig. 7
figure 7

Dissolution of the chitin crystalline fibers in 1-allyl-3-methylimidazolim bromide (AMIMBr). Partially modified with permission from [52], Copyright Royal Society of Chemistry 2018

MD trajectory analyses revealed that the experimentally determined solubility and predicted number of intermolecular hydrogen bonds were strongly correlated (Fig. 8) [51, 52]. In this study, we integrated high-throughput MD simulation and machine learning to screen for solvents with high cellulose solvation power. The predicted model was used to generate 3000 chemical analogs of imidazolium-based ionic liquids exhibiting improved solvation (manuscript in preparation).

Fig. 8
figure 8

Relationships between the experimentally determined solubilities and predicted numbers of intermolecular hydrogen bonds between the crystalline fibers of cellulose (red circles) and chitin (blue diamonds) in imidazolium-based ionic liquids

Conclusions and future perspective

Atomistic simulations of the crystal models of cellulose have provided insights into the molecular chain sheet stability, fiber deformation, and crystal transition and information for the derivation of novel nanostructures. In particular, we applied DFT calculations based on high-precision quantum mechanics to account for the large deformation of molecular chain sheets; moreover, T-REMD calculations of the formation and self-assembly of the double helices of amylose analog polysaccharides were also performed to overcome the inadequate sampling on the time scale of conventional MD calculations. Cellulose and chitin dissolution simulations require highly accurate force field parameters that adequately describe the ionic liquids and polysaccharide molecules. Consequently, to analyze complicated phenomena involving polysaccharide materials with high resolution, we refined the selected models and improved the accuracy of the calculations, thereby ensuring that the simulations mimicked realistic systems.

In the future, we aim to obtain guidelines for material design from atomistic insights into the interfacial interactions of polysaccharides. Furthermore, we aim to pioneer a technological platform to accelerate material discovery and development by solving spatiotemporal-scale problems in atomistic simulations and integrating data-driven science (materials informatics). A systematic development of force field parameters for polysaccharides and solvents (including ionic liquids) is currently underway in our research group. In addition, as mentioned earlier, we are in the process of building a machine learning system for the cellulose solvent search and will work on a similar strategy to predict the structure–property relationships of polysaccharides and their derivatives.