Home Journals Progress in Chemistry
Progress in Chemistry

Abbreviation (ISO4): Prog Chem      Editor in chief: Jincai ZHAO

About  /  Aim & scope  /  Editorial board  /  Indexed  /  Contact  / 
Review

Machine Learning Helps Probe Sodium Ion Motion Behavior in Carbon-Based Anodes

  • Zihao Yang 1, 2 ,
  • Zhendong Liu 3 ,
  • Quanbing Liu , 1, 2, *
Expand
  • 1 Guangzhou Key Laboratory of Clean Transportation Energy Chemistry, Guangdong Provincial Key Laboratory of Plant Resources Biorefinery, School of Chemical Engineering and Light Industry, Guangdong University of Technology, Guangzhou 510006, China
  • 2 Jieyang Branch of Chemistry and Chemical Engineering Guangdong Laboratory, Jieyang 515200, China
  • 3 State Key Laboratory of Pulp and Paper Engineering, School of Light Industry and Engineering, South China University of Technology, Guangzhou 510640, China

Received date: 2025-06-20

  Revised date: 2025-09-20

  Online published: 2025-12-10

Supported by

National Natural Science Foundation of China(22408054)

National Natural Science Foundation of China(22378074)

National Natural Science Foundation of China(22179025)

GuangDong Basic and Applied Basic Research Foundation(2025A1515011939)

Guangdong University Innovation Team Project(2023KCXTD035)

Abstract

The complexity of sodium-storage mechanisms has become a key bottleneck limiting the deployment of high-performance carbon-based anodes in commercial sodium-ion batteries. In hard-carbon anodes, Na-storage involves multiscale, coupled processes that are challenging to characterize. Machine learning (ML) can bridge the experiment-characterization-simulation divide, rapidly uncover nonlinear multivariate relationships and key structure-property descriptors, complement theoretical calculations by mitigating limitations in time/length scales and data scarcity, and enable predictions of capacity plateaus, diffusion kinetics, and cycling stability. Building on a critical synthesis of Na-storage mechanisms in hard carbon, this review distills core ML strategies and representative applications to support interpretable, data-driven design of high-capacity, long-life carbon anodes, highlighting ML-centered approaches for probing alkali-ion behavior. The aim is to provide theoretical guidance and practical design rules for the future design and optimization of carbon-based anode materials.

Contents

1 Introduction

2 The principal challenges facing carbon-based anodes

2.1 Bonding behaviour of alkali metal atoms in various carbon material systems

2.2 Sodium storage behaviour in hard carbon

3 Machine learning in investigating ion transport behaviour in carbon-based anodes

3.1 Common machine learning algorithms

3.2 Data-driven machine learning approaches

3.3 Machine learning reveals intercalation behaviour in carbon materials

4 Conclusion and outlook

Cite this article

Zihao Yang , Zhendong Liu , Quanbing Liu . Machine Learning Helps Probe Sodium Ion Motion Behavior in Carbon-Based Anodes[J]. Progress in Chemistry, 2025 , 37(12) : 1836 -1845 . DOI: 10.7536/PC20250613

1 Introduction

New energy technologies, exemplified by high-performance secondary batteries, have garnered widespread attention and importance due to their potential in alleviating energy pressure[1-5]. Among the various technologies, rechargeable sodium-ion batteries (SIBs) are regarded as a promising candidate for large-scale energy storage, owing to their low cost and the abundance of sodium resources in the Earth's crust[6]. Compared with lithium, sodium has a larger ionic radius and a higher ionization potential, making it difficult for sodium ions to be effectively intercalated between graphene layers. As a result, graphite-based anode materials traditionally used in lithium-ion batteries are not suitable for SIBs[7]. Hard carbon (HC) is a carbon material composed of disordered, stacked graphite crystallites. With its high specific capacity, electrochemical stability, and cost advantages, HC is considered a highly promising anode material for SIBs[8]. The curved graphite layer structure and the nanoscale pores within the material endow HC with exceptional sodium-ion storage performance, making it an ideal anode material for achieving high reversible capacity and good cycle life in SIBs. However, key questions remain regarding HC’s pore characteristics, the manner in which sodium ions fill these pores, and how pore-filling behavior is influenced by potential and the microstructure of HC. A deeper understanding of these complex sodium-storage mechanisms is crucial for designing HC microstructures with higher capacity and superior reversibility.
In quantum chemical research, first-principles methods and atom- and physics-based electrochemical modeling have made significant progress. However, there has been a persistent lack of a standardized predictive model that can systematically link the material properties and characteristic mechanisms (including spatial and temporal scales) underlying battery behavior with macroscopic battery performance, which has become a key bottleneck restricting further development[9]. With the rapid advancement of numerical algorithms and data acquisition technologies, machine learning (ML) has become more versatile and efficient, enabling researchers to effectively address the numerous parameter tuning and data processing challenges in battery design[10].
Data-driven approaches to battery research are increasingly being adopted as an essential complement to address the inherent complexity of these systems and accelerate the translation from academic research to industrial applications. This article first analyzes the fundamental reasons why traditional graphite anode materials used for lithium storage are not suitable for sodium/potassium-ion batteries. It then systematically reviews the key factors influencing the microstructure of hard carbons and the long-standingly debated mechanisms of sodium/potassium-ion storage. Finally, it summarizes the basic principles of mainstream machine learning algorithms and their specific applications in elucidating sodium storage mechanisms.

2 Main Challenges Facing Carbon-Based Anodes

In response to the growing scarcity of lithium resources, research into ion batteries that use other alkali metals as charge carriers has emerged as a potential solution. Graphite excels in accommodating various intercalation ions due to its high specific capacity, low working potential, and stable discharge characteristics, quickly establishing itself as one of the most promising anode materials and gaining widespread practical application (Fig. 1a). As shown in Fig. 1b, graphite exhibits a typical layered structure, with the most common stacking sequences being ABA or rhombohedral ABC. The covalent bonds within the graphene layers are strong, while the interlayer bonding is weak, with an interlayer spacing of 0.34 nm, which facilitates ion insertion and extraction. However, although graphite demonstrates a relatively high storage capacity for various alkali metals (AM), its sodium storage capacity is significantly lower (Fig. 1c). Early studies indicated that the intercalation of Na+ions leads to an expansion of the graphite interlayer spacing (greater than that observed with Li), thereby increasing the strain energy in graphite and resulting in a relatively low formation energy for Na–graphite compounds[11]. However, Liu et al.[12], based on density functional theory (DFT), found that the formation energy of Na–graphite compounds is the highest among the series of alkali metal–graphite compounds, indicating that Na–graphite compounds exhibit relatively high thermodynamic stability. Furthermore, the formation energy (E f) of alkali metal–graphite compounds follows the trend Na > Li > K > Rb > Cs, which highlights the thermodynamic instability of Na–graphite compounds. In the formation process of AM–graphite compounds (Fig. 1d), the alkali metal initially exists as isolated atoms, then forms an intercalation-like structure with graphite, and finally becomes embedded between the strained graphite layers. As shown in Fig. 1e, this process arises from the interplay between trends in ionization energy and the ion–substrate coupling effect: first, the AM undergoes ionization, transferring electrons to the graphite substrate; subsequently, cations are formed and couple with the substrate, thereby reducing the overall energy of the system.
图1 (a) 石墨中AM插层的主要发展时间表。(b) 两种不同堆叠序列的石墨结构三维示意图。(c) 石墨电极的典型充放电曲线[13]。(d) AM-石墨化合物的形成过程。(e) AM与基质之间的结合示意图[14]

Fig.1 (a) Key timeline for AM intercalation in graphite. (b) Three-dimensional schematic of graphite structures with two distinct stacking sequences. (c) Typical charge-discharge curve of a graphite electrode[13]. (d) Formation process of AM-graphite compounds. (e) Schematic of AM bonding with the matrix[14]

2.1 Binding Behavior of Alkali Metal Atoms in Various Carbon Material Systems

The thermodynamics of alkali metal ion intercalation into hard carbon is explained using free energy decomposition:
$E\approx -\frac{\mathrm{\Delta }{G}_{\mathrm{i}\mathrm{n}\mathrm{s}}}{F}=-\frac{\mathrm{\Delta }{G}_{\mathrm{d}\mathrm{e}\mathrm{s}\mathrm{o}\mathrm{l}\mathrm{v}}+\mathrm{\Delta }{G}_{\mathrm{b}\mathrm{i}\mathrm{n}\mathrm{d}}+\mathrm{\Delta }{G}_{\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{a}\mathrm{i}\mathrm{n}}}{F}$
Here, ΔG desolvis the free energy required for an ion to escape from its solvation shell, and its variation primarily governs the reaction rate and initial energy barrier of alkali metal ion intercalation; ΔG bindis the free energy associated with ion binding to the carbon matrix (interlayer/defects on pore walls), which determines the magnitude and stability of the platform voltage—alkali metal ions in hard carbon achieve a low-potential platform only because of structural matching; ΔG strainreflects the free energy associated with interlayer spacing expansion and structural reorganization, which is related to ionic radius and determines whether the material structure can withstand ion insertion, directly affecting cycling stability and platform capacity. The intercalation behavior of Li+, Na+, and K+in hard carbon differs due to variations in ionic radius, solvation energy, and ionization energy. The platform voltage and intercalation stability are the result of a trade-off among ΔG desolv, ΔG bind, and ΔG strain. Li+has the smallest radius (≈0.76 Å) and can stably intercalate between graphite layers, resulting in a low-potential platform; Na+has a larger radius (≈1.02 Å), making it thermodynamically unfavorable in graphite, but in hard carbon, it benefits from the larger 002 interlayer spacing (d 002≈0.37–0.39 nm) and the synergistic effect of closed micropores, enabling a stable low-potential platform (≈0.08–0.12 V vs Na/Na+). Compared to Na+, K+has an even larger ionic radius (≈1.38 Å), leading to differences in its storage mechanism in hard carbon materials relative to sodium ions. In hard carbon, K+typically exhibits a higher intercalation potential (≈0.2–0.3 V vs. K/K+), which is closely related to its larger ionic radius and lower ΔG desolv. Its larger size may cause greater deformation of the hard carbon structure, thereby affecting its cycling stability. Such theoretical analyses provide a solid theoretical foundation for understanding the binding behavior of alkali metal atoms in various substrate systems.

2.2 Sodium storage behavior of HC

Fortunately, this limitation is effectively mitigated in non-graphitizable HC materials. In electrochemical energy storage research, hard carbon materials are primarily synthesized from organic compounds or biomass-derived precursors through thermal treatment or chemical conversion processes (Figure 2). During carbonization, various reactions occur concurrently, including dehydrogenation and isomerization[15]. Moreover, because the macromolecular structure of non-graphitizable carbon precursors does not undergo fluidization during thermal treatment, certain structural features are preserved. As a result, some derived hard carbon materials retain the microstructural and morphological characteristics of the original precursors, although their overall packing density remains relatively low. HC is a disordered carbon material composed of randomly oriented graphite layers, with an interlayer spacing of approximately 0.36–0.40 nm. The curved graphite layers and abundant nanopores in its structure together confer advantages such as high specific capacity, good electrochemical stability, and low cost, making it a highly promising anode material for SIBs.
图2 (a) 硬碳形成方案与温度的函数关系与(b) 硬碳(原子)结构模型[16]

Fig.2 (a) Schematic representation of hard carbon formation as a function of temperature and (b) atomic structural model of hard carbon[16]

Overall, the sodium storage behavior in hard carbon materials mainly comprises: (1) adsorption on surfaces, defect sites, and functional groups; (2) micropore filling; and (3) intercalation into graphitized carbon layers. In addition, the electrochemical charge–discharge curves of conventional hard carbons are generally divided into two regions: a plateau region below 0.1 V and a sloping region above 0.1 V. However, due to the inherent complexity of the HC structure, its microstructure is highly diverse, the relationship between interlayer spacing and nanoscale pores remains unclear, and the coupling between capacity magnitude and sodium storage mechanisms is not well understood. As a result, significant uncertainty persists regarding the intercalation mechanism, which hinders further optimization and design of carbon-based anode materials[17-18]. The “insertion–adsorption” model initially proposed for sodium ion storage mechanisms[19]describes a structure in which aromatic fragments are randomly stacked like a house of cards, forming parallel graphite-like nanodomains and nanoscale pore regions. As research has progressed, this model has evolved into an “adsorption–insertion” mechanism: when HC materials discharge from 0.1 V to 0.001 V, the interlayer spacing expands from 3.96 Å to 4.16 Å, indicating that the plateau region corresponds to an intercalation mechanism[20]. For example, Qiu et al.[21]conducted systematic electrochemical analyses of undoped heteroatom-free nanostructured materials, combined with in situ/ex situ experimental techniques, and found that in the early stage of sodium ion intercalation, Na+first adsorbs onto defect sites in hard carbon, resulting in a sloping voltage curve. Subsequently, Na+is inserted into the interlayers of graphite crystallites at appropriate spacings, forming NaC xcompounds, analogous to lithium ion insertion in graphite, thereby exhibiting a typical low-voltage plateau feature. When the interlayer spacing exceeds 0.4 nm, such ultra-large interlayer-spaced graphitized carbon layers, together with conventional “defects” (such as pores, edges, and heteroatoms), can contribute to the capacity in the sloping region above 0.1 V. At an interlayer spacing of 0.36–0.40 nm, pseudo-graphitized carbon stores sodium via an “interlayer insertion” mechanism, and the resulting NaC8can provide a theoretical plateau capacity of 279 mAh·g-1. Carbon materials with an interlayer spacing of less than 0.36 nm, resembling graphitized carbon, are unable to store sodium. Subsequently, studies have proposed a three-stage synergistic sodium storage mechanism of “adsorption–insertion–filling”[22]: throughout the sodiation process, no insertion of Na+between carbon layers occurs, and the sloping capacity can be attributed to the adsorption of Na+on defect sites (>1 V) and on disordered, isolated graphene sheets (1–0.1 V), while the plateau capacity arises from mesopore filling (<0.1 V). Even when the battery discharges to 0 V, the d 002peak does not shift. The continuous evolution of these mechanisms provides a new theoretical basis and research perspective for the structural regulation and design of carbon-based materials (Figure 3). However, constrained by experimental conditions and the resolution limitations of various characterization techniques, carbon materials from different sources generally exhibit structural heterogeneity and widespread defect distributions, leading to certain specialized sodium storage mechanisms that remain insufficiently revealed and confirmed. At the same time, traditional theoretical computational methods are limited by temporal and spatial scales, making it difficult to perform comprehensive and effective modeling and validation of the aforementioned mechanisms. The rise of ML methods, however, offers new research pathways and tools for overcoming the limitations of traditional computational approaches and exploring complex sodium storage behaviors.
图3 不断发展的储能机制

Fig.3 The evolving energy storage mechanism

3 Application of Machine Learning in Studying Ion Transport Behavior in Carbon-Based Anodes

As a highly complex materials system, the operation of a battery relies on the coordinated interplay of charge and ion transport across multiphase interfaces, reversible and irreversible chemical reactions, and numerous intrinsic material properties. Battery performance is influenced by a wide range of factors, including electrode materials, electrolytes, interfacial properties, microstructure, current collectors, separators, binders, battery or battery-pack design, environmental conditions, and operating parameters[23-24]. Currently, DFT calculations and molecular dynamics (MD) simulations are widely used in battery research, providing researchers with energy, mechanical, and other relevant information at relatively low cost. However, these methods are typically limited to specific systems and struggle to meet the demands for comprehensive, multiscale, multiphysics property descriptions, thereby limiting the depth and breadth of materials research to some extent. With the rapid advancement of artificial intelligence technologies, materials research is gradually entering a new paradigm of data-driven scientific computing[25]. ML is an algorithmic framework that enables AI systems to learn from experience. Its core idea lies in combining theoretical knowledge from traditional computational methods with data-driven models, thereby efficiently uncovering the complex relationships and underlying mechanisms embedded in materials data[26-28]. As shown in Figure 4, ML is applied in battery research primarily through three pathways: experimental approaches, theoretical models, and data tools[29]. This new research paradigm helps to re-examine the vast amounts of "redundant" data generated in traditional computational processes, identifying potentially useful information and hidden patterns that are difficult to observe directly. For example, with the support of DFT calculations, ML can efficiently screen potential candidate materials[30], rapidly predict their energy performance based on existing data[31], and further apply these insights to predict key performance metrics such as battery cycle life[32]. At present, ML technology is being applied across different temporal and spatial scales in battery research, helping researchers develop more advanced theoretical frameworks and experimental toolsets.
图4 ML在电池领域应用的三个关键工具及尺度范围[29]

Fig.4 Three key tools and their respective scales for ML applications in the battery field[29]

3.1 Commonly used ML algorithms

In materials science, the key to applying ML lies in constructing an appropriate model that relies on a dataset capable of accurately characterizing the relationship between features (such as binding energy or physicochemical properties) and material performance. Next, an appropriate ML algorithm is selected to train this dataset, and the model’s predictive capability is typically evaluated by comparing its output with the true values of data not used in training. In contrast, traditional materials computation methods usually focus solely on obtaining numerical values, while neglecting the potentially hidden important information and underlying relationships within vast but underexplored “redundant data,” which researchers find difficult to uncover based on experience alone. By contrast, ML algorithms offer a more efficient cognitive pathway, enabling optimized solutions to complex problems through the construction of suitable models. As shown in Figure 5,common ML algorithms can be broadly categorized into three types: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning can be further subdivided into regression and classification. Regression is used to analyze the relationship between continuous variables in a dataset and target properties, while classification assigns samples to different categories based on relationships among the data. The hallmark of unsupervised learning is that no explicit input or output variables are predefined in the dataset; its primary task is to identify latent structures, key variables, or data clustering patterns within the dataset. As traditional computational methods continue to advance, the scale of generated data has grown exponentially, often accompanied by issues such as complex structures and information clutter, which has increasingly driven the widespread application of unsupervised ML algorithms in materials data processing. Unlike the two aforementioned approaches, reinforcement learning continuously acquires environmental feedback to dynamically optimize model parameters, exhibiting strong adaptive capabilities and gradually emerging as a key research focus in intelligent materials manufacturing and complex systems modeling.
图5 电池材料领域常用的ML算法

Fig.5 Commonly used machine learning algorithms in the field of battery materials

3.2 Data-driven ML methods

Due to the complex microstructure of carbon-based anode materials, current research still lacks a systematic analysis and elucidation of the relationship between hard carbon structure and its sodium storage performance. Data-driven ML methods are gradually emerging as crucial tools for analyzing experimental and theoretical data, and for uncovering key relationships between the structure and performance of battery materials. As machine learning applications in electrode material research deepen, prediction results are used to provide confidence intervals and construct calibration curves, thereby assessing the consistency between model outputs and the true distribution. Predicted values from neural network models, random forest models, binary logistic regression models, and others are employed as key metrics. Meanwhile, by integrating methods such as Monte Carlo sampling to account for experimental characterization errors (e.g., deviations in interlayer spacing measurements, uncertainties in pore volume fractions), uncertainty propagation analysis can be performed, helping to understand how input perturbations are transmitted to predictions of capacity and cycling stability. Through SHapley Additive exPlanations (SHAP value analysis) and Permutation Importance (a feature importance method), it is possible to quantitatively identify the structural parameters that contribute most significantly to model outputs. For example, Liu et al.[33]Using ML methods, key structural parameters are jointly analyzed with thermodynamic and kinetic properties, systematically summarizing and comparing the structure and performance of different types of carbon materials, and evaluating the key factors influencing sodium storage performance (Figure 6). At the same time, attempts are made to construct performance prediction models based on key structural parameters, exploring the pathways through which structural regulation affects sodium storage performance. The research results indicate that the sodium storage mechanism of the target carbon material is dominated by intercalation behavior, supplemented by a certain degree of adsorption and pore-filling behavior. A moderate degree of crystallinity helps achieve high specific capacity and a low voltage platform; meanwhile, a medium level of defect content facilitates rapid (de)intercalation of sodium ions and maintains the structural stability of the material during cycling. k-fold cross-validation (such as repeated stratified k-fold cross-validation, k=10) is employed, and validation is conducted using independent external datasets covering different precursors and carbonization process conditions. In addition, the F1-score (the weighted harmonic mean of precision and recall) is introduced to provide a comprehensive evaluation of classifier performance (with a range from 0 to 1, where values closer to 1 indicate better performance). The Receiver Operating Characteristic–Area Under Curve (ROC-AUC) is used to evaluate a composite metric that reflects both sensitivity and specificity for continuous variables; the AUC value ranges from 0 to 1, with higher values indicating better model performance. By applying two types of classification models separately to classify high-capacity and low-capacity samples, the model’s performance can be assessed more comprehensively, avoiding evaluation bias arising from class imbalance. Experimental results show that by regulating structural parameters such as carbon layer size, interlayer spacing, and defect distribution, it is promising to optimize and enhance the sodium storage performance of carbon materials. This study demonstrates the great potential of ML in battery material research, providing a practical and viable research approach for the development of high-performance carbon materials, and also indicates that the combination of experimental databases and ML methods will play an important role in advancing scientific research and industrializing new energy materials.
图6 ML算法对不同类型碳材料的结构和性能的预测:(a, b)不同类型碳材料的结构和性能的总结和比较;(c~e)首圈库仑效率、容量和倍率因素的机器学习性能预测结果;(f~h)2万组结构数据的最终预测性能[33]

Fig.6 Prediction of structural and performance characteristics for different carbon materials using ML algorithms: (a, b) Summary and comparison of structural and performance characteristics for different carbon materials; (c~e) machine learning performance predictions for first-cycle coulombic efficiency, capacity, and rate factor; (f~h) final prediction performance based on 20 000 sets of structural data[33]

3.3 ML Reveals Intercalation Behavior in Carbon Materials

Due to the complex spatial distribution and diverse composition of the internal structure of graphite anodes, the embedding process of AM ions within them also exhibits diversity and complexity. To reveal the chemical nature of sodium ion aggregation, Li et al.[34]conducted a systematic study on the aggregation behavior of Na+in graphite anodes. First, they used DFT simulations to model the entire process of Na+gradually embedding into graphite and forming dynamic sodium clusters. Subsequently, they combined ML methods to perform an in-depth analysis of the energy data generated during this formation process (Fig. 7a–c). The study found that the clustering behavior of sodium ions is closely related to the different interactions they exhibit in carbon materials, and the analytical results are highly consistent with previous experimental observations. The combined use of ML and DFT not only enables in-depth modeling of the intercalation behavior of AM atoms in carbon materials but also establishes a direct correspondence with experimental results. The synergistic effect of these two approaches provides new insights for DFT-based research, effectively advancing the understanding of Na+storage mechanisms and demonstrating the potential to become a general research strategy. By leveraging data-driven methods, this approach reveals the intrinsic relationship between sodium storage performance and material structural parameters, providing theoretical support for constructing a universal structure–performance mapping for carbon materials. This, in turn, facilitates the design optimization and performance enhancement of disordered carbon anode materials in battery technologies.
图7 ML结合DFT共同探究储钠机制:(a) 由不连续电位点构成的计算放电电位曲线,模型碳内部存在特定数量的钠原子(以Na44 C200为例)。该模型以每单元200个原子构建,质量密度为1.153 g·cm-3。(b) 随钠原子数从Na4 C200增加至Na44 C200,Löwdin电荷(通过LOBSTER计算)的核密度估计图,基于平滑直方图绘制。(c) 采用原子位置平滑重叠核函数分析SC中储钠的局部环境,该核函数最初用于高斯近似势能拟合的结构相似性计算。通过基于结构距离(代表相似/差异程度)的多维尺度分析获得分布图,相似度最高的点以相近颜色聚合显示[34]

Fig.7 ML combined with DFT to jointly investigate sodium storage mechanisms: (a) Computed discharge potential curve composed of discrete potential points, showing a specific number of sodium atoms within the model carbon structure (using Na44 C200 as an example). This model comprises 200 atoms per unit cell with a mass density of 1.153 g·cm-3. (b) Nuclear density estimates of Löwdin charges (calculated via LOBSTER) plotted as a smoothed histogram, tracking the increase in sodium atoms from Na4 C200 to Na44 C200. (c) Analysis of the local environment surrounding sodium storage in SC using an atom-position-smoothed overlapping kernel function, originally employed for structural similarity calculations in Gaussian approximation potential fitting. Distribution plots were obtained via multidimensional scaling based on structural distances (representing similarity/dissimilarity), with points of highest similarity aggregated under similar colours[34]

Chen et al.[35]constructed a hard carbon structure–performance database comprising 503 data sets, based on previously reported experimental data. The study employed various machine learning models, including random forests, gradient boosting decision trees (GBDT), and extreme gradient boosting (XGB), combined with cross-validation and data augmentation techniques, to predict and analyze the discharge performance of hard carbon materials. By comparing and fitting multiple data sets, the aim was to more accurately quantify the cycling stability of hard carbon materials. Through feature importance analysis, key structural features influencing discharge performance were effectively identified, allowing for adjustments to hard carbon anode parameters. Violin plots and scatter plots were constructed as output feature datasets (Figure 8a),and, in combination with unique ML models, it was concluded that SIBs achieve optimal multi-performance of hard carbon materials during discharge. The optimal structural parameters for SIBs to attain the best electrochemical performance during discharge—such as capacity performance and cycle retention—are derived from the Catboost model (Figures 8b and c).From the Catboost model, it was found that the structural parameters associated with high multi-performance partially overlap with those associated with high capacity, suggesting that high reversible capacity can be achieved under high multi-performance conditions; however, the underlying theoretical connections between these parameters still require further exploration. Chen et al.[36]used small-angle X-ray scattering (SAXS) to reveal the fractal dimension (D), which was identified as a descriptor of disorder in hard carbon structures under data-driven approaches and was correlated with electrochemical and structural characteristics. As Dincreases, the pseudo-graphite domains decrease, and the carbon layers become more curved, leading to an increase in closed pores. With the increase in D, the diffusion coefficient decreases, resulting in a lower slope capacity percentage in SIBs and narrowing the gap between structural characterization and actual performance. It was demonstrated that, above 0.1 V vs. Na+/Na, sodium ion transport is primarily controlled by surface-driven processes, such as the adsorption of Na+at edge sites and structural defects. These sites provide fast and stable channels for ion migration, thereby enhancing rate performance in the sloped voltage region. Conversely, in the low-voltage plateau region, reversible capacity and charge-transfer kinetics are influenced by pore filling and interlayer insertion, both of which are diffusion-controlled processes.
图8 (a) 输入数据集(小提琴和点曲线图);CatBoost模型:双因素交互作用对(b) 倍率性能和(c) 容量的影响。实际值对应于两个水平坐标,交互的大小对应于垂直坐标[35]

Fig.8 (a) Input dataset (violin and dot plots); CatBoost model: effect of two-factor interaction pairs on (b) rate performance and (c) capacity. Actual values correspond to the two horizontal axes, while interaction magnitude corresponds to the vertical axis[35]

4 Conclusion and Outlook

The diverse microstructures of hard carbon materials give rise to complex AM storage mechanisms. However, traditional experimental and theoretical approaches have limitations in elucidating these storage mechanisms, which in turn constrain the structural design and precise synthesis of carbon-based materials. The rise of data-driven materials science is profoundly transforming the conventional research paradigm. Thanks to the flexibility and scalability of ML methods, their integration with traditional experimental and computational simulation techniques has given rise to more innovative research concepts. The data-driven battery research paradigm effectively addresses researchers’ ongoing needs for material innovation, performance prediction, and structural optimization. By using ML methods to screen key structural parameters that influence energy storage performance, this approach can guide the design and development of novel carbon-based anode materials, thereby achieving superior sodium storage performance. It is gradually becoming an important bridge and a powerful complementary tool that connects fundamental laboratory research with practical industrial applications.
[1]
Dunn B, Kamath H, Tarascon J M. Science, 2011, 334(6058): 928.

[2]
Larcher D, Tarascon J M. Nat. Chem., 2015, 7(1): 19.

[3]
Choi C, Ashby D S, Butts D M, DeBlock R H, Wei Q L, Lau J, Dunn B. Nat. Rev. Mater., 2020, 5(1): 5.

[4]
Sheng H, Zhou J, Li B, He Y, Zhang X, Liang J, Zhou J, Su Q, Xie E, Lan W, Wang K, Yu C. Sci. Adv., 2021, 7(2): eabe3097.

[5]
Gür T M. Energy Environ. Sci., 2018, 11(10): 2696.

[6]
Pan H L, Hu Y S, Chen L Q. Energy Environ. Sci., 2013, 6(8): 2338.

[7]
Saurel D, Segalini J, Jauregui M, Pendashteh A, Daffos B, Simon P, Casas-Cabanas M. Energy Storage Mater., 2019, 21: 162.

[8]
Xiao B W, Rojo T, Li X L. ChemSusChem, 2019, 12(1): 133.

[9]
Ramadesigan V, Northrop P W C, De S, Santhanagopalan S, Braatz R D, Subramanian V R. J. Electrochem. Soc., 2012, 159(3): R31.

[10]
Mistry A, Franco A A, Cooper S J, Roberts S A, Viswanathan V. ACS Energy Lett., 2021: 1422.

[11]
Nobuhara K, Nakayama H, Nose M, Nakanishi S, Iba H. J. Power Sources, 2013, 243: 585.

[12]
Alvin S, Cahyadi H S, Hwang J, Chang W, Kwak S K, Kim J. Adv. Energy Mater., 2020, 10(20): 2000283.

[13]
Li Y Q, Lu Y X, Adelhelm P, Titirici M M, Hu Y S. Chem. Soc. Rev., 2019, 48(17): 4655.

[14]
Liu Y Y, Merinov B V, Goddard W A III. Proc. Natl. Acad. Sci. U. S. A., 2016, 113(14): 3735.

[15]
Cao Y L, Xiao L F, Sushko M L, Wang W, Schwenzer B, Xiao J, Nie Z M, Saraf L V, Yang Z G, Liu J. Nano Lett., 2012, 12(7): 3783.

[16]
Dou X W, Hasa I, Saurel D, Vaalma C, Wu L M, Buchholz D, Bresser D, Komaba S, Passerini S. Mater. Today, 2019, 23: 87.

[17]
Sun N, Guan Z, Liu Y W, Cao Y L, Zhu Q Z, Liu H, Wang Z X, Zhang P, Xu B. Adv. Energy Mater., 2019, 9(32): 1970125.

[18]
Novoselov K S, Fal’ko V I, Colombo L, Gellert P R, Schwab M G, Kim K. Nature, 2012, 490(7419): 192.

[19]
Stevens D A, Dahn J R. J. Electrochem. Soc., 2000, 147(4): 1271.

[20]
Bommier C, Surta T W, Dolgos M, Ji X L. Nano Lett., 2015, 15(9): 5888.

[21]
Qiu S, Xiao L F, Sushko M L, Han K S, Shao Y Y, Yan M Y, Liang X M, Mai L Q, Feng J W, Cao Y L, Ai X P, Yang H X, Liu J. Adv. Energy Mater., 2017, 7(17): 1700403.

[22]
Lu Y, Liang J N, Hu Y Z, Liu Y, Chen K, Deng S F, Wang D L. Adv. Energy Mater., 2020, 10(7): 1903312.

[23]
Liu K L, Ashwin T R, Hu X S, Lucu M, Widanage W D. Renew. Sustain. Energy Rev., 2020, 131: 110017.

[24]
Wang T, Pan R T, Martins M L, Cui J L, Huang Z N, Thapaliya B P, Do-Thanh C L, Zhou M S, Fan J T, Yang Z Z, Chi M F, Kobayashi T, Wu J Z, Mamontov E, Dai S. Nat. Commun., 2023, 14: 4607.

[25]
Lombardo T, Duquesnoy M, El-Bouysidy H, Årén F, Gallo-Bueno A, Jørgensen P B, Bhowmik A, Demortière A, Ayerbe E, Alcaide F, Reynaud M, Carrasco J, Grimaud A, Zhang C, Vegge T, Johansson P, Franco A A. Chem. Rev., 2022, 122(12): 10899.

[26]
Juan Y F, Dai Y B, Yang Y, Zhang J. J. Mater. Sci. Technol., 2021, 79: 178.

[27]
Yao N, Chen X, Fu Z H, Zhang Q. Chem. Rev., 2022, 122(12): 10970.

[28]
Oral B, Tekin B, Eroglu D, Yildirim R. J. Power Sources, 2022, 549: 232126.

[29]
Chen X, Liu X Y, Shen X, Zhang Q. Angew. Chem. Int. Ed., 2021, 60(46): 24354.

[30]
Zhang H K, Wang Z L, Cai J F, Wu S C, Li J J. ACS Appl. Mater. Interfaces, 2021, 13(45): 53388.

[31]
Zhang H K, Wang Z L, Ren J H, Liu J Y, Li J J. Energy Storage Mater., 2021, 35: 88.

[32]
Severson K A, Attia P M, Jin N, Perkins N, Jiang B B, Yang Z, Chen M H, Aykol M, Herring P K, Fraggedakis D, Bazant M Z, Harris S J, Chueh W C, Braatz R D. Nat. Energy, 2019, 4(5): 383.

[33]
Liu X X, Wang T, Ji T Y, Wang H, Liu H, Li J Q, Chao D L. J. Mater. Chem. A, 2022, 10(14): 8031.

[34]
Li Q, Liu X, Tao Y, Huang J, Zhang J, Yang C, Zhang Y, Zhang S, Jia Y, Lin Q, Xiang Y, Cheng J, Lv W, Kang F, Yang Y, Yang Q H. Nat. Sci. Rev., 2022, 9(8): nwac084.

[35]
Qi T S, Zhang X, Xiong K, Yang H P, Zhang S H, Chen H P. J. Mater. Chem. A, 2025, 13(23): 17748.

[36]
Hou W Y, Yi Z L, Yu H T, Jia W R, Dai L Q, Yang J J, Chen J P, Xie L J, Su F Y, Chen C M. Chin. Chemical Lett., 2025, 111124.

Outlines

/