Can Artificial Intelligence Boost Developing Electrocatalysts for Efficient Water Splitting to Produce Green Hydrogen?

Jaehyun Kim; Ho Won Jang

doi:10.3740/MRSK.2023.33.5.175

Preview

Review

Korean Journal of Materials Research. 27 May 2023. 175-188
https://doi.org/10.3740/MRSK.2023.33.5.175

Can Artificial Intelligence Boost Developing Electrocatalysts for Efficient Water Splitting to Produce Green Hydrogen?

Jaehyun Kim¹

Ho Won Jang¹²^†

¹Department of Materials Science and Engineering, Research Institute of Advanced Materials, Seoul National University, Seoul 08826, Republic of Korea

²Advanced Institute of Convergence Technology, Seoul National University, Suwon 16229, Republic of Korea

^†Corresponding author E-Mail : hwjang@snu.ac.kr (H. W. Jang, Seoul Nat’l Univ.)

License (open-access, http://creativecommons.org/licenses/by-nc/3.0/):

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

ABSTRACT

Water electrolysis holds great potential as a method for producing renewable hydrogen fuel at large-scale, and to replace the fossil fuels responsible for greenhouse gases emissions and global climate change. To reduce the cost of hydrogen and make it competitive against fossil fuels, the efficiency of green hydrogen production should be maximized. This requires superior electrocatalysts to reduce the reaction energy barriers. The development of catalytic materials has mostly relied on empirical, trial-and-error methods because of the complicated, multidimensional, and dynamic nature of catalysis, requiring significant time and effort to find optimized multicomponent catalysts under a variety of reaction conditions. The ultimate goal for all researchers in the materials science and engineering field is the rational and efficient design of materials with desired performance. Discovering and understanding new catalysts with desired properties is at the heart of materials science research. This process can benefit from machine learning (ML), given the complex nature of catalytic reactions and vast range of candidate materials. This review summarizes recent achievements in catalysts discovery for the hydrogen evolution reaction (HER) and oxygen evolution reaction (OER). The basic concepts of ML algorithms and practical guides for materials scientists are also demonstrated. The challenges and strategies of applying ML are discussed, which should be collaboratively addressed by materials scientists and ML communities. The ultimate integration of ML in catalyst development is expected to accelerate the design, discovery, optimization, and interpretation of superior electrocatalysts, to realize a carbon-free ecosystem based on green hydrogen.

Keywords

green hydrogen

machine learning

electrocatalysts

water splitting

water electrolysis

MAIN

1. Introduction
2. Basic ML Algorithms
3. Electrocatalysts Discovery with ML
3.1. HER and OER
3.2. HER catalysts discovery with ML
3.3. OER catalysts discovery with ML
4. Opportunities and Prospect

1. Introduction

Fossil fuel consumption and significant greenhouse gas emissions have been endangering the Earth’s ecosystem as a result of rising energy demands around the world.¹⁾ To satisfy the rising energy needs and achieve carbon neutrality, there has been a lot of focus on developing sustainable energy technology. The production of renewable energy by electrochemically catalyzed processes, which transform some everyday substances like water, carbon dioxide, and nitrogen into energy carriers, is now a promising technique.²⁾ Among them, hydrogen has received vast attention as a promising alternative energy carrier to replace fossil fuels due to its high energy content compare to that of gasoline. Also, an eco-friendly emission of water vapor instead of greenhouse gas during the combustion of hydrogen is expected to mitigate environmental issues and meet the requirements for circumventing global climate change.

However, more than 90 % of hydrogen is currently produced by reforming fossil fuels, which is called grey hydrogen, and a large amount of carbon dioxide is emitted during the production of grey hydrogen. Despite not producing greenhouse gases, grey hydrogen combustion is not suitable for the overall carbon neutralization. As a sustainable approach, water electrolysis technology has been developed to produce hydrogen without emitting greenhouse gases, which is called green hydrogen.³⁾ A promising method of producing carbon-free hydrogen is electrolysis, which uses electricity to separate water into hydrogen and oxygen. The ideal carbon-free energy ecosystem can be realized if the sources of this electricity are nuclear or renewable resources.

Electrochemical reactions can be divided into two half reactions occurring at the cathode and anode, respectively. When producing green hydrogen through water electrolysis, the hydrogen evolution reaction (HER) and the oxygen evolution reaction (OER) are two separate half processes.⁴⁾ When enough electrical power is supplied during water electrolysis, the water molecule splits into hydrogen and oxygen gas molecules in a device known as an electrolyzer with two electrodes. Designing excellent electrocatalysts with minimal overpotential is essential to achieving an effective and environmentally friendly hydrogen production in an electrochemical process, which will maximize the conversion efficiency of renewable energy sources.

The development of catalytic materials has relied mostly on empirical, trial-and-error methods due to complicated, multidimensional, and dynamic process nature of catalysis, which eventually requires a significant time and effort to find the most optimized multicomponent catalysts under a variety of reaction conditions. The rational and efficient design of materials with desired performance is the ultimate goal for all researchers in the materials science and engineering field. The most common tactic used by researchers, the classic trial-and-error method, has unavoidably suffered due to the high time and financial costs. The size of the materials space for a search should be constrained due to cost effectiveness, relying on the knowledge of researchers, which has restricted thorough explorations of undiscovered candidate materials.⁵⁾

Recently, ML has emerged as a game-changer that could shift a paradigm from a conventional trial-and-error Edisonian approach to data-based predictions.⁶⁾ ML is commonly used to predict target properties of a material based on the chemical composition, crystal structure of the materials, and basic materials properties [Fig. 1(a)]. For catalyst design, the target properties could be binding energy of adsorbate during intermediate reaction, or conductivity.

The collaborations between machine learning (ML) and catalysts development are rapidly increasing and expanding its area of usage. Historically, the design of materials was based on the intuition and experiment that led to long commercialization timelines. This sluggish traditional progress was the main obstacle to the investment in an early stage of research and also delayed the emergence of the solution for important energy and environment challenges. As computational chemistry has nourished, comprehensive materials databases could be generated in a shorter time frame than the experimental approaches, which becomes the workhorse for the application of ML in materials science. For this reason, computationally calculated materials databases are steadily developed with efficient data structures that help researchers to transform the data into actionable insights [Fig. 1(b)].

To highlight the importance of active electrocatalysts, the fundamental ML algorithms are provided in this review, followed by a brief description of two half reactions, HER and OER. Recent achievements using ML models to discover electrocatalysts for HER and OER by estimating the binding energy of adsorbates for theoretical overpotential prediction are introduced in detail. Challenges on applying ML to catalyst discovery, such as data scarcity and data uncertainty, and current strategies to overcome these challenges are also covered. We believe that our review paper will provide readers with comprehensive and timely information, and that they will benefit greatly from applying ML techniques to their own catalyst discovery.

https://cdn.apub.kr/journalsite/sites/mrsk/2023-033-05/N0340330501/images/mrsk_2023_335_175_F1.jpg

Fig. 1

(a) Schematic of a materials science with ML to predict useful materials science knowledge, such as materials properties, from given database and ML model. (b) Recent trend materials development paradigm in materials science research with ML. Reproduced with permission from Ref. (6). Copyright 2018 American Chemical Society.

2. Basic ML Algorithms

The majority of materials development in the past relied on trial and error. As time went on, scientific theories were methodically established in numerous disciplines, making it possible to produce materials by theoretical calculation. However, the calculations were very difficult and time-consuming. The reactions and properties of materials might be predicted without direct experimentation when more time had elapsed thanks to the invention of computers. The emergence of big data in recent years is a result of the advancement of ICT technology. In light of this, a new paradigm has been proposed, activating data-based research.

It is crucial to extract and process useful information from material data in order to apply big data to actual research and create novel materials.⁶⁾ ML is one of the most effective techniques. In contrast to conventional learning, ML can create a program from input values and output values (supervised learning). A program can be created utilizing only an input value and no output value if the amount of information used as an output value is minimal (unsupervised learning). Additionally, there is semi-supervised learning, which combines the two teaching approaches. Each learning method has a different learning algorithm, and efficiency can be increased even more by choosing and employing the right algorithm for the job. Supervised learning is fueled by a labeled dataset with known output value. Through input data and labeled data, supervised learning builds a model that can interpret the new data. The amount of training data set must be large to obtain good learning results, and the training data must be generalized.

Support vector machine (SVM) is one of the representative algorithms of supervised learning. It can perform classification or regression by selecting a hyperplane with the maximum margin [Fig. 2(a)]. In particular, it is specialized in classification and is widely used for this purpose. The SVM is based on linear classification and does not consider the dependence between each property. Decision tree (DT) is another method of classifying or predicting the entire data into several subgroups by representing decision rules in a tree-like structure [Fig. 2(b)]. The DT is divided into a classification tree and a regression tree according to which variables are handled. The ensemble of the DTs plays a role as a voting object in random forest that dramatically increase the model accuracy and robustness [Fig. 2(c)].

For successfully extract non-linear input-output relationship, Artificial neural network (ANN) is a learning algorithm created by imitating the learning method of the brain, inspired by the human neuro system [Fig. 2(d)]. Neurons, which are human nerve cells, are composed of dendrites, axons, somas, and synapses. By organizing these neurons into several layers, the weight of the synapse is continuously adjusted so that the training result of the neural network becomes similar to the expected value. This process is called training, and the purpose of training is to find the optimal value of the weight.

Unlike supervised learning, unsupervised learning is not labeled on data. In this case, it may be thought that learning is impossible, but learning is possible. In unsupervised learning, the correct answer is not given, but it is possible to learn by clustering data of similar classes. In order to explain, let us take an example. There are dogs and chickens, and the study can be conducted by dividing dogs with four legs into one group and chickens with two legs into another group. As such, unsupervised learning has excellent utility in classification and has many advantages when performing dimension reduction.

Principal component analysis (PCA) is one of the representative dimension reduction algorithms [Fig. 2(e)]. First, the principal component represented by the linear combination of variables is found by using the variance and covariance relationships between several quantitative variables. A multivariate analysis method that explains most of the total variation with k-th important components is PCA. It usually performed for the purpose of data reduction and interpretation, and it is essential to minimize the loss of information.

Many algorithms are discriminative networks that determine what data is with a certain probability. However, generative adversarial network (GAN) is not discriminant networks but generative models. When learning is performed using training data with a specific probability distribution, data with a similar distribution can be generated. GAN is composed of a generator and discriminator that the learning is progressed by the adversarial relationship between those two agents [Fig. 2(f)].

ML and deep learning is still very active research topic and researchers in worldwide have committed to develop and deploy advanced algorithms with user-friendly interactive tools. Scientists and engineering studying on catalysts should pay attention on cutting-edge ML algorithms like graph neural network (GNN) that has been spotlighted recently to imitate the atomic neighboring environment with node and edges, which is critical to understand the state-of-the-art ML techniques and apply on one’s own research.

https://cdn.apub.kr/journalsite/sites/mrsk/2023-033-05/N0340330501/images/mrsk_2023_335_175_F2.jpg

Fig. 2

Graphical representation of supervised (a-d) and unsupervised (e, f) learning algorithms, that is (a) support vector machine (SVM), (b) decision tree (DT), (c) random forest, (d) artificial neural network (ANN) (e) principal component analysis (PCA), and (f) generative adversarial network (GAN) respectively. Adapted with permission from Ref. (6). Copyright 2018 American Chemical Society.

3. Electrocatalysts Discovery with ML

3.1. HER and OER

Electrochemical reactions can be divided into two half reactions occurring at the anode and cathode, respectively.³⁾ In case of water electrolysis for green hydrogen production, HER is a cathodic half reaction and OER is an anodic half reaction. In catalysts working condition, the amount of the required voltage to drive HER and OER is higher than theoretically ideal potential, which are 0 V and 1.23 V vs. RHE, and the difference between onset potential and theoretically ideal potential is defined as the overpotential. To efficiently facilitate water electrolysis and decrease overall cell-working electricity consumption, both HER overpotential and OER potential must be lowered with active catalysts.

The HER is a typical example of a two-electron transfer reaction, and the hydrogen adsorption free energy ΔG_H is key for determining activity. Active electrocatalysts for HER should present a ΔG_H close to zero, undisturbed adsorption and desorption of adsorbate and H₂ product gas. Platinum is the best performing electrocatalyst for the HER because of its thermoneutral ΔG_H.

On the other hand, the OER involves a four-electron transfer to oxidize water to oxygen. From pristine active sites on the surface of the catalysts, OH*, O*, and OOH* exist on each intermediate step. The largest energy difference between adjacent steps is the energy barrier for the reaction and become the rate determining step. This energy barrier is related to a theoretical overpotential and should be lowered while having moderate binding energies for the reaction intermediates to be active catalysts. For OER, this trend can be represented with volcano relationship that applies ΔG_O -ΔG_OH as a descriptor. Both theoretical and experimental overpotential or onset potential can be acquired as target values.

3.2. HER catalysts discovery with ML

In recent years, ML techniques have revolutionized the development and discovery of electrocatalytic and photocatalytic materials in various materials species from metal oxides to single-atom catalysts. The key idea to assess whether highly active catalysts predicting binding energy of adsorbates during the intermediate steps of catalytic reactions. For HER, predict the ΔG_H of various materials with pre-trained ML models to screening promising candidate materials with near-zero ΔG_H might accelerate the catalyst development process and circumvent tedious trial-and-errors.

Despite the high intrinsic HER activity of Pt, the scarcity and high cost of this precious element limit its commercial use. Researchers have widely developed catalysts to minimize the usage of precious metal or even replace it with the low cost transition metal elements, while maintaining the superior HER activity as Pt. Compared to conventional bulk electrocatalysts, nanomaterials could exhibit even more superior performance as benefited from the rapid development of nanoscience and nanotechnology. Since the binding energy of the reaction intermediate of HER is a key descriptor to theoretically demonstrate the expected HER activity, researchers might efficiently explore a vast chemical space of possible candidates for HER catalysts to replace Pt if machine learning models successfully predict the hydrogen binding energy at the surface of catalysts.

To surpass the superior HER activity of pure Pt metal catalyst, various metal alloys have been investigated to reduce the amount of Pt composition or even exclude Pt. However, the number of possible combinations of elements with different composition is almost infinite, which makes the candidate search difficult. Tran et al. accelerated this candidate searching process by creating a framework combined ML and surrogate-based optimization to guide DFT calculations, which produced 42,785 adsorption-energy calculations to identify 258 candidate surfaces across 102 metal alloys for HER.⁷⁾ The element type was described by the atomic number, the Pauling electronegativity, the number of atoms coordinated with the adsorbate, and the median adsorption energy between them, and they were further tabulated as the catalyst descriptors for machine learning algorithm. The list of adsorption energy on different surface planes of metal alloy was visualized through t-SNE algorithm that disperse each data point into 2D plane and form clusters from similar neighboring data point. Some Pt-based alloys like PtGa or AlPt, and alloys without Pt like SiV, PdGa, PdSi, or AlNi were predicted to present optimal hydrogen adsorption energy, and those metal alloys could be expected as better candidates for further investigation and experiments.

The metal alloys could be modified by decreasing the size from bulk to nanoparticle, which might increase the surface to volume ratio and enhance a surface catalytic activity. Recently, Mao et al. reported a DFT-based high-throughput screening method combined with machine learning models to successfully identify a type of alloy nanoclusters as the electrocatalyst for HER.⁸⁾ Cu-based alloy clusters of Cu_55-nM_n (M = Co, Ni, Ru, and Rh, and n ≤ 22) were optimized and the preference toward core-shell structures with the dopant metal in the core and Cu as the shell atoms was confirmed with Cu-based alloy clusters. The excess energies (E*_exc) for Cu_55-nM_n alloy clusters as a function of the average radial distance of the dopant metal atoms to the core atom were plotted in Fig. 3(a). The lower excess energy indicated a more stable cluster, which led to the optimized atomic ratio of Cu-Ni cluster. With fixed atomic ratio of Cu-Ni cluster, core-shell structure was more stable than segregated cluster structure [Fig. 3(b)]. The neural network model was trained to predict the hydrogen adsorption energy on specific hydrogen adsorption site. The features consisted of the mean bader charge of the first-neighbor and second-neighbor atoms of the active site on the outer atomic shell and coordination number of the active site atom to successfully represent the charge distribution around the active site. The researchers used 3,388 adsorption free energies to predict the H adsorption free energy on nanoclusters with a MAE error of 0.07 eV and a root-mean-square error (RMSE) of 0.11 eV on the test set, which was represented in parity plots [Fig. 3(c)]. The researchers showed that HER performance of the Cu-based nanoclusters can be significantly improved by doping transition metals, which might lead to the optimal shift of hydrogen adsorption energy. Among metal alloy nanocluster candidates, core-shell CuNi alloy clusters were suggested to be the superior electrocatalyst owing to the superior structural stability and the electrochemical activity.

https://cdn.apub.kr/journalsite/sites/mrsk/2023-033-05/N0340330501/images/mrsk_2023_335_175_F3.jpg

Fig. 3

(a) The excess energies (E_exc) for Cu_55-nNi_n, alloy clusters with n = 7, 13, and 16 as a function of the average radial distance (D_{radial, M}) between the dopant metal atoms and the center atom. The insets show the lowest-energy configurations. (b) E_exc for Cu₄₂Ni₁₃ alloy clusters as a function of D_{radial, M} between the dopant metal atoms and the central atom. The core-shell structure and the segregated structure are illustrated. The insets show the optimal adsorption sites for H on the surface of the optimized Cu-centered Cu-Ni core-shell clusters with |ΔG_H| < 0.1 eV. (c) Parity plots between predicted and DFT-calculated ΔG_H values. Reproduced with permission from Ref. (8). Copyright 2021 Springer Nature.

Similar to nanoparticle, the dimensional reduction of bulk materials into two-dimensional (2D) materials lead to a large surface area to volume ratio the unique physical properties and have been widely researched. This interest led to both a new wave of research on known 2D materials, such as metal dichalcogenides and boron nitride, and the discovery of many new 2D materials. For example, Ge et al. predicted the hydrogen adsorption energy of two single-layer metal dichalcogenides MX₂ (M = Mo, W; X = S, Se, Te) in a tilted heterojunction structure by training ML models with rotational angle, bond length, the ratio of the bandgaps of two MX₂, and distance between those two layers.⁹⁾ Using the trained ML model, MoTe₂/WTe₂ heterojunction with a rotation of 300 degree was predicted to show optimal hydrogen adsorption behavior towards efficient HER and suggested as a promising HER catalysts.

Another widely investigated 2D materials species is 2D MXenes, which is a 2D flake of carbides and nitrides of transition metals. The general formula of MXenes is expressed as M_n+1X_nT_x (n = 1, 2, 3), where M is an early transition metal, X is C or/and N, and T_x is the surface functional group, such as -O, -OH, and -F. 2D MXene materials have attracted the researchers due to their advantages of adjustable chemical composition, tunable layer thickness, and facile functionalization nature with outstanding physiochemical properties, which could be beneficial for HER catalysts design. However, tuning the thermal stability and activation of in-plane activity still remain as a challenge. To solve this problem, Zheng et al. built a machine learning framework combined with screening based on DFT calculations to predict hydrogen adsorption energy on MXenes with various hydrogen coverages.¹⁰⁾ Various bare MXenes without hydrogen coverage were modelled to obtain the hydrogen adsorption free energy values by a series of DFT calculations [Fig. 4(a)]. Different numerical intervals of ΔG_H* were represented with the different colored circles.

To tune the hydrogen adsorption, the S functional group was introduced onto the bare MXenes. 20 primary descriptors were suggested to efficiently represent the hydrogen adsorption properties on MXenes and the correlation map of those features was used to narrow down the number of descriptors [Fig. 4(b)]. Features with top 10 importance were selected and it turned out that the electronegativity and atomic mass of transition metal element were important to increase the accuracy of machine learning model performance on predicting the hydrogen adsorption energy of the bare MXenes [Fig. 4(c)]. Also, the charge ratio of transition metal elements and of atomic radius corresponded the electronic and geometric structures that could be applied as key descriptors for the machine learning model to predict the hydrogen adsorption energy on the MXenes with a wide range of hydrogen coverages. The random forest regression model showed the lowest test set error compared to support vector or kernel ridge regression model and the best perfoming random forest regression model was selected to predicted that under wide hydrogen coverages. Os2B- and S-terminated Sc_n+1N_n (n = 1, 2, 3) MXenes exhibited optimal hydrogen adsorption free energy near zero, which leads to a superior HER activity with the ΔGH value approaching zero under wide hydrogen coverages [Fig. 4(d)]. DFT calculations on different hydrogen coverages on the promising candidates were conducted to verify the prediction from the machine learning model, and the author claimed that S functional groups play a crucial role in regulating the HER performance due to the antibonding states which are full of electrons.

https://cdn.apub.kr/journalsite/sites/mrsk/2023-033-05/N0340330501/images/mrsk_2023_335_175_F4.jpg

Fig. 4

(a) Color block map of ΔG_H* for the bare MXenes. Gray, yellow, orange, and wine-red circles represent ΔG_H* values of < -1.5, -1.5~-1.0, -1.0~-0.5, and -0.5~0.2 eV, respectively. (b) Correlation map of 20 features, which resulted in red for the strong positive correlation, blue colors correspond to the strong negative correlation, and white means no direct correlation between two descriptors. (c) Importance of top descriptive 10 features suggested from the random forest machine learning models. (d) ΔG_H* of Hf₄C₃S₂ and Sc₂NS₂ with different adsorbed hydrogen coverages from 1/9 to 4/9. Reproduced with permission from Ref. (10). Copyright 2020 American Chemical Society.

Furthermore, Wang et al. expanded the chemical space of O-terminated MXenes to the ordered binary alloy based MXenes with different combination of two 3d, 4d and 5d transition metal elements, which leads to M₂M′X₂O₂ and M₂M′₂X₃O₂ systems (X = C or N).¹¹⁾ The 110 kinds of experimentally unexplored 2D MXene OBAs from 2,520 candidates were predicted to indicate with outstanding thermostability and HER activity. The geometrical and electrical descriptors were selected to link the improved HER activity with the alloying effect through the ensemble learning model. The hydrogen adsorption energy on ordered binary alloy based MXenes covered with hydrogen was predicted with the machine learning model and Ti₂M′₂C₃O₂ (M′ for 3d = Ti, V, Cr; 4d = Zr, Nb, Mo; 5d = Hf, Ta) were expected to show exceptional HER activity due to their optimal hydrogen adsorption strength. Sun et al. also expanded the candidate group to MBenes that could be represented as M_mB_nT_x (m, n = 1, 2), and trained machine learning models to successfully predicted the hydrogen adsorption energy. As a result, Co₂B₂ and Mn/Co₂B₂ were highlighted as the excellent HER catalysts due to |ΔG_H* | < 0.15 eV over a wide range of hydrogen coverages.

2D materials could be applied as an efficient catalyst on not only electrocatalytic water splitting but also photocatalytic water splitting. Photocatalytic water splitting is also promising alternative for the generation of hydrogen without using fossil fuel. Photocatalysts based on bulk oxide materials such as TiO₂ suffered from large band gaps, low harvesting of visible light, and high tendency for charge recombination.¹²⁾ To circumvent those problems, various classes of 2D materials have been developed to offer several advantages such as increased active sites per surface area, enhanced charge separation and transport over their bulk counterparts. Kumar et al. devised interpretable machine learning model to find efficient 2D water-splitting photocatalysts, based on 3,099 2D materials belong to the octahedral symmetry group (O_h) or 1T phase, in which six hydroxyl ligands are attached to a metal atom in the octahedral geometry.¹³⁾ The total computational time required to perform the relevant DFT calculations amounted to about three years or 450,000 CPU core hours, justifying the need for an ML-based study to quickly screen new stable 2D materials. Hence, highly accurate ML methods, including mean feature ranking and Bayesian hyperparameter optimization were performed to predict formation energies and convex hull distances. The best three hyperparameters for the formation energy regression were represented in contour plots, consisted with learning rate, number of leaves in random forest models, and lambda factors [Fig. 5(a)]. After ML prediction for phase stability, a funnel-like screening procedure in Fig. 5(b) was conducted. The first tier screened 3,099 octahedral 2D (2DO) materials in database based on their overall stability descriptors. The second tier selected semiconductors with specific symmetrical structure and having PBE band gaps between 0.5 and 2.0 eV. The selected 37 2DO materials were finally screened based on suitable GW band gaps and band alignment.

https://cdn.apub.kr/journalsite/sites/mrsk/2023-033-05/N0340330501/images/mrsk_2023_335_175_F5.jpg

Fig. 5

(a) Contour plot of best three hyperparameters utilized in the LightGBM algorithm-based ML model using selected features for the formation energy regression (b) The high-throughput scheme utilized for screening stable and catalytically active 2DO photocatalysts with (thermo) dynamic stability, suitable PBE band gaps, and GW bandgap with band alignments. Reproduced with permission from Ref. (13). Copyright 2021 Springer Nature.

The most stable 2D materials were further screened based on suitable band gaps within the visible region and band alignments with respect to standard redox potentials using the GW method, resulting in 21 potential candidates. HfSe₂ and ZrSe₂ were found to have high solar-to-hydrogen efficiencies reaching their theoretical limits. These suggested 2D materials are also expected to mitigate the viability of charge carrier recombination by reducing the distance required for photogenerated electrons and holes for reaching the active sites. Researchers also confirmed with further calculation on target material system to evaluate the band gap position and connect with the localization of CBM and VBM to the optimal photocatalytic advantages.

Noble nanoparticle (NP)-sized electrocatalysts have been exploited for diverse electrochemical reactions including HER to facilitate the water electrolysis to realize the eco-friendly hydrogen economy. For further improvement in activity and cost effectiveness, minimal amounts of single atoms have been recently exploited to maximize the active surface area and to tune the catalytic activity by coordinating the single atoms in defect sites of N-doped graphene.¹⁴⁾ To explore the optimal single-atom electrocatalysts, Ha et al. screened single atom catalysts with 3d-5d transition metal elements using DFT along with ML-based descriptors.¹⁵⁾ The stability and activity of M-N-doped graphene were explored from the view of structure/coordination, formation energy, structural/electrochemical stability, electronic properties, electrical conductivity, and reaction mechanism. Among various –N_nC_m moieties, the –N₂C₂ moieties show higher electrochemical catalytic performance and longer durability (without aggregation/dissolution) compared with the widely studied pure –C₄/C₃ and –N₄/N₃ moieties, and also the formati–n of -N₂C₂ moieties were also energetically favored than other moieties [Fig. 6(a)]. The ML-based descriptors were carefully selected to predict hydrogen adsorption energy with low error to discover better catalysts than benchmark noble metal catalysts. In the N₂C₂ templates, Ni/Ru/Rh/Pt single atom active sites were predicted to show low HER overpotentials due to their optimal hydrogen binding energy near zero.

The catalytic activity was investigated in terms of H-adsorption binding/free energies for the intermediates during the HER. HER overpotential (η^HER) was calculated based on the ΔG_H*, which is used to find outstanding HER catalysts, and visualized with representative colormap [Fig. 6(b)]. Mo/Rh/La/Os/Pt in C₄-graphene, Ti/Zr/Ag/Cd in N₁C₃-N-doped graphene, Co/Ni/Ag in N₂C₂^b-N-doped graphene, Cr/Cu/Cd in N₃C₁-N-doped graphene, Rh/Cd in N₄-N-doped graphene, and Ni/Ru/Rh/Pt in –N₂C₂-N-doped graphene templates show -η^HER < 0.1 V and are likely to be super-performing HER catalysts, given that the Pt(111) surface shows -η^HER = 0.2 V on the fcc-hollow site. Indeed, Pt-N₂C₂, Ru-N₂C₂, and Rh-N₂C₂ were experimentally synthesized and reported as remarkably active HER catalysts. The advanced research strategy, which combines high-throughput computing with machine learning, shows a dramatic acceleration of catalysts search and ensure robust ability for evaluating unexplored new types of materials as a targeted electrocatalysts.

https://cdn.apub.kr/journalsite/sites/mrsk/2023-033-05/N0340330501/images/mrsk_2023_335_175_F6.jpg

Fig. 6

(a) DFT-calculated vacancy forming energies for –N_nC_mH_h moieties of N-doped graphene in the presence of H but in the absence of metal atoms. (b) HER overpotentials (η^HER) of Metal-N_nC_m on N-doped graphene. Color code map for η^HER (upper triangle) and most stable H-adsorption sites (H*) (lower triangle), which are denoted by metal site (in white blank), metal-C bridge site (by a hat symbol ‘‘^’’), C site (by ‘‘C’’) or N site (by ‘‘N”). Reproduced with permission from Ref. (15). Copyright 2021 Royal Society of Chemistry.

3.3. OER catalysts discovery with ML

Compared to the two electrons transfer process in HER, OER has four intermediate steps with total four electrons transfer process, which leads to unavoidable sluggish kinetics that greatly reducing the energy efficiency of the overall reaction. Significant efforts have been made over past decades to develop highly active electrocatalysts to improve the reaction rate of OER. Currently, the state-of-the-art OER catalysts are metal oxides based on the precious element the metal, such as RuO₂ and IrO₂. For OER, predict the ΔG_O and ΔG_OH on the catalyst surface with ML model to locate the top of the volcano plot leads to promising candidate materials with lowest overpotential. To accelerate the catalyst screening and rationalize the catalyst design, revealing the intrinsic factors that affect the OER activity of multicomponent transition metal oxides is the key.

To fully commercialize the highly efficient proton exchange membrane (PEM) electrolyzers produce hydrogen with higher purity, superior electrocatalyst with economical, active, and acid-stable catalysts for the OER occurring at the anode. Discovering acid-stable, cost-effective, and active catalysts for oxygen evolution reaction is critical since this reaction is a bottleneck in many electrochemical energy conversion systems, including electrochemical water electrolysis. The current systems use extremely expensive iridium oxide catalysts. Identifying Ir-free or less-Ir containing catalysts has been suggested as the goal, but no systematic strategy to discover such catalysts has been reported. Back et al. performed first-principles-based high-throughput catalyst screening to discover OER-active and acid-stable catalysts focusing on equimolar bimetallic oxides with space groups derived from those of IrO_x.¹⁶⁾ The researchers developed an approach to evaluate acid-stability under the reaction condition by utilizing the Materials Project database¹⁷⁾ and DFT calculations. Acid-stable materials were further investigated with their OER catalytic activities and identify promising OER catalysts that satisfy all the desired properties: Co-Ir, Fe-Ir, and Mo-Ir bimetallic oxides.

Moreover, machine-learning-based surrogate models have the potential to accelerate the search for polymorphs that target specific applications. Recently, a generalizable active learning accelerated algorithm for identification of electrochemically stable iridium oxide polymorphs of IrO₂ and IrO₃ (Fig. 7).¹⁸⁾ The search was combined with a subsequent evaluation of the structures’ electrochemical stability for the acidic oxygen evolution reaction. By finding all 956 structurally distinct AB₂ and AB₃ prototypes in current materials databases, more than 38,000 of structural candidates were generated. The researchers confirmed the overall stability of the rutile structure by discovering 196 IrO₂ polymorphs within the thermodynamic amorphous synthesizability limit and employing an active learning methodology. A random search of the candidate space was done to test the algorithms performance and at least a 2-fold increase in the rate of discovery was confirmed. Additionally, the active learning approach can acquire the most stable polymorphs of IrO₂ and IrO₃ with fewer than 30 density functional theory optimizations [Fig. 8(a)]. Analysis of the revealed polymorphs’ structural characteristics suggests that almost all low-energy structures prefer octahedral local coordination settings [Fig. 8(b)]. A subsequent Pourbaix diagram on the Ir-H₂O system revealed that rutile IrO₂ was no longer as stable as α-IrO₃ under acidic OER conditions [Fig. 8(c)]. Also, the DFT calculation of theoretical OER surface activities demonstrated an ideal weaker binding of the OER intermediates on α-IrO₃ than on any other considered iridium oxide. The proposed active learning algorithm could be easily generalized to search for any binary metal oxide structure with a defined stoichiometry.

To expand the catalysts candidate species to the bimetallic perovskite structure, Li et al. develop an adaptive machine learning strategy in search of high-performance double perovskites (AA′B₂O₆) modified ABO₃-type cubic perovskites for catalyzing OER [Fig. 8(a)].¹⁹⁾ An exploratory analysis by comparing the overpotential distribution of the candidates with different B-site elements was conducted to gain compositional insights into the OER activity [Fig. 8(b)]. A set of multifidelity features such as composition and electronic structure as optimized and probabilistic models with Gaussian processes were trained with DFT calculated *O and *OH adsorption energies as catalytic activity descriptors. By setting criteria of the theoretical overpotentials to be less than 0.5 V, candidates were iteratively refined and investigated throughout ML models with small RMSE less than 0.5 eV. This ML model could rapidly navigate through a chemical subspace of ~4,000 AA′B₂O₆ and single out stable structures with promising OER activity [Fig. 8(c)]. Various known perovskites with improved catalytic performance over the benchmark LaCoO₃ were successfully identified [Fig. 8(d)] and also other additionally unexplored candidates were suggested for further investigation.

https://cdn.apub.kr/journalsite/sites/mrsk/2023-033-05/N0340330501/images/mrsk_2023_335_175_F7.jpg

Fig. 7

(a) For five different generations on active learning process for searching the electrochemically stable iridium oxide polymorphs. (b) Structural models with one monolayer of adsorbed O atoms on the different surface of IrO_x polymorphs. (c) Pourbaix diagram of Ir-H₂O system demonstrated as a function of applied potential (U_SHE) and pH of the electrolyte. Reproduced with permission from Ref. (18). Copyright 2020 American Chemical Society.

https://cdn.apub.kr/journalsite/sites/mrsk/2023-033-05/N0340330501/images/mrsk_2023_335_175_F8.jpg

Fig. 8

(a) Configuration of AA′B₂O₆ cubic double perovskite. (b) Box and swarm plots display of the overpotential distributions for the candidate data set with different B-site metals. (c) Heat map visualization of the OER activity of double perovskites as a function of A-site/B-site cations in terms of the OER overpotentials and cubic phase probability. The red/blue color bar denotes the overpotentials, and the purple bar represents the tolerance factor. (d) Parity plot for DFT-calculated versus Gaussian process model prediction of descriptor adsorption free energies on candidate perovskite structures. Reproduced with permission from Ref. (19). Copyright 2020 American Chemical Society.

For another mathematically reasonable approach on descriptor development, symbolic regression (SR) is a promising method of interpretable machine learning for building mathematical formulas that best fit certain datasets. Recently, Weng et al. used SR to guide the design of new oxide perovskite catalysts by developing a simple descriptor, μ/t, where μ and t are the octahedral and tolerance factors, respectively.²⁰⁾ With a simple descriptor, the discovery of a series of new oxide perovskite catalysts with various elemental ratio was accelerated that enabled the expansion of candidate chemical spaces. The researchers then synthesized five new oxide perovskites and characterized their OER activities. Four of them, Cs_0.4La_0.6Mn_0.25Co_0.75O₃, Cs_0.3La_0.7NiO₃, SrNi_0.75Co_0.25O₃, and Sr_0.25Ba_0.75NiO₃, were among the oxide perovskite catalysts with the highest intrinsic activities, which highlighted the predictive power of SR on OER catalysts screening process.

To maximize the atomic utilization, atomically dispersed single atom catalysts recently gain attentions from researchers. Recently, Wu et al. designed a topological information-based ML model to map the OER overpotentials with atomic properties of the corresponding SACs.²¹⁾ A topology-based ML-accelerated prediction of OER overpotential of all transition metals was reported based on DFT calculations of 15 species of SACs. The trained ML model enabled a 130,000-fold reduction of prediction time compared to traditional DFT calculation. It also yielded remarkable prediction precision with a low relative error of 6.49 %, which assured both prediction accuracy and time efficiency at the same time.

For solar-driven photoelectrochemical water splitting, doping is an effective strategy for tuning metal oxide-based semiconductors, since the proper bandgap leads to an optimal light absorbance. Despite years of intensive research, choosing the right dopant is still primarily based on trial and error. Because it can identify correlations from the ostensibly unclear relationships between a wide range of dopant characteristics and the PEC performance of doped photoelectrodes, machine learning is promise in delivering predicted insights on the dopant selection for high-performing PEC systems. Recently, Wang et al. successfully built ML model to predict the doping effect of 17 metal dopants into hematite structure of Fe₂O₃, a prototype photoelectrode material.²²⁾ The critical parameters from the 10 intrinsic features of each dopant were extracted and validated experimentally by the coherent prediction on Y and La dopants’ behaviors. From the ML model, the chemical state was selected as the most significant selection criteria and dopants with higher metal-oxygen bond formation enthalpy and larger ionic radius were favored in improving the charge separation and transfer in the Fe₂O₃ photoanodes. The generic feature of this ML guided selection criteria was further extended to CuO-based photoelectrodes by alkaline metal ions doping. Those ML-assited dopant searching process could be easily transferred to other types of catalysts design, such as transition-metal based layered double hydroxides.²³⁾

4. Opportunities and Prospect

ML is undoubtedly revolutionizing the whole paradigm in almost every fields, including science and engineering. Those recent surge of interest in ML could mislead to the wrong statement that ML is an absolute solution for every single case. In fact, however, ML is not needed in many cases, and even show worse performance on some tasks. Researchers should especially remind themselves with a standard whether should or should not use ML in given task. ML might be a meaningful strategy for an appropriate problem that human often struggles with, such as when the data and their interactions too complex and could not be interpreted.

There are several challenges in applying ML to materials science and engineering. The most frequent difficulty for researchers who want to apply ML to the materials field is the small amount of data. Since the absolute amount of data related to ongoing research is tiny, it is often difficult to proceed with ML based on this data. However, some state-of-the-art machine learning techniques, such as few-shot learning or transfer learning, have been extensively applied to overcome the data deficiency and expand the domain of ML-assisted advances in materials science and engineering fields.

When the training data and search space covered distant crystal structures and chemical compositions, it was challenging to directly use the elemental and structural representations as descriptors. In particular for energic indicators, such as the adsorption energy of O or H species on HER and OER electrocatalysts and the binding energies of electrocatalysts with the supports, the results of DFT calculations have been stored as databases for prediction and classification of target physical properties in ML thanks to advanced techniques and computational power.

There are now a few extensive databases that provide DFT calculation data for computational catalysis research. For instance, Catalysis-Hub is a web-platform allowing comprehensive and user-friendly access to heterogeneous catalysis concepts, whose surface reactions database contains thousands of reaction energies and barriers from DFT calculations on surface systems.²⁴⁾ Open Catalyst Project is another collaborative research effort for using AI to model and discovering new catalysts that contain 1.2 million molecular relaxations with results from over 250 million DFT calculations.²⁵⁾ With the help of ML and big data, discovery towards complex multicomponent catalysts for the optimal reactions might be accelerated. For example, transition metal-based alloy,²⁶⁾ atomically dispersed dual-atom catalysts²⁷⁾ or 2D materials,²⁸⁾ The ML-assisted catalysts development paradigm might be expanded to various electrochemical reactions, such as water oxidation under neutral condition,²⁹⁾ CO₂ reduction,³⁰⁾ ammonia oxidation,³¹⁾ and photoelectrochemical reactions.³²⁾ We believe that ML will unlock the limitation on chemical space for catalysts search and superior catalysts for efficient water electrolysis could be discovered and synthesized,³³⁾ which might accelerate the green hydrogen ecosystem with a large-scale hydrogen production.³⁴⁾

Acknowledgements

This study was supported by the KRISS (Korea Research Institute of Standards and Science) MPI Lab. Program and the National Research Foundation of Korea (NRF) grant funded by the Korea Government MSIT (2021R1C1C2006142). The Inter-university Semiconductor Research Center and Institute of Engineering Research at Seoul National University provided research facilities for this study.

Author Information

Jaehyun Kim

Graduate Student, Department of Materials Science and Engineering, Research Institute of Advanced Materials, Seoul National University

Ho Won Jang

Full Professor, Department of Materials Science and Engineering, Research Institute of Advanced Materials, Seoul National University

Full Professor, Advanced Institute of Convergence Technology, Seoul National University

References

X. Li, L. Zhao, J. Yu, X. Liu, X. Zhang, H. Liu and W. Zhou, Nano-Micro Lett., 12, 131 (2020). 10.1007/s40820-020-00469-334138146PMC7770753

M. Yu, E. Budiyanto and H. Tüysüz, Angew. Chem., Int. Ed., 61, e202103824 (2022). 10.1002/anie.202103824

A. J. Medford, M. R. Kunz, S. M. Ewing, T. Borders and R. Fushimi, ACS Catal., 8, 7403 (2018). 10.1021/acscatal.8b01708

Z. W. Seh, J. Kibsgaard, C. F. Dickens, I. Chorkendorff, J. K. Nørskov and T. F. Jaramillo, Science, 355, 4998 (2017). 10.1126/science.aad499828082532

D. H. Mok, W. Lee, J. Kim, H. D. Jung, H. Y. Jang, S. Moon, C. Lee and S. Back, Ceramist, 25, 126 (2022). 10.31613/ceramist.2022.25.2.08

J. Kim, D. Kang, S. Kim and H. W. Jang, ACS Mater. Lett., 3, 1151 (2021). 10.1021/acsmaterialslett.1c00204

K. Tran and Z. W. Ulissi, Nat. Catal., 1, 696 (2018). 10.1038/s41929-018-0142-1

X. Mao, L. Wang, Y. Xu, P. Wang, Y. Li and J. Zhao, npj Comput. Mater., 7, 46 (2021). 10.1038/s41524-021-00514-8

L. Ge, H. Yuan, Y. Min, L. Li, S. Chen, L. Xu and W. A. Goddard, J. Phys. Chem. Lett., 11, 869 (2020). 10.1021/acs.jpclett.9b0387531927930

J. Zheng, X. Sun, C. Qiu, Y. Yan, Z. Yao, S. Deng, X. Zhong, G. Zhuang, Z. Wei and J. Wang, J. Phys. Chem. C, 124, 13695 (2020). 10.1021/acs.jpcc.0c02265

X. Wang, C. Wang, S. Ci, Y. Ma, T. Liu, L. Gao, P. Qian, C. Ji and Y. Su, J. Mater. Chem. A, 8, 23488 (2020). 10.1039/D0TA06583H

Y. Zhang and X. Xu, ACS Omega, 5, 15344 (2020). 10.1021/acsomega.0c0143832637808PMC7331044

R. Kumar and A. K. Singh, npj Comput. Mater., 7, 197 (2021). 10.1038/s41524-021-00669-4

J. Kim, S. Choi, J. Cho, S. Y. Kim and H. W. Jang, ACS Mater. Au, 2, 1 (2022). 10.1021/acsmaterialsau.1c0004136855696PMC9888646

M. Ha, D. Y. Kim, M. Umer, V. Gladkikh, C. W. Myung and K. S. Kim, Energy Environ. Sci., 14, 3455 (2021). 10.1039/D1EE00154J

S. Back, K. Tran and Z. W. Ulissi, ACS Appl. Mater. Interfaces, 12, 38256 (2020). 10.1021/acsami.0c1182132799519

A. Jain, S. P. Ong, G. Hautier, W. Chen, W. D. Richards, S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder and K. A. Persson, APL Mater., 1, 011002 (2013). 10.1063/1.4812323

R. A. Flores, C. Paolucci, K. T. Winther, A. Jain, J. A. G. Torres, M. Aykol, J. Montoya, J. K. Nørskov, M. Bajdich and T. Bligaard, Chem. Mater., 32, 5854 (2020). 10.1021/acs.chemmater.0c01894

Z. Li, L. E. K. Achenie and H. Xin, ACS Catal., 10, 4377 (2020). 10.1021/acscatal.9b05248

B. Weng, Z. Song, R. Zhu, Q. Yan, Q. Sun, C. G. Grice, Y. Yan and W.-J. Yin, Nat. Commun., 11, 3513 (2020). 10.1038/s41467-020-17263-932665539PMC7360597

L. Wu and T. Guo, T. Li, iScience, 24, 102398 (2021). 10.1016/j.isci.2021.10239833997683PMC8099497

Z. Wang, Y. Gu, L. Zheng, J. Hou, H. Zheng, S. Sun and L. Wang, Adv. Mater., 34, 2106776 (2022). 10.1002/adma.20210677634964178

X. Jiang, Y. Wang, B. Jia, X. Qu and M. Qin, ACS Appl. Mater. Interfaces, 14, 41141 (2022). 10.1021/acsami.2c1343536044226

K. T. Winther, M. J. Hoffmann, J. R. Boes, O. Mamun, M. Bajdich and T. Bligaard, Sci. Data, 6, 75 (2019). 10.1038/s41597-019-0081-y31138816PMC6538711

L. Chanussot, A. Das, S. Goyal, T. Lavril, M. Shuaibi, M. Riviere, K. Tran, J. Heras-Domingo, C. Ho, W. Hu, A. Palizhati, A. Sriram, B. Wood, J. Yoon, D. Parikh, C. L. Zitnick and Z. Ulissi, ACS Catal., 11, 6059 (2021). 10.1021/acscatal.0c04525

H. R. Kwon, H. Park, S. E. Jun, S. Choi and H. W. Jang, Chem. Commun., 58, 7874 (2022). 10.1039/D2CC02423C35766059

A. Kumar, V. Q. Bui, J. Lee, L. Wang, A. R. Jadhav, X. Liu, X. Shao, Y. Liu, J. Yu, Y. Hwang, H. T. D. Bui, S. Ajmal, M. G. Kim, S.-G. Kim, G.-S. Park, Y. Kawazoe and H. Lee, Nat. Commun., 12, 6766 (2021). 10.1038/s41467-021-27145-334799571PMC8604929

S. E. Jun, J. K. Lee and H. W. Jang, Energy Adv., 2, 34 (2023). 10.1039/D2YA00231K

H. Seo, K. H. Cho, H. Ha, S. Park, J. S. Hong, K. Jin and K. T. Nam, J. Korean Ceram. Soc., 54, 1 (2017). 10.4191/kcers.2017.54.1.12

S. M. Lee, W. S. Cheon, M. G. Lee and H. W. Jang, Small Struct., 2022, 2200236 (2022). 10.1002/sstr.202200236

S. A. Lee, M. G. Lee and H. W. Jang, Sci. China Mater., 65, 3334 (2022). 10.1007/s40843-022-2111-2

B. R. Lee, S. Choi, W. S. Cheon, J. W. Yang, M. G. Lee, S. H. Park and H. W. Jang, Electron. Mater. Lett., 18, 391 (2022). 10.1007/s13391-022-00346-8

S. A. Lee, J. W. Yang, S. Choi and H. W. Jang, Exploration, 1, 20210012 (2021). 10.1002/EXP.20210012

S. A. Lee, J. Kim, K. C. Kwon, S. H. Park and H. W. Jang, Carbon Neutralization, 1, 26 (2022). 10.1002/cnl2.9

Korean Journal of Materials Research ISSN:1225-0562(Print) 2287-7258(Online) 한국재료학회지

Preview

Can Artificial Intelligence Boost Developing Electrocatalysts for Efficient Water Splitting to Produce Green Hydrogen?

ABSTRACT

MAIN

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Acknowledgements

Author Information

References