Efficient and generalizable nested Fourier-DeepONet for three-dimensional geological carbon sequestration

Jonathan E. Lee Department of Chemical and Environmental Engineering, Yale University, New Haven, CT 06511, USA Min Zhu Department of Statistics and Data Science, Yale University, New Haven, CT 06511, USA Ziqiao Xi Department of Computer Science and Engineering, University of California, San Diego, CA 92093, USA Kun Wang ExxonMobil Technology and Engineering Company, Annandale, NJ 08801, USA Yanhua O. Yuan ExxonMobil Technology and Engineering Company, Annandale, NJ 08801, USA Corresponding author. Email: yanhua.yuan@exxonmobil.com, lu.lu@yale.edu Lu Lu Department of Statistics and Data Science, Yale University, New Haven, CT 06511, USA Corresponding author. Email: yanhua.yuan@exxonmobil.com, lu.lu@yale.edu
Abstract

Geological carbon sequestration (GCS) involves injecting CO2 into subsurface geological formations for permanent storage. Numerical simulations could guide decisions in GCS projects by predicting CO2 migration pathways and the pressure distribution in storage formation. However, these simulations are often computationally expensive due to highly coupled physics and large spatial-temporal simulation domains. Surrogate modeling with data-driven machine learning has become a promising alternative to accelerate physics-based simulations. Among these, the Fourier neural operator (FNO) has been applied to three-dimensional synthetic subsurface models. Here, to further improve performance, we have developed a nested Fourier-DeepONet by combining the expressiveness of the FNO with the modularity of a deep operator network (DeepONet). This new framework is twice as efficient as a nested FNO for training and has at least 80% lower GPU memory requirement due to its flexibility to treat temporal coordinates separately. These performance improvements are achieved without compromising prediction accuracy. In addition, the generalization and extrapolation ability of nested Fourier-DeepONet beyond the training range has been thoroughly evaluated. Nested Fourier-DeepONet outperformed the nested FNO for extrapolation in time with more than 50% reduced error. It also exhibited good extrapolation accuracy beyond the training range in terms of reservoir properties, number of wells, and injection rate.

Keywords:

geological carbon sequestration; 3D multiphase flow in porous media; deep neural operator; Fourier-DeepONet; computational cost; extrapolation

1 Introduction

Geological carbon sequestration (GCS) is an essential technology to reduce CO2 emissions at scale. High-fidelity simulations of plume migration pathways and pressure distribution during and after CO2 injection play an important role in decision making in carbon storage projects. The problem can be modeled by solving a multiphase flow of CO2 and water, governed by Darcy’s law, which is expressed by nonlinear partial differential equations (PDEs) [31, 2]. Numerical simulations, such as finite difference, finite element, or finite volume methods, provide insights to guide informed decisions about subsurface scenarios. These simulations allow the prediction of how much CO2 will migrate through sedimentary rocks for effective site selection and injection designs. However, this approach is computationally demanding for performing optimization tasks or decision-making processes, due to not only the multiphysics, multiscale, and multiphase nature of this problem but also the need for high grid resolution [31, 2, 30, 10, 34, 21]. As a result, the trade-off between accuracy and computational cost is inherent in such simulation methods. One technique to address this issue involves subdividing the grid into multiple resolutions, thereby reducing overall computational costs. Local grid refinement (LGR) technique is typically employed to maintain solver accuracy near the CO2 injection location [36]. This method discretizes the region around injection wells with finer grids and gradually coarsens regions further away from them [3, 11, 12, 19]. This approach enables the simulation model to capture the complex physics of CO2 plume migration around the wells with minimal loss of accuracy. However, it is still computationally expensive to use such techniques to model real-world scenarios.

Machine learning (ML) methods have emerged as a potential alternative to traditional numerical approaches due to their capability to learn PDEs and make real-time predictions [5, 40, 8, 45, 29, 33, 35, 17]. Recent advances in ML have shown significant progress in solving subsurface flow problems [34, 36, 45, 29, 33, 35, 37]. ML techniques offer rapid inference, making them particularly useful for tasks such as probabilistic assessment, which require multiple forward simulations. Notably, physics-informed neural networks (PINNs) and their extensions [20, 15, 41, 27] have gained popularity from their ability to make informed predictions in various scientific and engineering applications and already have shown success in many applications [5, 40, 8, 13, 7, 39, 14, 38]. Despite their successes, PINNs struggle with computational cost in real-world scenarios due to the necessity for separate training if DE conditions vary. For example, in GCS, various reservoir conditions and varying injection schemes change the CO2 migration behavior, which makes PINNs inefficient for this application.

Recently, deep neural operators have gained significant attention because of their generalizability in learning the PDE family rather than a single PDE with fixed parameters. They do not require separate training for new conditions [23, 22, 24, 44, 43, 16, 28, 9, 26, 42, 6, 18]. Furthermore, in the field of GCS, a nested Fourier neural operator (FNO) framework has been used to simulate pressure buildup and carbon saturation in three-dimensional (3D) spatial and temporal domains, mirroring real-world scenarios [36]. Nested FNO adopts a similar idea as LGR technique using a separate model per each level of resolution. Despite using this technique, due to the large temporal and 3D spatial domain it still requires a couple of days of training for each level of resolution, and a GPU with large memory (>>>30 GiB). Furthermore, because FNO uses different channels to predict solutions at different times, it performs poorly in predicting unseen times [16].

In our study, we aim to address the challenges arising from computational cost and poor generalizability. We develop a nested Fourier-DeepONet by employing a deep neural operator called Fourier-DeepONet [43, 16], which integrates the cutting-edge FNO [22] with the deep operator network (DeepONet) [23]. Nested Fourier-DeepONet offers significant improvements in GPU memory usage and training time compared to the state-of-the-art nested FNO [36] owing to its design of incorporating separate networks for handling temporal coordinates. This nificantly increases computational efficiency with at least 50% reduction in GPU memory consumption and 50% decrease in training time. Additionally, a nested Fourier-DeepONet demonstrates promising results in extrapolation in terms of sampling parameters that include various injection schemes, subsurface conditions, and temporal coordinates.

The paper is organized as follows. In Section 2, we introduce the problem setup, including the governing equations of multiphase flow and the dataset generated by numerical simulation. In Section 3, after providing a brief overview of DeepONet and FNO, we propose a nested Fourier-DeepONet. In Section 4, we highlight the superior computational efficiency of the nested Fourier-DeepONet compared to the nested FNO and demonstrate its generalization capabilities. We summarize the significance of this work in Section 5.

2 Problem setup

In this work, we investigate carbon migration in varying scenarios of injection schemes and subsurface conditions for GCS. We model 3D GCS as a multiphase flow of CO2 and water in a porous media to predict pressure buildup and gas saturation when injecting CO2 underground over 30 years. To alleviate the computational cost arising from the large spatial and temporal domain, the local refinement technique is employed for data generation, and a nested ML framework is constructed accordingly to make predictions at varying discretizations.

2.1 Governing equations

The problem of GCS is governed by mass balance equations for CO2 and water, where the flux of each phase is determined by an extension of Darcy’s law, which describes fluid flow through a porous medium [31] as follows:

(ϕpSpρpXpCO2)t=[𝐅CO2|adv+𝐅CO2|dif]+qCO2,italic-ϕsubscript𝑝subscript𝑆𝑝subscript𝜌𝑝subscriptsuperscript𝑋𝐶subscript𝑂2𝑝𝑡delimited-[]evaluated-atsuperscript𝐅𝐶subscript𝑂2𝑎𝑑𝑣evaluated-atsuperscript𝐅𝐶subscript𝑂2𝑑𝑖𝑓superscript𝑞𝐶subscript𝑂2\frac{\partial\left(\phi\sum_{p}S_{p}\rho_{p}X^{CO_{2}}_{p}\right)}{\partial t% }=-\nabla\cdot\left[\mathbf{F}^{CO_{2}}|_{adv}+\mathbf{F}^{CO_{2}}|_{dif}% \right]+q^{CO_{2}},divide start_ARG ∂ ( italic_ϕ ∑ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_X start_POSTSUPERSCRIPT italic_C italic_O start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) end_ARG start_ARG ∂ italic_t end_ARG = - ∇ ⋅ [ bold_F start_POSTSUPERSCRIPT italic_C italic_O start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT + bold_F start_POSTSUPERSCRIPT italic_C italic_O start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | start_POSTSUBSCRIPT italic_d italic_i italic_f end_POSTSUBSCRIPT ] + italic_q start_POSTSUPERSCRIPT italic_C italic_O start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT , (1)
(ϕpSpρpXpwater)t=[𝐅water|adv+𝐅water|dif],italic-ϕsubscript𝑝subscript𝑆𝑝subscript𝜌𝑝subscriptsuperscript𝑋𝑤𝑎𝑡𝑒𝑟𝑝𝑡delimited-[]evaluated-atsuperscript𝐅𝑤𝑎𝑡𝑒𝑟𝑎𝑑𝑣evaluated-atsuperscript𝐅𝑤𝑎𝑡𝑒𝑟𝑑𝑖𝑓\frac{\partial\left(\phi\sum_{p}S_{p}\rho_{p}X^{water}_{p}\right)}{\partial t}% =-\nabla\cdot\left[\mathbf{F}^{water}|_{adv}+\mathbf{F}^{water}|_{dif}\right],divide start_ARG ∂ ( italic_ϕ ∑ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_X start_POSTSUPERSCRIPT italic_w italic_a italic_t italic_e italic_r end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) end_ARG start_ARG ∂ italic_t end_ARG = - ∇ ⋅ [ bold_F start_POSTSUPERSCRIPT italic_w italic_a italic_t italic_e italic_r end_POSTSUPERSCRIPT | start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT + bold_F start_POSTSUPERSCRIPT italic_w italic_a italic_t italic_e italic_r end_POSTSUPERSCRIPT | start_POSTSUBSCRIPT italic_d italic_i italic_f end_POSTSUBSCRIPT ] , (2)
𝐅η|adv=pXηFp=pXη(kkr,pρpμp(Ppρp𝐠)),evaluated-atsuperscript𝐅𝜂𝑎𝑑𝑣subscript𝑝superscript𝑋𝜂subscript𝐹𝑝subscript𝑝superscript𝑋𝜂𝑘subscript𝑘𝑟𝑝subscript𝜌𝑝subscript𝜇𝑝subscript𝑃𝑝subscript𝜌𝑝𝐠\mathbf{F}^{\eta}|_{adv}=\sum_{p}X^{\eta}F_{p}=\sum_{p}X^{\eta}\left(-k\frac{k% _{r,p}\rho_{p}}{\mu_{p}}(\nabla P_{p}-\rho_{p}\mathbf{g})\right),bold_F start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT | start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_X start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT italic_F start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_X start_POSTSUPERSCRIPT italic_η end_POSTSUPERSCRIPT ( - italic_k divide start_ARG italic_k start_POSTSUBSCRIPT italic_r , italic_p end_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_ARG start_ARG italic_μ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_ARG ( ∇ italic_P start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_ρ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT bold_g ) ) ,
Pn=Pw+Pc.subscript𝑃𝑛subscript𝑃𝑤subscript𝑃𝑐P_{n}=P_{w}+P_{c}.italic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = italic_P start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT .

Eqs. (1) and (2) represent the law of conservation of mass for the two species within the system: CO2 and water. On the left side of the equations, ϕitalic-ϕ\phiitalic_ϕ denotes the porosity of the media, S𝑆Sitalic_S represents the saturation, ρ𝜌\rhoitalic_ρ stands for density, and X𝑋Xitalic_X signifies the mass fraction of each species, with the subscript p𝑝pitalic_p indicating the phase. Lastly, qCO2superscript𝑞𝐶subscript𝑂2q^{CO_{2}}italic_q start_POSTSUPERSCRIPT italic_C italic_O start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT denotes the injection rate of CO2.

Although molecular diffusion and hydrodynamic dispersion, FdifsubscriptF𝑑𝑖𝑓\textbf{F}_{dif}F start_POSTSUBSCRIPT italic_d italic_i italic_f end_POSTSUBSCRIPT, are not explicitly simulated, they are implicitly considered as part of numerical diffusion and dispersion resulting from finite difference gradient approximation [36]. The advective mass flux, denoted as FadvsubscriptF𝑎𝑑𝑣\textbf{F}_{adv}F start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT, for component η𝜂\etaitalic_η, is governed by Darcy’s law for each phase. Here, k𝑘kitalic_k denotes absolute permeability, P𝑃Pitalic_P stands for fluid pressure, g represents gravitational acceleration, krsubscript𝑘𝑟k_{r}italic_k start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT signifies relative permeability, and μ𝜇\muitalic_μ indicates viscosity. The fluid pressure for the non-wetting phase, Pnsubscript𝑃𝑛P_{n}italic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, is the sum of the wetting phase pressure, Pwsubscript𝑃𝑤P_{w}italic_P start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT, and the capillary pressure, Pcsubscript𝑃𝑐P_{c}italic_P start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT.

2.2 Nested grid refinement framework

In this work, we aim to predict pressure buildup and gas saturation over a 30-year period based on reservoir conditions (i.e., permeability field, temperature, and initial pressure) and the injection scheme (i.e., injection rate and location) (Fig. 1).

Refer to caption
Figure 1: Nested machine learning pipeline for predicting the pressure buildup and gas saturation for five levels with gradually increasing resolution. There are two groups of inputs which include reservoir condition (3D permeability field, temperature, 3D initial pressure) and injection scheme (injection rate, and injection location). Each level starting from level 1 uses outputs from the previous level (grey arrows) on top of the reservoir condition and injection scheme (black arrows).

Grid resolution significantly impacts the accuracy of simulations, creating a trade-off between accuracy and computational cost. As discussed in the Introduction, local refinement technique is commonly used to address this challenge. This refinement technique uses increasing grid resolution with decreasing distance from the injection location. In this study, we employ the grid refinement used in Ref. [36]. In total, we have five levels of resolutions where level 0 indicates a global level, and four remaining levels (1–4) represent increased resolution near the well (Fig. 2). A higher level indicates a finer resolution and closer distance to the injection location. The details of the discretization are presented in Table S1 of Wen et al. [36].

Refer to caption
Figure 2: 2-Dimensional sample visualizations of a global grid and local refinements in the dataset. (A) x𝑥xitalic_x-y𝑦yitalic_y view. (B) y𝑦yitalic_y-z𝑧zitalic_z view. The boundaries of different computational domains are drawn in black, red, orange, blue, and green solid lines for levels 0, 1, 2, 3, and 4, respectively. zsuperscript𝑧z^{\prime}italic_z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT-coordinate is the depth relative to the top of the simulation domain.

As we have five levels of resolutions for predicting pressure buildup and gas saturation, we train one ML model per level (Fig. 1). The output from each level’s model is used as an input for the subsequent level’s model, in conjunction with reservoir conditions and the injection scheme. Among the five resolution levels, only a model at level 0 does not use predictions from a previous level’s, as it is the initial stage of the nested framework. Consequently, this results in a total of five models for predicting pressure buildup. In contrast, there are four models for gas saturation prediction, as no model is trained for gas saturation at level 0.

2.3 Dataset

We utilize an open-source dataset provided by Wen et al. [36], obtained from the full physics numerical simulator ECLIPSE (e300) [1]. The spatial domain of reservoir sampling includes depths ranging from 800 to 4,500 m, with the temperature calculated based on depth and geothermal gradient. For each reservoir, the simulation domain spans a 100-meter depth relative to the top surface, where the upper bound is 800 m. To prevent interactions between different wells, each well is located at least 5,000 m apart, with injection rates ranging from 0.5 to 2.0 MT/y and perforation thicknesses varying from 20 to 100 m. The dataset also considers the variation of the reservoir dip angles from 0 to 2°. Stanford Geostatistical Modeling Software (SGeMS) [32] was used to produce heterogeneous permeability fields. Table S2 from Wen et al. summarizes the parameters used in generating the dataset, encompassing a wide range of inputs that cover potential scenarios for CO2 injection into saline formations [36].

The dataset comprises a total of 3,009 reservoir simulations, each featuring between 1 to 4 wells, resulting in a sum of 7,455 local-level simulations across all wells. 80% of the dataset is used for training, which includes 24 time snapshots, with time intervals exponentially increasing from 1 day to 30 years.

3 Methods

While scientific machine learning has seen numerous attempts to apply deep neural operators to two-dimensional (2D) problems, there has been a lack of efforts to model the 3D spatial domain. Wen et al. [36] initially employed a nested FNO architecture, demonstrating good accuracy for 3D GCS. However, this architecture suffers from high computational costs and discontinuity of the solution in terms of temporal coordinates [36, 16]. In response, we propose a nested Fourier-DeepONet, which offers improved performance compared to FNO in terms of prediction accuracy while significantly reducing computational cost.

3.1 Fourier-DeepONet

We develop a 3D Fourier-DeepONet based on the 2D Fourier-DeepONet proposed in Refs. [43, 16], which is a hybrid architecture combining a deep operator network (DeepONet) [23] and a Fourier neural operator (FNO) [22]. DeepONet extends the theorem proposed by Chen and Chen [4], which asserts that nonlinear operator mappings between input and output functions can be learned from data using neural networks. On the other hand, the FNO parameterizes the integral kernel in Fourier space [22]. FNO computes convolutions in Fourier space rather than physical space to map the input function and the output function discretized on the same equispaced grid.

Refer to caption
Figure 3: Fourier-DeepONet architecture. The branch and trunk nets receive two groups of inputs, which are encoded and then decoded by the Fourier layers.

In Fourier-DeepONet (Fig. 3), we use DeepONet to encode input through a branch net: permeability map, reservoir temperature, initial pressure, and injection rates. At the same time, the trunk net takes the temporal coordinates t𝑡titalic_t as input. The outputs of branch net, 𝐛𝐛\mathbf{b}bold_b, and trunk net, 𝐜𝐜\mathbf{c}bold_c, are merged through pointwise multiplication:

𝐳0=𝐛𝐜,subscript𝐳0direct-product𝐛𝐜\mathbf{z}_{0}=\mathbf{b}\odot\mathbf{c},bold_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = bold_b ⊙ bold_c ,

where 𝐳0subscript𝐳0\mathbf{z}_{0}bold_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT represents the output of the branch-trunk merge operation, serving as the input to the first Fourier layer.

Fourier-DeepONet consists of L𝐿Litalic_L Fourier layers which are then applied iteratively starting from the transformed input 𝐳0subscript𝐳0\mathbf{z}_{0}bold_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT on the same mesh as the outputs (pressure buildup, gas saturation). Each Fourier layer performs the following:

1(l(𝐳l)),superscript1subscript𝑙subscript𝐳𝑙\mathcal{F}^{-1}(\mathcal{R}_{l}\cdot\mathcal{F}(\mathbf{z}_{l})),caligraphic_F start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( caligraphic_R start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ⋅ caligraphic_F ( bold_z start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) ) ,

where \mathcal{F}caligraphic_F uses 3D Fast Fourier Transform (FFT) and 1superscript1\mathcal{F}^{-1}caligraphic_F start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT uses the inverse 3D FFT. And, 𝐳lsubscript𝐳𝑙\mathbf{z}_{l}bold_z start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT denotes the l𝑙litalic_lth Fourier layer output. Rlsubscript𝑅𝑙R_{l}italic_R start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT is the weight matrix applied in the Fourier space. Another weight matrix Wlsubscript𝑊𝑙W_{l}italic_W start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT is used for a residual connection to compute the output of the (j𝑗jitalic_j+1)th Fourier layer output, 𝐳l+1subscript𝐳𝑙1\mathbf{z}_{l+1}bold_z start_POSTSUBSCRIPT italic_l + 1 end_POSTSUBSCRIPT, using 𝐳lsubscript𝐳𝑙\mathbf{z}_{l}bold_z start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT as input. Therefore, output of each Fourier layer can be defined as follows:

𝐳l+1=σ(1(l(𝐳l)+Wl𝐳l+bl),\mathbf{z}_{l+1}=\sigma(\mathcal{F}^{-1}(\mathcal{R}_{l}\cdot\mathcal{F}(% \mathbf{z}_{l})+W_{l}\cdot\mathbf{z}_{l}+b_{l}),bold_z start_POSTSUBSCRIPT italic_l + 1 end_POSTSUBSCRIPT = italic_σ ( caligraphic_F start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( caligraphic_R start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ⋅ caligraphic_F ( bold_z start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) + italic_W start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ⋅ bold_z start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) ,

where blsubscript𝑏𝑙b_{l}italic_b start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT is the bias at l𝑙litalic_lth Fourier layer and σ𝜎\sigmaitalic_σ is the nonlinear activation function. Final output of the L𝐿Litalic_L Fourier layer is then locally transformed back to the output function dimension parametrized by a fully-connected neural network, Q𝑄Qitalic_Q, as

u(x,y,z)=Q(𝐳L(x,y,z)).ux,y,z𝑄subscript𝐳𝐿x,y,z\textit{{u}}(\textit{x,y,z})=Q(\mathbf{z}_{L}(\textit{x,y,z})).u ( x,y,z ) = italic_Q ( bold_z start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( x,y,z ) ) .

In this study, 4 Fourier layers are utilized to decode the output.

Fourier-DeepONet is computationally more efficient than FNO because Fourier-DeepONet uses 3D FFT while FNO uses 4D FFT. This reduction occurs because the output of the trunk net undergoes pointwise multiplication with the branch net output, and thus the final output of Fourier-DeepONet does not include the time dimension. Additionally, Fourier-DeepONet can adopt a batch size for both branch and trunk nets, while FNO is incapable of adjusting the batch size for the time coordinates for training. This further reduces the computational cost compared with using a full time batch size.

3.2 Nested Fourier-DeepONet

As discussed in Section 2.2, to predict pressure buildup, we have five levels of grid resolution that result in five different Fourier-DeepONets. For gas saturation prediction, we use four different Fourier-DeepONets for the four resolution levels. We train each Fourier-DeepONet independently at each level for each output and make a nested prediction starting from level 0 up to level 4.

3.2.1 Networks and resolutions

Fourier-DeepONet used for each level is denoted as 𝒩iPsubscriptsuperscript𝒩𝑃𝑖\mathcal{N}^{P}_{i}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, where i𝑖iitalic_i in the subscript denotes the level of resolution in Fig. 1 for pressure buildup prediction. Similarly, for the prediction of gas saturation, we denote the neural operator as 𝒩iSsubscriptsuperscript𝒩𝑆𝑖\mathcal{N}^{S}_{i}caligraphic_N start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and we only train from level 1 to 4. The spatial domains of Fourier-DeepONets at five different levels are Ω0,Ω1,Ω2,Ω3,Ω4subscriptΩ0subscriptΩ1subscriptΩ2subscriptΩ3subscriptΩ4\Omega_{0},\Omega_{1},\Omega_{2},\Omega_{3},\Omega_{4}roman_Ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_Ω start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , roman_Ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , roman_Ω start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT (Fig. 2). The entire spatial and temporal domain of each Fourier-DeepONet is D=Ωi×T𝐷subscriptΩ𝑖𝑇D=\Omega_{i}\times Titalic_D = roman_Ω start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT × italic_T, and T𝑇Titalic_T denotes the temporal domain up to 30 years. The architecture of each Fourier-DeepONet in nested Fourier-DeepONet, including the output shape of each operation for levels 0 to 4, is shown in Tables 34, and 5 in the Appendix A.

3.2.2 Training

For training 𝒩0Psubscriptsuperscript𝒩𝑃0\mathcal{N}^{P}_{0}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, we use the reservoir condition and injection scheme as input to the neural operator. For training subsequent levels, we use the true output from the previous level as input in addition to the reservoir condition and injection scheme. Specifically, 𝒩1Psubscriptsuperscript𝒩𝑃1\mathcal{N}^{P}_{1}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 𝒩1Ssubscriptsuperscript𝒩𝑆1\mathcal{N}^{S}_{1}caligraphic_N start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT use pressure buildup in Ω0subscriptΩ0\Omega_{0}roman_Ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT surrounding each well in the x𝑥xitalic_x and y𝑦yitalic_y directions with a width and length of 48 km. For the remaining models, 𝒩iPsubscriptsuperscript𝒩𝑃𝑖\mathcal{N}^{P}_{i}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and 𝒩iSsubscriptsuperscript𝒩𝑆𝑖\mathcal{N}^{S}_{i}caligraphic_N start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for i𝑖iitalic_i = 2–4, the entire ground truth output of the previous level is used for training to predict pressure buildup or gas saturation.

3.2.3 Inference

For a reservoir with n𝑛nitalic_n wells, the final prediction is combined from (4×n)+14𝑛1(4\times n)+1( 4 × italic_n ) + 1 network predictions for either pressure buildup or gas saturation prediction. This is because there are four levels of refinement around each injection location, which results in four network predictions per well in addition to the global level prediction. The prediction of pressure buildup at the global level (level 0) around each well will be used as input for the next level (level 1) for pressure buildup and gas saturation per well. Then, each subsequent level i𝑖iitalic_i, where i𝑖iitalic_i = 2–4, will be provided with entire outputs from the previous level i1𝑖1i-1italic_i - 1. When different levels of resolution are combined to construct the entire spatial domain, the subsequent level replaces the overlapping region.

Although we use ground truth output from the previous level as an input for training purposes, it is not possible to obtain ground truth from the previous level in inference scenarios, which necessitates a sequential prediction as described above. Because the nested architecture is designed in such a way that subsequent levels depend on predictions from the previous level, it is susceptible to accumulating errors. To address this, a fine-tuning step is performed to enhance the generalizability.

3.3 Fine-tuning procedure

We adopt a fine-tuning procedure by adding a random perturbation to the network inputs as introduced in Ref. [36]. Specifically for training the subsequent level’s model, we perturb the ground truth output from the previous level by adding the randomly sampled model error. For instance, after the training procedure described in Section 3.2.2, to fine-tune the pressure buildup model at level i𝑖iitalic_i, the model is further trained using the noised input computed as follows:

Pi1=Pi1+ϵi1.subscriptsuperscript𝑃𝑖1subscript𝑃𝑖1subscriptitalic-ϵ𝑖1P^{\prime}_{i-1}=P_{i-1}+\epsilon_{i-1}.italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT = italic_P start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT + italic_ϵ start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT .

Pi1subscriptsuperscript𝑃𝑖1P^{\prime}_{i-1}italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT corresponds to noised ground truth pressure buildup (Pi1subscript𝑃𝑖1P_{i-1}italic_P start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT). To generate the noise, ϵi1subscriptitalic-ϵ𝑖1\epsilon_{i-1}italic_ϵ start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT, we apply the trained model at level i1𝑖1i-1italic_i - 1 to the training dataset and compute the error between the ground truth and the network predictions which results in a set of errors. Then, we randomly sample from the set of errors to obtain the error, ϵi1subscriptitalic-ϵ𝑖1\epsilon_{i-1}italic_ϵ start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT. The same procedure can be applied to fine-tuning gas saturation models. We follow Ref. [36] and apply this fine-tuning procedure to 𝒩1Psubscriptsuperscript𝒩𝑃1\mathcal{N}^{P}_{1}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, 𝒩4Psubscriptsuperscript𝒩𝑃4\mathcal{N}^{P}_{4}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, 𝒩1Ssubscriptsuperscript𝒩𝑆1\mathcal{N}^{S}_{1}caligraphic_N start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and 𝒩2Ssubscriptsuperscript𝒩𝑆2\mathcal{N}^{S}_{2}caligraphic_N start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

3.4 Details of training and evaluation

As Fourier-DeepONet separates temporal coordinates through a trunk network, we have an additional option to select their batch size for training. We observe that GPU memory consumption grows linearly (Fig. 4A) and training time decreases with respect to the time batch size (Fig. 4B) at the global level’s pressure buildup model as an example. In this work, we used a batch size of 6 among the 24-time snapshots, as it significantly reduces GPU memory usage without sacrificing model accuracy (Fig. 4C) and training time efficiency.

Refer to caption
Figure 4: Effect of time batch size for Fourier-DeepONet. (A) GPU memory usage. (B) Training time per epoch. (C) Error, δΩjPsubscriptsuperscript𝛿𝑃subscriptΩ𝑗\delta^{P}_{\Omega_{j}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT, for pressure buildup model at level 0 (𝒩0Psubscriptsuperscript𝒩𝑃0\mathcal{N}^{P}_{0}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT). Networks are trained for 20 epochs.

For the branch net, we use a batch size of 1, identical to that of Wen et al.’s FNO [36]. We utilized the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT relative error for training the nested Fourier-DeepONet. The learning rate is set to 0.001, and the Adam optimizer is employed with a decay rate of 0.9 every two epochs. To evaluate error between the model prediction (P^^𝑃\hat{P}over^ start_ARG italic_P end_ARG and S^^𝑆\hat{S}over^ start_ARG italic_S end_ARG) and ground truth (P𝑃Pitalic_P and S𝑆Sitalic_S), we adopt the same metric as used by Wen et al. [36]:

Error of 𝒩jP at level j=0,,4:δΩjP=1nTnΩjtTiΩj|Pt,iP^t,i|Pt,max,\text{Error of $\mathcal{N}^{P}_{j}$ at level }j=0,\dots,4:\quad\delta^{P}_{% \Omega_{j}}=\frac{1}{n_{T}n_{\Omega_{j}}}\sum_{t\in T}\sum_{i\in\Omega_{j}}% \frac{|P_{t,i}-\hat{P}_{t,i}|}{P_{t,max}},Error of caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT at level italic_j = 0 , … , 4 : italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_t ∈ italic_T end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_i ∈ roman_Ω start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG | italic_P start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT - over^ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT | end_ARG start_ARG italic_P start_POSTSUBSCRIPT italic_t , italic_m italic_a italic_x end_POSTSUBSCRIPT end_ARG , (3)
Error of 𝒩jS at level j=1,,4:δΩjS=1It,itTiΩjIt,i|St,iS^t,i|.\begin{gathered}\text{Error of $\mathcal{N}^{S}_{j}$ at level }j=1,\dots,4:% \quad\delta^{S}_{\Omega_{j}}=\frac{1}{\sum I_{t,i}}\sum_{t\in T}\sum_{i\in% \Omega_{j}}I_{t,i}|S_{t,i}-\hat{S}_{t,i}|.\end{gathered}start_ROW start_CELL Error of caligraphic_N start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT at level italic_j = 1 , … , 4 : italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG ∑ italic_I start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_t ∈ italic_T end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_i ∈ roman_Ω start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT | italic_S start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT - over^ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT | . end_CELL end_ROW (4)

where It,i=1subscript𝐼𝑡𝑖1I_{t,i}=1italic_I start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT = 1 if St,i>0.01subscript𝑆𝑡𝑖0.01S_{t,i}>0.01italic_S start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT > 0.01 or |S^t,i|>0.01subscript^𝑆𝑡𝑖0.01|\hat{S}_{t,i}|>0.01| over^ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT | > 0.01 which is used to prevent overestimation and focus on the error within the plume. Here, T𝑇Titalic_T denotes the nTsubscript𝑛𝑇n_{T}italic_n start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( = 24) time snapshots over 30 years, given by T𝑇Titalic_T = {10d, 20d, 30d, 50d, 80d, 110d, 150d, 210d, 280d, 1.0y, 1.3y, 1.7y, 2.2y, 2.8y, 3.6y, 4.6y, 5.9y, 7.5y, 9.4y, 11.9y, 15.0y, 19.0y, 23.9y, 30y}. nΩjsubscriptsubscript𝑛Ω𝑗{n_{\Omega}}_{j}italic_n start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT represents the number of cells in the mesh of ΩjsubscriptΩ𝑗\Omega_{j}roman_Ω start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. For pressure buildup error, δΩjPsubscriptsuperscript𝛿𝑃subscriptΩ𝑗\delta^{P}_{\Omega_{j}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT, the absolute difference between the prediction, P^^𝑃\hat{P}over^ start_ARG italic_P end_ARG, and the ground truth, P𝑃Pitalic_P, at each level is normalized by the maximum pressure of the reservoir at each time step, Pt,maxsubscript𝑃𝑡𝑚𝑎𝑥P_{t,max}italic_P start_POSTSUBSCRIPT italic_t , italic_m italic_a italic_x end_POSTSUBSCRIPT. To evaluate the total error per reservoir, δΩSsubscriptsuperscript𝛿𝑆Ω\delta^{S}_{\Omega}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT or δΩSsubscriptsuperscript𝛿𝑆Ω\delta^{S}_{\Omega}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT, we use the following metric:

Total error of 𝒩0P,𝒩1P,,𝒩4P:δΩP=1nTnΩtTiΩ|Pt,iP^t,i|Pt,max,\text{Total error of }\mathcal{N}^{P}_{0},\mathcal{N}^{P}_{1},\dots,\mathcal{N% }^{P}_{4}:\quad\delta^{P}_{\Omega}=\frac{1}{n_{T}n_{\Omega}}\sum_{t\in T}\sum_% {i\in\Omega}\frac{|P_{t,i}-\hat{P}_{t,i}|}{P_{t,max}},Total error of caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT : italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_t ∈ italic_T end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_i ∈ roman_Ω end_POSTSUBSCRIPT divide start_ARG | italic_P start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT - over^ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT | end_ARG start_ARG italic_P start_POSTSUBSCRIPT italic_t , italic_m italic_a italic_x end_POSTSUBSCRIPT end_ARG , (5)
Total error of 𝒩1S,,𝒩4S:δΩS=1It,itTiΩIt,i|St,iS^t,i|.\begin{gathered}\text{Total error of }\mathcal{N}^{S}_{1},\dots,\mathcal{N}^{S% }_{4}:\quad\delta^{S}_{\Omega}=\frac{1}{\sum I_{t,i}}\sum_{t\in T}\sum_{i\in% \Omega}I_{t,i}|S_{t,i}-\hat{S}_{t,i}|.\end{gathered}start_ROW start_CELL Total error of caligraphic_N start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , caligraphic_N start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT : italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG ∑ italic_I start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_t ∈ italic_T end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_i ∈ roman_Ω end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT | italic_S start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT - over^ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT | . end_CELL end_ROW (6)

4 Results

In this section, we highlight the superior computational efficiency of Fourier-DeepONet in terms of both GPU memory consumption and training time. Additionally, we demonstrate the generalization capabilities of the nested Fourier-DeepONet through various experiments, including temporal extrapolation and extrapolation on subsurface conditions and injection schemes. All training experiments were conducted using the NVIDIA A100 80GB GPU. All codes, implemented using the Python library DeepXDE [25], and the dataset will be provided in the following GitHub repository: https://github.com/lu-group/nested-fourier-deeponet-gcs-3d.

4.1 Computational efficiency

As discussed in Section 3.1, the reduced dimension of FFT and flexible batch size for time coordinates result in twice faster training and significant (at least 80%) reduction in GPU memory consumption (Table 1), enabling the training of Fourier-DeepONets even on GPUs with limited memory. Additionally, they lead to a substantial reduction of more than 80% in the number of trainable parameters, both at global and local resolutions, compared to FNO in Ref. [36].

Table 1: Computational cost comparison between training nested FNO and nested Fourier-DeepONet. Fourier-DeepONet uses a time batch size of 6, while FNO is required to use all 24 time batches.
Model Level No. of parameters GPU memory (GiB) Training time (hour)
FNO [36] Global 80.3 M 33.1 37.6
Local 1 150.5 M 18.5 41.7
Local 2–4 150.5 M 25.4 61.3
Fourier-DeepONet Global 13.1 M 4.9 14.9
Local 1 20.8 M 3.3 20.5
Local 2–4 20.8 M 5.0 28.3

4.2 Pressure buildup prediction

In addition to the computational efficiency, nested Fourier-DeepONet demonstrates similar or better accuracy for predicting pressure buildup. We compare the accuracy of Fourier-DeepONets and the state-of-the-art FNOs [36] at all levels in Table 2. As the code for computing the evaluation metrics from Ref. [36] was not public, we implemented our own code for Eqs. (3)–(6). For the accuracy of FNO, we include the accuracy from Ref. [36] and we also report the accuracy from our own FNOs trained using their public code but evaluated using our evaluation metric code.

Regardless of whether the models are fine-tuned or not, nested Fourier-DeepONet showcases superior performances across all levels compared with nested FNO we trained. Because errors in Table 2 accumulate at each subsequent level, to demonstrate the accuracy of each individual network, we present the test errors using ground truth input in Table 6 in Appendix B. For individual networks, Fourier-DeepONet has better accuracy for almost all levels except level 3 for the prediction of pressure buildup.

Table 2: Accuracy comparison for nested Fourier-DeepONet and nested FNO. For nested FNOs, we include the errors from Ref. [36] and the errors from our own FNOs trained using their public code. Bold font indicates the two smallest errors per row.
Nested Fourier-DeepONet Nested FNO Nested FNO [36]
Without fine-tuning With fine-tuning Without fine-tuning With fine-tuning Without fine-tuning With fine-tuning
Pressure δΩPsubscriptsuperscript𝛿𝑃Ω\delta^{P}_{\Omega}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT 0.668% 0.565% 0.833% 0.625% 0.47%
δΩ0Psubscriptsuperscript𝛿𝑃subscriptΩ0\delta^{P}_{\Omega_{0}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.020% 0.020% 0.025% 0.025% 0.02% 0.02%
δΩ1Psubscriptsuperscript𝛿𝑃subscriptΩ1\delta^{P}_{\Omega_{1}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.174% 0.173% 0.222% 0.233% 0.24% 0.16%
δΩ2Psubscriptsuperscript𝛿𝑃subscriptΩ2\delta^{P}_{\Omega_{2}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.329% 0.318% 0.400% 0.374% 0.41% 0.30%
δΩ3Psubscriptsuperscript𝛿𝑃subscriptΩ3\delta^{P}_{\Omega_{3}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.557% 0.550% 0.654% 0.576% 0.68% 0.51%
δΩ4Psubscriptsuperscript𝛿𝑃subscriptΩ4\delta^{P}_{\Omega_{4}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 1.433% 1.110% 1.801% 1.205% 1.88% 0.82%
Saturation δΩSsubscriptsuperscript𝛿𝑆Ω\delta^{S}_{\Omega}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT 1.518% 1.461% 1.663% 1.622% 1.79%
δΩ1Ssubscriptsuperscript𝛿𝑆subscriptΩ1\delta^{S}_{\Omega_{1}}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 1.331% 1.290% 1.650% 1.419% 1.55% 1.39%
δΩ2Ssubscriptsuperscript𝛿𝑆subscriptΩ2\delta^{S}_{\Omega_{2}}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 1.675% 1.569% 1.936% 1.802% 1.93% 1.91%
δΩ3Ssubscriptsuperscript𝛿𝑆subscriptΩ3\delta^{S}_{\Omega_{3}}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 1.649% 1.583% 1.888% 1.737% 2.00% 1.77%
δΩ4Ssubscriptsuperscript𝛿𝑆subscriptΩ4\delta^{S}_{\Omega_{4}}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 1.439% 1.380% 1.563% 1.545% 1.95% 1.82%
Refer to caption
Figure 5: Examples of nested Fourier-DeepONet pressure buildup prediction for three different reservoirs on the top surface of the reservoir. (A) 2 injection wells. (B) 3 injection wells. (C) 4 injection wells.

We show pressure buildup predictions and errors (xy𝑥𝑦xyitalic_x italic_y-view at the top layer) of three randomly selected reservoirs with 2–4 wells at 6 different time snapshots, T𝑇Titalic_T = {50d, 1.0y, 7.5y, 19.0y, 23.9y, and 30y} in Fig. 5. At early times where there is no significant pressure gradient, the error is the lowest. The error increases with time as the injection of CO2 results in the buildup of pressure. For the pressure buildup of the first example (Fig. 5A), both cross-sectional xy𝑥𝑦xyitalic_x italic_y-view at the top layer and xz𝑥𝑧xzitalic_x italic_z-view around the injection point (level 1–4) are shown in Fig. 6A. By employing locally refined models, the errors are significantly reduced through predictions at refined resolutions.

Refer to caption
Figure 6: Nested Fourier-DeepONet prediction at local refinements corresponding to the first example of Fig. 5A at 30 years. (A) Pressure buildup. (B) Gas saturation.

4.3 Gas saturation prediction

For gas saturation network models, the accuracy of nested Fourier-DeepONet and nested FNO is also shown in Table 2. As mentioned in Section 3.4, we do not train the gas saturation model at the global level. As we were able to achieve better accuracy at the global level (δΩ0Psubscriptsuperscript𝛿𝑃subscriptΩ0\delta^{P}_{\Omega_{0}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT) and level 1 (δΩ1Ssubscriptsuperscript𝛿𝑆subscriptΩ1\delta^{S}_{\Omega_{1}}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT) for individual prediction (Table 6), the cumulative error arising from earlier levels resulted in lower accuracy with Fourier-DeepONets for both fine-tuned models and those without tuning. This highlights the importance of achieving robust accuracy at earlier levels when using a nested framework.

As shown in Fig. 6B, the CO2 plume footprint is contained within the domain of level 1 even at 30 years of CO2 injection. The error tends to be the largest at the boundary of the footprint in addition to the injection location.

4.4 Generalization and extrapolation capability of nested Fourier-DeepONet

To further investigate the generalizability of nested Fourier-DeepONet on unseen datasets, we conducted experiments to assess its performance under different extrapolation scenarios in terms of reservoir conditions (Sections 4.4.1 and 4.4.2), injection schemes (Section 4.4.3), and temporal coordinates (Section 4.4.4). For temporal extrapolation, we train models based on the first 21 time snapshots out of 24 while using all training samples.

Refer to caption
Figure 7: Interpolation and extrapolation errors for different reservoir conditions, injection schemes, and temporal coordinates. (A, B, and C) Interpolation and extrapolation errors of pressure buildup for different conditions. (A) Number of wells. (B) Permeability field magnitude. (C) Maximum injection rate. (D and E) Interpolation and extrapolation errors of gas saturation for different conditions. (D) Permeability field magnitude. (E) Injection rate. (F and G) Comparison between nested Fourier-DeepONet and nested FNO for temporal extrapolation. (F) Errors of pressure buildup. and (G) Errors of gas saturation.

4.4.1 Number of wells

Our dataset contains reservoirs with 1–4 injection wells. Here, we investigate the extrapolation capabilities of Fourier-DeepONet on the number of wells. Specifically, reservoirs containing 1–3 wells (1806 samples) were used to train Fourier-DeepONet at level 0, 𝒩0Psubscriptsuperscript𝒩𝑃0\mathcal{N}^{P}_{0}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Then, we evaluate their prediction accuracy on reservoirs containing 4 wells for pressure buildup prediction, which we refer to as extrapolation. We did not separately train local level (1–4) models for this scenario because the local level models are not subject to the number of wells.

We observe comparable errors for predictions on reservoirs with 1–3 wells (i.e. interpolation) between the baseline model and the model trained with 1–3 wells (Fig. 7A). The baseline model refers to the nested Fourier-DeepONet without fine-tuning in Table 2, which has been trained on the entire dataset. For reservoirs with 4 wells, the third quartile of error in predicting pressure buildup using the extrapolation model is below 2%. This result suggests that the nested Fourier-DeepONet demonstrates strong generalization capabilities across different well configurations.

4.4.2 Permeability field

We assess the extrapolation capability of Fourier-DeepONet on the permeability field by dividing the dataset into five clusters based on the average value of the 3D permeability field in the reservoirs. In the dataset, the permeability fields are well clustered into five groups when projected into a 2D space of mean and variance values (Fig. 7B). The cutoff values separating the clusters, from Group 1 to Group 5, are 1.02, 1.1, 1.25, and 1.377 ln mD, as shown in different colors in Fig. 7B. For this experiment, we select samples from the first four groups for training, which represent 55% (1637 samples) of the original training dataset 𝒩0Psubscriptsuperscript𝒩𝑃0\mathcal{N}^{P}_{0}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

For both pressure buildup and gas saturation prediction, the errors between the baseline model and the extrapolation model are comparable in the interpolation regime (Group 1 to Group 4). Despite the observed error increase in the extrapolation regime (Group 5), the median error for pressure buildup prediction remains below 2%, and the error for gas saturation prediction stays below 5% (Fig. 7B). Because samples with high permeability field mean have larger variances, and having a highly permeable field leads to easier migration of CO2 through the media, it can become more difficult for the model to make accurate predictions under such subsurface conditions. This implies that it is more efficient to allocate computing resources to obtaining datasets of permeability fields of a high magnitude. This analysis offers further insight into the data generation process, indicating that training the models with a broader range of permeability fields, particularly at higher permeabilities, would enhance overall accuracy with minimal tradeoffs regarding lower prediction errors at a lower permeability regime.

4.4.3 Injection rate

Next, we conduct an extrapolation study based on injection rates. We differentiate the samples by their maximum injection rate for pressure buildup models, but we look at each injection rate to differentiate the samples for gas saturation models. Samples with multiple wells inherently have a higher probability of having a larger maximum injection rate, so we use less number of samples for training pressure buildup models (1155 for 𝒩0Psubscriptsuperscript𝒩𝑃0\mathcal{N}^{P}_{0}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and 2425 for 𝒩14Psubscriptsuperscript𝒩𝑃14\mathcal{N}^{P}_{1-4}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 - 4 end_POSTSUBSCRIPT) than gas saturation models (4325 for 𝒩14ssubscriptsuperscript𝒩𝑠14\mathcal{N}^{s}_{1-4}caligraphic_N start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 - 4 end_POSTSUBSCRIPT). The pressure buildup extrapolation models are trained with samples with maximum injection rates up to 1.6 MT/year at the global level and the corresponding samples are selected for training at local levels. On the other hand, we only consider the injection rate for each well for training gas saturation models.

As shown in Fig. 7C, we observe a moderate increase in error when predicting pressure buildup for maximum injection rates above 1.6 MT/yr per reservoir. This increase is not only due to extrapolation errors but also the reduced number of training samples (<<< 40% of the total samples) available at injection rates below 1.6 MT/yr. Notably, the error increase observed when using a model trained on injection rates up to 1.6 MT/yr for predicting samples in the 1.4–1.6 MT/yr range (interpolation regime) is comparable to that seen in the extrapolation regime relative to a baseline model. This suggests that the contribution of extrapolation errors may be minimal, as a similar error increase is observed within the interpolation regime. Similarly, the error in gas saturation predictions increases to a similar extent for samples with injection rates between 1.4 and 1.6 MT/yr and those above 1.6 MT/yr. This trend indicates that the extrapolation error in the gas saturation model trained on injection rates below 1.6 MT/yr is also likely to be small.

4.4.4 Extrapolation in time

The final experiment involves a temporal extrapolation. Instead of training a model with a reduced number of reservoir samples as in Sections 4.4.1 to 4.4.3, we train models with a limited number of time snapshots (21 out of 24) up to 15 years to assess their generalization capabilities up to 30 years. We train both a nested Fourier-DeepONet and a nested FNO to demonstrate the superior performance of a nested Fourier-DeepONet in temporal extrapolation.

Nested Fourier-DeepONet, trained up to 15 years, exhibits a similar error in the interpolation regime as the baseline model, but only slightly higher errors in the extrapolation regime (Fig. 7D). The rise in error using a nested Fourier-DeepONet is relatively moderate, especially compared to FNO. This is because the use of a trunk net from Fourier-DeepONet allows the solution to be continuous over time, rather than using a separate channel for each time step like FNO. For example, nested Fourier-DeepONet that has been trained up to 15 years extrapolated on 30 years with less than 1% error increase for pressure buildup prediction compared with a baseline model, which is a significant improvement over FNO. Although we observe a moderate error increase in gas saturation prediction (Fig. 7E), Fourier-DeepONet still exhibits roughly a 10% decrease in error compared to FNO. This experiment enables us to assess the model’s performance in the temporal extrapolation regime.

5 Conclusions

In this work, we have developed a nested Fourier-DeepONet framework for simulating a 3D GCS over 30 year period. We also have demonstrated its superior performance over nested FNO not only in accuracy but also in computational efficiency. Fourier-DeepONet is faster in training compared to FNO, thanks to the reduced number of trainable parameters and lower consumption of GPU memory achieved by selecting a smaller time batch size.

Previous 2D GCS experiments by Jiang et al. [16] in temporal interpolation have shown better performance compared to FNO, as the trunk net guarantees continuous prediction with respect to time. Building upon this, our extrapolation experiments have further revealed that the nested Fourier-DeepONet exhibits good extrapolation capability across various scenarios (i.e., number of wells, permeability field, injection rate, time coordinates). This extrapolation analysis provides valuable insight into the maximization of model performance from a limited dataset.

Currently, there are still some limitations to the application of deep neural operators to real-world problems. First, since the current method relies solely on data, the size of the dataset significantly influences the predictive capability of the neural operators. Although deep neural operators demonstrate good generalizability, subsurface conditions can vary widely between sites. To address this challenge, one common approach is to incorporate physics-informed loss functions during neural network training. Another limitation of the current nested framework is the need to train a model for each level, which can be time-consuming. Future advancements could involve developing architectures capable of handling non-equispaced meshes with varying grid points and resolutions. For example, a deep neural operator should be able to train and predict in reservoirs with varying well configurations, involving different numbers of cells.

Acknowledgments

This work was supported by ExxonMobil Technology and Engineering Company and U.S. Department of Energy Office of Advanced Scientific Computing Research under Grants No. DE-SC0025592 and No. DE-SC0025593. We thank Gege Wen et al. for making the dataset public and Alex GK Lee for providing valuable feedback.

Disclosure statement

No potential conflict of interest was reported by the authors.

Author contributions statement

Jonathan E. Lee: Writing – review & editing, Writing – original draft, Visualization, Validation, Software, Methodology, Investigation, Formal analysis, Data curation. Min Zhu: Writing – review & editing, Writing – original draft, Visualization, Validation, Software, Methodology, Investigation, Formal analysis, Data curation. Ziqiao Xi: Visualization, Investigation. Kun Wang: Writing – review & editing, Supervision, Funding acquisition. Yanhua O. Yuan: Writing – review & editing, Supervision, Funding acquisition. Lu Lu: Writing – review & editing, Writing – original draft, Supervision, Resources, Project administration, Methodology, Funding acquisition, Formal analysis, Conceptualization.

Data availability statement

Training data and the code supporting the findings of this study will be available in GitHub at https://github.com/lu-group/nested-fourier-deeponet-gcs-3d. These data were derived from the following resources available in the public domain: Ref. [36].

Appendix A Fourier-DeepONet architecture

The network architectures of Fourier-DeepONets for all levels mentioned in Section 3.2 are shown in Tables 34, and 5.

Table 3: Global Fourier-DeepONet architecture.
Operation Output shape
Branch net Padding(8), Linear (C, 116, 116, 21, 32)
Trunk net Linear (T, 32)
Merge operation Point-wise multiplication (C ×\times× T, 116, 116, 21, 32)
Merge net Fourier 1 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 116, 116, 21, 32)
Fourier 2 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 116, 116, 21, 32)
Fourier 3 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 116, 116, 21, 32)
Fourier 4 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 116, 116, 21, 32)
De-padding De-padding(8) (C ×\times× T, 100, 100, 5, 32)
Projection 1 Linear (C ×\times× T, 100, 100, 5, 128)
Projection 2 Linear (C ×\times× T, 100, 100, 5, 1)
Reshape - (C, T, 100, 100, 5)
Table 4: LGR1 Fourier-DeepONet architecture.
Operation Output shape
Branch net Padding(8), Linear (C, 56, 56, 41, 36)
Trunk net Linear (T, 36)
Merge operation Point-wise multiplication (C ×\times× T, 56, 56, 41, 36)
Merge net Fourier 1 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 56, 56, 41, 36)
Fourier 2 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 56, 56, 41, 36)
Fourier 3 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 56, 56, 41, 36)
Fourier 4 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 56, 56, 41, 36)
De-padding De-padding(8) (C ×\times× T, 40, 40, 25, 36)
Projection 1 Linear (C ×\times× T, 40, 40, 25, 144)
Projection 2 Linear (C ×\times× T, 40, 40, 25, 1)
Reshape - (C, T, 40, 40, 25)
Table 5: LGR2–4 Fourier-DeepONet architecture.
Operation Output shape
Branch net Padding(8), Linear (C, 56, 56, 66, 36)
Trunk net Linear (T, 28)
Merge operation Point-wise multiplication (C ×\times× T, 56, 56, 66, 36)
Merge net Fourier 1 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 56, 56, 66, 36)
Fourier 2 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 56, 56, 66, 36)
Fourier 3 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 56, 56, 66, 36)
Fourier 4 Fourier3d/Conv1d/Add/GELU (C ×\times× T, 56, 56, 66, 36)
De-padding De-padding(8) (C ×\times× T, 40, 40, 50, 36)
Projection 1 Linear (C ×\times× T, 40, 40, 50, 144)
Projection 2 Linear (C ×\times× T, 40, 40, 50, 1)
Reshape - (C, T, 40, 40, 50)

Appendix B Prediction accuracy for individual networks

Table 6 shows the test accuracy of the nested Fourier-DeepONet and nested FNO models, where ground truth outputs from the previous level are used for 𝒩iPsubscriptsuperscript𝒩𝑃𝑖\mathcal{N}^{P}_{i}caligraphic_N start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and 𝒩iSsubscriptsuperscript𝒩𝑆𝑖\mathcal{N}^{S}_{i}caligraphic_N start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, with i𝑖iitalic_i = 1–4. Although this approach is not feasible in practice, it allows us to assess how well the model was trained at each individual level. Table 2 corresponds to the “sequential” prediction accuracy, while Table 6 represents the “separate” prediction accuracy, as presented in Table 7 of Ref. [36].

Table 6: Accuracy comparison for nested FNO and nested Fourier-DeepONet. The errors are computed using the test dataset both in the entire domain and each subdomain. Separate refers to the model prediction using the ground truth from the previous level’s output as the input.
Domain Nested Fourier-DeepONet Nested FNO Nested FNO [36]
Pressure δΩPsubscriptsuperscript𝛿𝑃Ω\delta^{P}_{\Omega}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT 0.069% 0.080% -
δΩ0Psubscriptsuperscript𝛿𝑃subscriptΩ0\delta^{P}_{\Omega_{0}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.020% 0.025% 0.02%
δΩ1Psubscriptsuperscript𝛿𝑃subscriptΩ1\delta^{P}_{\Omega_{1}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.039% 0.061% 0.10%
δΩ2Psubscriptsuperscript𝛿𝑃subscriptΩ2\delta^{P}_{\Omega_{2}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.044% 0.046% 0.16%
δΩ3Psubscriptsuperscript𝛿𝑃subscriptΩ3\delta^{P}_{\Omega_{3}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.059% 0.058% 0.14%
δΩ4Psubscriptsuperscript𝛿𝑃subscriptΩ4\delta^{P}_{\Omega_{4}}italic_δ start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.128% 0.148% 0.45%
Saturation δΩSsubscriptsuperscript𝛿𝑆Ω\delta^{S}_{\Omega}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω end_POSTSUBSCRIPT 0.486% 0.461% -
δΩ1Ssubscriptsuperscript𝛿𝑆subscriptΩ1\delta^{S}_{\Omega_{1}}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 1.073% 1.271% 1.27%
δΩ2Ssubscriptsuperscript𝛿𝑆subscriptΩ2\delta^{S}_{\Omega_{2}}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.733% 0.667% 1.00%
δΩ3Ssubscriptsuperscript𝛿𝑆subscriptΩ3\delta^{S}_{\Omega_{3}}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.562% 0.480% 0.61%
δΩ4Ssubscriptsuperscript𝛿𝑆subscriptΩ4\delta^{S}_{\Omega_{4}}italic_δ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 0.478% 0.455% 0.74%

References

  • [1] Eclipse reservoir simulation software reference manual, 2014.
  • [2] Blunt, M. J. Multiphase Flow in Permeable Media: A Pore-Scale Perspective. Cambridge University Press, Oct. 2016.
  • [3] Bramble, J. H., Ewing, R. E., Pasciak, J. E., and Schatz, A. H. A preconditioning technique for the efficient solution of problems with local grid refinement. Computer Methods in Applied Mechanics and Engineering 67, 2 (Mar. 1988), 149–159.
  • [4] Chen, T., and Chen, H. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Transactions on Neural Networks 6, 4 (July 1995), 911–917.
  • [5] Chen, Y., Lu, L., Karniadakis, G. E., and Dal Negro, L. Physics-informed neural networks for inverse problems in nano-optics and metamaterials. Optics Express 28, 8 (Apr. 2020), 11618.
  • [6] Clark Di Leoni, P., Lu, L., Meneveau, C., Karniadakis, G. E., and Zaki, T. A. Neural operator prediction of linear instability waves in high-speed boundary layers. Journal of Computational Physics 474 (Feb. 2023), 111793.
  • [7] Daneker, M., Cai, S., Qian, Y., Myzelev, E., Kumbhat, A., Li, H., and Lu, L. Transfer learning on physics-informed neural networks for tracking the hemodynamics in the evolving false lumen of dissected aorta. Nexus 1, 2 (June 2024), 100016.
  • [8] Daneker, M., Zhang, Z., Karniadakis, G. E., and Lu, L. Systems Biology: Identifiability Analysis and Parameter Identification via Systems-Biology-Informed Neural Networks. Springer US, 2023, p. 87–105.
  • [9] Deng, B., Shin, Y., Lu, L., Zhang, Z., and Karniadakis, G. E. Approximation rates of DeepONets for learning operators arising from advection–diffusion equations. Neural Networks 153 (Sept. 2022), 411–426.
  • [10] Doughty, C. Investigation of CO2 plume behavior for a large-scale pilot test of geologic carbon storage in a saline formation. Transport in Porous Media 82, 1 (May 2009), 49–76.
  • [11] Eigestad, G. T., Dahle, H. K., Hellevang, B., Riis, F., Johansen, W. T., and Øian, E. Geological modeling and simulation of CO2 injection in the johansen formation. Computational Geosciences 13, 4 (Aug. 2009), 435–450.
  • [12] Faigle, B., Helmig, R., Aavatsmark, I., and Flemisch, B. Efficient multiphysics modelling with adaptive grid refinement using a mpfa method. Computational Geosciences 18, 5 (Mar. 2014), 625–636.
  • [13] Fan, B., Qiao, E., Jiao, A., Gu, Z., Li, W., and Lu, L. Deep learning for solving and estimating dynamic macro-finance models. Computational Economics (Aug. 2024).
  • [14] Hayford, J., Goldman-Wetzler, J., Wang, E., and Lu, L. Speeding up and reducing memory usage for scientific machine learning via mixed precision. Computer Methods in Applied Mechanics and Engineering 428 (Aug. 2024), 117093.
  • [15] Jagtap, A. D., and Karniadakis, G. E. Extended physics-informed neural networks (XPINNs): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. Communications in Computational Physics 28, 5 (2020).
  • [16] Jiang, Z., Zhu, M., and Lu, L. Fourier-MIONet: Fourier-enhanced multiple-input neural operators for multiphase modeling of geological carbon sequestration. Reliability Engineering & System Safety 251 (2024), 110392.
  • [17] Jiao, A., He, H., Ranade, R., Pathak, J., and Lu, L. One-shot learning for solution operators of partial differential equations. arXiv preprint arXiv:2104.05512 (2021).
  • [18] Jin, P., Meng, S., and Lu, L. MIONet: Learning multiple-input operators via tensor product. SIAM Journal on Scientific Computing 44, 6 (Nov. 2022), A3490–A3514.
  • [19] Kamashev, A., and Amanbek, Y. Reservoir simulation of CO2 storage using compositional flow model for geological formations in frio field and precaspian basin. Energies 14, 23 (Dec. 2021), 8023.
  • [20] Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., and Yang, L. Physics-informed machine learning. Nature Reviews Physics 3, 6 (May 2021), 422–440.
  • [21] Khebzegga, O., Iranshahr, A., and Tchelepi, H. Continuous relative permeability model for compositional simulation. Transport in Porous Media 134, 1 (July 2020), 139–172.
  • [22] Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Fourier neural operator for parametric partial differential equations, 2020.
  • [23] Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence 3, 3 (Mar. 2021), 218–229.
  • [24] Lu, L., Meng, X., Cai, S., Mao, Z., Goswami, S., Zhang, Z., and Karniadakis, G. E. A comprehensive and fair comparison of two neural operators (with practical extensions) based on FAIR data. Computer Methods in Applied Mechanics and Engineering 393 (Apr. 2022), 114778.
  • [25] Lu, L., Meng, X., Mao, Z., and Karniadakis, G. E. Deepxde: A deep learning library for solving differential equations. SIAM Review 63, 1 (Jan. 2021), 208–228.
  • [26] Lu, L., Pestourie, R., Johnson, S. G., and Romano, G. Multifidelity deep neural operators for efficient learning of partial differential equations with application to fast inverse design of nanoscale heat transport. Physical Review Research 4, 2 (June 2022).
  • [27] Lu, L., Pestourie, R., Yao, W., Wang, Z., Verdugo, F., and Johnson, S. G. Physics-informed neural networks with hard constraints for inverse design. SIAM Journal on Scientific Computing 43, 6 (Jan. 2021), B1105–B1132.
  • [28] Mao, S., Dong, R., Lu, L., Yi, K. M., Wang, S., and Perdikaris, P. PPDONet: Deep operator networks for fast prediction of steady-state solutions in disk–planet systems. The Astrophysical Journal Letters 950, 2 (June 2023), L12.
  • [29] Mo, S., Zhu, Y., Zabaras, N., Shi, X., and Wu, J. Deep convolutional encoder‐decoder networks for uncertainty quantification of dynamic multiphase flow in heterogeneous media. Water Resources Research 55, 1 (Jan. 2019), 703–728.
  • [30] Orr, F. M. Theory of gas injection processes. Tie-Line Publications, 2007.
  • [31] Pruess, K., and García, J. Multiphase flow dynamics during CO2 disposal into saline aquifers. Environmental Geology 42, 2–3 (Mar. 2002), 282–295.
  • [32] Remy, N., Boucher, A., and Wu, J. Applied Geostatistics with SGeMS: A User’s Guide. Cambridge University Press, Jan. 2009.
  • [33] Tang, M., Liu, Y., and Durlofsky, L. J. A deep-learning-based surrogate model for data assimilation in dynamic subsurface flow problems. Journal of Computational Physics 413 (July 2020), 109456.
  • [34] Wen, G., and Benson, S. M. CO2 plume migration and dissolution in layered reservoirs. International Journal of Greenhouse Gas Control 87 (Aug. 2019), 66–79.
  • [35] Wen, G., Hay, C., and Benson, S. M. CCSNet: a deep learning modeling suite for CO2 storage. Advances in Water Resources 155 (2021), 104009.
  • [36] Wen, G., Li, Z., Long, Q., Azizzadenesheli, K., Anandkumar, A., and Benson, S. M. Real-time high-resolution CO2 geological storage prediction using nested fourier neural operators. Energy & Environmental Science 16, 4 (2023), 1732–1741.
  • [37] Wen, G., Tang, M., and Benson, S. M. Towards a predictor for CO2 plume migration using deep neural networks. International Journal of Greenhouse Gas Control 105 (Feb. 2021), 103223.
  • [38] Wu, C., Zhu, M., Tan, Q., Kartha, Y., and Lu, L. A comprehensive study of non-adaptive and residual-based adaptive sampling for physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering 403 (Jan. 2023), 115671.
  • [39] Wu, W., Daneker, M., Jolley, M. A., Turner, K. T., and Lu, L. Effective data sampling strategies and boundary condition constraints of physics-informed neural networks for identifying material properties in solid mechanics. Applied Mathematics and Mechanics 44, 7 (July 2023), 1039–1068.
  • [40] Yazdani, A., Lu, L., Raissi, M., and Karniadakis, G. E. Systems biology informed deep learning for inferring parameters and hidden dynamics. PLOS Computational Biology 16, 11 (Nov. 2020), e1007575.
  • [41] Yu, J., Lu, L., Meng, X., and Karniadakis, G. E. Gradient-enhanced physics-informed neural networks for forward and inverse pde problems. Computer Methods in Applied Mechanics and Engineering 393 (Apr. 2022), 114823.
  • [42] Zhang, Z., Moya, C., Lu, L., Lin, G., and Schaeffer, H. D2NO: Efficient handling of heterogeneous input function spaces with distributed deep neural operators. Computer Methods in Applied Mechanics and Engineering 428 (Aug. 2024), 117084.
  • [43] Zhu, M., Feng, S., Lin, Y., and Lu, L. Fourier-DeepONet: Fourier-enhanced deep operator networks for full waveform inversion with improved accuracy, generalizability, and robustness. Computer Methods in Applied Mechanics and Engineering 416 (2023), 116300.
  • [44] Zhu, M., Zhang, H., Jiao, A., Karniadakis, G. E., and Lu, L. Reliable extrapolation of deep neural operators informed by physics or sparse observations. Computer Methods in Applied Mechanics and Engineering 412 (2023), 116064.
  • [45] Zhu, Y., and Zabaras, N. Bayesian deep convolutional encoder–decoder networks for surrogate modeling and uncertainty quantification. Journal of Computational Physics 366 (Aug. 2018), 415–447.