Office Action Analysis: 17743112 — PHYSICS-INFORMED ATTENTION-BASED NEURAL NETWORK

Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is made final.
Claims 1-12 and 15-30 are pending. Claims 1, 20 and 28 are independent claims.

Response to Arguments
Applicant’s arguments dated 11/6/2025 and 12/10/2025, regarding the 35 U.S.C. 103 rejections of the previous office action have been fully considered. Due to the claim amendments, the scope of the claims has changed and new grounds of rejection are applied – see the updated rejection below.
In the arguments dated 11/6/2025 and 12/10/2025, applicant argues that McClenny is significantly different from the elements described in the now amended claims 1, 20 and 28, because McClenny “defines learnable parameters to adjust the relevance of loss terms associated with boundary conditions, initial conditions and interior points within the neural network”. Applicant then describes a self-attention mechanism, which is not claimed, and argues that McClenny is using the term “soft attention mechanism” for analogy or marketing purposes. The examiner argues that the physics-informed neural network (PINN) + attention mechanism disclosed by McClenny fits many claimed limitations, as it focuses the PINN on difficult areas of the solution, i.e., shocks, (McClenny, pg. 1, Abstract, it has been observed that the original PINN algorithm can produce inaccuracies around sharp transitions in the solution, as well as display instability during training… Self-adaptive PINNs are based on trainable weights that can automatically force the neural network to focus on difficult regions of the solution) by weighting PDE inputs on a spatio-temporal scale (McClenny, pg. 6, Fig. 2 depicts learned weights – and – pg. 6, Conclusion, a similar conceptual framework as soft self-attention mechanisms used in Computer Vision, in that the network identifies which inputs are most important to its own training, in real time, with no additional hyperparameters. These weights are updated with respect to the loss function of the PINN, therefore the PINN training is capable of identifying a unique mask for any initial value problem). 
Applicant further argues that Asao, which was previously used to teach limitations of the now amended claim 20, operates in a different domain, and does not teach the limitations of the amended claim, reciting many elements that differentiate claim 20 from the Asao reference, some of which are not included within the claim. In response to applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., accurately computing differential terms via numerical schemes such as finite differences) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). For the newly added limitations, see the updated rejection below. 
Applicant also argues that Shahkarami does not teach various elements of the claims. In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). Shahkarami was only used to teach the limitation of creating a surrogate model to generate a simulation output. In response to applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., grid-consistent pressure saturation solutions, sequence-to-field operator architectures) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).

Due to the claim amendments, the 35 U.S.C. 112 rejections have been withdrawn.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 2, 3, 6, 7 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Vos et al. (US 20220342115 A1), herein Vos, in view of McClenny et al. ("Self-Adaptive Physics-Informed Neural Networks using a Soft Attention Mechanism", INCLUDED IN IDS), herein McClenny, and Bihorac et al. (US 20220044809 A1), herein Bihorac.
Regarding claim 1, Vos teaches: A computer system implementing a physics-informed attention-based neural network (PIANN) comprising: at least one processor; and a memory storing computer instructions, the instructions implementing a PIANN, wherein, when the at least one processor executes the instructions (¶26, The networked computer environment 100 may include a computer 102 with a processor 104 and a data storage device 106), the PIANN is trained to learn a solution or model for at least one nonlinear partial differential equation (PDE) respecting one or more physical constraints (¶14, More specifically, a non-linear reduced order model or autoencoder is paired with a neural network in the latent space and with physics-informed partial differential equation (PDE) constraints), wherein the PIANN includes a physics-informed neural network (PINN) implementing a deep neural network (¶29, The physics-informed neural network may include a deep learning model).
Vos fails to teach: and a transition zone detector, the PINN including an attention mechanism configured to automatically detect shocks in the solution of the at least one nonlinear PDE, and wherein the PIANN… determines which parts of input parameters of the at least one nonlinear PDE are relevant.
However, in the same field of endeavor, McClenny teaches: and a transition zone detector (pg. 3, Section 3, ¶3, The trainable weights can efficiently force the network to focus on the initial, boundary, or residual points located in difficult or important regions of the solution), the PINN including an attention mechanism configured to automatically detect shocks in the solution of the at least one nonlinear PDE (pg. 1, Abstract, it has been observed that the original PINN algorithm can produce inaccuracies around sharp transitions in the solution, as well as display instability during training… Self-adaptive PINNs are based on trainable weights that can automatically force the neural network to focus on difficult regions of the solution – sharp transitions, i.e., shocks), and wherein the PIANN… determines which parts of input parameters of the at least one nonlinear PDE are relevant (pg. 6, Conclusion, a similar conceptual framework as soft self-attention mechanisms used in Computer Vision, in that the network identifies which inputs are most important to its own training, in real time, with no additional hyperparameters. These weights are updated with respect to the loss function of the PINN, therefore the PINN training is capable of identifying a unique mask for any initial value problem).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to introduce the transition zone detector that acts as an attention mechanism as disclosed by McClenny to the system disclosed by Vos to reduce computational cost (pg. 6, Section 4.4, more accurate solutions of PDEs with smaller computational cost).
Vos in view of McClenny fails to teach: includes a gated recurrent unit (GRU) architecture that…
However, in the same field of endeavor, Bihorac teaches: wherein the network includes a gated recurrent unit (GRU) architecture that determines relevant input parameters (¶63, Rather than relying on the most recent hidden state of the GRU for making a prediction, various embodiments of the deep learning model instead provide a weighted average of all prior hidden states to the final classification layer – and – ¶64, falls under the umbrella category of an attention mechanism, named as such in reference to the act of a model “focusing” on particular pieces of an input rather than its entirety).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include a GRU architecture for attention as disclosed by Bihorac in the system disclosed by Vos in view of McClenny to measure the importance of input parameters in an efficient manner (¶53, The key attribute of an RNN which makes it especially useful for modeling temporal data is its notion of internal memory – and – ¶59, with the advantage that the GRU networks include fewer required parameters than LSTM networks – and – ¶63, the sequence in effect becomes more distilled, by dampening inconsequential time steps and amplifying important ones relative to the final outcome… insight into the internal reasoning of the GRU's predictions, since larger time step weights directly influences the averaged hidden state used for prediction and can thus be interpreted as denoting important time steps relative to the outcome of interest).

Regarding claim 2, Vos further teaches: The system of claim 1, wherein the PIANN includes an encoder and a decoder (¶14, More specifically, a non-linear reduced order model or autoencoder is paired with a neural network in the latent space and with physics-informed partial differential equation (PDE) constraints).

Regarding claim 3, Vos further teaches: The system of claim 2, wherein the encoder is used to map an encoder input to an encoder output in an embedding space, and wherein the decoder is used to map a decoder input in the embedding space to a decoder output (¶31, The encoder (ϕ) may receive a state x(t) produced by the coarse resolution climate model 202 and may produce a latent state z(t)… The decoder (ψ) may receive the latent state z(t) and produce a reconstructed state).

Regarding claim 6, Vos in view of McClenny fails to teach: The system of claim 1, wherein the PIANN includes a plurality of RNN units.
However, in the same field of endeavor, Bihorac teaches: wherein the PIANN includes a plurality of RNN units (¶7, deep learning model comprises a modified recurrent neural network (RNN) with gated recurrent units (GRUs)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use RNN units as disclosed by Bihorac in the system disclosed by Vos in view of McClenny to improve NN performance (¶53, The key attribute of an RNN which makes it especially useful for modeling temporal data is its notion of internal memory).

Regarding claim 7, Vos in view of McClenny fails to teach: The system of claim 6, wherein the plurality of RNN units include at least one of: the GRU, or at least one long short-term memory (LSTM).
However, in the same field of endeavor, Bihorac teaches: wherein the plurality of RNN units include at least one of: the GRU, or at least one long short-term memory (LSTM)s (¶7, deep learning model comprises a modified recurrent neural network (RNN) with gated recurrent units (GRUs) – and – ¶59, An example embodiment of the deep learning model uses an LSTM network).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use a GRU or LSTM included in the RNN units as disclosed by Bihorac in the system disclosed by Vos in view of McClenny to increase performance or reduce network size (¶59, long short-term memory (LSTM), which augments the traditional RNN with additional weight matrices and gating functions to improve long-range sequential processing – and – ¶59, with the advantage that the GRU networks include fewer required parameters than LSTM network).

Regarding claim 8, Vos in view of McClenny fails to teach: The system of claim 6, wherein the plurality of RNN units include a plurality of GRUs.
However, in the same field of endeavor, Bihorac teaches: wherein the plurality of RNN units include a plurality of GRUs (¶7, deep learning model comprises a modified recurrent neural network (RNN) with gated recurrent units (GRUs)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use a plurality of GRUs in the RNN units as disclosed by Bihorac in the system disclosed by Vos in view of McClenny to increase performance or reduce network size (¶59, with the advantage that the GRU networks include fewer required parameters than LSTM network).

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Vos in view of McClenny and Bihorac as applied to claim 3 above, and further in view of Joze et al. (US 10931976 B1), herein Joze.
Regarding claim 4, Vos in view of McClenny and Bihorac fails to teach: The system of claim 3, wherein the encoder input size and decoder input size are equal.
However, in the same field of endeavor, Joze teaches: wherein the encoder input size and decoder input size are equal (Col. 5, Line 36, An autoencoder, such as the video autoencoder and/or the audio autoencoder described above, is a neural network with equal input and output sizes).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to maintain the size of the encoder input and decoder output as disclosed by Joze in the system disclosed by Vos in view of McClenny and Bihorac in order to reduce error (Col. 5, Line 54, In order to minimize the error between the input data and the reconstructed output data).

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Vos in view of McClenny and Bihorac as applied to claim 3 above, and further in view of O’ Shea (US 20180322388 A1).
Regarding claim 5, Vos in view of McClenny and Bihorac fails to teach: The system of claim 3, wherein a linear transition layer is introduced in the embedding space between the encoder and the decoder.
However, in the same field of endeavor, O’ Shea teaches: wherein a linear transition layer is introduced in the embedding space between the encoder and the decoder (¶54, In some implementations, a linear regression layer may be implemented on the output of the encoder 202).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to add a linear layer between the encoder and decoder as disclosed by O’Shea in the PIANN system disclosed by Vos in view of McClenny and Bihorac to better handle problem complexity (¶26, By implementing machine-learning networks that may be trained to learn suitable encoding and decoding techniques for different types of communication media, techniques disclosed herein offer various advantages, such as improved power, resiliency, and complexity advantages).

Claims 9-13 are rejected under 35 U.S.C. 103 as being unpatentable over Vos in view of McClenny and Bihorac as applied to claim 6 above, and further in view of Asao et al. (US 20210034817 A1), herein Asao.
Regarding claim 9, Vos in view of McClenny and Bihorac fails to teach: The system of claim 6, wherein the plurality of RNN units include a first plurality of GRUs and a second plurality of GRUs, and wherein the first plurality of GRUs are used as a part of an encoder of the PIANN and the second plurality of GRUs are used as a part of a decoder of the PIANN.
However, in the same field of endeavor, Asao teaches: wherein the plurality of RNN units include a first plurality of GRUs and a second plurality of GRUs, and wherein the first plurality of GRUs are used as a part of an encoder of the PIANN and the second plurality of GRUs are used as a part of a decoder of the PIANN. (Fig. 2, encoder 144 and decoder 154, both of which are composed of GRUs)-.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use GRUs in the encoder and decoder as disclosed by Asao in the system disclosed by Vos in view of McClenny and Bihorac to increase model accuracy (¶91, the probability that question-answering device 122 outputs a right answer becomes higher).

Regarding claim 10, Vos in view of McClenny and Bihorac fails to teach: The system of claim 9, wherein the transition zone detector includes the attention mechanism implemented as an attention layer introduced between the first plurality of GRUs and the second plurality of GRUs.
However, in the same field of endeavor, Asao teaches: wherein the transition zone detector includes the attention mechanism implemented as an attention layer introduced between the first plurality of GRUs and the second plurality of GRUs (Fig. 2, attention layer 160 between encoder 144 and decoder 154, both of which are composed of GRUs).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use an attention mechanism between the encoder and decoder as disclosed by Asao in the system disclosed by Vos in view of McClenny and Bihorac to increase model accuracy (¶91, the probability that question-answering device 122 outputs a right answer becomes higher).

Regarding claim 11, Vos in view of McClenny and Bihorac fails to teach: The system of claim 10, wherein the attention layer is used to calculate a context vector based on encoder hidden states corresponding to the first plurality of GRUs.
However, in the same field of endeavor, Asao teaches: wherein the attention layer is used to calculate a context vector based on encoder hidden states corresponding to the first plurality of GRUs (¶50, an attention layer 160 for calculating a context vector used by the decoder 154 in calculating each word of word sequence 156, from values referred to as attention and hidden states of each GRU in encoder 144 and applying it to decoder 154).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use the hidden states to calculate the context vector as disclosed by Asao in the system disclosed by Vos in view of McClenny and Bihorac to increase model accuracy (¶91, the probability that question-answering device 122 outputs a right answer becomes higher).

Regarding claim 12, Vos in view of McClenny and Bihorac fails to teach: The system of claim 11, wherein the context vector is calculated based on attention weights that are determined based on the encoder hidden states corresponding to the first plurality of GRUs.
However, in the same field of endeavor, Asao teaches: wherein the context vector is calculated based on attention weights that are determined based on the encoder hidden states corresponding to the first plurality of GRUs (¶56, a context vector generating unit 164 for calculating a context vector 168 as a weighted average of hidden states of respective GRUs).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use the hidden states to calculate the context vector as disclosed by Asao in the system disclosed by Vos in view of McClenny and Bihorac to increase model accuracy (¶91, the probability that question-answering device 122 outputs a right answer becomes higher).

Regarding claim 13, Vos in view of McClenny and Bihorac fails to teach: The system of claim 12, wherein the context vector is used as input into at least one of the second plurality of GRUs.
However, in the same field of endeavor, Asao teaches: wherein the context vector is used as input into at least one of the second plurality of GRUs (Fig 2, inputting context vector 168 into decoder 154).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use the context vector as input to the decoder GRUs as disclosed by Asao in the system disclosed by Vos in view of McClenny and Bihorac to increase model accuracy (¶91, the probability that question-answering device 122 outputs a right answer becomes higher).

Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Vos in view of McClenny and Bihorac as applied to claim 1 above, and further in view of Fraces et al. ("Physics Informed Deep Learning for Transport in Porous Media. Buckley Leverett Problem", 2020), herein Fraces.
Regarding claim 15, Vos in view of McClenny and Bihorac fails to teach: The system of claim 1, wherein the non-linear PDE is a hyperbolic PDE.
However, in the same field of endeavor, Fraces teaches: wherein the non-linear PDE is a hyperbolic PDE (Section 5.3.1, in section 5.4 where we are interested in the problem of gravity segregation where the added physics acts as another closure condition for the hyperbolic problem).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to solve a hyperbolic PDE as disclosed by Fraces using the system disclosed by Vos in view of McClenny and Bihorac to accurately model physics-based problems (Section 5.4, This method is a new way to produce high fidelity surrogates that honor governing laws).

Claims 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Vos in view of McClenny and Bihorac as applied to claim 1 above, and further in view of Harvey et al. (US 20210382445 A1), herein Harvey.
Regarding claim 16, Vos in view of McClenny and Bihorac fails to teach: The system of claim 1, wherein the PIANN includes an automatic differentiator for producing a differentiation output, and wherein the differentiation output is used for training the PIANN in order to update one or more parameters or weights of the PIANN.
However, in the same field of endeavor, Harvey teaches: wherein the PIANN includes an automatic differentiator for producing a differentiation output (¶94, This backpropagator may use an automatic differentiator 940 to determine gradients of the various variables (weights, values, etc.) with the thermodynamic model), and wherein the differentiation output is used for training the PIANN in order to update one or more parameters or weights of the PIANN (¶94, then an optimizer 945 can use the negative gradients to incrementally change the time series control sequence).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use an automatic differentiator in the training process as disclosed by Harvey in the PIANN system disclosed by Vos in view of McClenny and Bihorac in order to achieve more accurate results (¶94, such that the thermodynamic model output more closely approaches the target demand curve).

Regarding claim 17, Vos in view of Bihorac and Harvey fails to teach: The system of claim 16, wherein the one or more parameters or weights of the PIANN include one or more transition zone detector weights or parameters weights of the transition zone detector.
However, in the same field of endeavor, McClenny teaches: wherein the one or more parameters or weights of the PIANN include one or more transition zone detector weights or parameters weights of the transition zone detector (pg. 3, Section 3, The trainable weights can efficiently force the network to focus on the initial, boundary, or residual points located in difficult or important regions of the solution – also see pg. 6, Fig. 2).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have weights in the transition zone detector as disclosed by McClenny in the system disclosed by Vos in view of Bihorac and Harvey to improve solver performance (pg. 1, Section 1, forcing the approximation to improve on those points).

Regarding claim 18, Vos in view of Bihorac and Harvey fails to teach: The system of claim 17, wherein the transition zone detector is an attention mechanism and the transition zone detector weights or parameters are attention weights of the attention mechanism.
However, in the same field of endeavor, McClenny teaches: wherein the transition zone detector is an attention mechanism and the transition zone detector weights or parameters are attention weights of the attention mechanism (pg. 3, Section 3, Unlike the previous approaches, the weights in the loss function are updated by backpropagation together with the network weights. In effect, the weights behave as a multiplicative soft attention mask, in a way that is reminiscent of attention mechanisms used in computer vision [14, 15] – also see fig. 2 and caption on pg. 6).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have a trainable attention mechanism as disclosed by McClenny in the system disclosed by Vos in view of Bihorac and Harvey to improve solver performance (pg. 1, Section 1, forcing the approximation to improve on those points).

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Vos in view of McClenny and Bihorac as applied to claim 1 above, and further in view of Chhaya et al. (US 20210034705 A1), herein Chhaya.
Regarding claim 19, Vos in view of McClenny and Bihorac fails to teach: The PIANN system of claim 1, wherein the PIANN is structured as a seq-to-seq RNN.
However, in the same field of endeavor, Chhaya teaches: wherein the PIANN is structured as a seq-to-seq RNN (¶19, In one such embodiment, the neural network is a bidirectional seq2seq neural network with attention).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use a seq-to-seq structure as disclosed by Chhaya in the system disclosed by Vos in view of McClenny and Bihorac in order to handle natural language (¶17, deep learning can be used to assess tone, which may bring a greater degree of consistency and predictability to the tone assessment).

Claims 20, 22 and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Vos in view of McClenny, Bihorac, Asao, Shahkarami et al. (US 20200242497 A1), herein Shahkarami, and Fraces.
Regarding claim 20, Vos teaches: A computer system implementing a PI…NN (¶29, physics-informed neural network), the computer system comprising: at least one processor; and a memory storing computer instructions, wherein, when the at least one processor executes the computer instructions (¶26, The networked computer environment 100 may include a computer 102 with a processor 104 and a data storage device 106)… 
Vos fails to teach: PIANN… wherein the PIANN includes a physics-informed neural network (PINN) with an attention mechanism… the attention mechanism configured to automatically detect shocks in the solution of at least one… PDE, and wherein the PIANN… determines which parts of input parameters of the at least one… PDE are relevant.
However, in the same field of endeavor, McClenny teaches: PIANN… wherein the PIANN includes a physics-informed neural network (PINN) with an attention mechanism… the attention mechanism configured to automatically detect shocks in the solution of at least one… PDE (pg. 1, Abstract, it has been observed that the original PINN algorithm can produce inaccuracies around sharp transitions in the solution, as well as display instability during training… Self-adaptive PINNs are based on trainable weights that can automatically force the neural network to focus on difficult regions of the solution), and wherein the PIANN… determines which parts of input parameters of the at least one… PDE are relevant (pg. 6, Conclusion, a similar conceptual framework as soft self-attention mechanisms used in Computer Vision, in that the network identifies which inputs are most important to its own training, in real time, with no additional hyperparameters. These weights are updated with respect to the loss function of the PINN, therefore the PINN training is capable of identifying a unique mask for any initial value problem).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to introduce the attention mechanism disclosed by McClenny, to the PINN system of Vos to reduce computational cost (pg. 6, Section 4.4, more accurate solutions of PDEs with smaller computational cost).
	Vos in view of McClenny fail to teach: wherein the PIANN includes a gated recurrent unit (GRU) architecture that…
However, in the same field of endeavor, Bihorac teaches: wherein the PIANN includes a gated recurrent unit (GRU) architecture that (¶63, Rather than relying on the most recent hidden state of the GRU for making a prediction, various embodiments of the deep learning model instead provide a weighted average of all prior hidden states to the final classification layer – and – ¶64, falls under the umbrella category of an attention mechanism, named as such in reference to the act of a model “focusing” on particular pieces of an input rather than its entirety)…
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include a GRU architecture for attention as disclosed by Bihorac in the system disclosed by Vos in view of McClenny to measure the importance of input parameters in an efficient manner (¶53, The key attribute of an RNN which makes it especially useful for modeling temporal data is its notion of internal memory – and – ¶59, with the advantage that the GRU networks include fewer required parameters than LSTM networks – and – ¶63, the sequence in effect becomes more distilled, by dampening inconsequential time steps and amplifying important ones relative to the final outcome. Second, these scalar time step weights can be used for providing clinician insight into the internal reasoning of the GRU's predictions, since larger time step weights directly influences the averaged hidden state used for prediction and can thus be interpreted as denoting important time steps relative to the outcome of interest).
Vos in view of McClenny and Bihorac fails to teach: introduced into an embedding space of the PINN between an encoder and a decoder of the PINN.
However, in the same field of endeavor, Asao teaches: introduced into an embedding space of the PINN between an encoder and a decoder of the PINN (Fig. 2, attention layer 160 between encoder 144 and decoder 154).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use an attention mechanism between the encoder and decoder as disclosed by Asao in the system disclosed by Vos in view of McClenny and Bihorac to increase model accuracy (¶91, the probability that question-answering device 122 outputs a right answer becomes higher).
Vos in view of McClenny, Bihorac and Asao fails to teach: the PIANN is used to generate a surrogate model for use in generating a simulation output…
However, in the same field of endeavor, Shahkarami teaches:  the PIANN is used to generate a surrogate model for use in generating a simulation output (¶16, The surrogate model is a simplified mathematical model of the reservoir that provides the same output or close to the same output obtained from the reservoir simulations)…
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to generate a surrogate model for simulation as disclosed by Shahkarami in the system disclosed by Vos in view of McClenny, Bihorac and Asao to reduce computation time (Section 5.4, The computation times are order of magnitudes smaller for the neural network approach).
Vos in view of McClenny, Bihorac, Asao and Shahkarami fails to teach: hyperbolic PDE… hyperbolic PDE.
However, in the same field of endeavor, Fraces teaches solving: hyperbolic PDE… hyperbolic PDE (Section 5.3.1, in section 5.4 where we are interested in the problem of gravity segregation where the added physics acts as another closure condition for the hyperbolic problem).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to solve a hyperbolic PDE as disclosed by Fraces in the system disclosed by Vos in view of McClenny, Bihorac, Asao and Shahkarami to accurately model physical problems (Section 5.4, to produce high fidelity surrogates that honor governing laws).

Regarding claim 22, Vos in view of McClenny, Bihorac, Shahkarami and Fraces fails to teach: The system of claim 20, wherein the PIANN includes an encoder-decoder recurrent neural network (RNN) configuration having a plurality of RNN units, and wherein the plurality of RNN units are used to generate an encoder output in the embedding space and/or to generate a decoder input in the embedding space.
However, in the same field of endeavor, Asao teaches: wherein the PIANN includes an encoder-decoder recurrent neural network (RNN) configuration having a plurality of RNN units, and wherein the plurality of RNN units are used to generate an encoder output in the embedding space and/or to generate a decoder input in the embedding space (Fig. 2, attention layer 160 between encoder 144 and decoder 154, both of which are composed of GRUs).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize RNN units in an encoder-decoder configuration as disclosed by Asao in the system disclosed by Vos in view of McClenny, Bihorac, Shahkarami and Fraces to improve accuracy (¶91, the probability that question-answering device 122 outputs a right answer becomes higher).

Regarding claim 26, Vos in view of McClenny, Bihorac, Asao and Fraces fails to teach: The system of claim 20, wherein the surrogate model is an oil and gas reservoir model.
However, in the same field of endeavor, Shahkarami teaches: wherein the surrogate model is an oil and gas reservoir model (¶16, The surrogate model is a simplified mathematical model of the reservoir that provides the same output or close to the same output obtained from the reservoir simulations – the reservoir contents are clarified as possibly being oil and gas in: ¶19, a reservoir of a hydrocarbon (liquid and/or gas or a combination)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to create a surrogate model for an oil and gas reservoir model as disclosed by Shahkarami with the system disclosed by Vos in view of McClenny, Bihorac, Asao and Fraces to reduce computation time (Section 5.4, The computation times are order of magnitudes smaller for the neural network approach).

Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Vos in view of McClenny, Bihorac, Asao, Shahkarami, and Fraces as applied to claim 20 above, and further in view of Kothari et al. (US 20210232873 A1), herein Kothari.
Regarding claim 21, Vos in view of McClenny, Bihorac and Shahkarami fails to teach: The system of claim 20, wherein the attention mechanism is one of: a soft attention mechanism, or a hard attention mechanism.
However, in the same field of endeavor, Kothari teaches: wherein the attention mechanism is one of: a soft attention mechanism, or a hard attention mechanism (¶53, In at least one embodiment, hard attention models or soft attention models can be used).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use a hard or soft attention mechanism as disclosed by Kothari in the system disclosed by Vos in view of McClenny, Shahkarami and Bihorac to improve performance (¶53, can consider all parts of a video which can be beneficial in determining context and thus which regions of a frame are important, which can result in more accurate logical step determinations overall).

Claims 23-25 and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Vos in view of McClenny, Bihorac, Asao, Shahkarami, and Fraces as applied to claim 20 above, and further in view of Harvey.
Regarding claim 23, Vos further teaches: The system of claim 20… and wherein the differentiation output is used by a physics-informed learning unit for physics-informed learning or training of the PINN (¶47, The back propagation or backwards propagation may be a training method for a neural network to evaluate the error function from the cost function).
Vos in view of McClenny, Bihorac, Asao, Shahkarami, and Fraces fails to teach: wherein the PIANN includes an automatic differentiator for producing a differentiation output…
However, in the same field of endeavor, Harvey teaches: wherein the PIANN includes an automatic differentiator for producing a differentiation output (¶94, This backpropagator may use an automatic differentiator 940 to determine gradients of the various variables (weights, values, etc.) with the thermodynamic model)…
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use an automatic differentiator in the training process as disclosed by Harvey in the PINN specifically, as disclosed by Vos in view of McClenny and Bihorac in order to achieve more accurate results (¶94, such that the thermodynamic model output more closely approaches the target demand curve).

Regarding claim 24, Vos further teaches: The system of claim 23, wherein the PIANN includes a physics-informed learning unit that is uses a physical loss function (¶33, including physics informed constraints in the loss function).

Regarding claim 25, Vos in view of McClenny, Bihorac, Asao, Shahkarami and Harvey fails to teach: The system of claim 24, wherein the physical loss function is formulated based on a Buckley-Leverett (BL) equation.
However, in the same field of endeavor, Fraces teaches: wherein the physical loss function is formulated based on a Buckley-Leverett (BL) equation (Section 4.2.1, equation 10 is a loss function formulated based on the Buckley-Leverett).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use the Buckley-Leverett equation in the loss function as disclosed by Fraces in the system disclosed by Vos in view of McClenny, Shahkarami, Bihorac and Harvey to accurately model specific physical problems (Section 5.4, to produce high fidelity surrogates that honor governing laws).

Regarding claim 27,  Vos further teaches: The system of claim 24, wherein the physics-informed learning unit enforces one or more initial conditions and one or more boundary conditions (¶51, Short-term deterministic predictions may be configured with appropriate initial conditions and boundary conditions).
	
Claims 28-30 are rejected under 35 U.S.C. 103 as being unpatentable over Vos in view of Asao, McClenny and Fraces.
Regarding claim 28, Vos teaches: A computer system implementing a deep neural network (DNN) (¶29, The physics-informed neural network may include a deep learning model) comprising: at least one processor; and a memory storing computer instructions, wherein, when the at least one processor executes the computer instructions (¶26, The networked computer environment 100 may include a computer 102 with a processor 104 and a data storage device 106), the DNN architecture is used to generate a DNN output that respects one or more predetermined physical constraints (¶15, The physics-informed artificial intelligence-based regional climate model may be trained using high fidelity weather and climate simulations. The physical constraints may be used to train the model and may ensure the model is retaining awareness of the physical system beyond interpolating between data points); wherein the DNN includes an encoder and a decoder coupled together in an embedding space (¶14, More specifically, a non-linear reduced order model or autoencoder is paired with a neural network in the latent space and with physics-informed partial differential equation (PDE) constraints).
Vos fails to teach: an encoder and a decoder coupled in an embedding space through an attention layer including an attention mechanism.
However, in the same field of endeavor, Asao teaches: an encoder and a decoder coupled in an embedding space through an attention layer including an attention mechanism (Fig. 2, attention layer 160 between encoder 144 and decoder 154).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize an attention layer between encoder and decoder as disclosed by Asao in the model disclosed by Vos to get more accurate results (¶91, the probability that question-answering device 122 outputs a right answer becomes higher).
Vos in view of Asao does not explicitly teach: and wherein the attention layer is configured to: use a full sequence of input data to determine which parts of input parameters of a… PDE are relevant, based on full sequences over a spatial/temporal grid; and automatically identify shock locations in the solution of at least one… PDE.
However, in the same field of endeavor, McClenny teaches: and wherein the attention layer is configured to: use a full sequence of input data to determine which parts of input parameters of a… PDE are relevant, based on full sequences over a spatial/temporal grid (pg. 6, Fig. 2, attention weights learned for the PDE – see Fig. 2, caption on pg. 7, Learned weights across the spatio-temporal domain); and automatically identify shock locations in the solution of at least one… PDE (pg. 1, Abstract, it has been observed that the original PINN algorithm can produce inaccuracies around sharp transitions in the solution, as well as display instability during training… Self-adaptive PINNs are based on trainable weights that can automatically force the neural network to focus on difficult regions of the solution).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use an attention layer that detects shocks as disclosed by McClenny in the system disclosed by Vos in view of Asao to reduce computational cost (pg. 6, Section 4.4, more accurate solutions of PDEs with smaller computational cost).
	Vos in view of Asao and McClenny fails to teach: hyperbolic PDE… hyperbolic PDE…
However, in the same field of endeavor, Fraces teaches: hyperbolic PDE… hyperbolic PDE (Section 5.3.1, in section 5.4 where we are interested in the problem of gravity segregation where the added physics acts as another closure condition for the hyperbolic problem)…
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to solve a hyperbolic PDE as disclosed by Fraces using the system disclosed by Vos in view of Asao and McClenny to accurately model particular physics-based problems (Section 5.4, This method is a new way to produce high fidelity surrogates that honor governing laws).

	Regarding claim 29, Vos in view of McClenny fails to teach: The system of claim 28, wherein the attention layer is coupled to a plurality of recurrent neural network (RNN) units of the encoder and coupled to a plurality of RNN units of the decoder.
However, in the same field of endeavor, Asao teaches: wherein the attention layer is coupled to a plurality of recurrent neural network (RNN) units of the encoder and coupled to a plurality of RNN units of the decoder (Fig. 2, attention layer 160 between encoder 144 and decoder 154, both of which are composed of GRUs).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to utilize an attention layer connected to RNN units of an encoder and decoder as disclosed by Asao in the model disclosed by Vos in view of McClenny and Fraces to get more accurate results (¶91, the probability that question-answering device 122 outputs a right answer becomes higher).

Regarding claim 30, Vos further teaches: The system of claim 29, wherein one or more weights or parameters of the encoder, the decoder, and/or the attention layer are trained using a physics-informed learning unit that uses a physical loss function (¶33, The neural network in the PINN-based ROM 204 is also improved by including physics informed constraints in the loss function of the full order state) and that respects the one or more predetermined physical constraints (¶15, The physics-informed artificial intelligence-based regional climate model may be trained using high fidelity weather and climate simulations. The physical constraints may be used to train the model and may ensure the model is retaining awareness of the physical system beyond interpolating between data points).


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HARRISON CHAN YOUNG KIM whose telephone number is (571)272-0713. The examiner can normally be reached Monday - Thursday 9:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cesar Paula can be reached on (571) 272-4128. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/HARRISON C KIM/Examiner, Art Unit 2145              



/CESAR B PAULA/Supervisory Patent Examiner, Art Unit 2145
Read full office action
PHYSICS-INFORMED ATTENTION-BASED NEURAL NETWORK

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

PHYSICS-INFORMED ATTENTION-BASED NEURAL NETWORK

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email