DETAILED ACTION
Claims 1-20 are presented for examination.
This office action is in response to submission of application on 20-OCTOBER-2025.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The amendment filed 20-OCTOBER-2025 in response to the non-final office action mailed 20-JUNE-2025 has been entered. Claims 1-20 remain pending in the application.
With regards to the non-final office action’s rejection under 101, the amendments to the claims are not sufficient to overcome the original rejection with regards to the claims being directed towards an abstract idea.
With regards to the non-final office action’s rejections under 103, the amendments to the claims necessitated a new consideration of the art. After this consideration, the examiner respectfully disagrees with the applicant’s arguments that the art referenced in the previous office action does not teach the amendment claim limitations. A new 103 rejection over the prior art has been provided.
Regarding the applicant’s arguments that claims 1, 10, and 19 have incorporated the subject matter of claims 8 and 17 and hence are allowable, the examiner respectfully disagrees that the amended limitations of claims 1, 10, and 19 and the limitations of claim 8 and 17 are equivalent:
Gabrielson discloses a selective prediction [loss which is a function] of selective empirical risk and an empirical coverage
Gabrielson teaches that for individual features the user may adjust a confidence threshold such that only more extreme deviations of that particular feature may result in an alert (Paragraph 39). This is an example of a selective prediction as when the alert is unclear, instances that result in the unclarity are filtered out. Furthermore, this is based on empirical coverage as the threshold is approximated such that true alerts would come through. Likewise, it is based on empirical risk as the alerts result from the training data of the neural network (Paragraph 35).
The concept of a loss function in a neural network is explicitly taught by Kim below.
Kim in the same field of endeavor of machine learning discloses the neural network model associated with a network loss:
Kim recites: “a loss function for minimizing a difference between a first neural network parameter and a plurality of second model parameters based on the entire pre-trained video.
Kim teaches a loss function for use in a neural network, which may be combined with the teachings of Gabrielson as it would have provided the advantage of improving transmission of neural network parameters (Kim, Background, “In this case, efficient transmission of neural network parameters is essential”).
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 rejected under 35 U.S.C. 101 because the claimed invention is direction to an abstract idea without significantly more.
MPEP 2106.04(a)(2)(Ill) “Accordingly, the "mental processes" abstract idea grouping is defined as concepts performed in the human mind, and examples of mental processes include observations, evaluations, Judgments, and opinions.
Further, the MPEP recites “The courts do not distinguish between mental processes that are performed entirely in the human mind and mental processes that require a human to use a physical aid (e.g., pen and paper or a slide run) to perform the claim limitation.
MPEP 2106.04(a)(2)(I) “The mathematical concepts grouping is defined as mathematical
relationships, mathematical formulas or equations, and mathematical calculations.”
Regarding claim 1:
Step 2A, Prong 1 will now be evaluated for this claim:
A judicial exception is recited in this claim as it recites a mental process:
A system for machine learning architecture for predicting the execution of a subsequent data process:
Prediction at a high level is an evaluation that can be performed in the human mind. Furthermore, the ‘execution of a subsequent data process’ is broadly state and may refer to any form of data analytics or allocation, which may be performed by the human mind.
generating from a plurality of executed data processes a sequence of data records, the plurality of executed data processes executing historical resource allocations from a user associated with a first identifier to another user associated with a second identifier
This claim describes an evaluation of past actions and allocations in order to determine records, which is an evaluation performable in the human mind i.e., compiling a report.
derive record features based on the sequence of data records representing the historical resource allocations for identifying irregular record features
Deriving record features describes recording important parts of the data in order to best represent it, which would be an evaluation.
determine a prospective subsequent data process for executing a future resource allocation associated with the first identifier and the second identifier based on a neural network model and the derived record features
Determining a prospective resource allocation would be an evaluation. The neural network model would be part of a generic computer.
determine, based on the neural network model, a selection score associated with the prospective subsequent data process; and when the selection score is above a minimum threshold
Determining a selection score would be determining if a particular resource allocation overcomes a threshold, which would be a comparison of the data process to a particular requirement, which is evaluation.
A judicial exception is recited in this claim as it describes a mathematical concept:
the neural network model associated with a network loss including a selective prediction loss which is a function of selective empirical risk and an empirical coverage
A selective prediction loss that is a function of selective empirical risk and a an empirical coverage describes a mathematical formula.
Step 2A, Prong 2 will now be evaluated for this claim:
Furthermore, the additional elements:
a processor; and a memory coupled to the processor and storing processor-executable instructions that, when executed, configure the processor to
are interpreted as a general purpose computer under MPEP 2106.05(f)
Furthermore, MPEP 2106.05(g) Insignificant Extra-Solution Activity has found mere data gathering and post-solution activity to be insignificant extra-solution activity.
The following steps are merely post solution activity:
cause to display, at a display device, the prospective resource allocation corresponding to the second identifier.
Displaying a result is a form of post-solution activity.
The additional elements have been considered both individually and as an ordered combination in order to determine whether they integrate the exception into a practical application. Therefore, no meaningful limits are imposed practicing the abstract idea.
Therefore, the claim is related to an abstract idea.
Step 2B will now be discussed with regards to this claim:
The claim does not provide an inventive concept. There is no additional Insignificant Extra- Solution Activity, as identified in Step 2A Prong Two, that provides an inventive concept.
Adding insignificant extra-solution activity to the judicial exception, e.g., mere data gathering in conjunction with a law of nature or abstract idea such as a step of obtaining information about credit card transactions so that the information can be analyzed by an abstract mental process, as discussed in CyberSource v. Retail Decisions, Inc., 654 F.3d 1366, 1375, 99 USPQ2d 1690, 1694 (Fed. Cir. 2011) (see MPEP § 2106.05(g)) does not overcome a rejection.
The additional elements have been considered both individually and as an ordered combination as to whether they whether they warrant significantly more consideration.
The claim is ineligible.
Regarding claim 2, which depends upon claim 1:
This claim further limits the neural network of claim 1. Further specifying the neural network does not overcome the parent claim’s rejection. The architecture of a neural network that is recited is not demonstrated to provide an improvement to the technology and is not essential to the function of the system. Therefore, in light of their parent claims, claims 2, 11, and 20 are considering to be insignificant extra-solution activity to the abstract idea.
This claim is rejected for incorporating the parent claim in full.
This claim is ineligible.
Regarding claim 3, which depends upon claim 2:
The following would be a generic computer function:
wherein the neural network model is configured to generate one or more outputs associated with one or more time steps, the one or more outputs comprising a predicted date-delta, a predicted normalized amount, the selection score, and an auxiliary prediction including an auxiliary amount and an auxiliary date
A neural network generating outputs is a generic computer function.
This claim is ineligible.
Regarding claim 4, which depends upon claim 3:
This claim further limits the training of the neural network of claim 3. Further specifying the training of the neural network does not overcome the parent claim’s rejection.
This claim is rejected for incorporating the parent claim in full.
This claim is ineligible.
Regarding claim 5, which depends upon claim 3:
The following would be a mental process:
based on the selection score, associate a weight with an identified data record corresponding to an irregular record feature
Association of a particular value to data would be a mental process that can be performed with the aid of a pen and paper
This claim is ineligible.
Regarding claim 6, which depends upon claim 5:
The following would be a mental process:
associating a zero weight to the identified data record marks the identified data record as the irregular record feature for abstaining from generating a prospective resource allocation
Association of a particular value to data would be a mental process that can be performed with the aid of a pen and paper. Furthermore, abstaining from using a particular data record would also be a mental process, as a human would be able to disregard the marked data record.
This claim is ineligible.
Regarding claim 7, which depends upon claim 1:
The following would be a mental process:
generate one or more adjusted prospective resource allocations corresponding to the second identifier based on self-attention operations
Creation of a prospective resource allocation would be an evaluation.
The following would be a mathematical calculation:
wherein the adjusted prospective resource allocations comprise a dynamic weighted average of prior observed resource allocation values
Dynamic weight average describes a calculation from the resource allocation values that would be a mathematical calculation.
This claim is ineligible.
Regarding claim 8, which depends upon claim 1:
This claim describes and provides an equation, which would be a mathematical formula and thus an abstract idea.
This claim is ineligible.
Regarding claim 9, which depends upon claim 8:
This claim describes and provides an equation, which would be a mathematical formula and thus an abstract idea.
This claim is ineligible.
Claim 10-18 recite a method that parallels the system of claims 1-9 respectively. Therefore, the analysis discussed above with respect to claims 1-9 also applies to claims 10-18 respectively. Accordingly, claims 10-18 are rejected based on substantially the same rationale as set forth above with respect to claims 1-9.
Claim 19-20 recite a non-transitory computer readable storage medium that parallels the system of claim 1-2 respectively. Therefore, the analysis discussed above with respect to claims 1-2 also applies to claims 19-20 respectively. Accordingly, claims 19-20 are rejected based on substantially the same rationale as set forth above with respect to claims 1-2.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-7, 10-16, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Gabrielson et al. (Pub. No. US 20200301740 A1, filed March 22nd 2019, hereinafter Gabrielson) in view of Watson (Pub. No. US 20180248895 A1, filed February 27th 2017, hereinafter Watson), further in view of Kim et al. (Pub. No. WO 2021002719 A1, filed July 3rd 2020, hereinafter Kim).
Regarding claim 1:
Claim 1 recites:
A system for machine learning architecture for predicting the execution of a subsequent data process comprising: a processor; and a memory coupled to the processor and storing processor-executable instructions that, when executed, configure the processor to: generating from a plurality of executed data processes a sequence of data records, the plurality of executed data processes executing historical resource allocations from a user associated with a first identifier to another user associated with a second identifier; derive record features based on the sequence of data records representing the historical resource allocations for identifying irregular record features; determine a prospective subsequent data process for executing a future resource allocation associated with the first identifier and the second identifier based on a neural network model and the derived record features, the neural network model associated with a network loss including a selective prediction loss which is a function of selective empirical risk and an empirical coverage; determine, based on the neural network model, a selection score associated with the prospective resource allocation; and when the selection score is above a minimum threshold, cause to display, at a display device, the prospective resource allocation corresponding to the second identifier.
Gabrielson discloses a system for machine learning architecture for predicting the execution of a subsequent data process comprising: a processor; and a memory coupled to the processor and storing processor-executable instructions that, when executed, configure the processor to: generating from a plurality of executed data processes a sequence of data records, the plurality of executed data processes executing historical resource allocations from a user associated with a first identifier to another user associated with a second identifier
Gabrielson teaches that its systems are implemented by computers (Paragraph 67), which would include a processor; and a memory coupled to the processor and storing processor-executable instructions that, when executed, configure the processor.
Furthermore, Gabrielson teaches selecting specific users within an organization to have administrative privileges that allow the selected users to set workload priorities (Paragraph 41) as well as which workloads have to ability to access a compute instance pool (Paragraph 39). These users would be users with a first identifier wherein the users are allocating resources to other users in the organization with a second identifier. Furthermore, users may view historical data related to the use of compute instances in a compute instance pool by a primary workload (Paragraph 39) which would be a generated plurality of executed data processes as it describes the historical use of the compute instance pool, which would be executed data processes which executed historical resource allocations.
Gabrielson discloses derive record features based on the sequence of data records representing the historical resource allocations [for identifying irregular record features]:
Gabrielson teaches that a forecasting and scheduling service can obtain metrics collected by a data monitoring service to learn attributes associated with users’ workloads (Paragraph 43). This would be deriving record features based on the sequence of data records representing the historical resource allocations.
However, Gabrielson does not teach that this is for identifying irregular record features. Identification of irregular record features is taught by Watson further below.
Gabrielson discloses determine a prospective subsequent data process for executing a future resource allocation associated with the first identifier and the second identifier based on a neural network model and the derived record features
Gabrielson teaches the optimization of allocation of computing resources among computing workloads (Paragraph 15). The optimization would be a prospective subsequent data process for executing a future resource allocation wherein this optimization is formed using the users as described previously with the first and second identifier as well as the derived record features.
Furthermore, Gabrielson teaches that this system may use a neural network (Paragraph 86).
Gabrielson discloses a selective prediction [loss which is a function] of selective empirical risk and an empirical coverage
Gabrielson teaches that for individual features the user may adjust a confidence threshold such that only more extreme deviations of that particular feature may result in an alert (Paragraph 39). This is an example of a selective prediction as when the alert is unclear, instances that result in the unclarity are filtered out. Furthermore, this is based on empirical coverage as the threshold is approximated such that true alerts would come through. Likewise, it is based on empirical risk as the alerts result from the training data of the neural network (Paragraph 35).
The concept of a loss function in a neural network is explicitly taught by Kim below.
Watson discloses identifying irregular record features
Watson in the same field of endeavor of learning methods for neural networks teaches identifying anomalous user activity (Paragraph 12). In combination with Gabrielson, Watson’s identification of anomalous user activity would pair with Gabrielson’s record features of historical data records produced by users’ workloads to identify irregular record features.
Gabrielson, Watson, and the present application are all analogous art because they are in the same field of endeavor of learning methods for neural networks.
Watson discloses determine, based on the neural network model, a selection score associated with the prospective subsequent data process:
Watson teaches monitoring user activity in order to determine a risk score to determine if a behavior is anomalous (Paragraph 12). The risk score would be analogous to a selection score as it would select the user as an anomalous user based on their activities, which includes their request for particular computational resources (Paragraph 16) that would be a subsequent data process.
Watson discloses when the selection score is above a minimum threshold, cause to display, at a display device, the prospective resource allocation corresponding to the second identifier:
Watson teaches when a risk score passes a threshold, an alert generated (Paragraph 27). The alert would constitute a display of the prospective resource allocation correspond to the second identifier as the alert provides information on the potentially suspicious behavior (e.g., the resource allocation) (Paragraph 29) including, for example, unusual IP addresses (Paragraph 42) that would act as an identifier.
Kim in the same field of endeavor of machine learning discloses the neural network model associated with a network loss:
Kim recites: “a loss function for minimizing a difference between a first neural network parameter and a plurality of second model parameters based on the entire pre-trained video.
Kim teaches a loss function for use in a neural network, which may be combined with the teachings of Gabrielson for reason of the advantage below.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement a system that utilized the teachings of Gabrielson, the teachings of Watson, and the teachings of Kim. This would have provided the advantage of improving the security of computational systems in a shared-resource environment (Watson, Paragraph 1) as well as the advantage of improving transmission of neural network parameters (Kim, Background, “In this case, efficient transmission of neural network parameters is essential”).
Regarding claim 2, which depends upon claim 1:
Claim 2 recites:
The system of claim 1, wherein the neural network model is based on a residual long short-term memory (LSTM) network including blocks of stacked LSTMs with residual connections between blocks.
Gabrielson in view of Watson teach the system of claim 1 upon which claim 2 depends. However, neither Gabrielson nor Watson teach the limitation of claim 2:
Kim recites: “The neural network may include a deep neural network. Neural networks include […] LSTM”.
Kim in the same field of endeavor of learning methods for neural networks teaches a neural network that may include an LSTM. Furthermore:
Kim recites: “At least one of the first neural network and the plurality of second neural networks may include a light-weight residual dense block including at least one convolutional layer.”
This teaches that the neural networks of Kim that may be LSTMs that are connected through residual block, or a block using residual connections.
As Kim teaches the plurality of second neural networks may include residual blocks including at least one convolutional layer. This is interpreted as saying the plurality specifically may include these blocks, meaning that the blocks themselves are neural networks. Therefore, Kim’s further teaching that the neural networks may be LSTMs supported stacked LSTMs as the residual blocks are stacked together through their residual connections.
Gabrielson, Watson, Kim and the present application are all analogous art because they are in the same field of endeavor of learning methods for neural networks.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement a system that utilized the teachings of Gabrielson in view of Watson and the teachings of Kim. This would have provided the advantage of improving transmission of neural network parameters (Kim, Background, “In this case, efficient transmission of neural network parameters is essential”).
Regarding claim 3, which depends upon claim 2:
Claim 3 recites:
The system of claim 2, wherein the neural network model is configured to generate one or more outputs associated with one or more time steps, the one or more outputs comprising a predicted date-delta, a predicted normalized amount, the selection score, and an auxiliary prediction including an auxiliary amount and an auxiliary date.
Gabrielson in view of Watson further in view of Kim disclose the system of claim 2 upon which claim 3 depends. Furthermore, regarding the limitation of claim 3:
Gabrielson teaches a forecasting and scheduling service that uses metrics collected by a data monitoring service in order to produce outputs associated with various workloads over time (Paragraph 37), which would be output data associated with one or more time steps, wherein the outputs may include predictions about future resource usage pattern (Paragraph 37), or the predicted normalized amount, as it would be the amount of compute resources normally used.
Furthermore, Gabrielson teaches that the data monitoring service that collects the data may be a part of a different service (Paragraph 37), which would make it auxiliary to the main system, and hence produce auxiliary predictions including auxiliary amounts and dates.
Furthermore, Watson has previously taught a selection score (Paragraph 12).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement a system that utilized the teachings of Gabrielson in view of Kim and the teachings of Watson. This would have provided the advantage of improving the security of computational systems in a shared-resource environment (Watson, Paragraph 1).
Regarding claim 4, which depends upon claim 3:
Claim 4 recites:
The system of claim 3, wherein training of the neural network model is based on the auxiliary amount and the auxiliary date.
Gabrielson in view of Watson further in view of Kim disclose the system of claim 3 upon which claim 3 depends. Furthermore, regarding the limitation of claim 4:
Gabrielson specifies that the data used to the train its neural network may not overlap with the historical data about a particular computing workload whose future resource usage is to be predicted by the trained neural network (Paragraph 71). That is, the neural network may be trained by data that is outside of the historical data for the computing workloads it is predicting a resource usage for, which would be auxiliary amounts and auxiliary dates as the data of Gabrielson has previously been describe to have been over time (Paragraph 37).
Regarding claim 5, which depends upon claim 3:
Claim 5 recites:
The system of claim 3, wherein the processor-executable instructions, when executed, configure the processor to: based on the selection score, associate a weight with an identified data record corresponding to an irregular record feature.
Gabrielson in view of Watson further in view of Kim disclose the system of claim 3 upon which claim 5 depends. Furthermore, regarding the limitation of claim 5:
Watson teaches that its risk score, which has previously been analogous to the selection score of the present application, may have weights such that documents with particular irregular traits, such as a high number of social security numbers listed on it, would be considered more risky (Paragraph 40). This would describe a risk (or selection score) that associates a weight, or a higher level of risk, with an identified data record, here a document, corresponding to an irregular record feature, such as particularly sensitive data.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement a system that utilized the teachings of Gabrielson in view of Kim and the teachings of Watson. This would have provided the advantage of improving the security of computational systems in a shared-resource environment (Watson, Paragraph 1).
Regarding claim 6, which depends upon claim 5:
Claim 6 recites:
The system of claim 5, wherein associating a zero weight to the identified data record marks the identified data record as the irregular record feature for abstaining from generating a prospective resource allocation.
Gabrielson in view of Watson further in view of Kim teach the system of claim 5 upon which claim 6 depends. However, neither Gabrielson nor Watson entirely teach the limitation of claim 6:
Watson does teach the marking of an identified data record with an irregular feature, as described above (Paragraph 40). However, Watson does not teach the association of a zero weight for abstaining from generating a prospective resource allocation.
Kim recites: “As described above, the processor 130 may perform binary masking for weights equal to 0 for residuals and perform K-means clustering for weights other than zero.”
Kim teaches binary masking for weights equal to 0, which would be analogous to a zero weight for abstaining from generating a prediction, such as the prospective resource allocation of Gabrielson which has been previously discussed.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement a system that utilized the teachings of Gabrielson and the teachings of Watson and the teachings of Kim. This would have provided the advantage of improving transmission of neural network parameters (Kim, Background, “In this case, efficient transmission of neural network parameters is essential”) as well as improving the security of computational systems in a shared-resource environment (Watson, Paragraph 1).
Regarding claim 7, which depends upon claim 1:
Claim 7 recites:
The system of claim 1, wherein the processor-executable instructions, when executed, configure the processor to: generate one or more adjusted prospective resource allocations corresponding to the second identifier based on self-attention operations, wherein the adjusted prospective resource allocations comprise a dynamic weighted average of prior observed resource allocation values.
Gabrielson in view of Watson disclose the system of claim 1 upon which claim 7 depends. However, Gabrielson does not entirely teach generate one or more adjusted prospective resource allocations corresponding to the second identifier:
Gabrielson teaches optimization of compute resources for workloads associated with specific users (Paragraph 15), where the optimization of compute resources would be an adjusted prospective resource allocation and the users associated with the workloads would be the second identifier.
Furthermore, Gabrielson teaches wherein the adjusted prospective resource allocations comprise a dynamic weighted average of prior observed resource allocation values:
Gabrielson teaches a capacity forecasting and scheduling service for generating prediction of compute capacity usage, or resource allocations from historical trends (Paragraph 54). The output of a neural network would be analogous to a dynamic weighted average, as neural networks are designed to predict likely output values from historical inputs using weights, wherein the outputs would adapt with new inputs (which would be dynamic).
However, neither Gabrielson nor Watson entirely teach self-attention operations:
Kim recites: “The neural network may include a deep neural network. Neural networks include […] Attention Network”
Kim teaches the use of an attention network, which would use self-attention operations.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to implement a system that utilized the teachings of Gabrielson in view of Watson and the teachings of Kim. This would have provided the advantage of improving transmission of neural network parameters (Kim, Background, “In this case, efficient transmission of neural network parameters is essential”).
Claims 10-16 recite a method that parallels the system of claims 1-7 respectively. Therefore, the analysis discussed above with respect to claims 1-7 also applies to claims 10-16 respectively. Accordingly, claims 10-16 are rejected based on substantially the same rationale as set forth above with respect to claims 10-16 respectively.
Claims 19-20 recite a method that parallels the system of claims 1-2 respectively. Therefore, the analysis discussed above with respect to claims 1-2 also applies to claims 19-20 respectively. Accordingly, claims 19-20 are rejected based on substantially the same rationale as set forth above with respect to claims 1-2 respectively.
Allowable Subject Matter
Claims 8-9 and 17-18 contain allowable subject matter.
Claims 8-9 and 17-18 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
Regarding claim 8:
The references of record alone or in combination do not disclose or suggest the limitations found within claim 8, which recites:
The system of claim 1, wherein the selective prediction loss expressed as
PNG
media_image1.png
85
381
media_image1.png
Greyscale
Wherein
PNG
media_image2.png
85
382
media_image2.png
Greyscale
Represents the selective empirical risk, and
PNG
media_image3.png
81
241
media_image3.png
Greyscale
Represents the empirical coverage, f is a prediction function, g is a selection function for generating the selection score, c is a target coverage, lambda is a balancing hyper parameter, and psi is a quadratic penalty function.
Kim teaches the use of a loss function, as it recites “The total loss function for training a neural network may include an SR loss and a weight residual (WR) cost. SR loss can be expressed as in Equation 1”. However, Kim does not teach the precise equation nor the components of the loss function as described in claim 8, nor does Gabrielson or Watson.
Regarding claim 9:
The references of record alone or in combination do not disclose or suggest the limitations found within claim 9, which recites:
The system of claim 8, wherein the network loss includes a combination of the selective prediction loss and an auxiliary loss expressed as
PNG
media_image4.png
61
286
media_image4.png
Greyscale
Wherein
PNG
media_image5.png
82
342
media_image5.png
Greyscale
Kim teaches the use of a loss function, as it recites “The total loss function for training a neural network may include an SR loss and a weight residual (WR) cost. SR loss can be expressed as in Equation 1”. However, Kim does not teach the precise equation nor the components of the loss function as described in claim 8, nor does Gabrielson or Watson.
Claims 17-18 recite a method that parallels the system of claims 8-9 respectively. Therefore, the analysis discussed above with respect to claims 8-9 also applies to claims 17-18 respectively. Accordingly, claims 17-18 contain allowable subject matter based on substantially the same rationale as set forth above with respect to claims 8-9 respectively.
Response to Arguments
Applicant’s arguments filed 20-OCTOBER-2025 have been fully considered, but the examiner believes that not all are fully persuasive.
Regarding the applicant’s remarks on non-final office action’s 101 rejection of the claims as an abstract idea, the applicant argues that regardless of the original claims, the amended claims are not directed towards an abstract idea. The examiner respectfully requests the applicant’s consideration of the following:
The recitation of execution of data processes or variants thereof are considered to be a generic computer function as they are recited at a high level of generality (see MPEP 2106.05(b)). Use of conventional computer functions, such as generic data processing, does not count as a particular machine. Furthermore, merely executing a data process does not impose a meaningful limitation on the allocation of resources that claims 1, 10, and 19 describe.
Furthermore, regarding claims 2, 11, and 20, it has been previously shown that reciting a specific neural network does not necessarily overcome a claim’s rejection as an abstract idea (see Example 47 claim 2 of the July 2024 Subject Matter Eligibility Examples). The architecture of a neural network that is recited is not demonstrated to provide an improvement to the technology and is not essential to the function of the system. Therefore, in light of their parent claims, claims 2, 11, and 20 are considering to be insignificant extra-solution activity to the abstract idea.
Regarding the applicant’s remarks on the non-final office action’s 103 rejection of the claims, the applicant argues that none of Gabrielson, Watson, or Kim teach the amended limitations of these claims. As such, the applicant argues that all claims dependent on the above would additionally not be obvious under 103. However, the examiner believes that Gabrielson in view of Watson further in view of Kim does teach the amended limitations and respectfully requests applicant’s consideration of the following:
Regarding the applicant’s arguments that claims 1, 10, and 19 have incorporated the subject matter of claims 8 and 17 and hence are allowable, the examiner respectfully disagrees that the amended limitations of claims 1, 10, and 19 and the limitations of claim 8 and 17 are equivalent:
Gabrielson discloses a selective prediction [loss which is a function] of selective empirical risk and an empirical coverage
Gabrielson teaches that for individual features the user may adjust a confidence threshold such that only more extreme deviations of that particular feature may result in an alert (Paragraph 39). This is an example of a selective prediction as when the alert is unclear, instances that result in the unclarity are filtered out. Furthermore, this is based on empirical coverage as the threshold is approximated such that true alerts would come through. Likewise, it is based on empirical risk as the alerts result from the training data of the neural network (Paragraph 35).
The concept of a loss function in a neural network is explicitly taught by Kim below.
Kim in the same field of endeavor of machine learning discloses the neural network model associated with a network loss:
Kim recites: “a loss function for minimizing a difference between a first neural network parameter and a plurality of second model parameters based on the entire pre-trained video.
Kim teaches a loss function for use in a neural network, which may be combined with the teachings of Gabrielson as it would have provided the advantage of improving transmission of neural network parameters (Kim, Background, “In this case, efficient transmission of neural network parameters is essential”).
Furthermore, regarding the applicant’s arguments that the prior art fails to teach stacked LSTM blocks, Kim recites the following:
“The neural network may include a deep neural network. Neural networks include […] LSTM”
“At least one of the first neural network and the plurality of second neural networks may include a light-weight residual dense block including at least one convolutional layer.”
“…receiving residuals between each of a plurality of second model parameters corresponding to a plurality of second neural networks for processing each of the plurality of temporal portions and the first model parameter…”
Kim teaches the plurality of second neural networks may include residual blocks including at least one convolutional layer. The examiner interprets this as saying the plurality specifically may include these blocks, meaning that the blocks themselves are neural networks. Therefore, Kim’s further teaching that the neural networks may be LSTMs supported stacked LSTMs as the residual blocks are stacked together through their residual connections.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDRIA JOSEPHINE MILLER whose telephone number is (703)756-5684. The examiner can normally be reached Monday-Thursday: 7:30 - 5:00 pm, every other Friday 7:30 - 4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela Reyes can be reached at (571) 270-1006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/A.J.M./Examiner, Art Unit 2142
/Mariela Reyes/Supervisory Patent Examiner, Art Unit 2142