Last updated: April 19, 2026
Application No. 17/746,317
INFORMATION PROCESSING METHOD, INFORMATION PROCESSING APPARATUS, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM

Final Rejection §101§103§112
Filed
May 17, 2022
Examiner
LAHAM BAUZO, ALVARO SALIM
Art Unit
2146
Tech Center
2100 — Computer Architecture & Software
Assignee
Actapio Inc.
OA Round
2 (Final)
Interview Optional

— +100.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 3 resolved cases, 2023–2026
Examiner Intelligence

LAHAM BAUZO, ALVARO SALIM View full profile →
Grants only 33% of cases
Career Allow Rate
1 granted / 3 resolved
-21.7% vs TC avg
Strong +100% interview lift
Without
With
+100.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
27 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
32.4%
-7.6% vs TC avg
§103
44.3%
+4.3% vs TC avg
§102
7.3%
-32.7% vs TC avg
§112
16.0%
-24.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 3 resolved cases
Office Action

§101 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Amendments
This Office Action is in response to the amendment filed on November 5, 2025. 
Claims 1 and 16-18 have been amended. 
No claims have been cancelled.
No new claims have been added. 
The objections and rejections from the prior correspondence that are not restated herein are withdrawn.

Response to Arguments
Applicant's arguments filed on November 5, 2025 have been fully considered.
Applicant’s arguments regarding the 35 U.S.C. 101 rejections of the previous office action have been fully considered but are not persuasive. Applicant argues that the amended claims are directed to a specific technological improvement in neural network training: a method of coordinating dropout and batch normalization operations to suppress non-training targets during back propagation, and that the amended claims do not merely recite a mathematical concept standing alone, but a specific technological process for improving neural network training. Applicant argues that performing batch normalization after dropout addresses the technical problem of how to properly coordinate dropout and batch normalization to avoid wasting computational resources on non-training targets.
The examiner respectfully disagrees. According to MPEP § 2106.05(a), the judicial exception alone cannot provide the improvement, and that the improvement must be provided by one or more additional elements. Stating that performing batch normalization after dropout (both mathematical calculations under Step 2A Prong 1) avoids wasting computational resources on non-training targets (Applicant Arguments/Remarks, pg. 2) is simply the result of rearranging the order in which the mathematical calculations of dropout and batch normalization are performed. Moreover, the additional elements recited in claim 16, such as a processor; and a memory storing instructions that, when executed by the processor, cause the processor to […] are simply using a computer as tools to perform the abstract idea of batch normalization after dropout. Therefore, performing batch normalization after dropout does not provide an improvement to computer functionality or any other technology or technical field. Additionally, the amended claim 1 recites the following abstract ideas: 
generating […] the model in a manner in which the first partial model is trained by first dropout based on a first dropout rate and the second partial model is trained by second dropout based on a second dropout rate different from the first dropout rate. (Mathematical concept – generating/training a model based on a dropout rate involves mathematical calculations (see [0162]) – see MPEP § 2106.04(a)(2)(I))
wherein performing batch normalization after the first dropout and after the second dropout suppresses nodes that are not activated by the respective dropout from being subjected to batch normalization during back propagation, thereby suppressing non-training targets from being subjected to batch normalization. (Mathematical concept – performing batch normalization after a dropout suppresses nodes involves mathematical calculations (see [0162], [0148-0153], and [FIG.10]) – see MPEP § 2106.04(a)(2)(I))
If claim limitations, under their broadest reasonable interpretation, cover performance of the limitations as a mental process, but for the recitation of generic computer components, then the claim limitations fall within the mathematical or mental process grouping of abstract ideas. Accordingly, the claim “recites” an abstract idea.
Applicant argues that the amended claims integrate any abstract idea into a practical application under Step 2A Prong 2 by providing a specific improvement to computer functionality. 
The examiner respectfully disagrees. Under Step 2A Prong 2, claim 1 recites the following additional elements: 
An information processing method executed by a computer (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
acquiring learning data (Mere data gathering – Adding insignificant extra-solution activity of mere data gathering to the judicial exception – see § MPEP2106.05(g).) 
by using the learning data (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim as a whole, looking at the additional elements individually and in combination, does not contain any other additional elements that are indicative of integration into a practical application, the claim is directed to an abstract idea.
Applicant argues that Under Step 2B, the Examiner has not established that the specific combination of (1) training multiple partial models with different dropout rates, (2) performing batch normalization after dropout for each partial model, and (3) suppressing non-activated nodes from batch normalization during back propagation is well-understood, routine, or conventional.
The examiner respectfully disagrees. Under Step 2B, the claim 1 does not include additional elements that are sufficient to amount to significantly more than the judicial exception:
An information processing method executed by a computer, the information processing method comprising (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
acquiring learning data (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC)- see MPEP § 2106.05(d)(ll)(i) - Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).) 
by using the learning data (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Considering the additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.
Accordingly, independent claims 16-18 recite similar limitations as corresponding claim 1 and are rejected for similar reasons as claim using similar teachings and rationale. Furthermore, the additional elements recited in claim 16, when considered individually and in combination, do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea:
a processor; (Mere recitation of a generic computer component – see § MPEP 2106.05(b)(I))
and a memory storing instructions that, when executed by the processor, cause the processor to: (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Applicant argues that claims 2-15 are patent-eligible for at least the same reasons as independent claim 1, as they add specific technological details that further define the improved neural network architecture.
The examiner respectfully disagrees. The dependent claims 6, 9, and 11-15 recite additional abstract ideas (see rejections below), and the additional elements recited in the dependent claims 2-15, when considered individually and in combination, do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea, as shown in the detailed 35 U.S.C. 101 rejections below.

Applicant’s arguments regarding the 35 U.S.C. 103 rejections of the previous office action have been fully considered but are moot because the newly applied prior art references, to NOGUCHI, LI, POUDEL, SAMPSON, KUMAR, and SRIVASTAVA teach the added limitations as shown in the 35 U.S.C. 103 rejections below.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.


Claims 10 and 11 are rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.

Regarding Claim 10, the claim depends from claim 1 and merely restates the step of generating the model by performing batch normalization after the first dropout for training without further narrowing the method of claim 1.

Regarding Claim 11, the claim depends from claim 1 and merely restates the step of generating the model by performing batch normalization after the second dropout for training without further narrowing the method of claim 1.

Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-18 are rejected under 35 U.S.C.101 because the claimed invention is directed to an abstract idea without significantly more.

Step 1: Claims 1-15 are directed to a process. Claims 16-18 are directed to a machine or an article of manufacture.

With respect to claims 1, 16, 17, and 18:
2A Prong 1: The claims recite an abstract idea. Specifically:
(Claims 1, 16, and 17) generating/generate […] the model […]
(Claim 18) the model being trained […]
[…] by using the learning data, the model in a manner in which the first partial model is trained by first dropout based on a first dropout rate and the second partial model is trained by second dropout based on a second dropout rate different from the first dropout rate. (Mathematical concept – generating/training a model based on a dropout rate involves mathematical calculations (see [0162]) – see MPEP § 2106.04(a)(2)(I))
(Claim 1 and 17) wherein performing batch normalization after the first dropout and after the second dropout suppresses nodes that are not activated by the respective dropout from being subjected to batch normalization during back propagation, thereby suppressing non-training targets from being subjected to batch normalization. (Mathematical concept – performing batch normalization after a dropout suppresses nodes involves mathematical calculations (see [0162], [0148-0153], and [FIG.10]) – see MPEP § 2106.04(a)(2)(I))
(Claim 16) wherein performing batch normalization after the first dropout and after the second dropout suppresses nodes that are not activated by the respective dropout from being subjected to batch normalization during back propagation, thereby improving training efficiency. (Mathematical concept – performing batch normalization after a dropout suppresses nodes involves mathematical calculations (see [0162], [0148-0153], and [FIG.10]) – see MPEP § 2106.04(a)(2)(I))
(Claim 18) wherein performing batch normalization after the dropout for the first partial model and after the dropout for the second partial model suppresses nodes that are not activated by the respective dropout from being subjected to batch normalization during back propagation. (Mathematical concept – performing batch normalization after a dropout suppresses nodes involves mathematical calculations (see [0162], [0148-0153], and [FIG.10]) – see MPEP § 2106.04(a)(2)(I))
If claim limitations, under their broadest reasonable interpretation, cover performance of the limitations as a mental process, but for the recitation of generic computer components, then the claim limitations fall within the mathematical or mental process grouping of abstract ideas. Accordingly, the claim “recites” an abstract idea.
2A Prong 2: The additional elements recited in the claims do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
(Claim 1) An information processing method executed by a computer, the information processing method comprising (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
(Claim 16) a processor; (Mere recitation of a generic computer component – see § MPEP 2106.05(b)(I))
(Claim 16) and a memory storing instructions that, when executed by the processor, cause the processor to: (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
(Claims 1, 16, and 17) acquiring/acquire learning data used for training of a model including a first partial model and a second partial model; (Mere data gathering – Adding insignificant extra-solution activity of mere data gathering to the judicial exception – see § MPEP2106.05(g).) 
(Claim 17) A non-transitory computer-readable storage medium having stored therein an information processing program for causing a computer to execute: (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
(Claim 18) A non-transitory computer-readable storage medium having stored therein an information processing program for causing a computer to be operated as a model including a first partial model and a second partial model, (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim as a whole, looking at the additional elements individually and in combination, does not contain any other additional elements that are indicative of integration into a practical application, the claim is directed to an abstract idea.
2B: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
(Claim 1) An information processing method executed by a computer, the information processing method comprising (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
(Claim 16) a processor; (Mere recitation of a generic computer component – see § MPEP 2106.05(b)(I))
(Claim 16) and a memory storing instructions that, when executed by the processor, cause the processor to: (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
(Claims 1, 16, and 17) acquiring/acquire learning data used for training of a model including a first partial model and a second partial model; (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC)- see MPEP § 2106.05(d)(ll)(i) - Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).) 
(Claim 17) A non-transitory computer-readable storage medium having stored therein an information processing program for causing a computer to execute: (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
(Claim 18) A non-transitory computer-readable storage medium having stored therein an information processing program for causing a computer to be operated as a model including a first partial model and a second partial model, (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Considering the additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.

With respect to claim 2:
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
wherein the second partial model includes a larger number of layers than the first partial model. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
wherein the second partial model includes a larger number of layers than the first partial model. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 3:
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
wherein the second partial model includes a hidden layer. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
wherein the second partial model includes a hidden layer. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 4:
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
wherein the model includes an input layer to which the learning data is input, and an output from the input layer is input to each of the first partial model and the second partial model. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
wherein the model includes an input layer to which the learning data is input, and an output from the input layer is input to each of the first partial model and the second partial model. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 5:
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
wherein the first partial model includes a first embedding layer in which an input from the input layer is embedded, and the second partial model includes a second embedding layer in which an input from the input layer is embedded. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
wherein the first partial model includes a first embedding layer in which an input from the input layer is embedded, and the second partial model includes a second embedding layer in which an input from the input layer is embedded. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 6:
2A Prong 1: The claim recites an abstract idea. Specifically:
wherein the model includes a combining layer that combines an output from the first partial model and an output from the second partial model. (Mathematical concept – combining outputs from a first and second partial model in a combining layer involves mathematical calculations. The combining layer includes a processing layer EL31 that combines the outputs of the partial model PM1 and the outputs of the partial model PM2, and calculates an average of these outputs (see [0132]). – see MPEP § 2106.04(a)(2)(I))
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 7:
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
wherein the first partial model includes a first output layer whose output is input to the combining layer, and the second partial model includes a second output layer whose output is input to the combining layer. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
wherein the first partial model includes a first output layer whose output is input to the combining layer, and the second partial model includes a second output layer whose output is input to the combining layer. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 8:
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
wherein the combining layer includes a softmax layer. (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
wherein the combining layer includes a softmax layer. (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 9:
2A Prong 1: The claim recites an abstract idea. Specifically:
wherein the combining layer performs combining processing for the output of the first partial model and the output of the second partial model before the softmax layer. (Mathematical concept & mental process – combining processing, or combining, can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2))
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 10:
2A Prong 1: The claim recites an abstract idea. Specifically:
generating the model by performing batch normalization after the first dropout for training. (Mathematical concepts – batch normalization and dropout involve mathematical calculations (see [0162], [0148-0153], and [FIG.10]) – see MPEP § 2106.04(a)(2)(I))
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 11:
2A Prong 1: The claim recites an abstract idea. Specifically:
generating the model by performing batch normalization after the second dropout for training. (Mathematical concept & mental process – batch normalization and dropout involve mathematical calculations and/or can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(I))
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 12:
2A Prong 1: The claim recites an abstract idea. Specifically:
generating the model including the first partial model having a size based on the first dropout rate. (Mathematical concept & mental process – generating a model having a size based on a dropout rate involves mathematical calculations and/or can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(I))
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
acquiring information indicating the first dropout rate, and (Mere data gathering – Adding insignificant extra-solution activity of mere data gathering to the judicial exception – see § MPEP2106.05(g).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
acquiring information indicating the first dropout rate, and (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC)- see MPEP § 2106.05(d)(ll)(i) - Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 13:
2A Prong 1: The claim recites an abstract idea. Specifically:
generating the model including the second partial model having a size based on the second dropout rate. (Mathematical concept & mental process – generating a model having a size based on a dropout rate involves mathematical calculations and/or can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(I))
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
acquiring information indicating the second dropout rate, and (Mere data gathering – Adding insignificant extra-solution activity of mere data gathering to the judicial exception – see § MPEP2106.05(g).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
acquiring information indicating the second dropout rate, and (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC)- see MPEP § 2106.05(d)(ll)(i) - Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 14:
2A Prong 1: The claim recites an abstract idea. Specifically:
generating the model including the second partial model including a hidden layer based on the second dropout rate. (Mathematical concept & mental process – generating a model based on a dropout rate involves mathematical calculations and/or can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(I))
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

With respect to claim 15:
2A Prong 1: The claim recites an abstract idea. Specifically:
generating the model including the second partial model including a hidden layer having a size determined based on the second dropout rate. (Mathematical concept & mental process – generating a model having a size determined based on a dropout rate involves mathematical calculations and/or can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(I))
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4, 6-7, 10-13 and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over NOGUCHI (US 20190005399 A1) in view of LI (US 20220044033 A1) and XIANG LI ("Understanding the Disharmony between Dropout and Batch Normalization by Variance Shift"), hereafter NOGUCHI, LI, and XIANG LI respectively.

Regarding Claim 1, NOGUCHI teaches:
An information processing method executed by a computer, the information processing method comprising: (NOGUCHI [0020] teaches: "The information providing device 10 is an information processing device that executes learning processing described later".)
acquiring learning data used for training of a model including a first partial model and a second partial model; (NOGUCHI [0059] teaches: "Next, the information providing device 10 acquires learning data used for learning (i.e., for training) the processing model (i.e., of a model) from the data server 50 (Step S2)." NOGUCHI [0118] teaches: "[…] the processing model M1 includes a partial model PM1 (i.e., including a first partial model). […] The processing model M1 also includes a partial model PM2 (i.e., including a second partial model) [...].")
NOGUCHI is not relied upon for teaching:
generating, by using the learning data, the model in a manner in which the first partial model is trained by first dropout based on a first dropout rate and the second partial model is trained by second dropout based on a second dropout rate different from the first dropout rate, wherein performing batch normalization after the first dropout and after the second dropout suppresses nodes that are not activated by the respective dropout from being subjected to batch normalization during back propagation, thereby suppressing non-training targets from being subjected to batch normalization. 
However, LI teaches: generating, by using the learning data, the model in a manner in which the first partial model is trained by first dropout based on a first dropout rate and the second partial model is trained by second dropout based on a second dropout rate different from the first dropout rate, […] (LI [0009] teaches: "In some implementations, training the plurality of DNNs comprises using a different dropout rate for each of the plurality of DNNs." Examiner's note: under BRI, the first dropout rate and second dropout rate can be interpreted as using different dropout rates for each of the plurality of DNNs. Additionally, generating, […] the model in a manner in which the first partial model is trained by can be reasonably interpreted as training the plurality of DNNs (i.e., first/second partial models) using different dropout rates.)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of NOGUCHI and LI before them, to apply LI’s different dropout rates in NOGUCHI’s partial models in the information providing device. One would have been motivated to make such a combination in order to reach the desired level of accuracy and robustness when training the neural networks (i.e., models), LI [0002 & 0005].
NOGUCHI in view of LI is not relied upon for teaching: 
wherein performing batch normalization after the first dropout and after the second dropout suppresses nodes that are not activated by the respective dropout from being subjected to batch normalization during back propagation, thereby suppressing non-training targets from being subjected to batch normalization.
However, XIANG LI teaches: wherein performing batch normalization after the first dropout and after the second dropout suppresses nodes that are not activated by the respective dropout from being subjected to batch normalization during back propagation, thereby suppressing non-training targets from being subjected to batch normalization. (XIANG LI [pg. 1, Figure 1. Up] teaches performing batch normalization after dropout. Furthermore, XIANG LI [pg. 3, 3. Theoretical Analyses] teaches that the BN (i.e., Batch Normalization) layer is directly subsequent to the dropout layer. Furthermore, XIANG LI [pg. 2, 2. Related Work and Preliminaries] teaches that dropout can be interpreted as a way of regularizing a neural network by adding noise to its hidden units (i.e., nodes) by multiplying hidden activations of hidden units by Bernoulli distributed random variables which take the value 1 with probability                                 
                                    p
                                    (
                                    0
                                    ≤
                                    p
                                    ≤
                                    1
                                    )
                                
                             and 0 otherwise (i.e., suppresses nodes). Hidden units (i.e., nodes) that are multiplied by a Bernoulli random variable with a value of 0 will result in a 0 for that hidden unit. Under broadest reasonable interpretation, suppresses nodes can be interpreted as the hidden unit being multiplied by the Bernoulli random variable with a value of 0, which makes the activation for that hidden unit (i.e., node) equal to 0 by the dropout (i.e., nodes that are not activated by the respective dropout). Additionally, XIANG LI [pg. 2, 2. Related Work and Preliminaries] also teaches in eq. (2) that                                 
                                    μ
                                
                             and                                 
                                    
                                        
                                            σ
                                        
                                        
                                            2
                                        
                                    
                                
                             (see also Figure 1) participate in the backpropagation. When the value for the instance in the mini-batch is processed by a hidden unit multiplied by a Bernoulli random variable of 0, the instance                                 
                                    
                                        
                                            x
                                        
                                        
                                            
                                                
                                                    i
                                                
                                            
                                            …
                                            
                                                
                                                    m
                                                
                                            
                                        
                                    
                                
                             will not end up participating in the backpropagation (i.e., during back propagation), and thus suppressed (i.e., thereby suppressing non-training targets from being subjected to batch normalization).)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of NOGUCHI, LI, and XIANG LI before them, to include XIANG LI’s performing batch normalization after dropout in each partial model in NOGUCHI and LI’s information processing device. One would have been motivated to make such a combination in order to track the accuracy of the model as it trains and to achieve cooperation between dropout and batch normalization for regularizing neural networks with a very large feature dimension (XIANG LI [Abstract], [pg. 1, I. Introduction], [pg. 2, 2. Related Work and Preliminaries] and [pg. 3, 3. Theoretical Analyses]).

Regarding Claim 4, NOGUCHI in view of LI and XIANG LI teaches the elements of claim 1 as outlined above. NOGUCHI further teaches:
wherein the model includes an input layer to which the learning data is input, and an output from the input layer is input to each of the first partial model and the second partial model. (NOGUCHI [Fig. 1] teaches pieces of input information, such as image and writing, that are input into the partial models. Under BRI, the input layer can be interpreted as the image layer and writing layer in Fig. 1, which receive data from the learning data, to input the data into each of the partial models (i.e., first/second partial model).)

Regarding Claim 6, NOGUCHI in view of LI and XIANG LI teaches the elements of claim 1 as outlined above. NOGUCHI further teaches:
wherein the model includes a combining layer that combines an output from the first partial model and an output from the second partial model. (NOGUCHI [Fig. 5] teaches a synthesis model SM1 that combines the outputs of each partial model (e.g., PM1, PM2, PM3) in the model M1. Therefore, under BRI, the combining layer can be interpreted as the synthesis model SM1.)

Regarding Claim 7, NOGUCHI in view of LI and XIANG LI teaches the elements of claim 6 as outlined above. NOGUCHI further teaches:
wherein the first partial model includes a first output layer whose output is input to the combining layer, and the second partial model includes a second output layer whose output is input to the combining layer. (NOGUCHI [Fig. 5] teaches a synthesis model SM1 that combines the outputs of each partial model (e.g., PM1, PM2, PM3) in the model M1. Under BRI, the output layer of each partial model can be interpreted as the layer containing the output characteristic information of each partial model.)

Regarding Claim 10, NOGUCHI in view of LI and XIANG LI teaches the elements of claim 1 as outlined above. XIANG LI further teaches:
generating the model by performing batch normalization after the first dropout for training. (XIANG LI [pg. 2, 2. Related Work and Preliminaries] teaches that BN (i.e., by performing batch normalization) after dropout accumulates the moving averages of neural means and variances during the learning (i.e., for training) to track the accuracy of a model as it trains (i.e., generating the model)(see XIANG LI [pg. 1, Figure 1, “Train Mode”). Examiner’s note: as stated above in claim 1, one of ordinary skill in the art would be motivated to apply XIANG LI’s batch normalization after dropout to each of NOGUCHI’s partial models PM1 (i.e., after the first dropout) and PM2 in order to track the accuracy as the model trains and achieve cooperation between dropout and batch normalization for regularizing neural networks with a very large feature dimension (XIANG LI [Abstract], [pg. 1, I. Introduction], [pg. 2, 2. Related Work and Preliminaries] and [pg. 3, 3. Theoretical Analyses].)

Regarding Claim 11, NOGUCHI in view of LI and XIANG LI teaches the elements of claim 1 as outlined above. XIANG LI further teaches:
generating the model by performing batch normalization after the second dropout for training. (XIANG LI [pg. 2, 2. Related Work and Preliminaries] teaches that BN (i.e., by performing batch normalization) after dropout accumulates the moving averages of neural means and variances during the learning (i.e., for training) to track the accuracy of a model as it trains (i.e., generating the model)(see XIANG LI [pg. 1, Figure 1, “Train Mode”). Examiner’s note: as stated above in claim 1, one of ordinary skill in the art would be motivated to apply XIANG LI’s batch normalization after dropout to each of NOGUCHI’s partial models PM1 and PM2 (i.e., after the second dropout) in order to track the accuracy as the model trains and achieve cooperation between dropout and batch normalization for regularizing neural networks with a very large feature dimension (XIANG LI [Abstract], [pg. 1, I. Introduction], [pg. 2, 2. Related Work and Preliminaries] and [pg. 3, 3. Theoretical Analyses].)

Regarding Claim 12, NOGUCHI in view of LI and XIANG LI teaches the elements of claim 1 as outlined above. LI further teaches:
acquiring information indicating the first dropout rate, (LI [0019] teaches: "For example only, the dropout rate for DNN-0 220-0 could be 0.2, the dropout rate for DNN 220-1 could be 0.4, and the dropout rate for DNN-N 220 -N (e.g., for N=3) could be 0.6.")
generating the model including the first partial model having a size based on the first dropout rate. (LI [0019] teaches: "In other words, the higher the dropout rate, the more connections are removed, and the model is smaller." LI [0009] teaches: "In some implementations, training the plurality of DNNs comprises using a different dropout rate for each of the plurality of DNNs." Examiner’s note” under BRI, “generating the model” can be interpreted as training the plurality of DNNs using different dropout rates.)

Regarding Claim 13, NOGUCHI in view of LI and XIANG LI teaches the elements of claim 1 as outlined above. LI further teaches:
acquiring information indicating the second dropout rate, and (LI [0019] teaches: "For example only, the dropout rate for DNN-0 220-0 could be 0.2, the dropout rate for DNN 220-1 could be 0.4, and the dropout rate for DNN-N 220 -N (e.g., for N=3) could be 0.6.")
generating the model including the second partial model having a size based on the second dropout rate. (LI [0019] teaches: "In other words, the higher the dropout rate, the more connections are removed, and the model is smaller." LI [0009] teaches: "In some implementations, training the plurality of DNNs comprises using a different dropout rate for each of the plurality of DNNs." Examiner’s note” under BRI, “generating the model” can be interpreted as training the plurality of DNNs using different dropout rates.)

Regarding Claim 16, the claim similar limitations as corresponding claim 1 and is rejected for similar reasons as claim 1 using similar teachings and rationale. Additionally, NOGUCHI further teaches:
a processor; and a memory storing instructions that, when executed by the processor, cause the processor to: (NOGUCHI [0099] teaches: “The control unit 40 is a controller, and is implemented when various programs stored in a storage device inside the information providing device 10 are executed by a processor such as a central processing unit (CPU) and a micro processing unit (MPU) using a RAM and the like as a working area, for example.”)
However, NOGUCHI is not relied upon for teaching, but XIANG LI teaches: wherein performing batch normalization after the first dropout and after the second dropout suppresses nodes that are not activated by the respective dropout from being subjected to batch normalization during back propagation, thereby improving training efficiency. (XIANG LI [pg. 1, Figure 1. Up] teaches performing batch normalization after […] dropout. Furthermore, XIANG LI [pg. 3, 3. Theoretical Analyses] teaches that the BN (i.e., Batch Normalization) layer is directly subsequent to the dropout layer. Furthermore, XIANG LI [pg. 2, 2. Related Work and Preliminaries] teaches that dropout can be interpreted as a way of regularizing a neural network by adding noise to its hidden units (i.e., nodes) by multiplying hidden activations of hidden units by Bernoulli distributed random variables which take the value 1 with probability                                 
                                    p
                                    (
                                    0
                                    ≤
                                    p
                                    ≤
                                    1
                                    )
                                
                             and 0 otherwise (i.e., suppresses nodes). Hidden units (i.e., nodes) that are multiplied by a Bernoulli random variable with a value of 0 will result in a 0 for that hidden unit. Under broadest reasonable interpretation, suppresses nodes can be interpreted as the hidden unit being multiplied by the Bernoulli random variable with a value of 0, which makes the activation for that hidden unit (i.e., node) equal to 0 by the dropout (i.e., nodes that are not activated by the respective dropout). Additionally, XIANG LI [pg. 2, 2. Related Work and Preliminaries] also teaches in eq. (2) that                                 
                                    μ
                                
                             and                                 
                                    
                                        
                                            σ
                                        
                                        
                                            2
                                        
                                    
                             
Read full office action
Prosecution Timeline

May 17, 2022
Application Filed
Jun 23, 2025
Non-Final Rejection — §101, §103, §112
Sep 25, 2025
Applicant Interview (Telephonic)
Sep 25, 2025
Examiner Interview Summary
Nov 05, 2025
Response Filed
Dec 08, 2025
Final Rejection — §101, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/758,166
Patent 12475388
MACHINE LEARNING MODEL SEARCH METHOD, RELATED APPARATUS, AND DEVICE
2y 5m to grant Granted Nov 18, 2025
Study what changed to get past this examiner. Based on 1 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
33%
Grant Probability
99%
With Interview (+100.0%)
3y 4m
Median Time to Grant
Moderate
PTA Risk
Based on 3 resolved cases by this examiner. Grant probability derived from career allow rate.