Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
This Office Action is response to amendment filed on 9/2/2025. By this amendment, claims 1, 5, 6, 7 and 9 are amended. Claim 8 is canceled. Therefore, claims 1-7 and 9-11 are pending in the application and examined.
Any claim objection/rejection not repeated below is withdrawn due to Applicant’s amendments and persuasive arguments.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-7 and 9-11 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
As per claims 1, 5, 6, 7 and 9, “an absolute value of the first factor becoming greater than 1 representing a stagnation in the learning rate” is being recited. However, it is not clear it is the same “an absolute value of the first factor” as recited above in the same claim or a different “an absolute value of the first factor”. Further, “the effect of the absolute value of the first factor” is being recited. However, it is not clear it refers to the first or second “an absolute value of the first factor” recited above in the same claim. In order to further examine on the merits of the claims, the examiner interpreted both “an absolute value of the first factor” are the same ones.
Any claim not specifically mentioned above, is rejected due to its dependency on a rejected claim.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 5-7 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Ida et al. (U.S. Pub. No. 2019/0156240) hereinafter as Ida in view of Andoni et al. (U.S. Pub. No. 2019/0073591) hereinafter as Andoni.
As per claims 1 and 5, Ida discloses a trained model generation system/method that generates a trained model (a learning apparatus 10 in fig. 1 that performs learning using a stochastic gradient descent method – e.g. par. [0047]), comprising:
estimation circuitry configured to perform estimation on learning data (the statistic calculation unit 12 takes the first-order gradient g.sub.t output from the gradient calculation unit 11 and the standard values of hyperparameters α, β.sub.1, and β.sub.2 as inputs, and uses Formula (3) to calculate the approximate value m.sub.t of the moving average of the first-order gradient g.sub.t (Step S5 in fig. 2) – e.g. par. [0059]);
a loss gradient calculating circuitry configured to calculate a gradient of loss for a result of estimation from the estimation circuitry (In addition, the statistic calculation unit 12 uses Formula (5) to calculate the moving average c.sub.t of the variance of the first-order gradient g.sub.t (Step S6 in fig. 2) – e.g. par. [0059]); and
optimizer circuitry configured to calculate a plurality of parameters constituting the trained model on the basis of the gradient of loss (The parameter updating unit 15 updates the parameters of the learning model using the learning rate adjusted by the learning rate adjustment unit 14 – e.g. par. [0055]),
Although Ida further discloses wherein the optimizer circuitry uses an expression for calculating the learning rate used to calculate the plurality of parameters as an expression (The parameter updating unit 15 updates the parameters of the learning model using the learning rate adjusted by the learning rate adjustment unit 14. Specifically, the parameter updating unit 15 updates the model parameter θ.sub.t based on the calculation result of the learning rate adjustment unit 14- e.g. par. [0055]), Ida does not explicitly disclose the expression including a first factor, an absolute value of the first actor based on a number of epochs increasing and a change in moving average, an absolute value of the first factor becoming greater than 1 representing a stagnation in the learning rate, the effect of the absolute value of the first factor being greater than 1 in the expression achieving an increasing in the learning rate at the optimizer circuitry.
However, Andoni discloses the expression including a first factor (a stagnation metric – e.g. par. [0065]), an absolute value of the first actor based on a number of epochs increasing and a change in moving average, an absolute value of the first factor becoming greater than 1 representing a stagnation in the learning rate, the effect of the absolute value of the first factor being greater than 1 in the expression achieving an increasing in the learning rate at the optimizer circuitry (the stagnation metric may indicate that output of the prior epoch has become stagnant, if the average (or highest) fitness for a particular number epochs remains within a particular range (e.g. +/-5%). If the stagnation metric satisfies a threshold, the system 100 may determine to increase the epoch size for the current epoch as compared to prior epoch..to attempt to overcome the stagnation. In other examples, the system 100 may determine the epoch size (or change the epoch size) based on other metrics or factors – e.g. par. [0065]. Please note if the stagnation metric satisfies a threshold may determine to increase the epoch size for the current epoch as compared to prior epoch corresponds to Applicant’s an absolute value and becoming greater than 1 and determine remain within a particular range (e.g. +/-5% corresponds to Applicant’s a change in moving average)
Ida and Andoni are considered to be analogous to the claimed invention because they are in the same field of endeavor of automated model building. Therefore, it would have been obvious to a person with ordinary skill in the art before the effective filing date of the invention to have combined the teachings of Ida with Andori’s the expression including a first factor, an absolute value of the first actor based on a number of epochs increasing and a change in moving average, an absolute value of the first factor becoming greater than 1 representing a stagnation in the learning rate, the effect of the absolute value of the first factor being greater than 1 in the expression achieving an increasing in the learning rate at the optimizer circuitry A person of ordinary skill in the art would be motivated to implement this solution to enable generating a neural network that models a particular data set with acceptable accuracy and in less time than using genetic algorithm having a fixed epoch size or backpropagation alone (e.g. par. [0004] of Andori) and to allow the model to avoid stagnation and converge more reliably.
As per claim 2, Ida – Andoni discloses a trained model generation system as applied above in claim 1. Andoni further discloses wherein the first factor enables an effect of suppressing the learning rate to be achieved more as the absolute value of the gradient increases and increases the effect of suppressing the learning rate as the number of epochs increases (the backpropagation trainer 180 may be disabled based on the epoch size. For example, if the epoch size of a particular epoch is greater than to a particular size, the backpropagation trainer 180 may be disabled for the particular epoch -.e.g par. [0075]).
Ida and Andoni are considered to be analogous to the claimed invention because they are in the same field of endeavor of automated model building. Therefore, it would have been obvious to a person with ordinary skill in the art before the effective filing date of the invention to have combined the teachings of Ida with Andori’s wherein the first factor enables an effect of suppressing the learning rate to be achieved more as the absolute value of the gradient increases and increases the effect of suppressing the learning rate as the number of epochs increases. A person of ordinary skill in the art would be motivated to implement this solution to enable processing resources and/or memory resources that are allocated to the one or more instances of the backpropagation trainer to be reallocated to other operations, thereby improving the efficiency of the system (e.g. par. [0075] of Andoni).
As per claim 6, Ida discloses an information processing device comprising:
acquiring circuitry configured to acquire a gradient of loss calculated from a result of estimation of learning data (the gradient calculation unit 11 uses Formula (2) to calculate the first-order gradient g.sub.t and outputs the result to the statistic calculation unit 12 – e.g. par. [0051]); and
optimizer circuitry configured to calculate a plurality of parameters constituting a trained model on the basis of the gradient of loss calculated from a result of estimation of learning data (The parameter updating unit 15 updates the parameters of the learning model using the learning rate adjusted by the learning rate adjustment unit 14 – e.g. par. [0055]),
Although Ida further discloses wherein the optimizer circuitry uses an expression for calculating the learning rate used to calculate the plurality of parameters as an expression (The parameter updating unit 15 updates the parameters of the learning model using the learning rate adjusted by the learning rate adjustment unit 14. Specifically, the parameter updating unit 15 updates the model parameter θ.sub.t based on the calculation result of the learning rate adjustment unit 14- e.g. par. [0055]), Ida does not explicitly disclose the expression including a first factor, an absolute value of the first actor based on a number of epochs increasing and a change in moving average, an absolute value of the first factor becoming greater than 1 representing a stagnation in the learning rate, the effect of the absolute value of the first factor being greater than 1 in the expression achieving an increasing in the learning rate at the optimizer circuitry.
However, Andoni discloses the expression including a first factor (a stagnation metric – e.g. par. [0065]), an absolute value of the first actor based on a number of epochs increasing and a change in moving average, an absolute value of the first factor becoming greater than 1 representing a stagnation in the learning rate, the effect of the absolute value of the first factor being greater than 1 in the expression achieving an increasing in the learning rate at the optimizer circuitry (the stagnation metric may indicate that output of the prior epoch has become stagnant, if the average (or highest) fitness for a particular number epochs remains within a particular range (e.g. +/-5%). If the stagnation metric satisfies a threshold, the system 100 may determine to increase the epoch size for the current epoch as compared to prior epoch..to attempt to overcome the stagnation. In other examples, the system 100 may determine the epoch size (or change the epoch size) based on other metrics or factors – e.g. par. [0065]. Please note if the stagnation metric satisfies a threshold may determine to increase the epoch size for the current epoch as compared to prior epoch corresponds to Applicant’s an absolute value and becoming greater than 1 and determine remain within a particular range (e.g. +/-5% corresponds to Applicant’s a change in moving average)
Ida and Andoni are considered to be analogous to the claimed invention because they are in the same field of endeavor of automated model building. Therefore, it would have been obvious to a person with ordinary skill in the art before the effective filing date of the invention to have combined the teachings of Ida with Andori’s the expression including a first factor, an absolute value of the first actor based on a number of epochs increasing and a change in moving average, an absolute value of the first factor becoming greater than 1 representing a stagnation in the learning rate, the effect of the absolute value of the first factor being greater than 1 in the expression achieving an increasing in the learning rate at the optimizer circuitry A person of ordinary skill in the art would be motivated to implement this solution to enable generating a neural network that models a particular data set with acceptable accuracy and in less time than using genetic algorithm having a fixed epoch size or backpropagation alone (e.g. par. [0004] of Andori) and to allow the model to avoid stagnation and converge more reliably.
As per claim 7, Ida - Andoni further discloses a non-transitory computer-readable storage medium storing instructions that, when executed by at least one processor (the program module 1093 stored in a detachable storage medium and read out by the CPU – e.g. par. [0117] of Ida) to carry out the method of steps as applied above in claim 5.
As per claim 9, Ida discloses an estimation device comprising:
acquiring circuitry configured to acquire input information (the gradient calculation unit 11 uses Formula (2) to calculate the first-order gradient g.sub.t and outputs the result to the statistic calculation unit 12 – e.g. par. [0051]); and
estimation circuitry configured to estimate for the input information using a trained model that is generated by calculating a plurality of parameters constituting the trained model on the basis of a gradient of loss calculated from a result of estimation of learning data (The parameter updating unit 15 updates the parameters of the learning model using the learning rate adjusted by the learning rate adjustment unit 14 – e.g. par. [0055]),
Although Ida further discloses wherein the optimizer circuitry uses an expression for calculating the learning rate used to calculate the plurality of parameters as an expression (The parameter updating unit 15 updates the parameters of the learning model using the learning rate adjusted by the learning rate adjustment unit 14. Specifically, the parameter updating unit 15 updates the model parameter θ.sub.t based on the calculation result of the learning rate adjustment unit 14- e.g. par. [0055]), Ida does not explicitly disclose the expression including a first factor, an absolute value of the first actor based on a number of epochs increasing and a change in moving average, an absolute value of the first factor becoming greater than 1 representing a stagnation in the learning rate, the effect of the absolute value of the first factor being greater than 1 in the expression achieving an increasing in the learning rate at the optimizer circuitry.
However, Andoni discloses the expression including a first factor (a stagnation metric – e.g. par. [0065]), an absolute value of the first actor based on a number of epochs increasing and a change in moving average, an absolute value of the first factor becoming greater than 1 representing a stagnation in the learning rate, the effect of the absolute value of the first factor being greater than 1 in the expression achieving an increasing in the learning rate at the optimizer circuitry (the stagnation metric may indicate that output of the prior epoch has become stagnant, if the average (or highest) fitness for a particular number epochs remains within a particular range (e.g. +/-5%). If the stagnation metric satisfies a threshold, the system 100 may determine to increase the epoch size for the current epoch as compared to prior epoch..to attempt to overcome the stagnation. In other examples, the system 100 may determine the epoch size (or change the epoch size) based on other metrics or factors – e.g. par. [0065]. Please note if the stagnation metric satisfies a threshold may determine to increase the epoch size for the current epoch as compared to prior epoch corresponds to Applicant’s an absolute value and becoming greater than 1 and determine remain within a particular range (e.g. +/-5% corresponds to Applicant’s a change in moving average)
Ida and Andoni are considered to be analogous to the claimed invention because they are in the same field of endeavor of automated model building. Therefore, it would have been obvious to a person with ordinary skill in the art before the effective filing date of the invention to have combined the teachings of Ida with Andori’s the expression including a first factor, an absolute value of the first actor based on a number of epochs increasing and a change in moving average, an absolute value of the first factor becoming greater than 1 representing a stagnation in the learning rate, the effect of the absolute value of the first factor being greater than 1 in the expression achieving an increasing in the learning rate at the optimizer circuitry A person of ordinary skill in the art would be motivated to implement this solution to enable generating a neural network that models a particular data set with acceptable accuracy and in less time than using genetic algorithm having a fixed epoch size or backpropagation alone (e.g. par. [0004] of Andori) and to allow the model to avoid stagnation and converge more reliably.
Claims 3-4 and 10-11 are rejected under 35 U.S.C. 103 as being unpatentable over Ida in view of Andoni as applied above in claims 1 and 2, and further in view of Iwamoto et al. (U.S. Patent No. 5287430) hereinafter as Iwamoto.
As per claims 3 and 10, Ida – Andoni discloses a trained model generation system as applied above in claims 1 and 2.
Andoni further discloses wherein the expression for calculating the learning rate includes a second factor which suppresses the learning rate and of which a maximum value according to a cumulative amount of update of each of the plurality of parameters through learning at the beginning of learning and does not include the second factor subsequently to the beginning of learning (the user may set a threshold for one or more metrics that are used to vary the number of models generated during one or more epochs of the genetic algorithm 110. The user can define a number of trainable models 122 to be trained by the backpropagation trainer 180 and fed back into the genetic algorithm 110 as trained models 182. As yet another example, the user can define a threshold fitness to be used to enable (disable) the backpropagation trainer 180 – e.g. par. [0037] and generate a fitness value. The fitness value may be compared to a threshold to determine whether to enable or disable the backpropagation trainer 180 by refraining from generating and providing trainable models for at least one epoch – e.g. par. [0068]).
Although par. [0037] of Andoni teaches the user can define a threshold fitness, Ida – Andoni does not explicitly disclose maximum value is 1. However, Iwamoto teaches maximum value is 1 (e.g. col. 12, lines 27-28).
Ida - Andoni and Iwamoto are considered to be analogous to the claimed invention because they are in the same field of endeavor of using neural network. Therefore, it would have been obvious to a person with ordinary skill in the art before the effective filing date of the invention to have combined the teachings of Ida – Andori with Iwamoto’s maximum value is 1. A person of ordinary skill in the art would be motivated to implement this solution to minimize the probability of occurrences of error classification due to noise in the input data for discrimination (e.g. col. 12, lines 30-32 of Iwamoto).
As per claims 4 and 11, Ida – Andoni – Iwamoto discloses a trained model generation system as applied above in claims 3 and 10, Andoni further discloses wherein the second factor has an absolute value which is less than 1 when the cumulative amount of update is less than a threshold value and monotonically decreases when the cumulative amount of update is greater than the threshold value (the input set 120 (and the output set 130) of a first epoch may include a first number of models (N), and the input set 120 (and the output set 130) of a second epoch that is sequent to the first epoch may include a second number of models (M). The second number is different than the first number (e.g., N and M are different positive integers). In a particular aspect, earlier epochs may generate and evolve a large number of models in order to rapidly identify one or more “promising” neural network topologies that achieve better fitness, and later epochs may generate smaller numbers of models (e.g., N is greater than M) to tune characteristics of the “promising” neural network topologies – e.g. par. [0030] and if the fitness value satisfies a first fitness threshold, the system 100 may determine to reduce the epoch size as compared to prior epochs – e.g. par. [0063]).
Response to Arguments
Applicant's arguments regarding 103 rejections on remarks pages 3-4 filed 9/2/2025 have been carefully and fully considered but they are not persuasive.
The Applicant’s arguments are summarized below:
Andoni does not disclose any expression including a factor whose value depends on a number of epochs and a change in a moving average
Andoni does not teach newly added claim limitation of “wherein the optimizer circuitry uses an expression including a first factor, an absolute value of the first factor is based on a number of epochs increasing and a change in moving average, the first factor becoming greater than 1 in the expression achieving an increasing in the learning rate at the optimizer circuitry”
Dependent claims are allowable due to their dependency
In response to arguments a) and b), the examiner respectfully disagrees. First, without an explicitly definition of expression, the examiner reasonably interprets an expression in light of the specification is any expression to explain a relationship/context among elements. It appears the Applicant interprets the expression as an equation. Please note expression and equation are distinct from each other. Second, the examiner has clearly mapped and articulated in the above 103 rejection that Andoni discloses the expression including a first factor (a stagnation metric – e.g. par. [0065]), an absolute value of the first actor based on a number of epochs increasing and a change in moving average, an absolute value of the first factor becoming greater than 1 representing a stagnation in the learning rate, the effect of the absolute value of the first factor being greater than 1 in the expression achieving an increasing in the learning rate at the optimizer circuitry (the stagnation metric may indicate that output of the prior epoch has become stagnant, if the average (or highest) fitness for a particular number epochs remains within a particular range (e.g. +/-5%). If the stagnation metric satisfies a threshold, the system 100 may determine to increase the epoch size for the current epoch as compared to prior epoch..to attempt to overcome the stagnation. In other examples, the system 100 may determine the epoch size (or change the epoch size) based on other metrics or factors – e.g. par. [0065]. Please note if the stagnation metric satisfies a threshold may determine to increase the epoch size for the current epoch as compared to prior epoch corresponds to Applicant’s an absolute value and becoming greater than 1 and determine remain within a particular range (e.g. +/-5% corresponds to Applicant’s a change in moving average)
Therefore, contrary to Applicant’s arguments, the 103 rejections are maintained.
In response to argument c), the examiner respectfully disagrees. Since all of the independent claims are not allowable, dependent claims are not allowable neither.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Ebrahimi et al. (U.S. Pub. No. 2022/0187841) discloses solve problems at different stages of training a neural network of using adaptive learning rate optimization methods
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to APRIL Y BLAIR whose telephone number is (571)270-1014. The examiner can normally be reached Monday-Friday, 9:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cordelia Zecher can be reached at (571)272-7771. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/APRIL Y BLAIR/ Supervisory Patent Examiner, Art Unit 2196