DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Style
In this action unitalicized bold is used for claim language, while italicized bold is used for emphasis.
Applicant Reply
“The claims may be amended by canceling particular claims, by presenting new claims, or by rewriting particular claims as indicated in 37 CFR 1.121(c). The requirements of 37 CFR 1.111(b) must be complied with by pointing out the specific distinctions believed to render the claims patentable over the references in presenting arguments in support of new claims and amendments. . . . The prompt development of a clear issue requires that the replies of the applicant meet the objections to and rejections of the claims. Applicant should also specifically point out the support for any amendments made to the disclosure. See MPEP § 2163.06. . . . An amendment which does not comply with the provisions of 37 CFR 1.121(b), (c), (d), and (h) may be held not fully responsive. See MPEP § 714.” MPEP § 714.02. Generic statements or listing of numerous paragraphs do not “specifically point out the support for” claim amendments. “With respect to newly added or amended claims, applicant should show support in the original disclosure for the new or amended claims. See, e.g., Hyatt v. Dudas, 492 F.3d 1365, 1370, n.4, 83 USPQ2d 1373, 1376, n.4 (Fed. Cir. 2007) (citing MPEP § 2163.04 which provides that a ‘simple statement such as ‘applicant has not pointed out where the new (or amended) claim is supported, nor does there appear to be a written description of the claim limitation ‘___’ in the application as filed’ may be sufficient where the claim is a new or amended claim, the support for the limitation is not apparent, and applicant has not pointed out where the limitation is supported.’)” MPEP § 2163(II)(A).
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 5-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claims 5 and 17 substantially recite “selecting a single trainable model from the plurality of modified models” and carrying out various other operations on the “single trainable model.” The Specification describes a “single trainable model” being provided to a backpropagation trainer and receiving the single trainable model back from the backpropagation trainer. See Spec. ¶53 (“In the example of FIGS. 5 and 7, a single trainable model 122 is provided to the backpropagation trainer 180 and a single trained model 182 is received from the backpropagation trainer 180. When the trained model 182 is received, the backpropagation trainer 180 becomes available to train another trainable model. Thus, because training takes more than one epoch, trained models 182 may be input into the genetic algorithm 110 sporadically rather than every epoch after the initial epoch. In some implementations, the backpropagation trainer 180 may have a queue or stack of trainable models 122 that are awaiting training. The genetic algorithm 110 may add trainable models 122 to the queue or stack as they are generated and the backpropagation trainer 180 may remove a training model 122 from the queue or stack at the start of a training cycle. In some implementations, the system 100 includes multiple backpropagation trainers 180 (e.g., executing on different devices, processors, cores, or threads). Each of the backpropagation trainers 180 may be configured to simultaneously train a different trainable model 122 to generate a different trained model 182. In such examples, more than one trainable model 122 may be generated during an epoch and/or more than one trained model 182 may be input into an epoch.”) This is not sufficient to limit the claims such that reciting “a single trainable model” should be interpreted to mean only one trainable model. While this is implied by the description and use of terms form the Specification in the claims, the use of “single” could also reasonably be interpreted as synonymous with “one”, in which case the single trainable model would read on two single trainable models. Further, the Specification explains that the “genetic algorithm 110 may add trainable models 122 to the queue or stack as they are generated.” In other words, the Specification explains that multiple individual models are trained. It is submitted that, if only one model is to be trained within a given time window or trained between some other set of occurrences, the claims should be amended to reflect this. Since it is not clear whether “single” refers to only one or refers to at least one, the claim language is indefinite.
All dependent claims are rejected as including the material of the claims from which they depend.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-14 and 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Dehghani (Minimum miscibility pressure prediction based on a hybrid neural genetic algorithm, 2007) and Bukharov (Development of a decision support system based on neural networks and a genetic algorithm; 2015).
1. A computer system comprising: a memory configured to store an input data set and a plurality of data structures, each of the plurality of data structures including data representative of a neural network; and (“In this paper, the advantage of a neural genetic computing technique in modeling the prediction of MMP in a gas injection process is examined.” Dehghani PP. 174-175. “The prediction begins by generating ANNs that define the relationship between the input/output data. Then, the optimization of ANNs set is done with genetic algorithms as a robust optimization tool.” Dehghani P. 175. “Firstly, ANNs are constructed with BP training algorithm based on a collection of training samples. Then, the GA is employed to examine good solutions among solution space.” Dehghani P. 178. “A major issue in genetic based design of neural network is that of representation (encoding). The encoding should be capable of capturing all the important aspects of the problem. Therefore, in GA, the representation scheme should be capable of allowing new, meaningful and valid network architecture to be produced by the genetic operators, like crossover or mutation. Genetic algorithms are applied to neural networks in two different ways: . . . They are used in designing the structure of the network.” Dehghani P. 176. Therefore the evolution that has been introduced to neural networks can be divided generally into different levels: (a) connection weights; (b) architecture; (c) learning rules. In the application of GA to training of neural networks, the parameters of the problem are encoded as a set of chromosomes called the population and the candidate solutions are assigned fitness values based on the constraints of the problem. Based on each individual’s fitness, a selection mechanism selects a mate with high fitness value for genetic manipulation. Dehghani P. 176.
While one of ordinary skill in the art would understand the machine learning algorithm of Dehghani refers to something implemented on a computer with a memory and a processor, Dehghani does not explicitly state that a memory is used to implement the machine learning algorithms taught in the reference.
Bukharov teaches “The evolutionary approach to the system we have developed is manifested in the use of genetic algorithm, interval neural networks, and general-purpose graphics processing units (GPGPU). GPU calculations involve using central processing units (CPU) together with GPUs to accelerate the calculation by means of large-scale parallelization of algorithms. This calculation method was invented more than ten years ago (Fung & Mann, 2004). It is now actively used to solve a wide range of tasks requiring fast performance of cumbersome calculations. In spite of the fact that the cores of graphic processors are not as fast as those of central processors, the former are superior due to the number of the cores (from about 300 cores on standard graphic cards to more than 4000 cores on one of the latest ASUS products). One of the most expensive products of the ‘home’ range of Intel processors, Intel Core i7-975 XE of 3.33 GHz produced in 2009, has 53.3 Gflops peak productivity whereas the GPU Nvidia Tesla K10 (3072 cores) has as many as 4.58 Tflops (NVIDIA official website) due to parallel calculations. Moreover, the cost of the graphic card is much lower than that of a CPU cluster with a similar productivity.” As would be understood by POSA, an Nvidia Tesla K10 includes memory.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Bukharov because using a processor including a memory to implement a machine learning algorithm can carry out trillions of calculations per second, which may save time compared with carrying out the calculations manually.)
a processor configured to (While it is clear from the context of the document that a model run on a processer is being discussed throughout the document, Dehghani further clarifies that the genetic algorithm is automated: “The design of neural networks using GA principles can be very beneficial in terms of two main issues: It automates the design of the network which will otherwise have to be done by hand using trial and error.” Dehghani 174. Dehghani further explains that “processing speed” is a performance criteria during implementation of ANN’s. See Dehghani Fig. 1. Note also that the motivation to combine Bukharov’s teaching of a processor applies to this limitation.) execute a recursive search for a plurality of iterations, wherein executing the recursive search comprises, during a first iteration of the recursive search: generating a plurality of modified data structures based on the plurality of data structures: (“Once the GA generates a new solution, the ANN will be used to determine its fitness value for the GA to continue its searching process. Until the stopping criterion of the GA is satisfied, the strategy will output the best solution resulted by the GA and its performance determine by detail evaluation based on comparison between measured and predicted MMP values.” Dehghani P. 178.) selecting a trainable data structure from the plurality of modified data structures; (“Then, the GA is employed to examine good solutions among solution space. Once the GA generates a new solution, the ANN will be used to determine its fitness value for the GA to continue its searching process. Until the stopping criterion of the GA is satisfied, the strategy will output the best solution resulted by the GA and its performance determine by detail evaluation based on comparison between measured and predicted MMP values. In order to optimize the network structure, the parameters of the network, such as number of neurons in the hidden layers, momentum and learning coefficients, are treated as variables. Here, the number of neurons in the two hidden layers are coded as binary variables and are allowed to take only integer values. The other two parameters, learning and momentum coefficients are coded as real variables and are allowed to take real values. The fitness is evaluated after training the network for specific iterations, which is kept as the mean square error (MSE) of the network and the final values are chosen. The optimization is carried out with a population of 30 and the stopping criteria as the maximum number of generations, which is kept 50.” Dehghani P. 178.) initiating an optimization process on the trainable data structure, the optimization process executed concurrently with at least a portion of the recursive search, (The claim language reads on executing the optimization process within the same process loop implementing the recursive search. For clarity, note that this language does not recite carrying out specific mutation/crossover/reproduction operations of the recursive search on one set of models concurrently with carrying out an optimization process (i.e. backpropagation) on a different model. If this is the desired scope, the claim language may be amended to clarify, assuming support. Dehghani figure 5 illustrates a loop in which ANNs are trained using backpropagation (BPNN), within a recursive genetic algorithm:
PNG
media_image1.png
200
400
media_image1.png
Greyscale
This is sufficient to teach the claim language, in its current form. Dehghani teaches implementing the loop above with a population of 30 models: “Then, the GA is employed to examine good solutions among solution space. Once the GA generates a new solution, the ANN will be used to determine its fitness value for the GA to continue its searching process. Until the stopping criterion of the GA is satisfied, the strategy will output the best solution resulted by the GA and its performance determine by detail evaluation based on comparison between measured and predicted MMP values. In order to optimize the network structure, the parameters of the network, such as number of neurons in the hidden layers, momentum and learning coefficients, are treated as variables. . . . The fitness is evaluated after training the network for specific iterations, which is kept as the mean square error (MSE) of the network and the final values are chosen. The optimization is carried out with a population of 30 and the stopping criteria as the maximum number of generations, which is kept 50.” Dehghani P. 178 col. 2.
In the interest of compact prosecution, the claim is also evaluated with the scope limited to executing out the specific operations of a genetic algorithm (e.g. mutation, crossover, and reproduction) concurrently with an optimization process (i.e. backpropagation). That is, the claim is also evaluated as though it required the operations of figure 8B (operations 806 808 826, 828, 830, . . . 806 . . .) to be carried out on one model, while the operations of figure 8A are simultaneously carried out on other models.
Based on this interpretation, which is narrower than the scope of the claim in its current form, it is not clear from the teaching of Dehghani whether the individual steps within genetic algorithm and backpropagation are carried out concurrently.
Carrying out backpropagation on neural networks concurrently with implementing a genetic algorithm on other models in the population is obvious in view of the teaching of Dehghani in view of Bukharov. As shown above, Dehghani teaches carrying out optimization on a population of 30 models during 50 generations of the genetic algorithm. See Dehghani P. 178 col. 2. The reference does not expressly state whether any of the 30 models undergo backpropagation optimization while the genetic algorithm operates on the other models. Bukharov teaches a technique for improving training time in an algorithm that combines genetic and backpropagation features, by training a series of models using backpropagation in a separate location before they undergo operations that are part of the genetic algorithm. See Bukharov P. 6182. (“The scheme shows that one of the key advantages of the proposed system is the opportunity to carry out forecasting as soon as the first cycle of genetic algorithm has been finished. This is possible due to replenishment of the pool with the networks with the smallest forecasting error on each step of the genetic algorithm, which results in a more precise value prediction. The second and main advantage of the system is the parallelization of the most labor-intensive procedure, i.e. training a set of neural networks. Owing to the GPGPU model, this algorithm works significantly faster than a similar one using just a CPU.”) See also Bukharov Fig. 3 page 6181.
PNG
media_image2.png
200
400
media_image2.png
Greyscale
As shown in Figure 3 of Bukharov (directly above) different models are input for “neural network learning” (backpropagation/optimization) and carried through the loop implementing a recursive genetic algorithm. The combined teaching of the references includes training 30 models using backpropagation modified 50 times by the genetic algorithm (Dehghani) where the backpropagation is carried out using separate hardware than the genetic algorithm (Bukharov). Both references are directed to continuous processes. Given that the stated goal of Dehghani is reduced training time by using separate hardware to train the models using backpropagation, one of ordinary skill in the art would understand the backpropagation and genetic algorithms to continue uninterrupted, and therefore simultaneously.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Dehghani and Bukharov with respect to this limitation, because offloading backpropagation to execute independently at a separate location, including during execution of the genetic algorithm, reduces training time.) wherein the optimization process is configured to: train the trainable data structure based on a portion of the input data set to generate a trained data structure; (“Back propagation neural network was assumed for all the runs.” Dehghani P. 178. “Then, the GA is employed to examine good solutions among solution space. Once the GA generates a new solution, the ANN will be used to determine its fitness value for the GA to continue its searching process. Until the stopping criterion of the GA is satisfied, the strategy will output the best solution resulted by the GA and its performance determine by detail evaluation based on comparison between measured and predicted MMP values.” Dehghani P. 178. See also Fig. 5, block 3, showing networks trained using backpropagation (abbreviated as BPNN) after reproduction, crossover, and mutation.) and provide the trained data structure as input to a second iteration of the recursive search, the second iteration subsequent to the first iteration wherein the second iteration and the first iteration are separated by at least one intervening iteration. (See Fig. 5 showing a continuous loop including BPNN and reproduction, crossover, mutation being carried out until a criterion is met. “The optimization is carried out with a population of 30 and the stopping criteria as the maximum number of generations, which is kept 50.” Dehghani p. 178. Note that 50 generations, as contemplated in the reference, teaches the claimed iterations “separated by at least one intervening iteration.”)
2. The computer system of claim 1, wherein generating the plurality of modified data structures includes performing a genetic operation on at least one data structure of the plurality of data structures, (See rejection of claim 1.) and wherein training the trainable data structure includes changing a connection weight of the trainable data structure. (See Dehghani P. 175 section 2.1 explaining generic neural network training. “The classic BP updates the weight by following summarized rule” shown in equation 5. See Dehghani P. 175 section 2.1.)
3. The computer system of claim 1, wherein generating the plurality of modified data structures comprises at least one of performing a crossover operation or performing a mutation operation with respect to the one or more data structures. (See Dehghani Fig. 5.)
4. The computer system of claim 1, wherein the optimization process is executed on a different device, graphics processing unit (GPU), processor, core, thread, or any combination thereof, than the recursive search. (The primary reference does not expressly discuss which GPU, processor, core or thread are used to execute different parts of the algorithm.
Bukharov teaches: “GPU calculations involve using central processing units (CPU) together with GPUs to accelerate the calculation by means of large-scale parallelization of algorithms. This calculation method was invented more than ten years ago (Fung & Mann, 2004). It is now actively used to solve a wide range of tasks requiring fast performance of cumbersome calculations. In spite of the fact that the cores of graphic processors are not as fast as those of central processors, the former are superior due to the number of the cores (from about 300 cores on standard graphic cards to more than 4000 cores on one of the latest ASUS products). One of the most expensive products of the ‘‘home’’ range of Intel processors, Intel Core i7-975 XE of 3.33 GHz produced in 2009, has 53.3 Gflops peak productivity whereas the GPU Nvidia Tesla K10 (3072 cores) has as many as 4.58 Tflops (NVIDIA official website) due to parallel calculations.” Bukharov P. 6180. “Thus, using a combination of CPU and GPU results in achieving an unprecedented productivity with an ordinary PC by virtue of simultaneous processing of the sequential part of the code on a CPU and calculating its parallelized parts on a GPU.” Bukharov P. 6181. “The system algorithm: . . . 2. The next step is to train a set of neural networks for each set of parameters. Learning is carried out using CUDA for parallel computing on GPU, which significantly reduces the algorithm’s operating time. . . . 3. If there is a network working better than the others (i.e. with a smaller error) among the trained networks, the parameters of this network are saved in the high-quality network pool. As soon as the first network is in the pool, the forecasting algorithm can work simultaneously with the search algorithm, choosing a network with the smallest error.” Bukharov P. 6181. “The second and main advantage of the system is the parallelization of the most labor-intensive procedure, i.e. training a set of neural networks. Owing to the GPGPU model, this algorithm works significantly faster than a similar one using just a CPU.” Bukharov P. 6182. One of ordinary skill in the art would understand training of the neural networks (using backpropagation as indicated in Bukharov P. 6179) using GPU’s while other less parallel operations, e.g. a recursive search, would be carried out using a CPU. Note also that only step 2 “neural network learning” is labeled “CUDA GPGPU” in figure 3 of Bukharov. Note further, that parallel processing on GPU’s, which reduces the time to carry out parallel tasks, uses different cores.
With respect to this limitation, it would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Bukharov because using GPUs for NN training saves time, while keeping other tasks on the CPU avoids bandwidth limitations of sending data to GPUs.
5. A method comprising: during a first epoch of a genetic algorithm with a plurality of iterations: (See Dehghani P. 178 Fig. 5.) generating a plurality of modified models based on a plurality of models, each of the plurality of models including data representative of a neural network, (“Firstly, ANNs are constructed with BP training algorithm based on a collection of training samples.” Dehghani P. 178, col. 2. See also Dehghani Fig. 5, box titled “Train the network up to a specified number of iterations by BPNN and evaluate fitness (MSE).”) selecting a single trainable model from the plurality of modified models; (“Then, the GA is employed to examine good solutions among solution space. Once the GA generates a new solution, the ANN will be used to determine its fitness value for the GA to continue its searching process. Until the stopping criterion of the GA is satisfied, the strategy will output the best solution resulted by the GA and its performance determine by detail evaluation based on comparison between measured and predicted MMP values. In order to optimize the network structure, the parameters of the network, such as number of neurons in the hidden layers, momentum and learning coefficients, are treated as variables. Here, the number of neurons in the two hidden layers are coded as binary variables and are allowed to take only integer values. The other two parameters, learning and momentum coefficients are coded as real variables and are allowed to take real values. The fitness is evaluated after training the network for specific iterations, which is kept as the mean square error (MSE) of the network and the final values are chosen. The optimization is carried out with a population of 30 and the stopping criteria as the maximum number of generations, which is kept 50.” Dehghani P. 178. Note also that Bukharov teaches “If there is a network working better than the others (i.e. with a smaller error) among the trained networks, the parameters of this network are saved in the high-quality network pool.” Bukharov p. 6181.) and initiating an optimization process on the single trainable model, the optimization process executed concurrently with at least one portion of the genetic algorithm; (See corresponding section of the rejection of claim 1. For a “single” trainable model, see section above.) and adding a trained model, output by the optimization process based on the trainable model, as input to a second epoch of the genetic algorithm, the second epoch subsequent to the first epoch. (See Fig. 5 showing a continuous loop including BPNN and reproduction, crossover, mutation being carried out until a criterion is met. See also rejection of claim 1. Claim 1 includes limitations directed to computer hardware necessitating a secondary reference, but only portions of Dehghani cited in claim 1 are indicated as teaching the limitations of this claim.)
6. The method of claim 5, further comprising selecting the plurality of models based on their respective fitness values. (See rejection of claim 2. Note that both references are to continuous processes operating on a population of models as is clear from the rejection of claim 1, including the process flow figures.)
7. The method of claim 5, further comprising generating the single trainable model including performing at least one crossover operation with respect to the plurality of models. (See Dehghani Fig. 5. See also rejection of claim 1.)
8. The method of claim 5, wherein the optimization process is configured to use a portion of an input data set associated with the genetic algorithm to train the single trainable model. (“The experimental data that were applied to develop and authenticate the new model are presented in Table 1. Fig. 5 shows the offered flowchart of neural genetic based algorithm for gas–oil MMP modeling. Back propagation neural network was assumed for all the runs.” Dehghani P. 178. See also Dehghani Fig. 5)
9. The method of claim 5, wherein a particular model of the plurality of models includes data representative of a particular neural network, and the data representative of the particular neural network is indicative of connections between nodes of the particular neural network. (See Dehghani fig. 4. “The fitness is evaluated after training the network for specific iterations, which is kept as the mean square error (MSE) of the network and the final values are chosen. The optimization is carried out with a population of 30[.]” Dehghani P. 178. See also Dehghani Section 2, explaining the structure of a neural network in general.)
10. The method of claim 5, wherein a particular model of the plurality of models includes data representative of a particular neural network, and the data representative of the particular neural network is indicative of an activation function associated with one or more nodes of the particular neural network. (See Dehghani Section 2.1 explaining the structure of a neural network. The function labeled g(x) refers to the activation function.)
11. The method of claim 5, wherein the first epoch is an initial epoch of the genetic algorithm. (See Dehghani Figs. 3 and 5 teaching implementation of a genetic algorithm as a continuous process.)
12. The method of claim 5, wherein the first epoch is a non-initial epoch of the genetic algorithm. (See Dehghani Figs. 3 and 5 teaching implementation of a genetic algorithm as a continuous process.)
13. The method of claim 5, wherein the second epoch and the first epoch are separated by at least one intervening epoch. (See Dehghani Figs. 3 and 5 teaching implementation of a genetic algorithm as a continuous process. “[T]he stopping criteria as the maximum number of generations, which is kept 50.” Dehghani P. 178.)
14. The method of claim 5, further comprising, during the first epoch or the second epoch, removing from the plurality of models one or more models that satisfy a stagnation criterion. (The previously cited art indicates that models which perform better are used for other steps in the genetic algorithm, but does not expressly state that other models are removed. One of ordinary skill in the art would understand a teaching that high performing models move on to imply leaving poor performing models behind, an express teaching will be cited. The length of the explanation offers context to the portion in bold, which expressly teaches the claimed subject matter.
Bukharov teaches: “Imitating this process, genetic algorithms are capable of ‘‘developing’’ solutions to real tasks if the latter are coded appropriately. They work with sets of ‘‘individuals’’, i.e. populations, each individual presenting a possible solution to the problem. Each individual is estimated by a measure of its fitness function (how close a given solution is to achieving the objectives). The most adapted individuals are capable of ‘‘reproducing’’ the next generation by means of a ‘‘crossover’’ with other individuals of the population. It helps to create new individuals combining some characteristics inherited from the parents. The least adapted individuals have a lower probability of reproduction. Therefore, their individual features will gradually disappear from the population in the course of evolution. A new population of possible solutions is reproduced this way, by selecting the best representatives of the previous generation, crossing them and producing a set of new individuals. This new generation contains a better set of characteristics inherited from the best representatives of the previous generation. Thus, from generation to generation, ‘‘good’’ characteristics extend to the entire population. Crossing the fittest individuals allows us to investigate the most promising areas of the search space. Finally, the population will converge to the optimum task solution. The initial generation of parameter sets (individuals) for a genetic algorithm is determined randomly. Thereafter, the sets are accepted as the fittest if the network trained with them gives the minimum error. The new generation of individuals is obtained by crossing the fittest individuals of the previous generation and mutation. The crossing (crossover) is performed as follows: both chromosomes are randomly divided into parts, which are later swapped. New generations are produced until the population has converged. On every step, the individuals are subject to low-probability mutation to prevent premature convergence. In case of a mutation, a separate gene parameter is replaced with another one from the general set of parameters.” Bukharov P. 6180. Note that for the features to “disappear from the population,” the individual must also disappear from the population.)
16. The method of claim 5, further comprising generating the single trainable model including performing at least one mutation operation with respect to the plurality of models. (See Dehghani Fig. 5. See also rejection of claim 1.)
17. A non-transitory computer-readable storage device storing instructions that, when executed, cause a computer to perform operations comprising: (Dehghani teaches: “In this study, a hybrid neural genetic algorithm (GA-ANN) is proposed with the purpose of automate the design of neural network for dissimilar type of structures.” While one of ordinary skill in the art would understand that an algorithm used to automate design of a neural network refers to computer-readable instructions stored on a device and executed (i.e. refers to software run on a computer,) a secondary reference will be used as an express teaching of software used to implement machine learning.
Bukharov teaches “The evolutionary approach to the system we have developed is manifested in the use of genetic algorithm, interval neural networks, and general-purpose graphics processing units (GPGPU). GPU calculations involve using central processing units (CPU) together with GPUs to accelerate the calculation by means of large-scale parallelization of algorithms. This calculation method was invented more than ten years ago (Fung & Mann, 2004). It is now actively used to solve a wide range of tasks requiring fast performance of cumbersome calculations. In spite of the fact that the cores of graphic processors are not as fast as those of central processors, the former are superior due to the number of the cores (from about 300 cores on standard graphic cards to more than 4000 cores on one of the latest ASUS products). One of the most expensive products of the ‘home’ range of Intel processors, Intel Core i7-975 XE of 3.33 GHz produced in 2009, has 53.3 Gflops peak productivity whereas the GPU Nvidia Tesla K10 (3072 cores) has as many as 4.58 Tflops (NVIDIA official website) due to parallel calculations. Moreover, the cost of the graphic card is much lower than that of a CPU cluster with a similar productivity.” As would be understood by POSA, an Nvidia Tesla K10 includes memory. “Nvidia CUDA (Compute Unified Device Architecture) was used to implement our system. CUDA is a parallel computing platform and programming model making parallel calculations with the help of Nvidia graphic processors (NVIDIA official website) to support the GPGPU approach. . . . Thus, using a combination of CPU and GPU results in achieving an unprecedented productivity with an ordinary PC by virtue of simultaneous processing of the sequential part of the code on a CPU and calculating its parallelized parts on a GPU.” Bukharov P. 6180-6181.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Bukharov for this limitation because using software (CUDA) to implement a machine learning algorithm on a CPU and GPU (which includes memory) enables carrying out trillions of calculations per second, thereby saving time compared with carrying out the calculations manually.) during a first iteration of a plurality of iterations: generating a plurality of modified models based on a plurality of models, each of the plurality of models including data representative of a neural network; selecting a single trainable model from the plurality of modified models; and initiating an optimization process on the single trainable model, the optimization process executed concurrently with at least a portion of the plurality of iterations; and adding a trained model, output by the optimization trainer based on the single trainable model, as input to a second iteration, the second iteration subsequent to the first iteration. (See rejection of claim 1. Both Dehghani and Bukharov teach iteratively selecting individual models. See Dehghani p. 178 (“the strategy will output the best solution resulted by the GA”) and Bukharov p. 6181 (“If there is a network working better than the others (i.e. with a smaller error) among the trained networks, the parameters of this network are saved in the high-quality network pool.”) Note also that selecting a “single model” does not limit to selecting only one model (or only one model per iteration).)
18. The non-transitory computer-readable storage device of claim 17, wherein one or more iterations of the plurality of iterations occur between an end of the first iteration and a start of the second iteration. (“Back propagation neural network was assumed for all the runs.” Dehghani P. 178. See also rejection of claim 1 citing Dehghani.)
19. The non-transitory computer-readable storage device of claim 17, wherein the operations further comprise selecting the plurality of models based on their respective fitness values. (See rejection of claim 2. Note that both references are to continuous processes operating on a population of models as is clear from the rejection of claim 1, including the process flow figures.)
20. The non-transitory computer-readable storage device of claim 17, wherein the single trainable model is generated by performing at least one of a crossover operation or a mutation operation with respect to the plurality of models. (See Dehghani Fig. 5. See also rejection of claim 1.)
Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Dehghani (Minimum miscibility pressure prediction based on a hybrid neural genetic algorithm, 2007), Bukharov, and Wang (Improved Genetic Neural Network for Image Segmentation; 2011)
15. The method of claim 5, wherein each of the plurality of models includes at least one output node configured to generate a classifier result. (The previously cited art does not expressly teach models configured to generate a classifier result. Wang teaches: “Image processing is a technology that carried out by computer image analysis and gets the results. Image segmentation is the key technology of image processing; it is a process which can decompose an image into a lot of sets of special, non-overlapping, with a collection of strong correlation. In this paper, image segmentation is regarded as a classification problem, so how to get an efficient classification method is the focus of this study. Genetic neural network is a common method to solve classification problems, it is a network model that use genetic algorithm to optimize the BP neural network.” Wang P. 1694. “BP neural network is designed to consider the number of network layer, input layer, hidden layer and output layer nodes, transfer function, the choice of training parameters and methods of selection.” Wang P. 1694. “Image (G) in each pixel ( Gij ) is to be classified a sample of this sample into the genetic neural network ( sim ) classification, a feature of the output value of Vi, the characteristic value determining the sample belongs to a class probability. We can decide, if the value is greater than 0.5, then it is the prospect that (F), otherwise it is the background (B).” Wang P. 1695.
It would have been obvious to one of ordinary skill in the art before the effective filing date to combine the teaching of Wang because using models which include an output node configured to generate a classification result in the system of the primary reference would have been understood by one of ordinary skill in the art to be a viable way of improving models used for image recognition.
Response to Arguments
Applicant's arguments filed 01/22/2026 have been fully considered but they are not persuasive.
Rejections under § 103
Applicant states that Dehghani fails to teach a “trained data structure as input to a second iteration of the recursive search, the second iteration subsequent to the first iteration, wherein the second iteration and the first iteration are separated by at least one intervening iteration” because, according to Applicant, “each candidate of the new population of Dehghani is based on manipulations of candidates of the immediately previous population.” Rem. 6. It is not clear whether Applicant’s is asserting that each candidate of the new population is based on all candidates of the immediately previous population, or if Applicant is merely asserting some relationship between some models between consecutive iterations. However, it is not clear from the Remarks why either case would be inconsistent with a teaching of a trained data structure being input to a subsequent iteration separated by an intervening iteration. Absent some explanation, the amended claim language is obvious in view of the art of record. Applicant also asserts that Bukharov fails to teach this limitation, but no distinction is clearly articulated. It is submitted that merely juxtaposing claim language with language from the references does not constitute a substantive argument.
Applicant cites language describing a genetic algorithm as exhibiting “implicit parallelism which does not evaluate and improve a single solution” as support for the proposition that the reference fails to teach “selecting a single trainable model.” Rem. 7. The Remarks conveniently fail to consider that the prior art may be referring to fundamental genetic operations that combine aspects of two models (e.g. crossover), which are also described in the Specification of this application and claimed. See e.g. Spec. ¶45 and Claims 3, 20. Nothing in the Remarks explains why crossover is inconsistent with selecting a single model. Further, the claims do not limit to selection of only one model and the Specification does not describe the selection of a single model such that use of “single” in the claims should be construed as synonymous with selecting one (or more) model(s). If only one model, or only one model per training iteration is the intended scope, it is recommended that this be clarified in the claims. See also rejection of claim 5.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL M KNIGHT whose telephone number is (571) 272-8646. The examiner can normally be reached Monday - Friday 9-5 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michelle Bechtold can be reached on (571) 431-0762. The fax phone number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
PAUL M. KNIGHTExaminerArt Unit 2148
/PAUL M KNIGHT/Examiner, Art Unit 2148