DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the amendment filed on 12/17/2025. Claims 1-20 are pending in the case.
Applicant Response
In Applicant’s response dated 12/17/2025, Applicant amended Claims 1-5, 8, 9, 11, 14, 15, 17, and 18 and argued against all objections and rejections previously set forth in the Office Action dated 09/18/2025.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed towards an abstract idea, without significantly more.
Step 1
According to the first part of the analysis, in the instant case, claims 1-7 are directed to a computer-implemented method, claims 8-14 are directed to a computer program product, and claim 15-20 is directed to a computer system. Thus, each of the claims falls within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).).
Regarding Claim 1, 8 and 15,
At step 2A, prong 1, Does the claim recite a judicial exception?
Claim 1 further recites the steps of :
attempting to retrieve attribute values corresponding to a set of attributes and associated with a transaction (This step involves data gathering and is fall under organizing human activity or data gathering group of abstract idea)
determining, at a first point in time subsequent to the attempting to retrieve the attribute values, that a first subset of the attribute values is available that a first subset of the attribute values is unavailable (This step involves determining which values are available or available and fall under mental process grouping of abstract idea.”)
in response to determining that the first subset of the attribute values is available and that the second subset of the attribute values is unavailable, modifying a first instance of a machine learning model (This step involves performing operations on models that involve mathematical calculations and fall under Mathematical concept grouping of abstract idea.”)
wherein the first instance of the machine learning model comprises a plurality of input layers, wherein each input layer comprises input nodes corresponding to a corresponding subset of the attributes, wherein a first set of input nodes from a first input layer of the plurality of input layers is connected to a second set of input nodes from a second input layer of the plurality of input layers via one or more first connections, wherein the second set of input nodes is connected to a third set of input nodes from a third input layer of the plurality of input layers via one or more second connections, and wherein the modifying comprises removing the one or more second connections in the first instance of the machine learning model; … ” (This step involves machine learning model algorithm and fall under Mathematical concept grouping of abstract idea.”)
providing the first subset of the attribute values as input values to the modified first instance of the machine learning model, wherein the modified first instance of the machine learning model is configured (i) to impute the second subset of the attribute values based on the first subset of the attribute values and to produce a first output based on one or more second connections between the plurality of input layers and the first subset of the attribute values and (ii) the imputed second subset of the attribute values … (This step involves data processing operations and is understood to be the “abstract idea” category of mathematical operation/algorithm, i.e. evaluation)
performing, for the transaction, an action based on the first output (This step involves data processing operations and is understood to be the “abstract idea” category of certain method of organizing human activity)
The claim recites mathematical concept/ mathematical operations that involve data retrieval and data availability which falls within the “Mathematical concepts” groupings of abstract ideas. Accordingly, the claims recite an abstract idea.
Step 2A prong 2: Does the claim recite additional elements? Do those additional elements, individually and in combination, integrate the judicial exception into a practical application?
Further, the claim does not recite any additional element which could integrate this abstract idea into a practical application, because the additional elements recited of consist of:
a non-transitory memory; and one or more hardware processors coupled with the non-transitory memory and configured to read instructions from the non-transitory memory to cause the system to, that is, generic computer components on which to implement the abstract idea (see MPEP 2106.05(f));
attempting to retrieve attribute values corresponding to a set of attributes and associated with a transaction, that is, insignificant extra-solution activity of data gathering (see MPEP 2106.05(g)),
performing, for the transaction, an action based on the first output, which is merely “using a computer or other machinery” as a tool to perform the abstract idea step of generating an output (see MPEP 2106.05(f)).
The additional elements are recited at a high level of generality and do not amount to significantly more than the abstract idea (MPEP 2106.05(f)). Thus the claim is directed towards the abstract idea.
Step 2B: Do the additional elements, considered individually and in combination, amount to significantly more than the judicial exception?
No, As shown above with respect to integration of the abstract idea into a practical application, the additional element of “a non-transitory memory; and one or more hardware processors coupled with the non-transitory memory and configured to read instructions from the non-transitory memory to cause the system to, that is, generic computer components on which to implement the abstract idea (see MPEP 2106.05(f));
attempting to retrieve attribute values corresponding to a set of attributes and associated with a transaction, that is, insignificant extra-solution activity of data gathering (see MPEP 2106.05(g)),
performing, for the transaction, an action based on the first output. The additional elements, alone and in combination, fail to integrate the abstract idea into a practical application. Thus, the claims are not patent eligible. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. Neither can insignificant extra-solution activity. All of these additional elements as generically claimed are thus considered well-understood, routine, and conventional. Therefore, these limitations, taken alone or in combination, do not integrate the abstract idea into a practical application or recite significantly more that the abstract idea.
Thus, these independent claims are not patent eligible.
The dependent claims respectively recite a judicial exception in limitations of: “wherein the first input layer corresponds to the first subset of the attribute values, wherein the second input layer corresponds to the second subset of the attribute values, and wherein the third input layer corresponds to a third subset of the attribute values.(claims 2), “wherein at least one of the first input layer, the second input layer, or the third input layer is connected to a hidden layer of the modified first instance of the machine learning model. (claims 3, 14), “wherein the modified first instance of the machine learning model is further configured to generate a set of intermediate values based on the first subset of the attribute values and the imputed second subset of the attribute values, and provide the set of intermediate values to a hidden layer of the modified first instance of the machine learning model.” (claims 4), “wherein the operations further comprise: determining that a third subset of the attribute values is unavailable at the first point in time; and providing the first subset of the attribute values as input values to a second instance of the machine learning model, wherein the second instance of the machine learning model is configured to produce a second output based on the first subset of the attribute values.” (claim 5, 9, 18), “wherein the operations further comprise: comparing the first output against the second output, wherein the performing the action is further based on the comparing.” (claims 6, 19), “wherein the operations further comprise: determining that a difference between the first output and the second output is below a threshold; and calculating a merged output value based on the first output and the second output, wherein the performing the action is further based on the merged output value.” (claims 7, 20), “determining that the difference exceeds a threshold; and withholding the processing the transaction request until one or more attribute values from the second subset of the attribute values are available.”(claim 10), “determining that the one or more attribute values are available; generating a third modified machine learning model based on the one or more attribute values; and providing the first subset of the attribute valuestransaction request is processed further based on the third output.”(Claim 11), “wherein the training the machine learning model comprises: selecting different subsets of the training data set corresponding to different subsets of the set of attributes for training the machine learning model.”(claim 12), “wherein the training the machine learning model further comprises: selecting a first subset of the training data set; generating a fourth modified machine learning model by based on a selection of the first subset of the training data set; providing the first subset of the training data set to the fourth instance of the machine learning model; and adjusting parameters of the machine learning model based on an output value obtained from the fourth modified machine learning model.”(claim 13), “wherein the device is configured to process the transaction request based on the prediction output.” (claim 16), wherein the plurality of input layers comprises a first input layer corresponding to the first subset of the attribute values, a second input layer corresponding to the second subset of the attribute values, and a third input layer corresponding to a third subset of the attribute values (claim 17).
These additional limitations (in claims 2-7, 9-14, and 16-20) also constitute concepts performed in the human mind which fall within the “Mental Processes” groupings of abstract ideas.
This judicial exception is not integrated into a practical application. Additional elements “computer readable medium comprising: computer program code (in claims 2-7, 9-14, and 16-20), all amount to no more than adding insignificant extra-solution activity/specifications related to data gathering, data input, or data transmittal. These additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The dependent claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of non-transitory computer readable medium comprising: computer program code are again insignificant extra-solution activity steps that cannot provide an inventive concept. All of these additional elements as generically claimed are considered well-understood, routine, and conventional.
Therefore, these limitations, taken alone or in combination, do not integrate the abstract idea into a practical application or recite significantly more that the abstract idea. Thus, all of the dependent claims are also not patent eligible.
Examiner Comments
6. In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Claim Rejections - 35 USC § 103
7. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
8. Claims 1, 8, and 12-16 are rejected under 35 U.S.C. 103 as being unpatentable over Mishra (Pat. No.: US 10733515 B1, Pub. Date: 2020-08-04) in view of Gupta (Pub. No US 20240354767 A1, Pub. Date: 2024-10-24 ) in further view of CHAPADOS (Pub. No US 20220067437 A1, Pub. Date: 2022-03-03)
Regarding independent Claim 1,
Mishra teaches a system, comprising:
a non-transitory memory; and one or more hardware processors coupled with the non-transitory memory and configured to read instructions from the non-transitory memory to cause the system to perform operations (see Mishra: Fig.1 illustrating he processor 102 executes instructions stored on computer-readable storage media 104.), comprising:
attempting to retrieve attribute values corresponding to a set of attributes and associated with transaction (see Mishra: Fig.3, Col. 7, Line 13-17, “At block 302, the data records are partitioned into Dataset A and Dataset B (a set of attributes). The data records may be partitioned randomly, such that the missing feature values are randomly distributed between Dataset A and Dataset B, such as by applying a randomizing algorithm to the data records and specifying that the data records are to be broken up into two or more subsets.”). (Examine notes that Mishra’s data’s is associated, for example, personal information that includes information about an individual, such as name, residence address, birthday, marriage status, length of time at residence, occupation, annual salary, number of people in the household, educational information, shopping habits, online activity, and other such types of information about a person.”, see for e.g. (Col.4, 1-6);
determining, at a first point in time subsequent to the attempting to retrieve the attribute value that a first subset of the attribute values is available and that the second subset of the attribute values is unavailable (see Mishra: Fig.1, Col.4, Line 20-25, “the partitioning algorithm 106 may divide the dataset into two subsets, one in which the data records are complete, and the other where the records contain missing feature values. … Col.4, Line 64-67, “the dataset may be partitioned to create Dataset A and Dataset B. In some instances, Dataset A is determined based upon no missing feature values in the data records, while Dataset B comprises the data records that are missing at least one feature value.”, i.e. Mishra discloses sequential processing pipeline by first attempting to retrieve attribute value and then determining which attribute values are available or unavailable after the attempt, which corresponds to at a first point in time )
in response to determining that the first subset of the attribute values is available and that the second subset of the attribute values is unavailable, modifying a first instance of a machine learning model, (see Mishra: Fig.4, Col.9, Line 13-20 “a first machine learning model is trained on Dataset A (i.e. modifying/changing first ML model based on Dataset A). In some cases, Dataset A is subdivided into two parts, one for training and one for verification. While this may be a useful validation tool, it is optional and may not be performed in every case. The first machine learning model is trained by looking for patterns within the Dataset, such that an input feature corresponds with a target feature.”).”). See also Fig.2, Line 20-27, “a first machine learning model is trained using Dataset A as training data. As described, in some instances, Dataset A includes the target features since it contains all the feature values in the data records.”), [… ]
providing the first subset of the attribute values as input values to the modified first instance of the machine learning model (see Mishra: Fig.4, Col.10, Line 1-3, “At block 410, the second machine learning model is applied on Dataset A and missing feature values from Dataset A are imputed by the machine learning model.” … Col.9, Line 57-61, “the second machine learning model is the first machine learning model after it has been trained on Dataset A, and may be considered a second machine learning model.”), wherein the modified first instance of the machine learning model is configured (i) to impute the second subset of the attribute values based on the one or more first connections and the first subset of the attribute values (see Mishra: Fig.4, Col.9, Line 60-67, “a second machine learning model is trained on Dataset B. In many cases, the second machine learning model is the first machine learning model after it has been trained on Dataset A, and may be considered a second machine learning model. Dataset B may be partitioned into training data and validation data. In some scenarios, the missing values in Dataset B are filled in with mean, median, or mode values in order to train the machine learning model. As the model progresses, those initial mean, median, or mode values, may be replaced with imputed values from the machine learning model.”), and (ii) to produce a first output based on the first subset of the attribute values and the imputed second subset of the attribute values (see Mishra: Fig.4, Col.10, Line 17-19, “the Dataset may be recombined in a database to form complete data records with little or no missing feature values of interest.”); and
Mishra teaches the system wherein:
performing, for the transaction, an action based on the first output.
the first instance of the machine learning model comprises a plurality of input layers, wherein each input layer comprises input nodes corresponding to a corresponding subset of the set of attributes, wherein a first set of input nodes from a first input layer of the plurality of input layers is connected to a second set of input nodes from a second input layer of the plurality of input layers via one or more first connections, wherein the second set of input nodes is connected to a third set of input nodes from a third input layer of the plurality of input layers via one or more second connections, and wherein the modifying comprises removing the one or more second connections in the first instance of the machine learning model;
However, Gupta teaches the system wherein:
performing, for the transaction, an action based on the first output (see Gupta: Fig.1, [0084], “the issuer service 121 can authorize the pending transaction 203. For example, the issuer service 121 can send a notification to the transaction terminal 106 and/or the client device 109 indicating that the pending transaction is authorized. Thereafter, this portion of the process proceeds to completion.”)
Because both Mishra and Gupta are in the same/similar field of endeavor of analyzing transaction using machine learning models, accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching of Mishra to include the system that perform, for the transaction, an action based on the first output as taught by Gupta. After modification of Mishra, the machine learning model that impute the missing attribute values can also incorporate a time- series transaction data and perform transaction output as taught by Gupta. One would have been motivated to make such a combination in order to provide companies that deploy machine-learning model to address problems customer marketing, fraud detection and prevention, credit decisioning, improve future decisions enables machine-learning models to deliver greater accuracy and predictability in their decisions over time. (see Gupta: [0002])
As shown above, Mishra and Gupta does not teach the system wherein:
wherein the first instance of the machine learning model comprises a plurality of input layers, wherein each input layer comprises input nodes corresponding to a corresponding subset of the set of attributes, wherein a first set of input nodes from a first input layer of the plurality of input layers is connected to a second set of input nodes from a second input layer of the plurality of input layers via one or more first connections, wherein the second set of input nodes is connected to a third set of input nodes from a third input layer of the plurality of input layers via one or more second connections, and wherein the modifying comprises removing the one or more second connections in the first instance of the machine learning model;
However, CHAPADOS teaches all the system wherein:
the first instance of the machine learning model comprises a plurality of input layers (see CHAPADOS: Fig.1, [0065], “each modality is initially processed by its own convolutional pipeline, independently of all others.”), wherein each input layer comprises input nodes corresponding to a corresponding subset of the set of attributes (see CHAPADOS: Fig.1, [0072],” More precisely, each set of a plurality of convolution kernels to be trained is for receiving a specific modality of the image and for generating a plurality of corresponding feature maps.”), wherein a first set of input nodes from a first input layer of the plurality of input layers is connected to a second set of input nodes from a second input layer of the plurality of input layers via one or more first connections (see CHAPADOS: Fig.2, [0088], “the combining unit 202 is used for combining, for each convolution kernel to be trained of the plurality of convolution kernels to be trained, each feature map generated by a given convolution kernel to be trained in each set of the more than one set of a plurality of convolution kernels to be trained to thereby provide a plurality of corresponding combined feature maps.”), wherein the second set of input nodes is connected to a third set of input nodes from a third input layer of the plurality of input layers via one or more second connections (see CHAPADOS: Fig.2, [0098], “second group of convolution kernels is an embodiment of a second feature map generating unit. The second feature map generating unit is used for receiving the at least one corresponding combined feature map from the unit for generating combined feature maps and for generating at least one final feature map using at least one corresponding transformation. It will be further appreciated that the generating of the at least one final feature map is performed by applying each of the at least one corresponding transformation on at least one of the at least one corresponding feature map received from the unit for generating combined feature maps”), and wherein the modifying comprises removing the one or more second connections in the first instance of the machine learning model (see CHAPADOS: Fig.1, [0068], “generating of each of the more than one corresponding feature map is performed by applying a given corresponding transformation on a given corresponding modality. It will be further appreciated that the more than one corresponding transformation is generated following an initial training performed in accordance with the processing task to be performed. As further explained below, the initial training is performed according to a pseudo-curriculum learning scheme wherein after a few iterations where all modalities are presented, modalities are randomly dropped”)
Because Mishra, Gupta and CHAPADOS are in the same/similar field of endeavor of field of machine learning and artificial intelligence (AI) models, accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching of Mishra to include the neural network architecture that operate with some inputs( modalities) missing by processing each inputs interpedently and removing input layers that are missing and allowing the model to operate only using the available inputs as taught by CHAPADOS. After modification of CHAPADOS, the machine learning model that impute the missing attribute values can also incorporate a time- the removal or modification of missing data input link or node as taught by CHAPADOS. One would have been motivated to make such a combination in order to create more stable and accurate ML models that enables machine-learning models to deliver greater accuracy and predictability in their decisions over time.
Regarding independent Claim 8,
Claim 8 is directed to a method claim and has same/similar claims claim 1 and is rejected under the same rationale.
Regarding Claim 12,
As shown above, Mishra, Gupta and CHAPADOS teaches all the limitations of Claim 8. Mishra teaches the method further comprising:
training the machine learning model using a training data set corresponding to the set of attributes (see Mishra: Fig.4, Col.9, Line 13-14, “first machine learning model is trained on Dataset A.”), wherein the training the machine learning model comprises:
selecting different subsets of the training data set corresponding to different subsets of the set of attributes for training the machine learning model (see Mishra: Fig.4, Col.9, Line 14-25 “In some cases, Dataset A is subdivided into two parts, one for training and one for verification. While this may be a useful validation tool, it is optional and may not be performed in every case. The first machine learning model is trained by looking for patterns within the Dataset, such that an input feature corresponds with a target feature. For example, the machine learning model may recognize a pattern between height and shoe size. That is, if the input feature is a male standing 6 feet 4 inches tall, the target feature may be that he wears size 13 shoes. The machine learning model may find this correlation between these two feature value.)
Regarding Claim 13,
As shown above, Mishra, Gupta and CHAPADOS teaches all the limitations of Claim 13. Mishra teaches the method further comprising:
the training the machine learning model further comprises selecting a first subset of the training data set (see Mishra: Fig43, Col.3, Line 5-13, “first machine learning model is trained on Dataset A.”), generating a fourth instance of the machine learning model by modifying the machine learning model based on a selection of the first subset of the training data set providing the first subset of the training data set to the fourth instance of the machine learning, adjusting parameters of the machine learning model based on an output value obtained from the fourth instance of the machine learning model (see Mishra: Fig.4, Col.10, Line 4-11, “The machine-based method may further be applied to train the second trained model by inputting the first dataset (with imputed values) into the second trained model to create a third trained model. This third trained model may then be applied on the second dataset by generating residual values for the second dataset. These residual values may then be used to generate imputed values which can be inserted into the second dataset”) .
Regarding Claim 14,
Claim 14 is directed to a method claim and has same/similar claims Claim 3 and is rejected under the same rationale.
Regarding Claim independent 15,
Claim 15 is directed to a non-transitory machine readable medium claim and has same/similar claims Claim 1 and Claim 8 and is rejected under the same rationale.
Regarding Claim 16,
As shown above, Mishra, Gupta and CHAPADOS teaches all the limitations of Claim 15. Mishra teaches the system further comprising:
the device is configured to process the transaction request based on the prediction output (see Mishra: Fig.4, Col.5, Line 15-20, “This is possible because, in this scenario, the training data and the verification data both contain all the feature values of interest, and the machine learning model's prediction can therefore be checked against actual values for accuracy.”)
Claims 2-7, 9-11 and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Mishra in view of Gupta and CHAPADOS as applied to claims 1, 8 and 12-16 as shown above and in further view of KHARAGHANI (Pub. No US 20180300629 A, Pub. Date: 2018-10-18)
Regarding Claim 2,
As shown above, Mishra, Gupta and CHAPADOS teaches all the limitations of Claim 1. Mishra, Gupta and CHAPADOS does not teach the system wherein plurality of input layers comprises a first input layer corresponding to the first subset of the attribute values, a second input layer corresponding to the second subset of the attribute values, and a third input layer corresponding to a third subset of the attribute values.
However, KHARAGHANI teaches the system wherein
the first input layers comprises a first input layer corresponds to the first subset of the attribute values, wherein the second input layer corresponds to the second subset of the attribute values, and the third input layer corresponds to a third subset of the attribute values (see KHARAGHANI: Fig.2B, [0044], “illustrates a sparse full connected layer 100′, in accordance with one embodiment. The sparse fully connected layer 100′ is similar in structure to the fully connected layer (reference 100 in FIG. 1) and comprises the same number (n) of input nodes 102′.sub.1, 102′.sub.2, 102′.sub.3 . . . 102′.sub.n and the same number (m) of output nodes 104′.sub.1, 104′.sub.2, . . . 104′.sub.m, as the fully connected layer 100. The sparse fully connected layer 100′ however comprises fewer connections than the fully connected layer 100.”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching of Mishra to include the system that comprise an artificial neural network with node, layer and connection of the plurality of connections as taught by KHARAGHANI. One would have been motivated to make such a combination in order to provide efficient and improved system and method for training a neural network.
Regarding Claim 3,
As shown above Mishra, Gupta, CHAPADOS and KHARAGHANI teaches all the limitations of Claim 2. KHARAGHANI further teaches the wherein the masking comprises:
at least one of the first input layer, the second input layer, or the third input layer is connected to a hidden layer of the modified first instance of the machine learning mode (see KHARAGHANI: Fig.2B, [0044], “, if the fully connected layer 100 is the starting point, the pruned connection 106′.sub.1,1 (shown in dotted lines for clarity purposes) between input node 102′.sub.1 and output node 104′.sub.1, the pruned connection 106′.sub.2,1 between input node 102′.sub.2 and output node 104′.sub.1, and the pruned connection 106′.sub.m,n between input node 102′.sub.n and output node 104′.sub.m are removed so arrive at the sparse fully connected layer 100′.”); and
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching of Mishra to include the system that more first connections connect the third input layer to the first input layer as taught by KHARAGHANI. One would have been motivated to make such a combination in order to provide efficient and improved system and method for training a neural network.
Regarding Claim 4,
As shown above, Mishra, Gupta, CHAPADOS and KHARAGHANI teaches all the limitations of Claim 1. KHARAGHANI further teaches the wherein:
the modified first instance of the machine learning model is further configured to generate a set of intermediate values based on the first subset of the attribute values and the imputed second subset of the attribute values, and provide the set of intermediate values to a hidden layer of the modified first instance of the machine learning model (see KHARAGHANI: Fig.2B, [0044], “illustrates a sparse full connected layer 100′, in accordance with one embodiment. The sparse fully connected layer 100′ is similar in structure to the fully connected layer (reference 100 in FIG. 1) and comprises the same number (n) of input nodes 102′.sub.1, 102′.sub.2, 102′.sub.3 . . . 102′.sub.n and the same number (m) of output nodes 104′.sub.1, 104′.sub.2, . . . 104′.sub.m, as the fully connected layer 100. The sparse fully connected layer 100′ however comprises fewer connections than the fully connected layer 100.”), and
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching of Mishra to include the system that the one or more second connections connect the first input layer to the second layer as taught by KHARAGHANI. One would have been motivated to make such a combination in order to provide efficient and improved system and method for training a neural network.
Regarding Claim 5,
As shown above, Mishra, Gupta, and CHAPADOS teaches all the limitations of Claim 1. Mishra, Gupta and Zheng does not teach the system wherein determining that a third subset of the attribute values is available at the first point in time; and providing the first subset of the attribute values and the third subset of the attribute as values to a second instance of the machine learning model, where in the second instance of the machine learning model is configured to produce a second output based on the first subset of the attribute values and the third subset of the attribute values.
However, KHARAGHANI teaches the system wherein:
determining that a third subset of the attribute values is available at the first point in time (see KHARAGHANI: Fig.2B, [0044], “illustrates a sparse full connected layer 100′, in accordance with one embodiment. The sparse fully connected layer 100′ is similar in structure to the fully connected layer (reference 100 in FIG. 1) and comprises the same number (n) of input nodes 102′.sub.1, 102′.sub.2, 102′.sub.3 . . . 102′.sub.n and the same number (m) of output nodes 104′.sub.1, 104′.sub.2, . . . 104′.sub.m, as the fully connected layer 100. The sparse fully connected layer 100′ however comprises fewer connections than the fully connected layer 100.”), and
providing the first subset of the attribute values as values to a second instance of the machine learning model, where in the second instance of the machine learning model is configured to produce a second output based on the first subset of the attribute values (see KHARAGHANI: Fig.2A, [0041], “step 204 comprises feeding the input data through the neural network layers over a randomly (or pseudo-randomly) selected subset of connections, as will be discussed further below. Step 204 may also comprise proceeding with the backward propagation (or backpropagation) phase. In the backpropagation phase, errors between the output values generated during the feed-forward phase and desired output values are computed and propagated back through the neural network layers.”),
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching of Mishra to include the system that provide a first portion of the first subset of the attribute values that corresponds to input nodes in the first input layer of the artificial neural network as input values to a second instance of the machine learning model as taught by KHARAGHANI. One would have been motivated to make such a combination in order to provide efficient and improved system and method for training a neural network.
Regarding Claim 6,
As shown above, Mishra, Gupta, CHAPADOS and KHARAGHANI teaches all the limitations of Claim 5. Mishra further teaches the wherein:
the operations further comprise: comparing the first output against the second output, wherein the performing the action is further based on the comparing (see Mishra: Fig.5, Col. 11, Line 6-10, “the model can further be validated by comparing the newly acquired actual values to the imputed values predicted by the machine learning model. In this way, the machine learning model is continuously improved as more data becomes available.”)
Regarding Claim 7,
As shown above, Mishra, Gupta, CHAPADOS and KHARAGHANI teaches all the limitations of Claim 5. Mishra further teaches the system wherein:
the operations further comprise determining that a difference between the first output and the second output is below a threshold (see Mishra: Fig.5, Col. 10, Line 36-45, “machine learning model is applied on Dataset B and is used to compute, determine, or predict the missing feature values based upon the previous mapping of the input data features to the target features. For example, where a data record is missing a user's cadence for online shopping transactions, if the data record indicates that the user visits a social media website a number of times that is above a threshold number, the machine learning model may predict that the user makes purchases from an online retailer on average of twice per month.”); and
calculating a merged output value based on the first output and the second output, wherein the performing the action is further based on the merged output value (see Mishra: Fig.4, Col. 10, Line 17-20 “the Dataset may be recombined in a database to form complete data records with little or no missing feature values of interest.”)
Regarding Claim 9,
Claim 9 is directed to a method claim and has same/similar claims Claim 5 and is rejected under the same rationale.
Regarding Claim 10,
As shown above, Mishra, Gupta, CHAPADOS and KHARAGHANI teaches all the limitations of Claim 9. Mishra further teaches the system wherein:
determining that the difference exceeds a threshold (see Mishra: Fig.4, Col.8, Line 6-11 “the model will tend to converge and the process may stop at convergence, or within a predetermined threshold of convergence. In other words, the residual value that indicates the difference between the actual value and the predicted value will converge at zero, or close to zero.”); and
withholding the processing the transaction request until one or more attribute values from the second subset of the attribute values are available (see Mishra: Fig.4, Col.10, Line 4-11 “the Dataset A values are adjusted for noise, as previously described above. An error metric may be calculated based on the mean absolute error. That is, for each record, the absolute error is the absolute value of the difference between the actual value and the predicted value. The average of the absolute error is the MAE. Blocks 410 and 412 may be iterated a number of times until the error metrics are no longer decreasing in Dataset A.”)
Regarding Claim 11,
As shown above, Mishra, Gupta, CHAPADOS and KHARAGHANI teaches all the limitations of Claim 10. Mishra further teaches the system wherein:
determining that the one or more attribute values are available (see Mishra: Fig43, Col.3, Line 5-13, “first machine learning model is trained on Dataset A.”),
generating a third modified machine learning model based on the one or more attribute values (see Mishra: Fig.4, Col.9, Line 60-67, “a second machine learning model is trained on Dataset B. In many cases, the second machine learning model is the first machine learning model after it has been trained on Dataset A, and may be considered a second machine learning model. Dataset B may be partitioned into training data and validation data. In some scenarios, the missing values in Dataset B are filled in with mean, median, or mode values in order to train the machine learning model. As the model progresses, those initial mean, median, or mode values, may be replaced with imputed values from the machine learning model.”), and
providing the first subset of the attribute values, and the one or more attribute values to the third modified machine learning model (see Mishra: Fig.4, Col.10, Line 1-11, “The machine-based method may further be applied to train the second trained model by inputting the first dataset (with imputed values) into the second trained model to create a third trained model. This third trained model may then be applied on the second dataset by generating residual values for the second dataset. These residual values may then be used to generate imputed values which can be inserted into the second dataset”), wherein the third modified machine learning model is configured to produce a third output based on the first subset of the attribute values, and the one or more attribute values, and wherein the transaction request is processed further based on the third output (see Mishra: Fig.4, Col.10, Line 1-11, “The machine-based method may further be applied to train the second trained model by inputting the first dataset (with imputed values) into the second trained model to create a third trained model. This third trained model may then be applied on the second dataset by generating residual values for the second dataset. These residual values may then be used to generate imputed values which can be inserted into the second dataset”)
Regarding Claim 17,
Claim 17 is directed to a non-transitory machine-readable medium claim and has same/similar claims claim 2 and is rejected under the same rationale
Regarding Claim 18,
Claim 18 is directed to a non-transitory machine-readable medium claim and has same/similar claims Claim 5 and is rejected under the same rationale.
Regarding Claim 19,
Claim 19 is directed to a non-transitory machine-readable medium claim and has same/similar claims claim 6 and is rejected under the same rationale.
Regarding Claim 20,
Claim 20 is directed to a non-transitory machine-readable medium claim and has same/similar claims claim 7 and is rejected under the same rationale.
Response to Arguments
Claim Rejections - 35 U.S.C. § 101,
Regarding the 35 U.S.C. 101 rejection for being directed non-statutory subject matter has been updated based on applicant amendments and. Therefore, the 35 U.S.C. 101 rejection has been sustained.
Claim Rejections - 35 U.S.C. § 103,
Applicant’s arguments with respect to claim amendments have been considered but are moot considering the new combination of references being used in the current rejection. The new combination of references was necessitated by Applicant’s claim amendments. Therefore, the claims are rejected under the new combination of references as indicated above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
PGPUB
NUMBER:
INVENTOR-INFORMATION:
TITLE / DESCRIPTION
US 20230032822 A1
Wang; Luming
Title: SYSTEMS AND METHODS FOR ADAPTING MACHINE LEARNING MODELS
Description: Machine learning models are widely applied to multiple different types of problems in multiple different applications. A machine learning model contains multiple parameters. Prior to being applied to a particular problem, a machine learning model is trained by using training data to estimate values of its parameters. The resulting trained machine learning model may be applied to input data to produce corresponding outputs
US 20210272121 A1
HARRIS; Theodore
Title: PRIVACY-PRESERVING GRAPH COMPRESSION WITH AUTOMATED FUZZY VARIABLE DETECTION
Description: A disclosed method includes a) receiving by a server computer network data comprising a plurality of transaction data for a plurality of transactions. Each transaction data comprises a plurality of data elements with data values. At least one of the plurality of data elements comprises a user identifier for a user. The server computer can then b) generate one or more graphs comprising a plurality of communities based on the network data.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZELALEM W SHALU whose telephone number is (571)272-3003. The examiner can normally be reached M- F 0800am- 0500pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cesar Paula can be reached on (571) 272-4128. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Zelalem Shalu/Examiner, Art Unit 2145
/CESAR B PAULA/Supervisory Patent Examiner, Art Unit 2145