Office Action Analysis: 18192130 — WEIGHTED MACHINE LEARNING AGREEMENT SYSTEM FOR CLASSIFICATION

Office Action

§101 §102 §103
DETAILED ACTION
	This non-final rejection is response to the claims filed 29 March 2023. Claims 1-20 are pending. Claims 1, 12, and 16 are independent claims
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-3, 7-18, and 20 rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Step 1
According to the first part of the analysis, in the instant case, claims 1-15 are directed to a method, claims 16-20 are directed to an apparatus. Each of these claims fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).

Regarding Claim 1
Step2A Prong One
and determining an output prediction from the initial predictions and the weights.
(This step for collecting and recognizing information is understood to be a recitation of a mental process)
Step 2A Prong Two
A method comprising: providing input to a plurality of prediction models; obtaining an initial prediction from each of the plurality of prediction models; providing the input to one or more weight models; obtaining from the one or more weight models a weight for each initial prediction,
(This step for providing information to the models to produce output is considered insignificant extra solution activity. See MPEP § 2106.05(g))
wherein the weight for each initial prediction is based upon the input and behavior of each of the plurality of prediction models;
(This step for obtaining a weight value from the weight models based on the input and behavior of the prediction models is considered insignificant extra solution activity. See MPEP § 2106.05(g))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites mental processes such as determining an output from weights and predictions while the additional elements of obtaining inputs and outputs from prediction models, determining weights for these outputs, and determining the weights from predictions and model behavior is routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).

	Regarding claim 2
	Step 2A Prong One
The method of claim 1,
(Claim 2 depends on claim 1, which has been determined to recite abstract ideas including mental processes. Therefore, claim 2 also recites an abstract idea.)
	Step 2A Prong Two
	wherein each of the plurality of prediction models comprises a machine learning model.
	(This step for applying the prediction models using a machine learning model is to be understood as mere instructions to apply an exception. See MPEP § 2106.05(f))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites abstract ideas while the additional element of applying the prediction models using a generic machine learning model is a well-understood, routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).


Regarding claim 3
	Step 2A Prong One
	The method of claim 1,
(Claim 3 depends on claim 1, which has been determined to recite abstract ideas including mental processes. Therefore, claim 3 also recites an abstract idea.)
	Step 2A Prong Two
	wherein the one or more weight models comprises a machine learning model.
	(This step for applying the weight models using a machine learning model is to be understood as mere instructions to apply an exception. See MPEP § 2106.05(f))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites the judicial exceptions of claim 1 while the additional elements of applying the weight model using a generic machine learning model is a well-understood, routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).


	Regarding claim 4
	Step 2A Prong One
The method of claim 1,
(Claim 4 depends on claim 1, which has been determined to recite abstract ideas including mental processes. Therefore, claim 4 also recites an abstract idea.)
	Step 2A Prong Two
	wherein determining the output prediction comprises determining a plurality of weighted predictions by weighting each of the initial predictions with the respective weight for the initial prediction.
	(This step incorporates the judicial exceptions from claim 1 into a process for determining the output prediction. In [0021] of the applicant’s spec, it is further explained that this process significantly improves the accuracy of the machine learning results. Therefore, this step is to be considered an improvement to the technology (See MPEP § 2106.04(d)(1)), and is not rejected under 35 USC § 101)


	Regarding Claim 5
Step 2A Prong One
The method of claim 4,
(Claim 5 depends on claim 4, which has been determined to include abstract ideas including mental processes. However, claim 4 integrates the abstract ideas into a practical application. Claim 5 does not introduce any new abstract ideas, and therefore is not rejected under 35 USC § 101)


Regarding Claim 6
Step 2A Prong One
The method of claim 4,
(Claim 6 depends on claim 4, which has been determined to include abstract ideas including mental processes. However, claim 4 integrates the abstract ideas into a practical application. Claim 6 does not introduce any new abstract ideas, and therefore is not rejected under 35 USC § 101)


	Regarding Claim 7
	Step 2A Prong One
The method of claim 1, wherein the input comprises features extracted from text.
(This step for defining the type of input data is to be used is understood as a recitation of a mental process)
Step 2A Prong Two
The claim does not include additional elements, when considered separately and in combination, that integrate the judicial exception into a practical application.
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites mental processes such as defining the input as text data, without any technological improvement or inventive step. 


Regarding Claim 8
	Step 2A Prong One
The method of claim 7,
(Claim 8 depends on claim 7, which has been determined to recite abstract ideas including mental processes. Therefore, claim 8 also recites an abstract idea.)
Step 2A Prong Two
wherein the text is derived from human speech.
(This step recites gathering the text data from human speech which is considered Insignificant Extra-Solution Activity. See MPEP § 2106.05(g))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites abstract ideas while the additional element of deriving text from human speech is a well-understood, routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).


	Regarding Claim 9
	Step 2A Prong One
	The method of claim 7, further comprising determining an overall prediction from a plurality of output predictions determined from different features extracted from the text.
	(This step for recognizing information is understood to be a recitation of a mental process)
Step 2A Prong Two
The claim does not include additional elements, when considered separately and in combination, that integrate the judicial exception into a practical application.
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites mental processes such as determining predictions from text data without any technological improvement or inventive step. 


	Regarding Claim 10
	Step 2A Prong One
	The method of claim 1, 
(Claim 10 depends on claim 1, which has been determined to recite abstract ideas including mental processes. Therefore, claim 10 also recites an abstract idea.)
	Step 2A Prong Two
	wherein each initial prediction comprises a prediction class and a probability for the prediction class.
	(This step is to be understood as gathering statistics and information which is to be understood as insignificant extra-solution activity. See MPEP § 2106.05(g))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim is directed to the mental process of a prediction while the additional elements of associating a probability and class with a prediction are well-understood, routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).


	Regarding Claim 11
	Step 2A Prong One
	The method of claim 10, wherein
(Claim 11 depends on claim 10, which has been determined to recite abstract ideas including mental processes. Therefore, claim 11 also recites an abstract idea.)
Step 2A Prong Two
wherein the probability is based upon the behavior of one of the plurality of prediction models and the input.  
(This step is to be understood as gathering statistics and information which is to be understood as insignificant extra-solution activity. See MPEP § 2106.05(g))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites an abstract idea while the additional elements of generically deriving probability using a model and input are a well-understood, routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).


	Regarding Claim 12
	Step 2A Prong One
determining a weight for each prediction from the plurality of prediction models;
(This step for deriving a value based on recognized information is understood as a mental process)
	Step 2A Prong Two
A method comprising: providing an input to a plurality of prediction models; obtaining, for the input, a prediction from each of the plurality of prediction models; 
(This step for providing information to the models to produce output is considered insignificant extra solution activity. See MPEP § 2106.05(g))
	generating a training dataset comprising the input labeled with the weights for each of the predictions from the plurality of prediction models;
	(This step recites generating a training dataset using the aforementioned input, weights, and predictions which is extra-solution activity. See MPEP § 2106.05(g))
	and training a weight model using the training dataset.
	(This step recites using the training dataset to train a model, which is an insignificant application. See MPEP § 2106.05(g))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites mental processes such as determining a weight for each prediction while the additional elements of providing input to and receiving output from prediction models, generating a training dataset, and training a model using the training dataset are a well-understood, routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).


	Regarding Claim 13
	Step 2A Prong One
The method of claim 12,
(Claim 13 depends on claim 12, which has been determined to recite abstract ideas including mental processes. Therefore, claim 13 also recites an abstract idea.)
wherein determining the weight for each prediction comprises determining the weight based upon a predetermined correct prediction for the input and the predictions from each of the plurality of prediction models.
(This step for determining the weight of each prediction using predictions is understood as a mental process.)
Step 2B Prong Two
The claim does not include additional elements, when considered separately and in combination, that integrate the judicial exception into a practical application.
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites an abstract idea of determining weights of each prediction using prediction information, without any technological improvement or inventive step. 


	Regarding claim 14
	Step 2A Prong One
The method of claim 12, wherein the input comprises features extracted from text. 
(This step for defining the type of input data is to be used is understood as a recitation of a mental process)
Step 2A Prong Two
The claim does not include additional elements, when considered separately and in combination, that integrate the judicial exception into a practical application.
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites mental processes such as defining the input as text data, without any technological improvement or inventive step. 

	Regarding Claim 15
	Step 2A Prong One
The method of claim 14,
(Claim 15 depends on claim 14, which has been determined to recite abstract ideas including mental processes. Therefore, claim 14 also recites an abstract idea.)
Step 2A Prong Two
wherein the text comprises text derived from human speech.
(This step recites gathering the text data from human speech which is considered Insignificant Extra-Solution Activity. See MPEP § 2106.05(g))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites abstract ideas while the additional element of deriving text from human speech is a well-understood, routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).


	Regarding Claim 16
	Step 2A Prong One	
	obtain from the one or more weight models a weight for each initial prediction,
(This step for deriving a value based on recognized information is understood as a mental process)
	Step 2A Prong Two
provide input to a plurality of prediction models; obtain an initial prediction from each of the plurality of prediction models; provide the input to one or more weight models;
	(This step for providing input to and receiving output from models is considered insignificant extra solution activity. See MPEP § 2106.05(f))
	One or more tangible, non-transitory computer readable storage media encoded with instructions that, when executed by one or more processors, cause the one or more processors to:
	(This step invokes a non-transitory computer readable storage media merely as a tool to receive, store, and transmit data. See MPEP § 2106.05(f))
	wherein the weight for each initial prediction is based upon the input and behavior of each of the plurality of prediction models;
(This step for obtaining a weight value from the weight models based on the input and behavior of the prediction models is considered insignificant extra solution activity. See MPEP § 2106.05(g))
	and determine an output prediction from the initial predictions and the weights
(This step for determining an output using predictions and weights is considered insignificant extra solution activity. See MPEP § 2106.05(g))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites mental processes such as determining weights for the predictions while the additional elements of obtaining inputs and outputs from prediction models, using a computer readable medium to perform generic computing functions, determining the weights from predictions and model behavior, and determining an output using predictions and weights are well-understood, routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).

	
	Regarding Claim 17
	Step 2A Prong One
	The one or more computer readable storage media of claim 16,
(Claim 17 depends on claim 16, which has been determined to recite abstract ideas including mental processes. Therefore, claim 17 also recites an abstract idea.)
	Step 2A Prong Two
wherein each of the plurality of prediction models comprises a machine learning model.
	(This step for applying the prediction models using a machine learning model is to be understood as mere instructions to apply an exception. See MPEP § 2106.05(f))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites abstract ideas while the additional element of applying the prediction models using a generic machine learning model is a well-understood, routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).


Regarding Claim 18
	Step 2A Prong One
	The one or more computer readable storage media of claim 16,
(Claim 18 depends on claim 16, which has been determined to recite abstract ideas including mental processes. Therefore, claim 18 also recites an abstract idea.)
	Step 2A Prong Two
wherein the one or more weight models comprises a machine learning model.
	(This step for applying the weight models using a machine learning model is to be understood as mere instructions to apply an exception. See MPEP § 2106.05(f))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim recites abstract ideas while the additional element of applying the one or more weight models using a generic machine learning model is a well-understood, routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).


	Regarding claim 19
	Step 2A Prong One
The one or more computer readable storage media of claim 16,
(Claim 19 depends on claim 16, which has been determined to recite abstract ideas including mental processes. Therefore, claim 19 also recites an abstract idea.)
	Step 2A Prong Two
	wherein the instructions operable to determine the output prediction comprise instruction operable to determine the output prediction by determining a plurality of weighted predictions by weighting each of the initial predictions with the respective weight for the initial prediction.
	(This step incorporates the judicial exceptions from claim 16 into a process for determining the output prediction. In [0021] of the applicant’s spec, it is further explained that this process significantly improves the accuracy of the machine learning results. Therefore, this step is to be considered an improvement to the technology (See MPEP § 2106.04(d)(1)), and is not rejected under 35 USC § 101)


Regarding Claim 20
	Step 2A Prong One
	The one or more computer readable storage media of claim 16, 
(Claim 20 depends on claim 16, which has been determined to recite abstract ideas including mental processes. Therefore, claim 20 also recites an abstract idea.)
	Step 2A Prong Two
	wherein each initial prediction comprises a prediction class and a probability for the prediction class.
	(This step is to be understood as gathering statistics and information which is to be understood as insignificant extra-solution activity. See MPEP § 2106.05(g))
Step 2B
	The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered individually and in combination, they do not add significantly more (also known as an inventive concept) to the exception. The claim is directed to the mental process of a prediction while the additional elements of associating a probability and class with a prediction are a well-understood, routine, and conventional activity, as recognized by the court decisions listed in MPEP § 2106.05(d).

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1-7, 9-11, 16-20 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Spencer et al. (US 9424246 B2) (hereinafter Spencer).

Regarding claim 1, Spencer teaches
A method comprising: providing input to a plurality of prediction models;
obtaining an initial prediction from each of the plurality of prediction models;
([Col.33, ln.62-64] “The text prediction engine 3000 is configured to generate concurrently using user text input text predictions from each of the plurality of language models 320, 300, 340.” Teaches a plurality of prediction models (the language models) using text as input, and each of the models produce initial predictions.)
providing the input to one or more weight models;
([Col.33 ln.65 – Col.34 ln.1] “The Multi-LM 380 is configured take the products of the probabilities of the text predictions generated from a given language model and the context-specific weighting factor associated with that language model.” 
The Multi-LM assigns a weighting factor to the predictions, therefore the Multi-LM is a weight model
([Col.8, ln.18-21] “In response to the input of end-of-sentence punctuation or a ‘return’ character, or at an otherwise predetermined time, the user inputted text sequence is passed to the Multi-LM” 
The weight model is being provided input here)
obtaining from the one or more weight models a weight for each initial prediction, wherein the weight for each initial prediction is based upon the input and behavior of each of the plurality of prediction models;
([Col.33 ln.65 – Col.34 ln.1] “The Multi-LM 380 is configured take the products of the probabilities of the text predictions generated from a given language model and the context-specific weighting factor associated with that language model.” 
Here we are being shown the weight model / Multi-LM applying a weight to each prediction. Additionally, 
[Col.28 ln.11-20] “In one embodiment, the Multi-LM 180 is configured to apply a context-specific weighting factor when combining the predictions from the relevant context-specific language model CS.sub.RLM and the general language model 170, which weighting factor is tailored to take into account the fact that a smaller language model (e.g. a context-specific dynamic language model CS.sub.RLM) may provide less accurate text predictions than a larger language model (e.g. the general language model 170), because it has been trained on much less data than the larger language model.”,
here we are shown that the weighting factor applied by the multi-LM depends on the input (input size here) and behavior of the plurality of prediction models (general language model behavior vs context-specific language model behavior))
and determining an output prediction from the initial predictions and the weights.
([Col.34 ln.1-5] “The Multi-LM 380 is then configured to combine the weighted predictions from the multiple language models to provide a final set of text predictions 390, which may include for example, the top n most likely text predictions.” The output prediction here is determined using the weights and initial predictions.)


Regarding claim 2, Spencer teaches;
The method of claim 1, wherein each of the plurality of prediction models comprises a machine learning model.
([Col.34 ln.1-5] “The Multi-LM 380 is then configured to combine the weighted predictions from the multiple language models to provide a final set of text predictions 390, which may include for example, the top n most likely text predictions.” the prediction models are language models, which are machine learning models)


Regarding claim 3, Spencer teaches;
The method of claim 1, wherein the one or more weight models comprises a machine learning model.
([Col.34 ln.1-5] “The Multi-LM 380 is then configured to combine the weighted predictions from the multiple language models to provide a final set of text predictions 390, which may include for example, the top n most likely text predictions.”, the weight model is a Multi-LM, which is a machine learning model)


Regarding claim 4, Spencer teaches;
The method of claim 1, wherein determining the output prediction comprises determining a plurality of weighted predictions by weighting each of the initial predictions with the respective weight for the initial prediction.
([Col.33 ln.62 – Col.34 ln.1-5] “The text prediction engine 3000 is configured to generate concurrently using user text input text predictions from each of the plurality of language models 320, 300, 340. The Multi-LM 380 is configured take the products of the probabilities of the text predictions generated from a given language model and the context-specific weighting factor associated with that language model. The Multi-LM 380 is then configured to combine the weighted predictions from the multiple language models to provide a final set of text predictions 390, which may include for example, the top n most likely text predictions.” 
This describes the output prediction being determined by producing a plurality of predictions from the plurality of prediction models, which are then weighted by the weight model (Multi-LM), and combined to create a final output prediction set.)

Regarding claim 5, Spencer teaches;
The method of claim 4, wherein the output prediction comprises all of the weighted predictions.
([Col.33 ln.62 – Col.34 ln.1-5] “The text prediction engine 3000 is configured to generate concurrently using user text input text predictions from each of the plurality of language models 320, 300, 340. The Multi-LM 380 is configured take the products of the probabilities of the text predictions generated from a given language model and the context-specific weighting factor associated with that language model. The Multi-LM 380 is then configured to combine the weighted predictions from the multiple language models to provide a final set of text predictions 390, which may include for example, the top n most likely text predictions.” 
the example that the top n most likely weighted predictions are selected could be all of the weighted predictions when n equals the number of predictions)


Regarding claim 6, Spencer teaches;
The method of claim 4, wherein the output prediction comprises one of the weighted predictions.
([Col.33 ln.62 – Col.34 ln.1-5] “The text prediction engine 3000 is configured to generate concurrently using user text input text predictions from each of the plurality of language models 320, 300, 340. The Multi-LM 380 is configured take the products of the probabilities of the text predictions generated from a given language model and the context-specific weighting factor associated with that language model. The Multi-LM 380 is then configured to combine the weighted predictions from the multiple language models to provide a final set of text predictions 390, which may include for example, the top n most likely text predictions.” 
the example that the top n most likely weighted predictions are selected could be only one of the predictions when n equals 1)


Regarding claim 7, Spencer teaches;
The method of claim 1, wherein the input comprises features extracted from text. 
([Col.4 ln.40-45] “the user interface is configured to display the word in the typing pane and pass the current input sequence including that word to the text prediction engine as a context input. In response to a word key press and hold or left-to-right gesture on the word key, the user interface is configured to display the word in the typing pane, pass the current input sequence excluding that word to the text prediction engine as a context input, and pass the characters of that word to the text prediction engine as a current word input.” 
This excerpt shows features including context and current word input being extracted from the text input.)


Regarding claim 9, Spencer teaches;
The method of claim 7, further comprising determining an overall prediction from a plurality of output predictions determined from different features extracted from the text.
([Col.7 ln.35-42] “The text prediction engine 100 operates to generate concurrently text predictions 20 from the multiple language models present. It does this by employing a multi-language model 8 (Multi-LM) to combine the predictions 20 sourced from each of the multiple language models to generate final predictions 9 that are provided to a user interface for display and user selection. The final predictions 9 are a set (i.e. a specified number) of the overall most probable predictions.” 
This excerpt shows that the overall prediction (a set of most probable predictions) is determined from a plurality of output predictions (generate predictions from multiple language models) determined from the input where the input includes different features extracted from the text 
([Col.4 ln.40-45] “the user interface is configured to display the word in the typing pane and pass the current input sequence including that word to the text prediction engine as a context input. In response to a word key press and hold or left-to-right gesture on the word key, the user interface is configured to display the word in the typing pane, pass the current input sequence excluding that word to the text prediction engine as a context input, and pass the characters of that word to the text prediction engine as a current word input.” 
This excerpt shows features including context and current word input being extracted from the text input.))


Regarding claim 10, Spencer teaches;
The method of claim 1, wherein each initial prediction comprises a prediction class and a probability for the prediction class.
([Col.5 ln.30-35] “In an embodiment, the language model further comprises a topic filter and the method includes the further steps of predicting topic categories represented in a current input text, predicting topic categories for the terms in the prediction set and adjusting the probabilities of the predictions in the prediction set based on the topic category predictions.” 
This excerpt shows that the prediction models comprise a topic filter wherein predictions comprise a prediction class (topic category) and an associated probability for the prediction class.)


Regarding claim 11, Spencer teaches; 
The method of claim 10, wherein the probability is based upon the behavior of one of the plurality of prediction models and the input.
([Col.5 ln.30-35] “In an embodiment, the language model further comprises a topic filter and the method includes the further steps of predicting topic categories represented in a current input text, predicting topic categories for the terms in the prediction set and adjusting the probabilities of the predictions in the prediction set based on the topic category predictions.”,
the probability is produced by an LM, and therefore is based upon the behavior of one of the plurality of prediction models. Additionally; 
[Col.6 ln.47-48] “The language models are generated from language texts.” Given that the LM is built upon the input, the probability produced by the LM would then be based on that same input. Therefore, the probability generated by the prediction models is based upon both the behavior of one of the plurality the prediction models as well as the input.)

Regarding claim 16, Spencer teaches; 
One or more tangible, non-transitory computer readable storage media encoded with instructions that, when executed by one or more processors, cause the one or more processors to: 
([Col.5 ln.52-56] “There is also provided, in accordance with the disclosure a computer program product including a computer readable medium having stored thereon computer program means for causing a processor to carry out the method of the disclosure.” 
This excerpt discloses a non-transitory computer readable medium (the language, “having stored thereon” implies non-transitory media.) encoded with instructions (“having stored thereon computer program means”) that, when executed by one or more processors, performs the method of the disclosure (for causing a processor to carry out the method of the disclosure”) )
 provide input to a plurality of prediction models; obtain an initial prediction from each of the plurality of prediction models;
([Col.33, ln.62-64] “The text prediction engine 3000 is configured to generate concurrently using user text input text predictions from each of the plurality of language models 320, 300, 340.” Teaches a plurality of prediction models (the language models) using text as input, and each of the models produce initial predictions.)
provide the input to one or more weight models;
([Col.33 ln.65 – Col.34 ln.1] “The Multi-LM 380 is configured take the products of the probabilities of the text predictions generated from a given language model and the context-specific weighting factor associated with that language model.” 
The Multi-LM assigns a weighting factor to the predictions, therefore the Multi-LM is a weight model
([Col.8, ln.18-21] “In response to the input of end-of-sentence punctuation or a ‘return’ character, or at an otherwise predetermined time, the user inputted text sequence is passed to the Multi-LM” 
The weight model is being provided input here)
obtain from the one or more weight models a weight for each initial prediction, wherein the weight for each initial prediction is based upon the input and behavior of each of the plurality of prediction models;
([Col.33 ln.65 – Col.34 ln.1] “The Multi-LM 380 is configured take the products of the probabilities of the text predictions generated from a given language model and the context-specific weighting factor associated with that language model.” 
Here we are being shown the weight model / Multi-LM applying a weight to each prediction. Additionally, 
[Col.28 ln.11-20] “In one embodiment, the Multi-LM 180 is configured to apply a context-specific weighting factor when combining the predictions from the relevant context-specific language model CS.sub.RLM and the general language model 170, which weighting factor is tailored to take into account the fact that a smaller language model (e.g. a context-specific dynamic language model CS.sub.RLM) may provide less accurate text predictions than a larger language model (e.g. the general language model 170), because it has been trained on much less data than the larger language model.”,
here we are shown that the weighting factor applied by the multi-LM depends on the input (input size here) and behavior of the plurality of prediction models (general language model behavior vs context-specific language model behavior))
and determine an output prediction from the initial predictions and the weights.
([Col.34 ln.1-5] “The Multi-LM 380 is then configured to combine the weighted predictions from the multiple language models to provide a final set of text predictions 390, which may include for example, the top n most likely text predictions.” The output prediction here is determined using the weights and initial predictions.)


Regarding claim 17, Spencer teaches; 
The one or more computer readable storage media of claim 16, wherein each of the plurality of prediction models comprises a machine learning model.
([Col.34 ln.1-5] “The Multi-LM 380 is then configured to combine the weighted predictions from the multiple language models to provide a final set of text predictions 390, which may include for example, the top n most likely text predictions.” the prediction models are language models, which are machine learning models)


Regarding claim 18, Spencer teaches;
The one or more computer readable storage media of claim 16, wherein the one or more weight models comprises a machine learning model. 
([Col.34 ln.1-5] “The Multi-LM 380 is then configured to combine the weighted predictions from the multiple language models to provide a final set of text predictions 390, which may include for example, the top n most likely text predictions.”, the weight model is a Multi-LM, which is a machine learning model)


Regarding claim 19, Spencer teaches;
The one or more computer readable storage media of claim 16, wherein the instructions operable to determine the output prediction comprise instruction operable to determine the output prediction by determining a plurality of weighted predictions by weighting each of the initial predictions with the respective weight for the initial prediction.
(As described above (in claim 16), the embodiment described in the disclosure of Spencer is performed using a computer readable storage media of claim 16. 
[Col.33 ln.62 – Col.34 ln.1-5] “The text prediction engine 3000 is configured to generate concurrently using user text input text predictions from each of the plurality of language models 320, 300, 340. The Multi-LM 380 is configured take the products of the probabilities of the text predictions generated from a given language model and the context-specific weighting factor associated with that language model. The Multi-LM 380 is then configured to combine the weighted predictions from the multiple language models to provide a final set of text predictions 390, which may include for example, the top n most likely text predictions.” 
This describes the output prediction being determined by producing a plurality of predictions from the plurality of prediction models, which are then weighted by the weight model (Multi-LM), and combined to create a final output prediction set.)



Regarding claim 20, Spencer teaches;
The one or more computer readable storage media of claim 16, wherein each initial prediction comprises a prediction class and a probability for the prediction class.
([Col.5 ln.30-35] “In an embodiment, the language model further comprises a topic filter and the method includes the further steps of predicting topic categories represented in a current input text, predicting topic categories for the terms in the prediction set and adjusting the probabilities of the predictions in the prediction set based on the topic category predictions.” 
This excerpt shows that the prediction models comprise a topic filter wherein predictions comprise a prediction class (topic category) and an associated probability for the prediction class.)

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.



Claim(s) 8 is/are rejected under 35 U.S.C. 103 as being anticipated by Spencer et al. (US 9424246 B2) (hereinafter Spencer).

Regarding claim 8, Spencer teaches;
The method of claim 7, 
Spencer fails to teach;
wherein the text is derived from human speech.
However, Spencer also discloses:
([Col.2 ln.8-10] “A somewhat different model of text entry is offered by the University of Cambridge's ‘Dasher’ system, in which text input is driven by natural, continuous pointing gestures rather than keystrokes. It relies heavily on advanced language model-based character prediction, and is aimed primarily at improving accessibility for handicapped users, although it can also be used in mobile and speech recognition-based applications.”)
	This excerpt discloses a different system for language model-based text prediction system (which is the subject of Spencer’s disclosure) which can be utilized in speech recognition-based applications. 
	Spencer further explains that:
	(“[Col.2 ln.16-21] Many of the input models discussed above utilize some form of text prediction technology. Known prediction models for enhancing text input have two main functions:
 1) Disambiguation of multiple-character keystrokes.
 2) Offering potential completions for partially-entered sequences.”) 
	 From this, one of ordinary skill in the art would recognize that these functions could also be used to improve the accuracy of speech recognition.
	Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the method of claim 7 by using text derived from human speech. 

Claim(s) 12-13 is/are rejected under 35 U.S.C. 103 as anticipated by Abhishek et al. (US 20230169323 A1) (hereinafter Abhishek).

Regarding claim 12, Abhishek teaches;
A method comprising: providing an input to a plurality of prediction models; obtaining, for the input, a prediction from each of the plurality of prediction models;
([0019] “In one embodiment of the present disclosure, a classification model is built in which a classified dataset for which classes are pre-labeled is inputted to the classification model. In one embodiment, the classified dataset includes label noise. Based on the input, the classification model generates a prediction of class probabilities. Furthermore, a second model is built with the same architecture as the classification model, where the second model is a moving average of the classification model. Similarly, as the classification model, the second model generates a prediction of class probabilities.” This excerpt discloses a plurality of prediction models (classification models which produce predictions of class probabilities) which are provided an input. Each of the plurality of prediction models then obtain a prediction for the input)
determining a weight for each prediction from the plurality of prediction models;
([0111] “The predictions of class probabilities of the classification model and the second model are then combined using an artificial neural network. Weight factors used to weight the combined predictions of class probabilities of the classification model and the second model are generated by the artificial neural network.” The artificial neural network determines a weight for each prediction from the plurality of prediction models)

Abhishek does not explicitly teach;
generating a training dataset comprising the input labeled with the weights for each of the predictions from the plurality of prediction models;
However, Abhishek discloses; 
([0019] “A prediction of class probabilities using these weighted predictions is then obtained by the artificial neural network. The predictions of class probabilities of the artificial neural network and the classification model are then combined to train the machine learning model, such as the deep learning model. In this manner, a machine learning model, such as a deep learning model, is trained using noisily labeled data.” 
This excerpt discloses generating a training dataset having the input data labeled with class probabilities using weighted predictions generated by the artificial neural network. Abhishek already discloses generating a training dataset where the input data is labeled with the weighted predictions from the plurality of prediction models, all that is missing is directly labeling the training dataset with the already derived weights instead of the prediction probabilities.)
It would therefore have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify this process by labeling the input data in the training dataset using the prediction weights instead of the prediction probabilities of from the plurality of prediction models.

Abhishek does not explicitly disclose;
and training a weight model using the training dataset.
However;
 (Abhishek discloses using the above-described method of generating a training dataset to train a machine learning model. Abhishek also discloses a weight model which assigns weights to the predictions from the plurality of prediction models (as described above, the weight model here is to be understood as the artificial neural network))
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to use the generated training dataset to train the weight model. 


Regarding claim 13, Abhishek teaches; 
The method of claim 12, wherein determining the weight for each prediction comprises determining the weight based upon a predetermined correct prediction for the input
([0094] “As discussed above, in one embodiment, MLP neural network 306 is configured to generate weight factors (weight values) from the predictions of the class probabilities 303, 305, which are used to weight the predictions 303, 305 of classification model 301 and guider model 304, respectively.” [0097] “Weights are learnable parameters inside the network. A teachable neural network will randomize the weight values before learning initially begins. As training continues, the learnable parameters (weights) are adjusted toward the desired values and the correct output. Weights indicate the strength of the connection between the input and output. Weight affects the amount of influence a change in the input will have upon the output. A low weight value will have no change on the input, and alternatively, a larger weight value will more significant” 
Teaches adjusting the weights towards a correct output in the MLP which is responsible for assigning weights to each prediction. In other words, teaches determining the weight for each prediction where determining the weight is based upon a predetermined correct prediction for the output.)

Claim(s) 14-15 is/are rejected under 35 U.S.C. 103 as anticipated by Abhishek et al. (US 20230169323 A1) (hereinafter Abhishek) in view of Wang et al. (US 20080249762 A1) (hereinafter Wang).

Regarding claim 14, Abhishek teaches;
The method of claim 12,
Abhishek fails to teach;
Wherein the input comprises features extracted from text.
However, Wang teaches;
Wherein the input comprises features extracted from text.
([0019] “In some embodiments, the classification system may use several different models for term n-grams and part-of-speech n-grams for n-grams of varying lengths (e.g., unigrams, bigrams, and trigrams). To generate a combined score for the models, the classification system learns weights for the various models. To learn the weights, the classification system may collect additional training documents and label those training documents. The classification system then uses each model to classify the additional training documents. The classification system may use a linear regression technique to calculate weights for each of the models to minimize the error between a classification generated by the weighted models and the label” 
Wang discloses very similar processes to Abhishek, including a plurality of classification models, a machine learning model that assigns weights based on predictions of other models, and using labeled data to train machine learning algorithms. 
Additionally; [Abstract] “A method and system is provided for classifying documents based on the subjectivity of the content of the documents using a part-of-speech analysis to help account for unseen words.” [0030] “The classification system may use smoothing techniques to overcome the problem of underestimated probability of any word unseen in a document. In general, smoothing techniques try to discount the probabilities of the words seen in the text and then assign an extra probability mass to the unseen words.” 
These excerpts describe the input as documents, which is later used interchangeably with text. The input is therefore text. 
Finally; [0006] “A classification system trains a classifier using the parts of speech of training documents so that the classifier can classify an unseen word based on the part of speech of the unseen word. The classification system identifies n-grams of the parts of speech of the words of each training document. The classification system also identifies n-grams of the terms of the training documents. The classification system then trains a part-of-speech model using the parts of speech of the n-grams and labels of the training documents, and trains a term model using the term unigrams and labels.” 
This excerpt discloses that the input comprises features extracted from the text. Here, these features are to be considered the identified n-grams and parts of speech used as input for the models)
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to perform the method of Abhishek’s disclosure using input data comprising features extracted from text.


Regarding claim 15, in light of Wang, Abhishek teaches;
The method of claim 14, wherein the text comprises text
Abhishek does not explicitly teach;
derived from human speech.
However, Abhishek also discloses;
([0072] “As stated above, deep-learning architectures, such as deep neural networks, deep belief networks, deep reinforcement learning, recurrent neural networks and convolutional neural networks, have been applied to fields including computer vision, speech recognition, natural language processing, machine translation, bioinformatics, drug design, medical image analysis, material inspection and board game programs, where they have produced results comparable to and in some cases surpassing human expert performance.” 
This excerpt explains that deep learning (which is the general concept presented in Abhishek’s disclosure) has been known to be applied to all of these applications, including speech recognition.)
It is also mentioned that the results produced by deep-learning architectures in these applications have been comparable to or even surpassing human expert performance, which indicates that the usage of this technology in speech recognition applications has a reasonable expectation of success. 
Additionally, Abhishek’s disclosure further relates to using noisily labeled data to train the system to be more robust to mistakes or errors in the data. When being used in the context of speech data, it would be obvious to use this same system to improve robustness to mistakes and errors.  
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to perform the method of Abhishek’s disclosure using text input data derived from human speech. 


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Matthew Alan Cady whose telephone number is (571) 272-7229. The examiner can normally be reached Monday - Friday, 7:30 am - 5:00 pm ET. 
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cesar Paula can be reached on (571)272-4128. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) 
at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/MATTHEW ALAN CADY/ Examiner, Art Unit 2145 


/CESAR B PAULA/            Supervisory Patent Examiner, Art Unit 2145
Read full office action
WEIGHTED MACHINE LEARNING AGREEMENT SYSTEM FOR CLASSIFICATION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

WEIGHTED MACHINE LEARNING AGREEMENT SYSTEM FOR CLASSIFICATION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email