DETAILED ACTION
This office action is responsive to the above identified application filed 6/7/2023. The application contains claims 1-20, all examined and rejected.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
The Information Disclosure Statement with references submitted 6/7/2023, 3/28/2024, have been considered and entered into the file.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 8-9 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claim 8 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claim 8 recites the limitation "The system as claimed in claim 6, when dependent on claim 4". However, claim 6 depends on claim 1 and not claim 4. Dependent claim 9 inherit the deficiency of the claim 8 that it depends of. The claim language is unclear and cannot define the metes and bounds of the claim.
Claim 14 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claim 14 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claim 8 recites the limitation "The system as claimed in claim 6, when dependent on claim 4". However, claim 13 depends on claim 12 and not claim 7. The claim language is unclear and cannot define the metes and bounds of the claim.
Claims 17-19 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claims 17-19 recites the limitation "the method as claimed in claim 11". There is insufficient antecedent basis for this limitation in the claim. For examiner purposes examiner will consider that the claims recite "the method as claimed in claim 16".
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 16 and 20 are rejected under 35 U.S.C. 102(a)(1) and 35 U.S.C. 102(a)(2) as being anticipated by Malik et al. [US 2021/0117780 A1, hereinafter D2].
With regard to Claim 16,
D2 teach a computer-implemented method training task-independent personal machine learning, ML, models for a user, the method comprising:
obtaining a training dataset specific to the user (¶117, “Each of the client systems 130a, 130b, 130c, and 130d may have stored in a local data store a respective plurality of examples 530a, 530b, 530c and 530d”, “ Client system 130a may then retrieve the plurality of examples 530 a from the local data store“);
obtaining, from a task-independent shared ML model, a set of shared features (¶117, “global neural network model 820 having a plurality of federated model parameters stored on a server 510”, “ Client system 130 a may then receive a current version of the global neural network model 820 a from server 510”, ¶119, “ federated parameters that were shared across all client systems …” ; and
training, using the training dataset and set of shared features, the task-independent personal ML model to learn a set of personal features specific to the user (¶109, “ the model parameters … of the machine-learning model may include both a global neural network model and a private local personalization model”, ¶117, “ Client system 130 a may then train the received global neural network model 820 a together with the local personalization model 830 a on the pluralities of examples 530 a to generate a plurality of updated federated model parameters and a plurality of updated local model parameters. Client system 130 a may then store in the local data store the trained local personalization model 830 a including the updated local model parameters”, ¶108, “jointly training private local model parameters and federated parameters of a machine-learning model locally on the client systems, storing updated local parameters on the client systems ..”).
With regard to Claim 20,
Claim 20 is similar in scope to claim 16; therefore it is rejected under similar rationale. D2 further teach a non-transitory computer readable medium for storing computer readable program code or instructions which are executable by a processor to perform a method for suggesting at least one modality of interaction See at least Fig. 16, ¶¶156-157,” processor 1602 includes hardware for executing instructions, such as those making up a computer program. As an example and not by way of limitation, to execute instructions, processor 1602 may retrieve (or fetch) the instructions from an internal register, an internal cache, memory 1604, or storage”, ¶163, “computer-readable non-transitory storage medium or media”).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-9, and 11-14 are rejected under 35 U.S.C. 103 as being unpatentable over Chakraborty et al. [US 20230419339, hereinafter D1] in view of Malik et al. [US 2021/0117780 A1, hereinafter D2] further in view of Kozhaya et al. [US 2021/0266225 A1, hereinafter D3].
With regard to Claim 1,
D1 teach a system for providing personal machine learning, ML, models for users, the system comprising:
a server (¶32, “computing environment 100 includes a service system 101, which can include a hub system 110 and one or more edge systems”, ¶33, “ hub system 110 is a network server (e.g. a hub server)”), comprising a task-independent shared ML model (¶3, “user representation model may be … a task agnostic learning model that generates the user representation in the form of a task agnostic embedding”, ¶50, “ Unlike the task specific model 205 and the multitask model 205, the task agnostic embedding is “task agnostic” because it does not learn dependency relationships between events within the user event sequence data 115 and downstream tasks”, ¶32, “ hub system 110 includes … a representation generator subsystem 111, which applies a user representation model 112 …”, ¶38, “user representation model 112 is configured to generate a user representation 117 in the form of a task agnostic embedding”);
a user platform device (¶32, “computing environment 100 includes a service system 101, which can include a hub system 110 and one or more edge systems”, ¶37, “ edge system 130”, ¶40, “ edge system 130 executes an edge processing system to provide one or more machine learning based services 133”), comprising a task-independent personal ML model for a user (7780); and
a user device (¶33, “ Edge clients 140 may be client computing devices, such as personal computers, mobile devices, tablets, etc”, ¶40, “edge client 140 (e.g. a user computing device or application”, ¶40, “ edge client 140 may submit requests …”, ¶68, “FIG. 6 depicts an example illustration 600 of performing, by an edge client 140, machine learning based services”).
D1 disclose that the edge client perform ML based services ¶40. However, D1 does not explicitly teach a task-independent personal ML model for a user.
D2 teach a system for providing personal machine learning, ML, models for users, the system comprising:
a server, comprising a task-independent shared ML model (Fig. 8, 510, ¶117, “The neural network model 810 may include a global neural network model 820 having a plurality of federated model parameters stored on a server 510”).
a user platform device, comprising a task-independent personal ML model for a user (Fig. 8, ¶117, “The neural network model 810 may include a global neural network model 820 having a plurality of federated model parameters stored on a server 510 and a local personalization model 830 having a plurality of local model parameters stored on each of the plurality of client systems 130”, “Client system 130 a may then train the received global neural network model 820 a together with the local personalization model 830 a“, “Client system 130 a may then store in the local data store the trained local personalization model 830 a including the updated local model parameters.”, ¶119, “User embeddings were considered private parameters and were jointly trained with the federated parameters, but kept privately on the client systems”)
D1 and D2 are analogous art to the claimed invention because they are from a similar field of endeavor of providing machine learning systems for generating and using personalized user models or user representations based on user’s data. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1 resulting in resolutions as disclosed by D2 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1 as described above to incorporate local personalization logic into the edge system for improved personalization (D2, ¶¶4-6, “assistant system may create and store a user profile comprising both personal and contextual information associated with the user. In particular embodiments, the assistant system may analyze the user input using natural-language understanding. The analysis may be based on the user profile of the user for more personalized and context-aware understanding”).
D1-D2 does not explicitly teach a user device comprising a task-specific personal ML model for the user.
D3 teach a user device comprising a task-specific personal ML model for the user (Fig. 2, ¶3, “deploying, by the edge server, on the respective ones of edge devices, machine learning models that are associated with the next set of the activities. Applications on the respective ones of the devices, which execute the next set of the activities, leverage the machine learning models that are associated with the next set of the activities”, ¶20, “Assume that a set of machine learning models have been trained and optimized for various activities or objectives”).
D1-D2 and D3 are analogous art to the claimed invention because they are from a similar field of endeavor of providing machine learning systems for generating and using personalized user models or user representations based on user’s data. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D2 resulting in resolutions as disclosed by D3 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1-D2 as described above to determine which machine learning model needs to be deployed to meet demands of the users (D3, ¶2, “There is a burgeoning demand for solutions to deploying machine learning models to edge devices and training custom models on edge devices … furthermore, in determining which machine learning model needs to be deployed to meet demands of the users, the ingestion of personalized heuristics of a user is not taken into account”).
With regard to Claim 2,
D1-D2-D3 teach the system as claimed in claim 1, wherein the server comprises at least one processor coupled to memory (D1, ¶32, “ computing environment 100 includes a service system 101, which can include a hub system 110”, ¶33, “hub system 110 is a network server (e.g. a hub server) “, ¶70, D2, Fig. 16, ¶¶156-157, “ computer system 1600 includes a processor 1602, memory 1604, storage 1606, an input/output (I/O) interface 1608, a communication interface 1610, and a bus 1612”) for training the task-independent shared ML model using a first training dataset (D1, ¶32, “representation generator subsystem 111 also includes a task prediction model 113, which is used in training the user representation model 112 using a training data set 116 selected from the user event sequence data “, ¶¶55-56, D2, ¶117, “framework 800 may include a neural network model 810 and a set of training data distributed over the plurality of client systems 130 . The neural network model 810 may include a global neural network model 820 having a plurality of federated model parameters stored on a server 510”, “ Client system 130 a may then retrieve the plurality of examples 530 a from the local data store of client system 130 a. Client system 130 a may then train the received global neural network model 820 a … on the pluralities of examples 530 a … send … updated federated model parameters to server 510 … then generate an updated global neural network model 820 by aggregating …”). The same motivation to combine for claim 1 equally applies for current claim.
With regard to Claim 3,
D1-D2-D3 teach the system as claimed in claim 2, wherein the at least one processor of the server trains the task-independent shared ML model to learn a set of shared features (D1, ¶32, “The hub system 110 includes processing devices that execute a representation generator subsystem 111, which applies a user representation model 112 to the user event sequence data 115 to generate the user representation 117. The a representation generator subsystem 111 also includes a task prediction model 113, which is used in training the user representation model 112 using a training data set 116”, D2, ¶117, “global neural network model 820 having a plurality of federated model parameters stored on a server 510 …“, “ Server 510 may then generate an updated global neural network model 820 by aggregating the updated federated model parameters …“). The same motivation to combine for claim 2 equally applies for current claim.
With regard to Claim 4,
D1-D2-D3 teach the system as claimed in claim 2, wherein the task-independent shared ML model comprises a feature extractor to extract features from data in the first training dataset (D1, ¶46, “ includes an embedding layer that converts a one-hot representation of input events from the user event sequence data 115 into a fixed dimensional embedding”, D2, ¶119, “ text was encoded into an input representation vector using character-level embeddings and a BLSTM layer”) , and a classifier to classify the extracted features (D1, ¶47, “The output of the LSTM layers of the task specific learning model 205 is passed to the task prediction model 113”, “the task prediction model 113 is a 2-layered fully connected neural network”, “The fully connected layer acts as an inference engine”, D2, ¶119, “An MLP produced the prediction from the concatenation of the input representation and the user embedding”). The same motivation to combine for claim 2 equally applies for current claim.
With regard to Claim 5,
D1-D2-D3 teach the system as claimed in claim 1, wherein training the task-independent shared ML model comprises using any one of the following:
supervised learning (D1, ¶45, “ training data set 116 includes target labels to indicate this correspondence”, D2, ¶123, “each of the plurality of examples 530 comprises one or more features and one or more labels”), unsupervised learning, semi-supervised learning, and self-supervised learning. The same motivation to combine for claim 1 equally applies for current claim.
With regard to Claim 6,
D1-D2-D3 teach the system as claimed in 1, wherein the user platform device (D1, ¶32, “a hub system 110 and one or more edge systems 130“, D2, “servers 510 configured to communicate with a plurality of client systems 130”) comprises at least one processor coupled to memory (D1, ¶¶70-71, “the computer system 700 includes a processing device 702 communicatively coupled to one or more memory components 704”, ¶40, “ edge system 130 executes an edge processing system”, D2, ¶117, “ Client system 130 a may then train the received global neural network model 820 a together with the local personalization model 830 a“, “Client system 130a may then store in the local data store the trained local personalization model 830a”) for training the task-independent personal ML model using a second training dataset (D2, ¶117, “The neural network model 810 may include a global neural network model 820 having a plurality of federated model parameters stored on a server 510 and a local personalization model 830 having a plurality of local model parameters stored on each of the plurality of client systems 130”, “Client system 130 a may then store in the local data store the trained local personalization model 830 a”, “ Client system 130 a may then train … the local personalization model 830 a on the pluralities of examples 530 a …”). The same motivation to combine for claim 1 equally applies for current claim.
With regard to Claim 7,
D1-D2-D3 teach the system as claimed in claim 6, wherein the at least one processor of the user platform device trains the task-independent personal ML model to learn a set of personal features specific to the user (D2, ¶108, “jointly training private local model parameters and federated parameters of a machine-learning model locally on the client systems, storing updated local parameters on the client systems”, ¶117, “The neural network model 810 may include … a local personalization model 830 having a plurality of local model parameters stored on each of the plurality of client systems 130”, “Client system 130 a may then retrieve the plurality of examples 530 a from the local data store of client system 130 a. Client system 130 a may then train the received global neural network model 820 a together with the local personalization model 830 a”). The same motivation to combine for claim 1 equally applies for current claim.
With regard to Claim 8,
D1-D2-D3 teach the system as claimed in claim 6, when dependent on claim 4, wherein training the task-independent personal ML model comprises using the shared features (D2, “¶117, “The neural network model 810 may include a global neural network model 820 … and a local personalization model 830 …”, “ Client system 130 a may then train the received global neural network model 820 a together with the local personalization model 830 a on the pluralities of examples 530 …”). The same motivation to combine for claim 6 equally applies for current claim.
With regard to Claim 9,
D1-D2-D3 teach the system as claimed in claim 8, wherein the task-independent personal ML model (D2, Fig. 8, ¶117, “The neural network model 810 may include a global neural network model 820 having a plurality of federated model parameters stored on a server 510 and a local personalization model 830 having a plurality of local model parameters stored on each of the plurality of client systems 130”, “Client system 130 a may then train the received global neural network model 820 a together with the local personalization model 830 a“, “Client system 130 a may then store in the local data store the trained local personalization model 830 a including the updated local model parameters.”, ¶119, “User embeddings were considered private parameters and were jointly trained with the federated parameters, but kept privately on the client systems”) comprises using an encoder to encode features of data in the second training dataset and the shared features, and decoder to decode the encoded features (D1, ¶50, “the task agnostic model 215 generates the task agnostic embedding 203 using an autoencoder model. In certain embodiments, the task agnostic learning model 215 uses a single layered LSTM as the architecture for the encoder and decoder”, “user event sequence is input to the encoder LSTM”, “task agnostic embedding 203 is input to the decoder LSTM”). The same motivation to combine for claim 1 equally applies for current claim.
With regard to Claim 11,
D1-D2-D3 teach the system as claimed in claim 6, wherein the second training dataset comprises labelled (D1, ¶45, “ training data set 116 includes target labels to indicate this correspondence”, D2, ¶123, “each of the plurality of examples 530 comprises one or more features and one or more labels”) and unlabelled data items (D1, ¶50, “the task agnostic model 215 generates the task agnostic embedding 203 using an autoencoder model”, “user event sequence is input to the encoder LSTM”, “The task agnostic embedding 203 is input to the decoder LSTM”, Autoencoder training reconstruct input sequences and does not require labels), and training the task-independent personal ML model comprises using any one of the following:
supervised learning, unsupervised learning (D1, ¶50, “using an autoencoder model. In certain embodiments, the task agnostic learning model 215 uses a single layered LSTM as the architecture for the encoder and decoder and the user event sequence is input to the encoder LSTM, which generates the task agnostic embedding 203. The task agnostic embedding 203 is input to the decoder LSTM to generate back the user behavior sequence. The task agnostic learning model 215 determines a task agnostic loss 213, for training the autoencoder model”), semi- supervised learning, and self-supervised learning. The same motivation to combine for claim 6 equally applies for current claim.
With regard to Claim 12,
D1-D2-D3 teach the system as claimed in claim 1, wherein the user device comprises at least one processor coupled to memory (D3, Fig. 4, ¶38) for training the task-specific personal ML model using a third training dataset (D3, ¶2, “deploying machine learning models to edge devices and training custom models on edge devices”, ¶3, “deploying, by the edge server, on the respective ones of edge devices, machine learning models that are associated with the next set of the activities. Applications on the respective ones of the devices, which execute the next set of the activities, leverage the machine learning models that are associated with the next set of the activities”, ¶20, “Assume that a set of machine learning models have been trained and optimized for various activities or objectives”). The same motivation to combine for claim 1 equally applies for current claim.
With regard to Claim 13,
D1-D2-D3 teach the system as claimed in claim 12 wherein the at least one processor of the user device trains the task-specific personal ML model to learn a set of task-specific personal features specific to the user (D3, ¶2, “deploying machine learning models to edge devices and training custom models on edge devices”, ¶14, “optimizes machine learning model deployment at granularity of user specific edge device properties … This advantage allows for creating machine learning models that are unique and specific to the user specific edge device”, “preserves end user privacy by avoiding transferring any user specific data to the cloud and rather only communication response metadata while keeping the user data safe on edge devices”, ¶35, “ For example, a model trained with data of mid-life device may perform better on devices which have been in operation for a while, while a model trained on data of a relatively new device performs better for devices recently deployed. The model trained on data from mid-life devices may be better tuned and handle parameters such as noise due to aging devices”, D3 teach ML models are trained and customized at edge device level, and that such models are tuned using data specific to that device. For example ¶2 explicitly states “training custom model on edge device” and ¶35 demonstrate that the model learns parameters and characteristics specific derived from the specific operational data of the device. In addition ¶14 states that the system enables creation of ML models “that are unique and specific to the user specific edge device”, D2, ¶108, “jointly training private local model parameters and federated parameters of a machine-learning model locally on the client systems, storing updated local parameters on the client systems”, ¶117, “ Client system 130 a may then train the received global neural network model 820 a together with the local personalization model 830 a on the pluralities of examples 530 …”, “User embeddings were considered private parameters and were jointly trained with the federated parameters, but kept privately on the client systems. Even though user embeddings were trained independently on each client system”). The same motivation to combine for claim 12 equally applies for current claim.
With regard to Claim 14,
D1-D2-D3 teach the system as claimed in claim 13, when dependent on claim 7, wherein training the task-specific personal ML model comprises using the personal features (D2, “¶117, “ Client system 130 a may then train the received global neural network model 820 a together with the local personalization model 830 a on the pluralities of examples 530 …”, “User embeddings were considered private parameters and were jointly trained with the federated parameters, but kept privately on the client systems. Even though user embeddings were trained independently on each client system”, ¶108, “jointly training private local model parameters and federated parameters of a machine-learning model locally on the client systems, storing updated local parameters on the client systems ..”, D3, ¶2, “training custom models on edge devices”, ¶14, “optimizes machine learning model deployment at granularity of user specific edge device properties … This advantage allows for creating machine learning models that are unique and specific to the user specific edge device”, “preserves end user privacy by avoiding transferring any user specific data to the cloud and rather only communication response metadata while keeping the user data safe on edge devices”). The same motivation to combine for claim 13 equally applies for current claim.
Claims 10 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Chakraborty et al. [US 20230419339, hereinafter D1] in view of Malik et al. [US 2021/0117780 A1, hereinafter D2] further in view of Kozhaya et al. [US 2021/0266225 A1, hereinafter D3] further in view of Gheorghita et al. [US 2022/0093270A1, hereinafter Gheorghita]
With regard to Claim 10,
D1-D2-D3 teach the system as claimed in claim 6, wherein the second training dataset comprises labelled data items (D2, ¶123, “each of the plurality of examples 530 comprises one or more features and one or more labels”), and training the task-independent personal ML model (D2, “¶117, “The neural network model 810 may include a global neural network model 820 … and a local personalization model 830 …”, “ Client system 130 a may then train the received global neural network model 820 a together with the local personalization model 830 a on the pluralities of examples 530 …”).
D1-D2-D3 does not explicitly teach using zero-shot or few-shot learning.
Gheorghita disclose using zero-shot or few-shot learning (¶5, “The trained model is then adapted using few-shot learning to predict …”, ¶6, “machine-learned model having been trained for classification with few-shot learning”).
D1-D2-D3 and Gheorghita are analogous art to the claimed invention because they are from a similar field of endeavor of training machine learning model. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D2-D3 resulting in resolutions as disclosed by Gheorghita with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1-D2-D3 as described above to allow training of machine learning models using a fewer number of samples of training data (Gheorghita ¶9, “The few-shot learning allows for a fewer number of samples of training data”). This is simply combining prior art elements according to known methods to yield predictable results; use of known technique to improve similar devices (methods, or products) in the same way; and applying a known technique to a known device (method, or product) ready for improvement to yield predictable results (MPEP 2143).
With regard to Claim 15,
D1-D2-D3 teach the system as claimed in claim 12, wherein the third training dataset comprises labelled data items (D1, ¶45, “ training data set 116 includes target labels to indicate this correspondence”, D2, ¶123, “each of the plurality of examples 530 comprises one or more features and one or more labels”), and training the task-specific personal ML model (D3, ¶2, “deploying machine learning models to edge devices and training custom models on edge devices”, ¶3)
D1-D2-D3 does not explicitly teach using zero-shot or few-shot learning.
Gheorghita disclose using zero-shot or few-shot learning (¶5, “The trained model is then adapted using few-shot learning to predict …”, ¶6, “machine-learned model having been trained for classification with few-shot learning”).
D1-D2-D3 and Gheorghita are analogous art to the claimed invention because they are from a similar field of endeavor of training machine learning model. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D2-D3 resulting in resolutions as disclosed by Gheorghita with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1-D2-D3 as described above to allow training of machine learning models using a fewer number of samples of training data (Gheorghita ¶9, “The few-shot learning allows for a fewer number of samples of training data”). This is simply combining prior art elements according to known methods to yield predictable results; use of known technique to improve similar devices (methods, or products) in the same way; and applying a known technique to a known device (method, or product) ready for improvement to yield predictable results (MPEP 2143).
Claims 17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Malik et al. [US 2021/0117780 A1, hereinafter D2] in view of Chakraborty et al. [US 20230419339, hereinafter D1]
With regard to Claim 17,
D2 teach the method as claimed in claim 11.
D2 does not teach using an encoder to encode features of data in the training dataset and the shared features, and using a decoder to decode the encoded features.
D1 teach using an encoder to encode features of data in the training dataset and the shared features, and using a decoder to decode the encoded features ¶50, “the task agnostic model 215 generates the task agnostic embedding 203 using an autoencoder model. In certain embodiments, the task agnostic learning model 215 uses a single layered LSTM as the architecture for the encoder and decoder”, “user event sequence is input to the encoder LSTM”, “task agnostic embedding 203 is input to the decoder LSTM”).
D2 and D1 are analogous art to the claimed invention because they are from a similar field of endeavor of providing machine learning systems for generating and using personalized user models or user representations based on user’s data. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D2 resulting in resolutions as disclosed by D1 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D2 as described above to allow higher flexibility as the model can be adapted to different tasks by simply changing the input and output formats, as the encoder-decoder architecture is flexible and can be used for various applications, encoder captures the context of the input data, allowing the decoder to generate outputs that are contextually appropriate, also encoder-decoder models have been shown to outperform traditional models in tasks requiring sequence-to-sequence learning, such as natural language processing and machine translation. These benefits make the encoder-decoder architecture a powerful tool in deep learning, enabling the development of advanced models capable of handling complex data and generating meaningful outputs. This simply combining prior art elements according to known methods to yield predictable results; use of known technique to improve similar devices (methods, or products) in the same way; and applying a known technique to a known device (method, or product) ready for improvement to yield predictable results (MPEP 2143).
With regard to Claim 19,
D2 teach the method as claimed in claim 11, wherein the training dataset comprises labelled, and wherein the training comprises using any one of the following: supervised learning, unsupervised learning, semi-supervised learning, and self-supervised learning.
D2 does not teach unlabelled data items.
D1 teach and unlabelled data items (D1, ¶50, “the task agnostic model 215 generates the task agnostic embedding 203 using an autoencoder model”, “user event sequence is input to the encoder LSTM”, “The task agnostic embedding 203 is input to the decoder LSTM”, Autoencoder training reconstruct input sequences and does not require labels).
D2 and D1 are analogous art to the claimed invention because they are from a similar field of endeavor of providing machine learning systems for generating and using personalized user models or user representations based on user’s data. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D2 resulting in resolutions as disclosed by D1 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D2 as described above to incorporate the unsupervised representation learning technique of D1 into the federated learning personalization system of D2 to improve personalization performance when labeled examples are limited or unavailable. Both learning user representation from user specific data on distributed client systems. This is simply combining prior art elements according to known methods to yield predictable results; use of known technique to improve similar devices (methods, or products) in the same way; and applying a known technique to a known device (method, or product) ready for improvement to yield predictable results (MPEP 2143).
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over in view of Malik et al. [US 2021/0117780 A1, hereinafter D2] further in view of Gheorghita et al. [US 2022/0093270A1, hereinafter Gheorghita]
With regard to Claim 18,
D2 teach the method as claimed in claim 11, wherein the training dataset comprises labelled data items(D2, ¶123, “each of the plurality of examples 530 comprises one or more features and one or more labels”).
D2 does not explicitly teach using zero-shot or few-shot learning.
Gheorghita disclose the training comprises using zero-shot or few-shot learning (¶5, “The trained model is then adapted using few-shot learning to predict …”, ¶6, “machine-learned model having been trained for classification with few-shot learning”).
Gheorghita disclose using zero-shot or few-shot learning (¶5, “The trained model is then adapted using few-shot learning to predict …”, ¶6, “machine-learned model having been trained for classification with few-shot learning”).
D1-D2-D3 and Gheorghita are analogous art to the claimed invention because they are from a similar field of endeavor of training machine learning model. Thus, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify D1-D2-D3 resulting in resolutions as disclosed by D1 with a reasonable expectation of success.
One of ordinary skill in the art would be motivated to modify D1-D2-D3 as described above to allow training of machine learning models using a fewer number of samples of training data (¶9, “The few-shot learning allows for a fewer number of samples of training data”). This is simply combining prior art elements according to known methods to yield predictable results; use of known technique to improve similar devices (methods, or products) in the same way; and applying a known technique to a known device (method, or product) ready for improvement to yield predictable results (MPEP 2143).
Conclusion
The prior art made of record and not relied upon is considered pertinent to the applicant’s disclosure.
US Patent Application Publication No. 20220300804 filed by Guan et al. that disclose training using few shot learning See at least ¶5, ¶19, ¶¶35-36
Examiner has pointed out particular references contained in the prior arts of record in the body of this action for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and Figures may apply as well. It is respectfully requested from the applicant, in preparing the response, to consider fully the entire references as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior arts or disclosed by the examiner. It is noted that any citation to specific pages, columns, figures, or lines in the prior art references any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331-33, 216 USPQ 1038-39 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMED ABOU EL SEOUD whose telephone number is (303)297-4285. The examiner can normally be reached Monday-Thursday 9:00am-6:00pm MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michelle Bechtold can be reached at (571) 431-0762. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MOHAMED ABOU EL SEOUD/Primary Examiner, Art Unit 2148