DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-2, 6, 8-10, 14-15, 17-18 and 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims recite a mental process of observation, judgement and evaluation. This judicial exception is not integrated into a practical application nor does it include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional element are mere extra-solution activity in combination with generic computer hardware used to execute the abstract idea. See the analysis below for further details and explanation.
Claims, 1, 10, and 18
Step 1: Claims 1-20 cite a method, system and process so therefore they fall into the statutory categories of a method and system.
Step 2A Prong 1: The claim recites, inter alia:
Determining, an output associated with the input data (claim 1 and claim 18) (This is a mental process of observation, judgement, and evaluation wherein a user considers an input and determines a corresponding out.)
Step 2A Prong 2:
This judicial exception is no integrated into a practical application. Aside from the limitations above, the claim recites:
obtaining a base model associated with one or more machine learning models; obtaining a domain specific part associated with the one or more machine learning models; receive input data associated with a domain (claim 10); (These limitations amount to data collection of receiving data which is extra-solution activity, see MPEP 2106.05(g).)
Performing one or more operations using the output. (claim 10) ( Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f).)
based on based model and domain specific part processing input data; one or more processing units (claim 10 and 18); processing the input data using one or more machine learning models to generate an output, the one or more machine learning models including one or more first layers associated with a base model and one or more second layers associated with the domain (claim 10); and using one or more machine learning models and based at least on input data associated with a first domain, an output associated with the input data, wherein the one or more machine learning models include one or more first layers associated with the first domain activated and one or more second layers associated with a second domain deactivated (claim 18) (These limitations amounts to using generic computer hardware as the machine learning models are cited a high level of generality and used to execute the abstract idea, see MPEP 2106.05(f).)
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere insignificant extra solution activity in combination of generic computer functions that are implemented to perform the disclosed abstract idea above.
Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional elements of “obtaining a base model associated with one or more machine learning models; obtaining a domain specific part associated with the one or more machine learning models; receive input data associated with a domain (claim 10);” all amount to data collecting which well-understood, routine and conventional. See MPEP 2106.05(d)(ii) wherein it cites “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. i. Receiving or transmitting data over a network, e.g., using the Internet to gather data”. The remaining additional elements of “based on based model and domain specific part processing input data; one or more processing units (claim 10 and 18); processing the input data using one or more machine learning models to generate an output, the one or more machine learning models including one or more first layers associated with a base model and one or more second layers associated with the domain (claim 10); and using one or more machine learning models and based at least on input data associated with a first domain, an output associated with the input data, wherein the one or more machine learning models include one or more first layers associated with the first domain activated and one or more second layers associated with a second domain deactivated (claim 18)” all amount to the training and using of a machine learning models as tool to apply an abstract idea, see MPEP 2106.05(f).
The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception as they are well, understood, routine and conventional activity as disclosed in combination of generic computer functions that are implemented to perform the disclosed abstract idea above.
Claim 2
Step 2A Prong 1:
determining the output associated with the input data is without using the second domain specific part. (This is a mental process of observation, judgement, and evaluation wherein a user considers an input and determines a corresponding out.)
Step 2A Prong 2:
This judicial exception is no integrated into a practical application. Aside from the limitations above, the claim recites:
obtaining a second domain specific part associated with the one or more machine learning models, wherein the determining the output associated. (This limitations amount to data collection of receiving data which is extra-solution activity, see MPEP 2106.05(g).)
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere insignificant extra solution activity in combination of generic computer functions that are implemented to perform the disclosed abstract idea above.
Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional elements of “obtaining a second domain specific part associated with the one or more machine learning models, wherein the determining the output associated.” amount to data collecting which well-understood, routine and conventional. See MPEP 2106.05(d)(ii) wherein it cites “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. i. Receiving or transmitting data over a network, e.g., using the Internet to gather data”.
The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception as they are well, understood, routine and conventional activity as disclosed in combination of generic computer functions that are implemented to perform the disclosed abstract idea above.
Claim 6
Step 2A Prong 1:
No mental process claimed, but inherits that of claim 1
Step 2A Prong 2:
This judicial exception is no integrated into a practical application. Aside from the limitations above, the claim recites:
updating, using first training data associated with one or more general domains, one or more first parameters of one or more first layers associated with the base model; and updating, using second training data associated with a specific domain, one or more second parameters of one or more second layers associated with the domain specific part. (This is training cited a high level of generality. Wherein a based model is trained using general data and fine tuned model is trained using domain specific data, as it cited at high level of generality it results using the machine learning models as tool to execute the abstract idea, see MPEP 2106.05(f).)
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere generic computer functions that are implemented to perform the disclosed abstract idea above.
Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional elements of “updating, using first training data associated with one or more general domains, one or more first parameters of one or more first layers associated with the base model; and updating, using second training data associated with a specific domain, one or more second parameters of one or more second layers associated with the domain specific part.” is training cited a high level of generality. Wherein a based model is trained using general data and fine-tuned model is trained using domain specific data, as it cited at high level of generality it results using the machine learning models as tool to execute the abstract idea, see MPEP 2106.05(f).)
The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception generic computer functions that are implemented to perform the disclosed abstract idea above.
Claim 8 and 15
Step 2A Prong 1:
Determining a second output associated with the input data; and determining based first output and second output, a third output associated with the input data. (This is a mental process of observation, judgement, and evaluation wherein a user considers an input and determines a corresponding out.)
Step 2A Prong 2:
This judicial exception is no integrated into a practical application. Aside from the limitations above, the claim recites:
Using one or more second machine learning models. (This is claimed a high level of generality and results in using the machine learning models as generic computer hardware to execute the abstract idea, see MPEP 2106.05(f).)
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere generic computer functions that are implemented to perform the disclosed abstract idea above.
Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional elements of “Using one or more second machine learning models.” is cited at a high level of generality, it results using the machine learning models as tool to execute the abstract idea, see MPEP 2106.05(f).)
The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception generic computer functions that are implemented to perform the disclosed abstract idea above.
Claim 9
Step 2A Prong 1:
determining the third output associated with the input data comprises: determining, based at least on the output and the second output, the third output associated with the input data and a fourth output associated with the input data. (This is a mental process of observation, judgement, and evaluation wherein a user considers an input and determines a corresponding out.)
determining a first score associated with the third output and a second score associated with the fourth output; (mental process of user determining a score for result.)
determining the third output based at least on the first score being greater than the second score. (mental process of user a third output based on the comparison of two scores.)
Step 2A Prong 2:
This judicial exception is no integrated into a practical application. Aside from the limitations above, the claim recites:
Using one or more third machine learning models. (This is claimed a high level of generality and results in using the machine learning models as generic computer hardware to execute the abstract idea, see MPEP 2106.05(f).)
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere generic computer functions that are implemented to perform the disclosed abstract idea above.
Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional elements of “Using one or more third machine learning models.” is cited at high level of generality it results using the machine learning models as tool to execute the abstract idea, see MPEP 2106.05(f).)
The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception generic computer functions that are implemented to perform the disclosed abstract idea above.
Claim 14
Step 2A Prong 1:
determining a second output associated with the second input data. (This is a mental process of observation, judgement, and evaluation wherein a user considers an input and determines a corresponding out.)
Step 2A Prong 2:
This judicial exception is no integrated into a practical application. Aside from the limitations above, the claim recites:
wherein the input data is input into the one or more machine learning models at a first time, (this amounts to using a machine learning model); and wherein the one or more processing units are further to; input the second input data into the one or more machine learning models at a second time; and using the one or more machine learning models (All of these claim elements amount to using generic computer hardware to execute the abstract idea, see MPEP 2106.05(f).)
receive second input data associated with a second domain; (This amount to data collection and is extra-solution activity, see MPEP 2106.05(g).)
one or more machine learning models including the one or more first layers associated with the base model and one or more third layers associated with the second domain at the second time; (This amount to linking the abstract idea to a technological field, see MPEP 2106.05(h).)
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere generic computer functions in combination with extra-solution activity that are implemented to perform the disclosed abstract idea above.
Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional elements of “receive second input data associated with a second domain;” is well-understood, routine, and conventional, see MPEP 2106.05(d)(ii) that discloses data collection and transmitting data is well-understood, routine and conventional. is cited at high level of generality it results using the machine learning models as tool to execute the abstract idea, see MPEP 2106.05(f). The additional elements of “wherein the input data is input into the one or more machine learning models at a first time, (this amounts to using a machine learning model); and wherein the one or more processing units are further to; input the second input data into the one or more machine learning models at a second time; and using the one or more machine learning models” are all cited a high-level of generality and result in using generic computer hardware to execute the abstract idea, see MPEP 2106.05(f). The limitation of “one or more machine learning models including the one or more first layers associated with the base model and one or more third layers associated with the second domain at the second time;” amounts to linking the abstract idea to a technological field, see MPEP 2106.05(h).
The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception generic computer functions in combination with extra-solution activity that are implemented to perform the disclosed abstract idea above.
Claim 17 and 20
Step 2A Prong 1:
Inherits mental processes of claims 10 and 18
Step 2A Prong 2:
This judicial exception is no integrated into a practical application. Aside from the limitations above, the claim recites:
wherein the system is comprised in at least one of: an infotainment system for an autonomous or semi-autonomous machine; an entertainment system for an autonomous or semi-autonomous machine; a system for performing simulation operations; a system for hosting real-time streaming applications; a system for generating content for one or more of virtual reality (VR), augmented reality (AR), or mixed reality (MR); a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system implemented using an edge device; a system implemented using a robot; a system for performing conversational AI operations; a system for generating synthetic data; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources. (All of these amounts to linking the abstract to technological fields, see MPEP 2106.05(h).)
The additional elements as disclosed above alone or in combination do not integrate the judicial exception into practical application as they are mere linking the abstract idea to technological field of study.
Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. The additional elements of “wherein the system is comprised in at least one of: an infotainment system for an autonomous or semi-autonomous machine; an entertainment system for an autonomous or semi-autonomous machine; a system for performing simulation operations; a system for hosting real-time streaming applications; a system for generating content for one or more of virtual reality (VR), augmented reality (AR), or mixed reality (MR); a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system implemented using an edge device; a system implemented using a robot; a system for performing conversational AI operations; a system for generating synthetic data; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources.” amounts to linking the abstract idea to a technological field, see MPEP 2106.05(h).
The additional elements as disclosed above in combination of the abstract idea are not sufficient to amount to significantly more than the judicial exception as its merely linking the abstract idea to technological field.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-7, 10-14, 16-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Houlsby et al. (“Parameter-Efficient Transfer Learning for NLP” – hereinafter Houlsby).
In regards to claim 1, Houlsby discloses a method comprising:
obtaining a base model associated with one or more machine learning models; (Houlsby abstract teaches “fine-tuning large pre-trained models is effective transfer mechanism in NLP.”, this teaches obtaining a base model as it is the pre-trained model that is fine-tuned. Also see section 3.1 first paragraph which cites “We use the public, pre-trained BERT Transformer network as our base model.”)
obtaining a domain specific part associated with the one or more machine learning models; and (Houlsby abstract cites “…we propose transfer with adapter modules. Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task,…”; page 2 left column second paragraph cites “Adapters are new modules added between layers of a
pre-trained network.” And page 2 section 2 second paragraph cites “in adapter-tuning, the parameters of the original network are frozen and therefore may be shared by many tasks.”. The adapter is domain specific part and associated with a base model as it added to it.)
determining, based at least on the base model and the domain specific part processing input data, an output associated with the input data. (Houlsby page 3 section 21. Second paragraph cites “The adapter is always applied directly to the output of the sub-layer, after the projection back to the input size, but before adding the skip connection back. The output of the adapter is then passed directly into the following layer normalization.” and figure 2 shows the data in a feed-forward layers passing thru the frozen based model and thru the adapter layer, so that the output is generated based on input, based model and domain specific part.)
In regards to claim 2, Houlsby discloses the method of claim 1, further comprising: obtaining a second domain specific part associated with the one or more machine learning models, wherein the determining the output associated with the input data is without using the second domain specific part in the processing of the input data. (Houlsby abstract cites “Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task, and new tasks can be added without revisiting previous ones.”, this teaches more than one task with each having its own adapter module, implying the existence of multiple domain-specific parts. Also, page 2 section paragraph cites “During training, only v are tuned…. Since w is fixed, the model can be extended to new tasks without affecting previous ones.” This further teaches the model being extend to multiple task which means using a second domain specific part (adapter module). Also, page 2 right column second to last paragraph cites “The adapter modules may also be ignored if not required; in Section 3.6 we observe that some adapters have more influence on the network than others.”, this teaches while adapter may be present it can be ignored when not needed or applicable. Also see figure 2 shows a skip connection with two adapters, wherein either adapter can be skipped. In this case using the first adapter along with the base model to determine output while skipping the second adapter.)
In regards to claim 3, Houlsby discloses the method of claim 1, further comprising: determining that the input data is associated with a specific domain; (Houlsby instruction section cites “In this paper we address the online setting, where tasks arrive in a stream. The goal is to build a system that performs well on all of them, but without training an entire new model for every new task.”, this teaches task arrive in a stream, meaning the input data arrives in a stream and the task are known. Further Houlsby section 2 second paragraph teaches the new adapter layer is trained on downstream task while the base model is frozen, hence downstream data is associated with domain specific part.) and based at least on the input data being associated with the specific domain, causing the domain specific part to be coupled to the base model, wherein the determining the output associated with the input data occurs while the domain specific part is coupled to the base model. (Houlsby page 2 left column second paragraph teaches adapters are new modules added between layers and fined tuned new task, thus causing the domain specific part to be coupled to the based model. Also, page section figure 2 and section 2.1 teaches the adapters is always applied directly to the output of the sub-layers, thus used in determining output associated with input using the couple domain specific part and base model.)
In regards to claim 4, Houlsby discloses the method of claim 3, wherein the determining that the input data is associated with the specific domain comprises at least one of: receiving, from a user device, an indication that the input data is associated with the specific domain; or analyzing the input data to determine that the input data is associated with the specific domain. (Houlsby page 2 section 3.1 cites “We use the public, pre-trained BERT Transformer network as our base model. To perform classification with BERT, we follow the approach in Devlin et al. (2018). The first token in each sequence is a special “classification token”. We attach a linear layer to the embedding of this token to predict the class label.”, would be indication of the data associated with a specific domain.)
In regards to claim 5, Houlsby discloses the method of claim 3, wherein the causing the domain specific part to be coupled to the base model comprises causing one or more first layers associated with the domain specific part to be coupled to one or more second layers associated with the base model. (Houlsby page 2 left column second paragraph cites “Adapters are new modules added between layers of a pre-trained network.” And page 2 section 2 second paragraph cites “In particular, the adapter tuning strategy involves injecting new layers into the original network.”, these citations disclose injecting (coupling) the adapter modules as new layers between the base model layers, thus first layers associated with the domain specific part (adapter module) and second layers of base model.)
In regards to claim 6, Houlsby discloses the method of claim 1, further comprising: updating, using first training data associated with one or more general domains, one or more first parameters of one or more first layers associated with the base model; (Houlsby abstract, section 1 first paragraph and section 3.1 teaches using a pre-trained BERT network as the based model and the Bert network is pre-trained on a large text corpora (general domain). This teaches updating the parameters of the base model as training is what determines those parameters.) and updating, using second training data associated with a specific domain, one or more second parameters of one or more second layers associated with the domain specific part. (Houlsby abstract cites “Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task, and new tasks can be added without revisiting previous ones.” And page 2 section 2 teaches in adapter-tuning, the parameters of the original network are frozen…”. Both of these teaches when the adapter layer is being trained that based model layers are frozen. Also fig. 2 text teaches adapters are trained on downstream data.)
In regards to claim 7, Houlsby discloses the method of claim 6, further comprising: during the updating using the first training data, refraining from updating the one or more second parameters of the one or more second layers associated with the domain specific part; and during the updating using the second training data, refraining from updating the one or more first parameters of the one or more first layers associated with the base model. (Houlsby abstract, section 1 first paragraph and section 3.1 teaches using a pre-trained BERT network as the based model and the Bert network is pre-trained on a large text corpora (general domain), so the parameters of the base model are tuned without updating the second parameters for the second layers (adapter layer) as it hasn’t been added yet. Houlsby abstract cites “Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task, and new tasks can be added without revisiting previous ones.” And page 2 section 2 teaches in adapter-tuning, the parameters of the original network are frozen…”. Both of these teaches when the adapter layer is being trained that based model layers are frozen. Also fig. 2 text teaches adapters are trained on downstream data.)
In regards to claim 10, Houlsby disclose a system comprising: one or more processing units to: (Houlsby section 4 teaches using pre-trained language models, which include computers with processors. Also see page 3 section 31. That teaches the system runs on 4 google cloud TPUs) receive input data associated with a domain; (Houlsby section 2 second paragraph teaches the new adapter layer is trained on downstream task while the base model is frozen, hence downstream data is associated with domain specific part. Also fig. 2 text teaches adapters are trained on downstream data.) processing the input data using one or more machine learning models to generate an output, the one or more machine learning models including one or more first layers associated with a base model and one or more second layers associated with the domain; (Houlsby page 3 section 21. Second paragraph cites “The adapter is always applied directly to the output of the sub-layer, after the projection back to the input size, but before adding the skip connection back. The output of the adapter is then passed directly into the following layer normalization.” and figure 2 shows the data in a feed-forward layers passing thru the frozen based model and thru the adapter layer, so that the output is generated based on input, based model and domain specific part.) and perform one or more operations using the output. (Houlsby page 6 section 3.5 teaches using the system and output in a SQuAD question and answering system, thus performing operations.)
In regards to claim 11, Houlsby discloses the system of claim 10, wherein the one or more machine learning models further include one or more third layers associated with a second domain, and wherein the one or more processing units are further to: cause the one or more second layers to be activated and the one or more third layers to be deactivated, wherein the output is generated when the one or more second layers are activated and the one or more third layers are deactivated. (Houlsby abstract cites “Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task, and new tasks can be added without revisiting previous ones.”, this teaches more than one task with each having its own adapter module, implying the existence of multiple domain-specific parts. Also, page 2 section paragraph cites “During training, only v are tuned…. Since w is fixed, the model can be extended to new tasks without affecting previous ones.” This further teaches the model being extend to multiple task which means using a second domain specific part (adapter module). Also, page 2 right column second to last paragraph cites “The adapter modules may also be ignored if not required; in Section 3.6 we observe that some adapters have more influence on the network than others.”, this teaches while adapter may be present it can be ignored when not needed or not applicable. Also see figure 2 shows a skip connection with two adapters, wherein either adapter can be skipped. In this case using the first adapter along with the base model to determine output while skipping the second adapter, meaning the one adapter is active and another is deactivated and the data is processed to get output.)
In regards to claim 12 Houlsby discloses the system of claim 11, wherein the one or more processing units are further to determine to activate the one or more second layers and deactivate the one or more third layers based at least on one or more of: receiving, from a user device, an indication to at least one of activate the one or more second layers or deactivate the one or more third layers; or analyzing the input data to determine that the input data is associated with the domain. (Houlsby page 2 section 3.1 cites “We use the public, pre-trained BERT Transformer network as our base model. To perform classification with BERT, we follow the approach in Devlin et al. (2018). The first token in each sequence is a special “classification token”. We attach a linear layer to the embedding of this token to predict the class label.”, would be indication of the data associated with a specific domain.)
In regards to claim 13, Houlsby discloses the system of claim 11, wherein the one or more second layers are caused to be activated and the one or more third layers are caused to be deactivated based at least on: a first memory component associated with the one or more second layers being connected to the one or more first layers associated with the base model; and a second memory component associated with the one or more third layers being disconnected from the one or more first layers associated with the base model. (Houlsby abstract cites “…we propose transfer with adapter modules. Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task,…”; page 2 left column second paragraph cites “Adapters are new modules added between layers of a pre-trained network.”; page 2 section 2 second paragraph cites “in adapter-tuning, the parameters of the original network are frozen and therefore may be shared by many tasks.”; and page 2 left column third paragraph teaches adapters differ in that tasks do not interact and the shared parameters are froze, meaning the model has perfect memory of pervious task…”, this means that each task has its own adapter and when one task is active the others are not, and thus the other adapters and base model is frozen, meaning any memory associated with them are also not active.)
In regards to claim 14 is it he system embodiment of claim 6 with similar limitations and thus rejected using the same reasoning found in claim 6.
In regards to claim 16 is it he system embodiment of claim 7 with similar limitations and thus rejected using the same reasoning found in claim 7.
In regards to claim 17, Houlsby discloses the system of claim 10, wherein the system is comprised in at least one of:
an infotainment system for an autonomous or semi-autonomous machine;
an entertainment system for an autonomous or semi-autonomous machine;
a system for performing simulation operations;
a system for hosting real-time streaming applications;
a system for generating content for one or more of virtual reality (VR), augmented reality (AR), or mixed reality (MR);
a system for performing digital twin operations;
a system for performing light transport simulation;
a system for performing collaborative content creation for 3D assets;
a system for performing deep learning operations;
a system implemented using an edge device;
a system implemented using a robot;
a system for performing conversational AI operations; (Houlsby page 6 section 3.5 teaches a question and answering system, which is a form of conversation AI as the system takes questions from users processing it using natural language processing and returning an Answer.)
a system for generating synthetic data;
a system incorporating one or more virtual machines (VMs);
a system implemented at least partially in a data center; or
a system implemented at least partially using cloud computing resources.
In regards to claim 18, Houlsby discloses a processor comprising: one or more processing units to determine, (Houlsby section 4 teaches using pre-trained language models, which include computers with processors. Also see page 3 section 31. That teaches the system runs on 4 google cloud TPUs) using one or more machine learning models and based at least on input data associated with a first domain, an output associated with the input data, wherein the one or more machine learning models include one or more first layers associated with the first domain activated and one or more second layers associated with a second domain deactivated. (Houlsby abstract cites “Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task, and new tasks can be added without revisiting previous ones.”, this teaches more than one task with each having its own adapter module, implying the existence of multiple domain-specific parts. Also, page 2 section paragraph cites “During training, only v are tuned…. Since w is fixed, the model can be extended to new tasks without affecting previous ones.” This further teaches the model being extend to multiple task which means using a second domain specific part (adapter module). Also, page 2 right column second to last paragraph cites “The adapter modules may also be ignored if not required; in Section 3.6 we observe that some adapters have more influence on the network than others.”, this teaches while adapter may be present it can be ignored when not needed or not applicable. Also see figure 2 shows a skip connection with two adapters, wherein either adapter can be skipped. In this case using the first adapter along with the base model to determine output while skipping the second adapter, meaning the one adapter is active and another is deactivated and the data is processed to get output.)
In regards to claim 19 , Houlsby discloses the processor of claim 18, wherein the one or more processing units are further to: activate the one or more first layers and deactivate the one or more second layers based at least on one or more of: receiving, from a user device, an indication to at least one of activate the one or more first layers or deactivate the one or more second layers; or analyzing the input data to determine that the input data is associated with the first domain. (Houlsby page 2 section 3.1 cites “We use the public, pre-trained BERT Transformer network as our base model. To perform classification with BERT, we follow the approach in Devlin et al. (2018). The first token in each sequence is a special “classification token”. We attach a linear layer to the embedding of this token to predict the class label.”, would be indication of the data associated with a specific domain or adapter layer.)
Claim 20 is processor embodiment of the system claim 17 with similar limitations and thus rejected using the same reasoning found in claim 17.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 8-9 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Houlsby et al. (“Parameter-Efficient Transfer Learning for NLP” – hereinafter Houlsby). and further in view of Rebuffi et al. (“Learning Multiple Visual Domains with Residual Adapters” – hereinafter Rebuffi).
In regards to claim 8, Houlsby discloses the method of claim 1, but does not disclose further comprising: determining, using one or more second machine learning models, a second output associated with the input data; and determining, based at least on the output and the second output, a third output associated with the input data.
Rebuffi discloses determining, using one or more second machine learning models, a second output associated with the input data; and determining, based at least on the output and the second output, a third output associated with the input data. (Rebuffi page 2 second paragraph states “The layers in the resulting parametric network are either domain-agnostic, hence shared between domains, or domain-specific, hence parametric. The domain-specific layers are changed based on the ground-truth domain of the input image, or based on an estimate of the latter obtained from an auxiliary network. In the latter configuration, our architecture is analogous to the learnet of [2].” This teaches a parametric network (1st model) and an auxiliary network (second model). Then section 3 paragraph 2 teaches “
PNG
media_image1.png
48
572
media_image1.png
Greyscale
wherein this teaches parametric network (ɸad), a classifier(ψd) and result. The auxiliary network is disclosing in section 3.1 wherein it cites “As shown later, it often easy to construct an auxiliary network that predict d from x.” this teaches the second machine learning model taking input x and getting second output, which is the domain. Then page 2 second paragraph states “The domain-specific layers are changed based on the ground-truth domain of the input image, or based on an estimate of the latter obtained from an auxiliary network. In the latter configuration, our architecture is analogous to the learnet of [2].” This teaches that domain parametric model is changed based on the output (domain) of the auxiliary network, thus the third output is based on the first and second output of the first and second models.)
it would have been obvious to one of ordinary in the art before the effective filing date of the claimed invention to modify the teachings of Houlsby with that of Rebuffi in order to allow combining the output of two models to determine an output as all the both references deal with fine-tuning a based model using domain specific data and the benefit of doing so it allow for more accurate model selection as the auxiliary model in Rebuffi allows for finding the domain of the data and then tuning the primary or based model using that data.
In regards to claim 9, Houlsby in view of Rebuffi discloses the method of claim 8, wherein the determining the third output associated with the input data comprises: determining, based at least on the output and the second output, the third output associated with the input data and a fourth output associated with the input data; determining, using one or more third machine learning models, a first score associated with the third output and a second score associated with the fourth output; and determining the third output based at least on the first score being greater than the second score. (See Rebuffi page 6 section “challenge and evaluation” for scoring output and page 8 of Rebuffi.)
In regards to claim 15 is it he system embodiment of claim 8 with similar limitations and thus rejected using the same reasoning found in claim 8.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAULINHO E SMITH whose telephone number is (571)270-1358. The examiner can normally be reached Mon-Fri. 10AM-6PM CST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached at 571-270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PAULINHO E SMITH/ Primary Examiner, Art Unit 2127