DETAILED ACTION
This office action is in response to amendments filed on 11/07/2025.
Claims 1, 2, 9, and 15 have been amended. Claims 1-20 are pending.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
35 U.S.C. 101 Rejections:
Applicant’s arguments regarding the claim rejections under 35 U.S.C. 101 (pg. 9-10) have
been fully considered and are persuasive. The rejection of the pending claims has been withdrawn.
Prior Art Rejections:
Applicant’s arguments regarding the prior art rejections (pg. 11-15) have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Applicant argues that the amended independent claims 1, 9, and 15 recite features not disclosed by any of the cited references. Specifically, applicant argues that the cited references do not disclose “deploying the third neural network to analyze sensor data from the second environment and generate control signals for controlling operation of production line equipment in the second environment,” as recited in the amended independent claims 1, 9, and 15. Applicant asserts that the references Przewięźlikowski and Rozantsev are directed to domain-adaptation methods for offline image classification, and thus lack any teaching or suggestion of real-time sensor data analysis or physical production line equipment control. Examiner notes that the Cella reference, which teaches autonomous production line control using neural networks operating on real-time sensor data, has been brought in to teach this limitation.
Applicant also agues that the references Przewięźlikowski and Rozantsev address the problem of domain adaptation, which is fundamentally different from the technical problem addressed by the instant application that “analytics solutions are usually custom developed for one line/product and much system integration is needed to scale the solution onto another line due to a change in sensor deployments” (spec. 0050). However, examiner respectfully notes that the described problem—modifying the weights of a neural network to adapt to a different but related input data distribution—is effectively a domain adaptation problem, and while Przewięźlikowski and Rozantsev do not appear to specifically suggest applying domain adaptation techniques in an industrial automation environment, this deficiency is remedied by Cella.
In light of the amendments to the claims, the anticipation rejections have been withdrawn and replaced with obviousness rejections. Claims 1-5, 9-12, and 15-18 are now rejected as being unpatentable over Przewięźlikowski in view of Cella. Claims 6-8, 13-14, and 19-20 are now rejected as being unpatentable over Przewięźlikowski in view of Cella and further in view of Rozantsev.
The prior art rejections have been updated to include the amended limitations and to clarify the reasoning given for the limitations that were not amended.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-5, 9-12, and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over
Przewięźlikowski et al. (hereinafter Przewięźlikowski), “HyperMAML: Few-Shot Adaptation of Deep Models with Hypernetworks” (published 05/31/2022) in view of
Cella et al. (hereinafter Cella), U.S. Patent Application Publication US-20190129408-A1 (filed 12/19/2018).
Regarding Claim 1,
Przewięźlikowski teaches A method […] comprising:
generating at least (1) a first set of weights for a first neural network associated with a first task performed in a first environment and (2) a second set of weights for a second neural network associated with the first task performed in a second environment; (Examiner notes that references to the tasks disclosed by Przewięźlikowski will be italicized, while references to the tasks claimed by applicant will be left plain. Pg. 6, algorithm 2: “1: randomly initialize θ, γ, η… 5: Compute adapted parameters θ′i from Si using formula given by eq. (5).” Algorithm 2 describes the training procedure for HyperMAML. The global parameters θ are the weights of the general model fθ (first NN), and they are initialized (i.e. generated). Adapted parameters θ′i are the weights of target model fθ’ (second NN) for the training task Ti, and they are computed (i.e. generated). The general model is associated with the first environment (general case), while the adapted model is associated with the second environment (task-specific). Pg. 7-8, section 4.2: “In order to benchmark the performance of HyperMAML in cross-domain adaptation, we combine two datasets so that the training fold is drawn from the first dataset and validation and the testing fold – from another one… Omniglot → EMNIST classification.” The training and validation/testing datasets used are Omniglot and EMNIST, respectively. Classification performed on the training data drawn from the Omniglot dataset corresponds to the claimed first task. Pg. 16, section 10.1: “The universal classifier is a single fully-connected layer with the input size equal to the encoder embedding size (in our case 64) and the output size equal to the number of classes.” The general model (universal classifier) is a neural network, as is the target model since its architecture mirrors the general model and merely updates its parameters.)
training a metamodel based on at least the first set of weights and the second set of weights; (Pg. 6, section 3.3: “The parameters of the encoder, hypernetwork, and global parameters θ represent the meta parameters of the system, and they are updated with stochastic gradient descent (SGD) by optimizing LHyperMAML(fθ).” The hypernetwork is a metamodel, and its parameters are updated (trained) based on the loss function defined in equation 6, which includes terms fθ and fθ’ corresponding to the first and second model (and their weights), respectively.)
generating, based on the metamodel, a third set of weights for a third neural network associated with a second task in the second environment; and (Figure 2 (pg. 6) shows the inference flow of the HyperMAML architecture, where the output of the trained hypernetwork is ∆θ representing updates to the weights which are used to calculate the target weights θ′ for the target classifier in the task-specific second environment. In the testing phase, this target classifier will perform classification on the test data from the EMNIST dataset, corresponding to the claimed second task.)
Przewięźlikowski does not appear to explicitly disclose
A method for automated deployment of neural network analytics across different production line environments, the method comprising:
deploying the third neural network to analyze sensor data from the second environment and generate control signals for controlling operation of production line equipment in the second environment.
However, Cella teaches A method for automated deployment of neural network analytics across different production line environments, the method comprising:
deploying the third neural network to analyze sensor data from the second environment and generate control signals for controlling operation of production line equipment in the second environment. (0014: “The present disclosure describes a monitoring system for data collection in an industrial environment, the system according to one disclosed non-limiting embodiment of the present disclosure can include a plurality of input sensors operatively coupled to a production line, the plurality of sensors communicatively coupled to a data collector having a controller, the controller including: a data collection band circuit structured to determine at least one collection parameter for at least one of the plurality of sensors from which to process output data, a machine learning data analysis circuit structured to receive output data from the at least one of the plurality of sensors and learn output data patterns indicative of a state of the production line, and a response circuit structured to adjust an operating parameter of a component of the production line based on one of a mismatch or a match of the output data pattern and the state of the production line.” 0346: “In examples, the many types of machine learning algorithms may include decision tree based learning, association rule learning, deep learning, artificial neural networks…” A machine learning data analysis circuit, which can include a neural network, is deployed on a controller to analyze sensor data and adjust an operating parameter (i.e. generate control signals) of a component of a production line (i.e. production line equipment).)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Przewięźlikowski and Cella. Przewięźlikowski teaches adapting a machine learning model to an unfamiliar environment using a hypernetwork to update model weights. Cella teaches leveraging machine learning for autonomous production line control and optimization in industrial environments based on sensor data. One of ordinary skill would have motivation to combine Przewięźlikowski and Cella because Cella’s production line neural networks operate on sensor data, including images (Cella, 0338), in order to optimize “parameters that may be relevant to successful outcomes (such as outcomes in a wide range of environments)” (Cella, 0239), and Przewięźlikowski’s hypernetwork adaptation method enables neural networks to perform image classification in new environments while “outperform[ing] the classical MAML in a number of standard Few-Shot learning benchmarks and achiev[ing] results similar to various other state-of-the-art methods” (Przewięźlikowski, pg. 9, section 5).
Regarding Claim 2, Przewięźlikowski and Cella teach The method of claim 1, as shown above.
Przewięźlikowski also teaches wherein:
generating the first set of weights is based on data captured by a first set of sensors in the first environment, (Pg. 6, section 3.3: “The parameters of the encoder, hypernetwork, and global parameters θ represent the meta parameters of the system, and they are updated with stochastic gradient descent (SGD) by optimizing LHyperMAML(fθ).” The global parameters θ are the weights of the general model fθ (first NN, general environment), and they are updated based on the loss calculated when training on the Omniglot dataset, which includes data captured by sensors: “Omniglot, a data set we collected of multiple examples of 1623 handwritten characters from 50 writing system…Both images and pen strokes were collected” (Lake et al., “Human-level concept learning through probabilistic program induction,” pg. 1333, col. 2).)
generating the second set of weights is based on data captured by a second set of sensors in the second environment, (Pg. 6, figure 2: “The input support examples are processed by encoding network E(·) and delivered to the hypernetwork H(·)… The hypernetwork transforms them, and returns the update of weights ∆θ for target classifier fθ′.” In the training phase, the weights of the target classifier (second NN, task-specific environment) are generated by the hypernetwork based on tasks drawn from the Omniglot dataset, which includes data captured by sensors as described above.)
generating, based on the metamodel, the third set of weights is based on data captured by the second set of sensors in the second environment. (Pg. 6, figure 2: “The input support examples are processed by encoding network E(·) and delivered to the hypernetwork H(·)… The hypernetwork transforms them, and returns the update of weights ∆θ for target classifier fθ′.” In the testing phase, the weights of the target classifier (third NN, task-specific environment) are generated by the hypernetwork based on tasks drawn from the EMNIST dataset, which includes data captured by sensors: “This paper introduces such a suite of datasets, known as Extended Modified NIST (EMNIST). Derived from the NIST Special Database 19… The NIST Special Database 19 [11] contains handwritten digits and characters collected from over 500 writers. The dataset contains binary scans of the handwriting sample collection forms, and individually segmented and labeled characters which were extracted from the forms” (Cohen et al., “EMNIST: an extension of MNIST to handwritten letters,” pg. 2).)
Cella teaches wherein at least one of the first set of sensors or the second set of sensors comprises at least one of: cameras, vibration sensors, acoustic sensors. distance sensors, or temperature sensors, and (1026: “In still additional examples, sensors may be ultrasonic, microphone, touch, capacitive, vibration, acoustic, pressure, strain gauges, thermographic (e.g., camera), imaging (e.g., camera, laser, IR, structured light), a field detector, an EMF meter to measure an AC electromagnetic field, a gaussmeter, a motion detector, a chemical detector, a gas detector, a CBRNE detector, a vibration transducer, a magnetometer, positional, location-based, a velocity sensor, a displacement sensor, a tachometer, a flow sensor, a level sensor, a proximity sensor, a pH sensor, a hygrometer/moisture sensor, a densitometric sensor, an anemometer, a viscometer, or any analog industrial sensor and/or digital industrial sensor.”)
Regarding Claim 3, Przewięźlikowski and Cella teach The method of claim 2, as shown above.
Przewięźlikowski also teaches wherein:
the first neural network is for analysis of the data captured by the first set of sensors related to the first task in the first environment, the second neural network is for analysis of the data captured by the second set of sensors related to the first task in the second environment, and the third neural network is for analysis of the data captured by the second set of sensors related to the second task in the second environment. (Pg. 7-8, section 4.2: “In order to benchmark the performance of HyperMAML in cross-domain adaptation, we combine two datasets so that the training fold is drawn from the first dataset and validation and the testing fold – from another one… Omniglot → EMNIST classification.” In the training phase, the general model (first NN) operates on the Omniglot dataset (first task) in the general environment, and the target model (second NN) operates on the Omniglot dataset (first task) in the task-specific environment. In the testing phase, the target model operates on the EMNIST dataset (second task) in the task-specific environment.)
Regarding Claim 4, Przewięźlikowski and Cella teach The method of claim 2, as shown above.
Przewięźlikowski also teaches wherein:
generating the first set of weights is further based on a first set of labels associated with the data captured by the first set of sensors in the first environment, and (Pg. 6, section 3.3: “The parameters of the encoder, hypernetwork, and global parameters θ represent the meta parameters of the system, and they are updated with stochastic gradient descent (SGD) by optimizing LHyperMAML(fθ).” In the training phase, the global parameters θ are the weights of the general model fθ (first NN, general environment), and they are updated based on the loss calculated when training on the Omniglot dataset (first task) according to “true support labels” YS for each support set, as shown in figure 2 (pg. 6).)
generating the second set of weights is further based on a second set of labels associated with the data captured by the second set of sensors in the second environment. (Pg. 6, figure 2: “The input support examples are processed by encoding network E(·) and delivered to the hypernetwork H(·) together with the true support labels… The hypernetwork transforms them, and returns the update of weights ∆θ for target classifier fθ′.” In the training phase, the weights of the target classifier (second NN, task-specific environment) are generated by the hypernetwork, which takes “true support labels” YS from the Omniglot dataset (first task) as one of its input features.)
Regarding Claim 5, Przewięźlikowski and Cella teach The method of claim 1, as shown above.
Przewięźlikowski also teaches wherein:
generating the third set of weights is further based on a fourth set of weights for a fourth neural network associated with a second task in the first environment. (Pg. 6, figure 2: “The input support examples are processed by encoding network E(·) and delivered to the hypernetwork H(·) together with…predictions from general model fθ(·). The hypernetwork transforms them, and returns the update of weights ∆θ for target classifier fθ′.” In the testing phase, the weights of the target classifier (third NN, task-specific environment) are generated by the hypernetwork, which takes predictions ŶS from the trained general model (fourth NN, general environment) from the EMNIST dataset (second task) as one of its input features, as shown in figure 2 (pg. 6). Therefore, the output of the hypernetwork is based on the weights of the trained general model which determine ŶS.)
Claims 9-12 are system claims, containing substantially the same elements as method claims 1-4. Przewięźlikowski and Cella teach the elements of claims 1-4, as shown above.
Przewięźlikowski also teaches A system comprising: a memory; and a set of processors coupled to the memory, the set of processors configured to perform the method. (Examiner notes that this limitation is interpreted as a general-purpose computing environment. Pg. 16, section 11: “We implement HyperMAML using the PyTorch framework… Each experiment described in this work was run on a single NVIDIA RTX 2080 GPU.” The use of PyTorch and a GPU necessitates implementation of the described methods in a computing environment.)
Claims 15-18 are product claims, containing substantially the same elements as method claims 1-4. Przewięźlikowski and Cella teach the elements of claims 1-4, as shown above.
Przewięźlikowski also teaches A non-transitory machine readable medium storing sets of instructions that, when executed by a set of processors, causes the set of processors to perform the method. (Examiner notes that this limitation is interpreted as a general-purpose computing environment. Pg. 16, section 11: “We implement HyperMAML using the PyTorch framework… Each experiment described in this work was run on a single NVIDIA RTX 2080 GPU.” The use of PyTorch and a GPU necessitates implementation of the described methods in a computing environment.)
Claims 6-8, 13-14, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Przewięźlikowski in view of Cella and further in view of
Rozantsev et al. (hereinafter Rozantsev), “Residual Parameter Transfer for Deep Domain Adaptation” (published 11/21/2017).
Regarding Claim 6, Przewięźlikowski and Cella teach The method of claim 5, as shown above.
Przewięźlikowski also teaches wherein generating the third set of weights comprises:
outputting the third set of weights from the metamodel. (Pg. 6, figure 2: “The hypernetwork transforms them, and returns the update of weights ∆θ for target classifier fθ′.” As shown in figure 2, the update of weights ∆θ is combined with the universal weights θ to output the target weights θ’, which during the testing phase correspond to the weights of the third NN.)
Przewięźlikowski and Cella do not appear to explicitly disclose wherein generating the third set of weights comprises: providing the fourth set of weights as an input to the metamodel;
However, Rozantsev teaches wherein generating the third set of weights comprises:
providing the fourth set of weights as an input to the metamodel; (Pg. 2, section 1: “We model the domain shift by learning meta parameters that transform the weights and biases of each layer of the network. They are depicted by the horizontal branches in Fig. 1.” As can be seen in figure 1 (pg. 1) as well as figure 2 (pg. 3), the parameters of the source model (i.e. the fourth set of weights) are provided directly as inputs to the residual transform network (i.e. the metamodel) which outputs the weights of the target model.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Przewięźlikowski, Cella, and Rozantsev. Przewięźlikowski teaches a method for adapting a model to an unfamiliar environment by using a hypernetwork to update model weights based on support data. Cella teaches leveraging machine learning for autonomous production line control and optimization in industrial environments based on sensor data. Rozantsev teaches a method for adapting a model to a new domain by using auxiliary networks to predict target model weights directly from the weights of the source model. One of ordinary skill would have motivation to combine Przewięźlikowski, Cella, and Rozantsev because “This architecture enables us to flexibly preserve the similarities between domains where they exist and model the differences when necessary. We demonstrate that our approach yields higher accuracy than state-of-the-art methods without undue complexity” (Rozantsev, pg. 1, abstract). Additionally, Rozantsev’s method is applicable even in the absence of annotated samples from the target domain (Rozantsev, pg. 3, section 3).
Regarding Claim 7, Przewięźlikowski and Cella teach The method of claim 5, as shown above.
Przewięźlikowski and Cella do not appear to explicitly disclose wherein each set of weights comprises a vector of weight values and training the metamodel comprises generating at least a first matrix for converting sets of weights associated with different tasks in the first environment to corresponding sets of weights associated with the different tasks in the second environment, and generating the third set of weights comprises using the first matrix to convert the fourth set of weights into the third set of weights.
However, Rozantsev teaches wherein
each set of weights comprises a vector of weight values and (Pg. 3, section 3.1: “let us first consider a vector representation of the source and target stream parameters as θis and θit, respectively.” Source and target stream parameters are the weights of the source and target models.)
training the metamodel comprises generating at least a first matrix for converting sets of weights associated with different tasks in the first environment to corresponding sets of weights associated with the different tasks in the second environment, and (Pg. 3, section 3.1: “we could learn all the coefficients of these matrices [Ai and Bi] for all layers, along with their rank, by minimizing a loss function…”Ai and Bi are “transformation matrices” according to table 1, which are used to convert weights from the source domain to the target domain.)
generating the third set of weights comprises using the first matrix to convert the fourth set of weights into the third set of weights. (Pg. 3, section 3.1: “A natural way to transform the source parameters into the target ones is to write θit = Biσ (AiTθis + di) + θis , ∀i ∈ Ω , for which the notation is given in Table 1.” In the testing phase, source parameters θis are the fourth set of weights, which are converted by the transformation matrices Ai and Bi to obtain the target parameters θit, which represent the third set of weights.)
Regarding Claim 8, Przewięźlikowski, Cella, and Rozantsev teach The method of claim 7, as shown above.
Przewięźlikowski, Cella, and Rozantsev also teach wherein training the metamodel comprises generating at least a second matrix for converting sets of weights associated with different tasks in the second environment to corresponding sets of weights associated with the different tasks in the first environment, the method further comprising: generating a fifth set of weights for a fifth neural network associated with a fourth task in the second environment; and using the second matrix to convert the fifth set of weights into a sixth set of weights for a sixth neural network associated with the fourth task in the first environment. (This claim amounts to an additional iteration of the methods taught above, with the only significant difference being that the weights are converted from the second environment to the first environment rather than from the first environment to the second. Rozantsev teaches this bidirectionality when testing on the Office dataset. Table 6 (pg. 8) shows model adaptation in both directions for pairs of domains (environments) (e.g., D → W and W → D, where D refers to DSLR data and W refers to webcam data).)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Przewięźlikowski, Cella, and Rozantsev. Przewięźlikowski teaches a method for adapting a model to an unfamiliar environment by using a hypernetwork to update model weights based on support data. Cella teaches leveraging machine learning for autonomous production line control and optimization in industrial environments based on sensor data. Rozantsev teaches a method for adapting a model to a new domain by using auxiliary networks to predict target model weights directly from the weights of the source model. One of ordinary skill would have motivation to combine Przewięźlikowski, Cella, and Rozantsev because “This architecture enables us to flexibly preserve the similarities between domains where they exist and model the differences when necessary. We demonstrate that our approach yields higher accuracy than state-of-the-art methods without undue complexity” (Rozantsev, pg. 1, abstract). Additionally, Rozantsev’s method is applicable even in the absence of annotated samples from the target domain (Rozantsev, pg. 3, section 3), and allows for bidirectionality in domain/environment adaptation (Rozantsev, pg. 8, table 6), enabling conversions from the task-specific environment to the general environment for single-domain generalization, “where only one source domain is available for training” (Wang et al., “Learning to Diversify for Single Domain Generalization”, pg. 1, abstract).
Regarding Claim 13, Przewięźlikowski and Cella teach The system of claim 9, as shown above.
Przewięźlikowski also teaches wherein the set of processors configured to generate the third set of weights is further configured to:
output the third set of weights from the metamodel. (Pg. 6, figure 2: “The hypernetwork transforms them, and returns the update of weights ∆θ for target classifier fθ′.” As shown in figure 2, the update of weights ∆θ is combined with the universal weights θ to output the target weights θ’, which during the testing phase correspond to the weights of the third NN.)
Przewięźlikowski and Cella do not appear to explicitly disclose wherein the set of processors configured to generate the third set of weights is further configured to: provide a fourth set of weights for a fourth neural network associated with a second task in the first environment as an input to the metamodel;
However, Rozantsev teaches wherein the set of processors configured to generate the third set of weights is further configured to:
provide a fourth set of weights for a fourth neural network associated with a second task in the first environment as an input to the metamodel; (Pg. 2, section 1: “We model the domain shift by learning meta parameters that transform the weights and biases of each layer of the network. They are depicted by the horizontal branches in Fig. 1.” As can be seen in figure 1 (pg. 1) as well as figure 2 (pg. 3), the parameters of the source model (i.e. trained general model during testing phase, fourth set of weights) are provided directly as inputs to the residual transform network (i.e. the metamodel) which outputs the weights of the target model (i.e. task-specific model during testing phase, third set of weights).)
Regarding Claim 14, Przewięźlikowski, Cella, and Rozantsev teach The system of claim 13, as shown above.
Rozantsev also teaches wherein
each set of weights comprises a vector of weight values and (Pg. 3, section 3.1: “let us first consider a vector representation of the source and target stream parameters as θis and θit, respectively.” Source and target stream parameters are the weights of the source and target models.)
training the metamodel comprises generating at least a first matrix for converting sets of weights associated with different tasks in the first environment to corresponding sets of weights associated with the different tasks in the second environment, and (Pg. 3, section 3.1: “we could learn all the coefficients of these matrices [Ai and Bi] for all layers, along with their rank, by minimizing a loss function…”Ai and Bi are “transformation matrices” according to table 1, which are used to convert weights from the source domain to the target domain.)
generating the third set of weights comprises using the first matrix to convert the fourth set of weights into the third set of weights. (Pg. 3, section 3.1: “A natural way to transform the source parameters into the target ones is to write θit = Biσ (AiTθis + di) + θis , ∀i ∈ Ω , for which the notation is given in Table 1.” In the testing phase, source parameters θis are the fourth set of weights, which are converted by the transformation matrices Ai and Bi to obtain the target parameters θit, which represent the third set of weights.)
Regarding Claim 19, Przewięźlikowski and Cella teach The non-transitory machine readable medium of claim 15, as shown above.
Przewięźlikowski also teaches wherein the sets of instructions causing the set of processors to generate the third set of weights further causes the set of processors to:
output the third set of weights from the metamodel. (Pg. 6, figure 2: “The hypernetwork transforms them, and returns the update of weights ∆θ for target classifier fθ′.” As shown in figure 2, the update of weights ∆θ is combined with the universal weights θ to output the target weights θ’, which during the testing phase correspond to the weights of the third NN.)
Przewięźlikowski and Cella do not appear to explicitly disclose wherein the sets of instructions causing the set of processors to generate the third set of weights further causes the set of processors to: provide a fourth set of weights for a fourth neural network associated with a second task in the first environment as an input to the metamodel;
However, Rozantsev teaches wherein the sets of instructions causing the set of processors to generate the third set of weights further causes the set of processors to:
provide a fourth set of weights for a fourth neural network associated with a second task in the first environment as an input to the metamodel; (Pg. 2, section 1: “We model the domain shift by learning meta parameters that transform the weights and biases of each layer of the network. They are depicted by the horizontal branches in Fig. 1.” As can be seen in figure 1 (pg. 1) as well as figure 2 (pg. 3), the parameters of the source model (i.e. trained general model during testing phase, fourth set of weights) are provided directly as inputs to the residual transform network (i.e. the metamodel) which outputs the weights of the target model (i.e. task-specific model during testing phase, third set of weights).)
Regarding Claim 20, Przewięźlikowski, Cella, and Rozantsev teach The non-transitory machine readable medium of claim 19, as shown above.
Rozantsev also teaches wherein
each set of weights comprises a vector of weight values and (Pg. 3, section 3.1: “let us first consider a vector representation of the source and target stream parameters as θis and θit, respectively.” Source and target stream parameters are the weights of the source and target models.)
training the metamodel comprises generating at least a first matrix for converting sets of weights associated with different tasks in the first environment to corresponding sets of weights associated with the different tasks in the second environment, and (Pg. 3, section 3.1: “we could learn all the coefficients of these matrices [Ai and Bi] for all layers, along with their rank, by minimizing a loss function…”Ai and Bi are “transformation matrices” according to table 1, which are used to convert weights from the source domain to the target domain.)
generating the third set of weights comprises using the first matrix to convert the fourth set of weights into the third set of weights. (Pg. 3, section 3.1: “A natural way to transform the source parameters into the target ones is to write θit = Biσ (AiTθis + di) + θis , ∀i ∈ Ω , for which the notation is given in Table 1.” In the testing phase, source parameters θis are the fourth set of weights, which are converted by the transformation matrices Ai and Bi to obtain the target parameters θit, which represent the third set of weights.)
Conclusion
Claims 1-20 are rejected.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN M ROHD whose telephone number is (571)272-6445. The examiner can normally be reached Mon-Thurs 8:00-6:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Viker Lamardo can be reached at (571) 270-5871. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/B.M.R./Examiner, Art Unit 2147 /ERIC NILSSON/Primary Examiner, Art Unit 2151