Last updated: April 19, 2026
Application No. 18/163,542
XOR OPERATION LEARNING PROBABILITY OF MULTIVARIATE NONLINEAR ACTIVATION FUNCTION AND PRACTICAL APPLICATION METHOD THEREOF

Non-Final OA §102§103
Filed
Feb 02, 2023
Examiner
KAWSAR, ABDULLAH AL
Art Unit
2127
Tech Center
2100 — Computer Architecture & Software
Assignee
Iucf-Hyu (Industry-University Cooperation Foundation Hanyang University)
OA Round
1 (Non-Final)
Interview Optional

— +58.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 395 resolved cases, 2023–2026
Examiner Intelligence

KAWSAR, ABDULLAH AL View full profile →
Grants 79% — above average
Career Allow Rate
312 granted / 395 resolved
+24.0% vs TC avg
Strong +58% interview lift
Without
With
+58.0%
Interview Lift
resolved cases with interview
Typical timeline
4y 11m
Avg Prosecution
14 currently pending
Career history
409
Total Applications
across all art units
Statute-Specific Performance

§101
16.3%
-23.7% vs TC avg
§103
43.5%
+3.5% vs TC avg
§102
12.4%
-27.6% vs TC avg
§112
23.1%
-16.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 395 resolved cases
Office Action

§102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. KR10-2022-0016, filed on February 8, 2022.

Claim Objections
Claim 3 is objected to because of the following informalities: “wherein the constructing of the inner network comprises constructing the inner network using a convolution with a preset size” should be “wherein the constructing of the inner network comprises a convolution with a preset size”.  Claim 6 is objected to because of the following informalities: “pretraining” should be “pre-training”. Claim 9 is objected to because of the following informalities: “claim. 1” should be “claim 1”. Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: inner network constructor in claims 10, 11, and 12, and the model trainer of claims 13, 14, and 15.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof (Not in specification)
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claim(s) 1, 6, 9, 10, and 13 are rejected under 35 U.S.C. 102(a)(1) as being unpatentable over LaBute (US 10761897 B2).
Regarding Claim 1,
LaBute teaches a learning method of an activation function performed by a computer device, the method comprising: (See Col. 1 lines 53-58, LaBute recites “The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor”, which represents the computer device where the learning method is performed on. See Col. 5 lines 47-48, LaBute recites “Provisioning system 300 comprises required threads determiner”. Furthermore, LaBute recites “First input z(t) comprises a first input—for example, a number of required threads (e.g., from a required threads determiner). In some embodiments, z(t) are one-hot encoded vectors of time-based features indicating month, week, day, day of month, weekday, hour, minute”, see Col. 7 lines 15-19. Next, LaBute recites “First input z(t) is processed by input layer 402 and neuron layer(s) 404. In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400, where they are “frozen”, i.e. the parameters of these layers are not allowed to change during training of the full model 400”, where the pre-training is the learning method, see Col. 5 lines 24-29. After, LaBute recites “The layer(s) of neuron layer(s) 404 comprise an output function, for example a rectified linear output function or a tan h output function”, where the output functions are the activation functions, see Col. 7 lines 35-37.  Therefore, the computer device performs the learning method of an activation function.).
Constructing an inner network using a multivariate nonlinear activation function (See Col. 5 lines 43-44, LaBute recites the “thread usage” is “produced by” the “required threads determiner 302”. Furthermore, LaBute recites “Required threads determiner 302 determines an estimate of a number of required processing threads at a given time. The estimate is based at least in part on historical usage data (e.g., historical processor queue data). Required threads determiner 302 determines the number of required threads at a regular interval (e.g., a new determination is made every 5 minutes, every 10 minutes, every hour, etc.). where the “Required threads determiner 302 provides multi-scale seasonality (e.g., daily, weekly, monthly, and annually) patterns, learned from the number-of-threads time series determined”, which means that the thread usage has multiple inputs at a given interval of time, see Col. 5, lines 48-63. Then, LaBute recites “In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages”, which is interpreted as the construction of the neuron layer(s) 404, or the inner network, see Col. 7 lines 24-25. In addition, LaBute recites “The layer(s) of neuron layer(s) 404 comprise an output function, for example a rectified linear output function or a tan h output function”, where the rectified linear output function or the tan h output function is the nonlinear activation function, see Col. 7 lines 35-37. Therefore, the neural network layers 404, or the inner network, with the multiple inputs, or multivariate, from the thread usages are created using a rectified linear output function or tan h output function, or the nonlinear activation function, where it becomes a multivariate nonlinear activation function due to the inputs from the thread usages.);
and training a combination model generated by merging the constructed inner network and an outer network (See Col. 7, lines 24-29, LaBute recites “In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400, where they are ‘frozen’, i.e. the parameters of these layers are not allowed to change during training of the full model 400”. Furthermore, the neuron layer(s) 404 are inserted between the input layer 402 and the neuron layer(s) 410 of the full model 400, see Fig. 4. This is interpreted as the neuron layers, or the inner network, being pretrained and inserted between the layers of the main network model, or the outer network. Furthermore, LaBute recites “Model 400 additionally receives training data. Training data comprises data for training model 400 (e.g., to make correct decisions). For example, training data comprises a resource utilization score received from a resource utilization score determiner. Model 400 is trained in one or more ways, include using reinforcement learning using a resource utilization score and a provisioner performance reward, pre-training training neural network layer(s) 404 using seasonal job request data and then freezing these before training the full network, or training using supervised learning with historical data”, which is interpreted as the model 400 with the neural network layers 404 in between them being trained, see Col. 8 lines 4-14.).

Regarding Claim 6,
LaBute teaches the method of claim 1, wherein the training comprises pre-training the inner network using reinforcement learning on the multivariate nonlinear activation function (See Col. 5 lines 43-44, LaBute recites the “thread usage” is “produced by” the “required threads determiner 302”. Furthermore, LaBute recites “Required threads determiner 302 determines an estimate of a number of required processing threads at a given time. The estimate is based at least in part on historical usage data (e.g., historical processor queue data). Required threads determiner 302 determines the number of required threads at a regular interval (e.g., a new determination is made every 5 minutes, every 10 minutes, every hour, etc.). where the “Required threads determiner 302 provides multi-scale seasonality (e.g., daily, weekly, monthly, and annually) patterns, learned from the number-of-threads time series determined”, which means that the thread usage has multiple inputs at a given interval of time, see Col. 5, lines 48-63. Then, LaBute recites “neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400”, where the neuron layers are the inner network and the model 400 is the outer network, see Col. 7 lines 24-27. In addition, LaBute recites “the layer(s) of neuron layer(s) 404 comprise an output function, for example a rectified linear output function or a tan h output function”, where the rectified linear or tan h output functions with multiple input neurons are multivariate nonlinear activation functions, see Col. 7, lines 35-37. Furthermore, LaBute recites “Model 400 additionally receives training data. Training data comprises data for training model 400 (e.g., to make correct decisions). For example, training data comprises a resource utilization score received from a resource utilization score determiner. Model 400 is trained in one or more ways, include using reinforcement learning using a resource utilization score and a provisioner performance reward, pre-training training neural network layer(s) 404 using seasonal job request data and then freezing these before training the full network, or training using supervised learning with historical data”, where the neuron layers 404, or the inner network, are pretrained using reinforcement learning, see Col. 8 lines 4-14. Therefore, the neuron layers 404, or the inner network, is pretrained using reinforcement on the rectified linear output function or a tan h output function with the thread usages being the multiple inputs, or the multivariate nonlinear activation function).
Regarding Claim 9, 
	LaBute teaches a non-transitory computer-readable recording medium storing a computer program to perform the learning method of the activation function of claim 1 on the computer device (See Col. 1 lines 53-58, LaBute recites “The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor”, which is the computer program stored on a computer readable storage medium. Then, LaBute recites “Provisioning system 300 comprises required threads determiner”, see Col. 5 lines 47-48. Furthermore, LaBute recites “First input z(t) comprises a first input—for example, a number of required threads (e.g., from a required threads determiner). In some embodiments, z(t) are one-hot encoded vectors of time-based features indicating month, week, day, day of month, weekday, hour, minute”, see Col. 7 lines 15-19. Next, LaBute recites “First input z(t) is processed by input layer 402 and neuron layer(s) 404. In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400, where they are “frozen”, i.e. the parameters of these layers are not allowed to change during training of the full model 400”, where the pre-training is the learning method, see Col. 5 lines 24-29. After, LaBute recites “The layer(s) of neuron layer(s) 404 comprise an output function, for example a rectified linear output function or a tan h output function”, where the output functions are the activation functions, see Col. 7 lines 35-37.  Therefore, the non-transitory computer-readable recording medium stores a computer program to perform the learning method of the activation function on the computer device.).
Regarding Claim 10,
	LaBute teaches a computer device comprising: an inner network constructor configured to construct an inner network using a multivariate nonlinear activation function (See Col. 1 lines 53-58, LaBute recites “The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor”, where the computer components represent the computer device. Then, LaBute recites the “thread usage” is “produced by” the “required threads determiner 302”, see Col. 5 lines 43-44. Next, LaBute recites “Provisioning system 300 comprises required threads determiner”, see Col. 5 lines 47-48. Furthermore, LaBute recites “Required threads determiner 302 determines an estimate of a number of required processing threads at a given time. The estimate is based at least in part on historical usage data (e.g., historical processor queue data). Required threads determiner 302 determines the number of required threads at a regular interval (e.g., a new determination is made every 5 minutes, every 10 minutes, every hour, etc.). where the “Required threads determiner 302 provides multi-scale seasonality (e.g., daily, weekly, monthly, and annually) patterns, learned from the number-of-threads time series determined”, which means that the thread usage has multiple inputs at a given interval of time, see Col. 5, lines 48-63. Then, LaBute recites “In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages”, which is interpreted as the construction of the neuron layer(s) 404, or the inner network with pre-training, or the inner network constructor, see Col. 7 lines 25-26. Furthermore, LaBute recites “Neuron layer(s) 404 comprises one or more neuron layers for data processing—for example, a feedforward neuron layer, a recurrent neuron layer, a long short term memory layer, a dense layer, a dropout layer, etc. Neuron layer(s) 404 feed into neuron layer(s) 410. The layer(s) of neuron layer(s) 404 comprise an output function, for example a rectified linear output function or a tan h output function.”, where the neuron layer(s) 404 represent the inner network and the multiple neurons are input to the rectified linear or tan h output function, which represents the multivariate nonlinear activation function since ten h and rectified linear (ReLU) are both nonlinear functions, see Col. 7 lines 30-36. Therefore, pre-training, or the inner network constructor, constructs the neuron layers 404, or the inner network, using the output functions with the thread usages, or the multivariate nonlinear activation function.),
and a model trainer configured to train a combination model generated by merging the constructed inner network and an outer network (See Col. 7, lines 25-31, LaBute recites “In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400, where they are “frozen”, i.e. the parameters of these layers are not allowed to change during training of the full model 400”. Furthermore, the neuron layer(s) 404 are inserted between the input layer 402 and the neuron layer(s) 410 of the full model 400, see Fig. 4. This is interpreted as the neuron layers, or the inner network, being pretrained and inserted between the layers of the main network model, or the outer network. Then, LaBute recites “Model 400 is trained in one or more ways, include using reinforcement learning using a resource utilization score and a provisioner performance reward, pre-training training neural network layer(s) 404 using seasonal job request data and then freezing these before training the full network, or training using supervised learning with historical data” where reinforcement learning using the resource utilization score and provisional performance reward is the model trainer, see Col. 8, lines 8-14.).
Regarding Claim 13,
	LaBute teaches the method of claim 10, wherein the model trainer is configured to pretrain the inner network using reinforcement learning on the multivariate nonlinear activation function (See Col. 5 lines 43-44, LaBute recites the “thread usage” is “produced by” the “required threads determiner 302”. Furthermore, LaBute recites “Required threads determiner 302 determines an estimate of a number of required processing threads at a given time. The estimate is based at least in part on historical usage data (e.g., historical processor queue data). Required threads determiner 302 determines the number of required threads at a regular interval (e.g., a new determination is made every 5 minutes, every 10 minutes, every hour, etc.). where the “Required threads determiner 302 provides multi-scale seasonality (e.g., daily, weekly, monthly, and annually) patterns, learned from the number-of-threads time series determined”, which means that the thread usage has multiple inputs at a given interval of time, see Col. 5, lines 48-63. Then, LaBute recites “neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400”, where the neuron layers are the inner network and the model 400 is the outer network, see Col. 7 lines 24-27. In addition, LaBute recites “the layer(s) of neuron layer(s) 404 comprise an output function, for example a rectified linear output function or a tan h output function”, where the rectified linear or tan h output functions with inputs from the thread usages are multivariate nonlinear activation functions, see Col. 7 lines 35-37. Furthermore, LaBute recites “Model 400 additionally receives training data. Training data comprises data for training model 400 (e.g., to make correct decisions). For example, training data comprises a resource utilization score received from a resource utilization score determiner. Model 400 is trained in one or more ways, include using reinforcement learning using a resource utilization score and a provisioner performance reward, pre-training training neural network layer(s) 404 using seasonal job request data and then freezing these before training the full network, or training using supervised learning with historical data”, where the neuron layers 404, or the inner network, are pretrained using reinforcement learning and the reinforcement learning using the resource utilization score and provisional performance reward is the model trainer, see Col. 8 lines 4-14. Therefore, the neuron layers 404, or the inner network, is pretrained using reinforcement on the rectified linear output function or a tan h output function with the inputs from the thread usages, or the multivariate nonlinear activation function).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 
and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections 
set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

10. 	The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or non-obviousness.
Claim(s) 2 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over LaBute (US 10761897 B2) in view of Xiao (US 20170323636 A1).
While LaBute teaches the method of claim 1 comprising the construction of an inner network using a multivariate nonlinear activation function (See Col. 5 lines 43-44, LaBute recites the “thread usage” is “produced by” the “required threads determiner 302”. Furthermore, LaBute recites “Required threads determiner 302 determines an estimate of a number of required processing threads at a given time. The estimate is based at least in part on historical usage data (e.g., historical processor queue data). Required threads determiner 302 determines the number of required threads at a regular interval (e.g., a new determination is made every 5 minutes, every 10 minutes, every hour, etc.). where the “Required threads determiner 302 provides multi-scale seasonality (e.g., daily, weekly, monthly, and annually) patterns, learned from the number-of-threads time series determined”, which means that the thread usage has multiple inputs at a given interval of time, see Col. 5, lines 48-63. Then, LaBute recites “In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages”, which is interpreted as the construction of the neuron layer(s) 404, or the inner network, see Col. 7 lines 25-26. Furthermore, LaBute recites “Neuron layer(s) 404 comprises one or more neuron layers for data processing—for example, a feedforward neuron layer, a recurrent neuron layer, a long short term memory layer, a dense layer, a dropout layer, etc. Neuron layer(s) 404 feed into neuron layer(s) 410. The layer(s) of neuron layer(s) 404 comprise an output function, for example a rectified linear output function or a tan h output function.” where the neuron layer(s) 404 represent the inner network and the inputs from the thread usages are used for the rectified linear or tan h output function, which represents the multivariate nonlinear activation function since ten h and rectified linear (ReLU) are both nonlinear functions, see Col. 7 lines 30-36. Therefore, the inner network is constructed using the multivariate nonlinear activation function.). 
LaBute fails to teach 
However, Xiao teaches the modeling of the multivariate nonlinear activation function using a multilayer perceptron (MLP) having a plurality of input arguments and at least one output terminal (See Paragraph 0049, Xiao recites “the first MLP 72 includes multiple layers of nodes in a directed graph, with each layer fully connected to the next one. In the exemplary embodiment, u.sub.b can be a representation 74 of the natural utterance computed by the multilayer perceptron (MLP) 72 over bags of n-grams of the sentence such that u.sub.b=s(W.sub.2(s(W.sub.1(n)), where n is the n-gram representation of the sentence, W.sub.1, W.sub.2 are parameter matrices, and s is a non-linear activation function that maps the weighted inputs to the output of each neuron”, which is interpreted as the multilayer perceptron forming a non-linear activation that maps multiple, or plurality of, inputs to a single output, or terminal, making this the multivariate nonlinear activation function.).
It would have been obvious to a person of ordinary skill in art before the effective filling date of the invention to implement the function of Xiao into the method of LaBute to model the multivariate nonlinear activation layer with a multilayer perceptron. The modification would have been obvious because one of the ordinary skills of the art would be motivated to utilize the feature of Xiao as all the references are in the field of activation functions. A person of ordinary skill of the art would have been motivated to map multiple inputs to a single output for each layer to learn and represent complex, nonlinear relationships. 
Regarding Claim 11,
While LaBute teaches the computer device of claim 10 comprising the inner network constructor constructing an inner network using a multivariate nonlinear activation function (See Col. 1 lines 53-58, LaBute recites “The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor”, where the computer components represent the computer device. Then, LaBute recites the “thread usage” is “produced by” the “required threads determiner 302”, See Col. 5 lines 43-44. Furthermore, LaBute recites “Required threads determiner 302 determines an estimate of a number of required processing threads at a given time. The estimate is based at least in part on historical usage data (e.g., historical processor queue data). Required threads determiner 302 determines the number of required threads at a regular interval (e.g., a new determination is made every 5 minutes, every 10 minutes, every hour, etc.). where the “Required threads determiner 302 provides multi-scale seasonality (e.g., daily, weekly, monthly, and annually) patterns, learned from the number-of-threads time series determined”, which means that the thread usage has multiple inputs at a given interval of time, see Col. 5, lines 48-63. Then, LaBute recites “In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages”, which is interpreted as the construction of the neuron layer(s) 404, or the inner network and the pre-training on thread usages represents the inner network constructor, see Col. 7 lines 25-26. Furthermore, LaBute recites “Neuron layer(s) 404 comprises one or more neuron layers for data processing—for example, a feedforward neuron layer, a recurrent neuron layer, a long short term memory layer, a dense layer, a dropout layer, etc. Neuron layer(s) 404 feed into neuron layer(s) 410. The layer(s) of neuron layer(s) 404 comprise an output function, for example a rectified linear output function or a tan h output function”, where the neuron layer(s) 404 represent the inner network and the inputs from the thread usages are used for the rectified linear or tan h output function, which represents the multivariate nonlinear activation function since ten h and rectified linear (ReLU) are both nonlinear functions, see Col. 7 lines 30-36. Therefore, the inner network is constructed with pre-training, or the inner network constructor, using the multivariate nonlinear activation function.), LaBute fails to teach the modeling of the multivariate nonlinear activation function using a multilayer perceptron (MLP) having a plurality of input arguments and at least one output terminal.
However, Xiao teaches the modeling of the multivariate nonlinear activation function using a multilayer perceptron (MLP) having a plurality of input arguments and at least one output terminal (See
It would have been obvious to a person of ordinary skill in art before the effective filling date of the invention to implement the function of Xiao into the method of LaBute to model the multivariate nonlinear activation layer with a multilayer perceptron. The modification would have been obvious because one of the ordinary skills of the art would be motivated to utilize the feature of Xiao as all the references are in the field of . 

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over LaBute (US 10761897 B2) in view of Liu (US 12380318 B2).
Regarding Claim 3, 
While LaBute teaches the construction of the inner network (See Col. 7, lines 25-31, LaBute recites “In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400, where they are “frozen”, i.e. the parameters of these layers are not allowed to change during training of the full model 400”. Furthermore, the neuron layer(s) 404 are inserted between the input layer 402 and the neuron layer(s) 410 of the full model 400, see Fig. 4. This is interpreted as the neuron layers, or the inner network, being pretrained and inserted between the layers of the main network model, or the outer network), LaBute fails to teach the constructing of network using a convolution with a preset size. 
However, Liu teaches the construction of the inner network using a convolution with a preset size (See Col. 2 lines 37-43, Liu teaches “A generator network is constructed, and the network sequentially consists of: fully connected layer-upsampling layer-convolutional layer-upsampling layer-convolutional layer-output layer, herein the number of nodes in the fully connected layer is 16*the number of spectral wavebands, the convolutional layer is one-dimensional convolution, the size of a convolution kernel is 1×5”, which is interpreted as the network being constructed with a convolution with a preset size.).
It would have been obvious to a person of ordinary skill in art before the effective filling date of the invention to implement the function of Liu into the method of LaBute to construct the inner network with a convolution with a preset size. The modification would have been obvious because one of the ordinary skills of the art would be motivated to utilize the feature of Liu as all the references are in the field of network construction. A person of ordinary skill of the art would have been motivated to construct the inner network with a convolution with a preset size to have faster training, efficient use of memory, and lower overfitting due to defined layer sizes.

Claim(s) 4, and 12 are rejected under 35 U.S.C. 103 as being unpatentable over LaBute (US 10761897 B2) in view of Biadsy (US 20220414542 A1).
While LaBute teaches the method of claim 1, wherein the training comprises merging the constructed inner network with the outer network (See Col. 7, lines 25-31, LaBute recites “In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400, where they are “frozen”, i.e. the parameters of these layers are not allowed to change during training of the full model 400”. Furthermore, the neuron layer(s) 404 are inserted between the input layer 402 and the neuron layer(s) 410 of the full model 400, see Fig. 4. This is interpreted as the neuron layers, or the inner network, being pretrained and merged with the main network model, or the outer network), 
LaBute fails to teach the providing of the inner network between the hidden layers of the outer network.
However, Biadsy teaches the providing of the inner network between the hidden layers of the outer network (See Paragraph 0049, Biadsy recites “the submodel can be inserted into the base model at an intermediate or hidden location” that can be “one or more additional” “hidden layers”, which is interpreted as the sub-model, or the inner network, being inserted in between the hidden layers of the base model, or the outer network, since it can be an additional hidden layer.).
It would have been obvious to a person of ordinary skill in art before the effective filling date of the invention to implement the function of Biadsy into the method of LaBute to insert the inner network between the hidden layers of the outer network. The modification would have been obvious because one of the ordinary skills of the art would be motivated to utilize the feature of Biadsy as all the references are in the field of merging models. A person of ordinary skill of the art would have been motivated to train both models simultaneously instead of separately to save time.
Regarding Claim 12,
While LaBute teaches the computer device of claim 10, wherein the inner network constructor is configured to merge the constructed inner network and the outer network (See Col. 7, lines 25-31, LaBute recites “In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400, where they are “frozen”, i.e. the parameters of these layers are not allowed to change during training of the full model 400”. Furthermore, the neuron layer(s) 404 are inserted between the input layer 402 and the neuron layer(s) 410 of the full model 400, see Fig. 4. This is interpreted as the neuron layers, or the inner network, being pretrained with thread usages, or the inner network constructor, and merged with the main network model, or the outer network), LaBute fails to teach that the inner network is inserted between hidden layers of the outer network.
However, Biadsy teaches the providing of the inner network between the hidden layers of the outer network (See Paragraph 0049, Biadsy recites “the submodel can be inserted into the base model at an intermediate or hidden location” that can be “one or more additional” “hidden layers”, which is interpreted as the sub-model, or the inner network, being inserted in between the hidden layers of the base model, or the outer network, since it can be an additional hidden layer.).
It would have been obvious to a person of ordinary skill in art before the effective filling date of the invention to implement the function of Biadsy into the method of LaBute to insert the inner network between the hidden layers of the outer network. The modification would have been obvious because one of the ordinary skills of the art would be motivated to utilize the feature of Biadsy as all the references are in the field of merging models. A person of ordinary skill of the art would have been motivated to train both models simultaneously instead of separately to save time.

Claim(s) 5 is rejected under 35 U.S.C. 103 as being unpatentable over LaBute (US 10761897 B2) in view of view of Santhar (US 20230229904 A1).
Regarding Claim 5,
While LaBute teaches the method of claim 1 wherein the training comprises the merging of the constructed inner network with the outer network (See Col. 7, lines 25-31, LaBute recites “In some embodiments, neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400, where they are “frozen”, i.e. the parameters of these layers are not allowed to change during training of the full model 400”. Furthermore, the neuron layer(s) 404 are inserted between the input layer 402 and the neuron layer(s) 410 of the full model 400, see Fig. 4. This is interpreted as the neuron layers, or the inner network, being pretrained and inserted between the hidden layers of the main network model, or the outer network), LaBute fails to teach the merging of the constructed inner network and the outer network through a slice and concatenation operation from a depth dimension of the inner network
However, Santhar teaches the merging of the inner network and the outer network through a slice and concatenation operation from a depth dimension of the inner network (See Paragraph 0022, Santhar recites “The present invention proposes a system where multiple supervised models, once trained, are sliced at each layer taking the output from the attention layers. These slices are fed to a shallow neural network which takes the inputs from the sliced portion of the network and outputs the multiple classes of different features that contribute to the detection of the overall object. The independent features are determined, and accuracy is calculated for each of these features after every slice” where the slices are “stacked together to form a composite model”, which is interpreted as the supervised model, or the inner network, being sliced and fed, or concatenated, to the shallow neural network, or the outer network, where the features of each slice are determined for accuracy, where the features of each slice of the supervised models refer to the depth dimension of the inner network, which finally stack together to form a composite model, or the combined model.).
It would have been obvious to a person of ordinary skill in art before the effective filling date of the invention to implement the function of Santhar into the method of LaBute to merge the inner network with the outer network through slicing and concatenating. The modification would have been obvious because one of the ordinary skills of the art would be motivated to utilize the feature of Santhar as all the references are in the field of combining networks. A person of ordinary skill of the art would have been motivated to perform the combination for having an accurate combined model (“The features with high accuracy across heterogenous model slices are picked and concatenated in sequence of low level to high level features. The overall accuracy of the final model is contributed by individual accuracies of each slice that are connected together” as suggested by Santhar at Paragraph 0022.).

Claim(s) 7, 8, 14, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over LaBute (US 10761897 B2) in view of Tang (US 11636348 B2).
Regarding Claim 7,
While LaBute teaches the method of claim 6, wherein the training comprises simultaneously training the inner network and the outer network to generate the combination model by merging the pretrained inner network and the outer network (See Col. 7 lines 25-28, LaBute recites “neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400”. Furthermore, LaBute recites “Model 400 additionally receives training data. Training data comprises data for training model 400 (e.g., to make correct decisions). For example, training data comprises a resource utilization score received from a resource utilization score determiner. Model 400 is trained in one or more ways, include using reinforcement learning using a resource utilization score and a provisioner performance reward, pre-training training neural network layer(s) 404 using seasonal job request data and then freezing these before training the full network, or training using supervised learning with historical data”, see Col. 8 lines 4-14. Therefore, the neuron layers 404, or the inner network, is pretrained and the model 400 is trained, where the neuron layers 404 are merged with the model 400 and is trained as well.), LaBute fails to teach that the networks are merged through parameter sharing.
However, Tang teaches that the networks are merged through parameter sharing (See Col. 21 lines 4-12, Tang recites “For example, the discriminative DNN may be trained to predict a class label, or recognize abnormal activities in the scene (supervised), while the generative DNN can be trained to reconstruct its own input (unsupervised). In this case, instead of training two separate networks completely independently, model parameters or weights may be shared between the two training regimes, resulting in a single model that can make use of both unlabeled and labeled data” which is interpreted as two networks sharing parameters to create a combined model.).
It would have been obvious to a person of ordinary skill in art before the effective filling date of the invention to implement the function of Tang into the method of LaBute to merge the inner network and outer network through parameter sharing. The modification would have been obvious because one of the ordinary skills of the art would be motivated to utilize the feature of Tang as all the references are in the field of parameter sharing. A person of ordinary skill of the art would have been motivated to have a single model with features of both the separated models (Since the discriminative DNN “predicts class labels” and generative DNN “reconstructs its own input”, then “in this case, instead of training two separate networks completely independently, model parameters or weights may be shared between the two training regimes, resulting in a single model that can make use of both unlabeled and labeled data” as suggested by Tang at Col. 21 lines 4-12.).
Regarding Claim 8,
	LaBute teaches The method of claim 7, wherein the training comprises fixing the trained inner network and then initializing the trained outer network, and retraining the initialized outer network (See Col. 7 lines 24-29, LaBute recites “neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400, where they are ‘frozen’, i.e. the parameters of these layers are not allowed to change during training of the full model 400”, which is interpreted as the neuron layers 404 being frozen, or fixed, while the full model 400, or the outer network, is being initialized. Furthermore, LaBute recites “Model 400 additionally receives training data. Training data comprises data for training model 400 (e.g., to make correct decisions). For example, training data comprises a resource utilization score received from a resource utilization score determiner. Model 400 is trained in one or more ways, include using reinforcement learning using a resource utilization score and a provisioner performance reward, pre-training training neural network layer(s) 404 using seasonal job request data and then freezing these before training the full network, or training using supervised learning with historical data” where the full model 400, or the outer network, is retrained after fixing the neuron layers 404, or the inner network, with the training data, see Col. 8 lines 4-14.).
Regarding Claim 14,
LaBute teaches the computer device of claim 13, wherein the model trainer is configured to simultaneously train the inner network and the outer network to generate the combination model by merging the pretrained inner network and the outer network (See Col. 7 lines 25-28, LaBute recites “neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400”. Furthermore, LaBute recites “Model 400 additionally receives training data. Training data comprises data for training model 400 (e.g., to make correct decisions). For example, training data comprises a resource utilization score received from a resource utilization score determiner. Model 400 is trained in one or more ways, include using reinforcement learning using a resource utilization score and a provisioner performance reward, pre-training training neural network layer(s) 404 using seasonal job request data and then freezing these before training the full network, or training using supervised learning with historical data”, see Col. 8 lines 4-14. Therefore, the neuron layers 404, or the inner network, is pretrained and the model 400 is trained with reinforcement learning using the resource utilization score and provisional performance reward, or the model trainer, and the neuron layers 404 are merged with the model 400 and is trained as well.). 
LaBute fails to teach that the networks are merged through parameter sharing.
However, Tang teaches that the networks are merged through parameter sharing (See Col. 21 lines 4-12, Tang recites “For example, the discriminative DNN may be trained to predict a class label, or recognize abnormal activities in the scene (supervised), while the generative DNN can be trained to reconstruct its own input (unsupervised). In this case, instead of training two separate networks completely independently, model parameters or weights may be shared between the two training regimes, resulting in a single model that can make use of both unlabeled and labeled data” which is interpreted as two networks sharing parameters to create a combined model.).
It would have been obvious to a person of ordinary skill in art before the effective filling date of the invention to implement the function of Tang into the method of LaBute to merge the inner network and outer network through parameter sharing. The modification would have been obvious because one of the ordinary skills of the art would be motivated to utilize the feature of Tang as all the references are in the field of parameter sharing. A person of ordinary skill of the art would have been motivated to have a single model with features of both the separated models (Since the discriminative DNN “predicts class labels” and generative DNN “reconstructs its own input”, then “in this case, instead of training two separate networks completely independently, model parameters or weights may be shared between the two training regimes, resulting in a single model that can make use of both unlabeled and labeled data” as suggested by Tang at Col. 21 lines 4-12.).

Regarding Claim 15,
LaBute teaches The device of claim 14, wherein the model trainer is configured to fix the trained inner network and then initializing the trained outer network, and retraining the initialized outer network (See Col. 7 lines 25-30, LaBute recites “neuron layer(s) 404 are separately pre-trained on thread usages so they learn and encode the seasonality of thread usage and are then inserted into the main network model 400, where they are “frozen”, i.e. the parameters of these layers are not allowed to change during training of the full model 400”, which is interpreted as the neuron layers 400 being frozen, or fixed, while the full model 400 is being initialized. Furthermore, LaBute recites “Model 400 additionally receives training data. Training data comprises data for training model 400 (e.g., to make correct decisions). For example, training data comprises a resource utilization score received from a resource utilization score determiner. Model 400 is trained in one or more ways, include using reinforcement learning using a resource utilization score and a provisioner performance reward, pre-training training neural network layer(s) 404 using seasonal job request data and then freezing these before training the full network, or training using supervised learning with historical data” where the full model 400, or the outer network, is retrained after fixing the inner network and providing training data and the reinforcement learning using the resource utilization score and provisional performance reward is the model trainer, see Col. 8, lines 8-14.).

Conclusion
11. 	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. LaBute (US 10761897 B2) teaches the construction of the combination model by merging the constructed inner network with an outer network.
12.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to YASIN A HASSAN whose telephone number is (571)272-1567. The examiner can normally be reached Mon-Fri. 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Al Kawsar can be reached at (571) 270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/YASIN ABDULLAH HASSAN/Examiner, Art Unit 2127                                                                                                                                                                                                        

/ABDULLAH AL KAWSAR/               Supervisory Patent Examiner, Art Unit 2127
Read full office action
Prosecution Timeline

Feb 02, 2023
Application Filed
Feb 23, 2026
Non-Final Rejection — §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/728,119
Patent 12572799
METHODS FOR RELIABLE OVER-THE-AIR COMPUTATION AND FEDERATED EDGE LEARNING
2y 5m to grant Granted Mar 10, 2026
17/514,354
Patent 12541568
Method, System, and Computer Program Product for Recurrent Neural Networks for Asynchronous Sequences
2y 5m to grant Granted Feb 03, 2026
17/765,322
Patent 12536434
Computing Method And Apparatus For Convolutional Neural Network Model
2y 5m to grant Granted Jan 27, 2026
15/641,030
Patent 11501195
SYSTEMS AND METHODS FOR QUANTUM PROCESSING OF DATA USING A SPARSE CODED DICTIONARY LEARNED FROM UNLABELED DATA AND SUPERVISED LEARNING USING ENCODED LABELED DATA ELEMENTS
2y 5m to grant Granted Nov 15, 2022
15/233,881
Patent 11455545
Computer-Implemented System And Method For Building Context Models In Real Time
2y 5m to grant Granted Sep 27, 2022
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
79%
Grant Probability
99%
With Interview (+58.0%)
4y 11m
Median Time to Grant
Low
PTA Risk
Based on 395 resolved cases by this examiner. Grant probability derived from career allow rate.