Last updated: April 17, 2026
Application No. 17/718,141
NEURAL NETWORK TRAINING WITH WEIGHTED FUNCTION COMBINATIONS IN HIDDEN LAYERS

Final Rejection §102§103§112
Filed
Apr 11, 2022
Examiner
ALGHAZZY, SHAMCY
Art Unit
2128
Tech Center
2100 — Computer Architecture & Software
Assignee
unknown
OA Round
2 (Final)
Interview Optional

— +0.7% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 62 resolved cases, 2023–2026
Examiner Intelligence

ALGHAZZY, SHAMCY View full profile →
Grants 48% of resolved cases
Career Allow Rate
30 granted / 62 resolved
-6.6% vs TC avg
Minimal +1% lift
Without
With
+0.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 11m
Avg Prosecution
25 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
34.9%
-5.1% vs TC avg
§103
39.3%
-0.7% vs TC avg
§102
11.1%
-28.9% vs TC avg
§112
10.0%
-30.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 62 resolved cases
Office Action

§102 §103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments, see page 7, filed 10/31st/2025 with respect to the objection to the specifications have been fully considered and are persuasive. Therefore, the objection to the specification has been withdrawn.

Applicant's arguments, see page 7, filed 10/31st/2025 with respect to the rejection of claims 1-10 under 35 U.S.C. § 112 have been fully considered and are persuasive. Therefore, the rejection of claims 1-10 under 35 U.S.C. § 112 has been withdrawn.

Applicant's arguments, see page 7-11, filed 10/31st/2025 with respect to the rejection of claims 1-10 under 35 U.S.C. § 101 have been fully considered and are persuasive. Therefore, the rejection of claims 1-10 under 35 U.S.C. § 101 has been withdrawn.

Applicant's arguments, see page 12-15, filed 10/31st/2025 with respect to the rejection of claims 1-10 under 35 U.S.C. § 103 have been fully considered and are moot in light of the new rejection below.

Claim Rejections- 35 USC§ 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a): (a) IN GENERAL-The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-A IA 35 U.S.C. 112: The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 1-10 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first
paragraph, as failing to comply with the written description requirement. The claim( s)
contains subject matter which was not described in the specification in such a way as to
reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-A IA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
With respect to claims 1, 5, and 6, the instant specification does not describe a " second function in which each of the adjusted weights is set to each weight of the each different function combined linearly by the first function". The " second function in which each of the adjusted weights is set to each weight of the each different function combined linearly by the first function " is interpreted as incorporating new matter into the claim which does not contain support in the original disclosure. The remaining claims are rejected with respect to their dependence on the rejected claims.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-10 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
This information should be submitted in the form of a separate paper, and should not be inserted in the specification (37 CFR 1.56). See also 37 CFR 1.97 and 1.98.
Claims 1, and 5-6 recite “a second function in which each of the adjusted weights is set to each weight of the each different function combined linearly by the first function” which is indefinite and it is unclear what is meant by “each of the adjusted weights is set to each weight of the each different function combined linearly by the first function” and the specification does not provide a definition. For examination purposes, the examiner will interpret “each of the adjusted weights is set to each weight of the each different function combined linearly by the first function” to mean “each of the adjusted weights is set to each weight of the first function”. Further clarification is required.
Claim 10 recites “specify a type of input data based on characteristics of the input data” which is indefinite and it is unclear what is meant by “specify a type of input data based on characteristics of the input data” and the specification does not provide a definition. For examination purposes, the examiner will interpret “specify a type of input data based on characteristics of the input data” to mean “specify a type of input data based on the neural network’s input data type”. Further clarification is required.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, and 5-8 are rejected under 35 U.S.C. 102 as being unpatentable over Zamora (US20200005143A1), in view of WANG (US20210334587A1).

Regarding claim 1, Zamora teaches An information processing apparatus comprising a memory and one or a plurality of processors, wherein the memory stores: a learning model using a neural network ([0015] Turning now to FIG. 1, a training architecture 10 is shown in which a neural network (NN) model 14 receives training data.)
each different function that is usable in a hidden layer of the neural network ([0015] In an embodiment, the NN model 14 includes one or more layers of neurons, where each neuron calculates a weighted sum of the inputs to the neuron, adds a bias, and then decides the extent to which the neuron should be fired/activated in accordance with an activation function that is dedicated and/or specific to the neuron.)
apply the first function commonly to nodes within at least one of node groups of each layer in a hidden layer of the learning model ([0015] In an embodiment, the NN model 14 includes one or more layers of neurons, where each neuron calculates a weighted sum of the inputs to the neuron, adds a bias, and then decides the extent to which the neuron should be fired/activated in accordance with an activation function that is dedicated and/or specific to the neuron.)
perform learning by inputting the acquired labeled learning data to the learning model in which the first function has been applied to the hidden layer ([0015, Fig. 1] 

    PNG
    media_image1.png
    976
    1076
    media_image1.png
    Greyscale


The examiner notes that Zamora teaches inputting training data from a database into a neural network that has activation functions that activate the neurons of the neural network. Furthermore, the examiner notes that WANG was shown to teach the use of labeled data.)
when learning the learning model, update a parameter of a neural network of the learning model by error back propagation, based on the labeled learning data; adjust each weight of the each different function combined linearly by the first function when the parameter of the neural network is updated; and produce, after the learning model is learned, a second function in which each of the adjusted weights is set to each weight of the each different function combined linearly by the first function ([0016] In general, error functions are calculated during forward propagation of the data from the database 12 through the NN model 14, with the weights 16 and fractional derivative values 18 being adjusted during backpropagation to reduce error. Once the NN model 14 achieves (e.g., converges on) a target level of accuracy ( e.g., acceptable level of error), the NN model 14 may be output as a trained model that is used to draw real-time inferences (e.g., automotive navigation inferences, speech recognition inferences, etc.)
the adjusting of the each weight comprises iteratively modifying weight values during backpropagation to minimize a loss function of the learning model ([0016] In the illustrated example, a plurality of weights 16 and fractional derivative values 18 for multiple activation functions are iteratively input to the NN model 14. In general, error functions are calculated during forward propagation of the data from the database 12 through the NN model 14, with the weights 16 and fractional derivative values 18 being adjusted during backpropagation to reduce error.)
However, Zamora is not relied upon to explicitly teach the one or a plurality of processors configured to: acquire labeled learning data. Zamora is also not relied upon to explicitly teach a first function that is produced by weighting and linearly combining each of the different functions.
On the other hand, WANG teaches the one or a plurality of processors configured to: acquire labeled learning data ([0049] Additionally, the training method of FIG. 1 includes a step of adjusting network parameters characterizing the convolutional neural network through a training loss function based on the target feature vectors and prelabeled defect labels corresponding to different types of defects. The examiner notes that Zamora and WANG are both directed to machine learning and both are reasonably analogous to each other. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Zamora’s data acquisition to incorporate the one or a plurality of processors configured to: acquire labeled learning data as taught by WANG [0049] to allow for adjusting network parameters characterizing the convolutional neural network [0049])
Furthermore, WANG teaches a first function that is produced by weighting and linearly combining each of the different functions ([0049] Optionally, the training loss function includes one loss function or a weighted linear combination of at least two different loss functions. The examiner notes that Zamora and WANG are both directed to machine learning and both are reasonably analogous to each other. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Zamora’s training routine to incorporate a first function that is produced by weighting and linearly combining each of the different functions as taught by WANG [0049] to allow for adjusting network parameters characterizing the convolutional neural network [0049]))

Regarding claim 2, Zamora teaches select, when an activation function is used for each of the different functions, one of a first group including smoothed activation functions and a second group including arbitrary activation functions, and activation functions in the selected first or second group are used as a plurality of functions to be used in the first function ([0017] FIG. 2A demonstrates that activation functions may be organized into groups based on the derivative behavior of the activation functions. In the illustrated example, the first derivative of a softplus function 20 is a sigmoid function 22 and the second derivative of the softplus function 20 is a radial basis function (RBF) 24 that is similar to a Gaussian function.)

Claim 5 is rejected based upon the same rationale as the rejection of claim 1 since it is the method claim corresponding to the apparatus claim.

Claim 6 is rejected based upon the same rationale as the rejection of claim 1 since it is the it is the non-transitory computer readable medium claim corresponding to the apparatus claim.

Regarding claim 7, Zamora teaches The information processing apparatus according to claim 1. However, Zamora is not relied upon to explicitly teach wherein the learning model comprises at least one of a convolutional neural network (CNN), a recurrent neural network (RNN), a deep neural network (DNN), a long short-term memory (LSTM) network, a deep Q-network (DQN), a variational autoencoder (V AE), generative adversarial networks (GANs), and a flow-based production model. On the other hand, WANG teaches wherein the learning model comprises at least one of a convolutional neural network (CNN), a recurrent neural network (RNN), a deep neural network (DNN), a long short-term memory (LSTM) network, a deep Q-network (DQN), a variational autoencoder (V AE), generative adversarial networks (GANs), and a flow-based production model ([0034] Visual inspection on microdefects in electronic devices based on human experience is not reliable and very low in efficiency. Neural network or particularly Convolutional Neural Network (CNN) can be used to identify and classify all solder joint images of the BGA chip with high efficiency and high reliability. The examiner notes that Zamora and WANG are both directed to neural networks and both are reasonably analogous to each other. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Zamora’s neural network model to incorporate wherein the learning model comprises at least one of a convolutional neural network (CNN), a recurrent neural network (RNN), a deep neural network (DNN), a long short-term memory (LSTM) network, a deep Q-network (DQN), a variational autoencoder (V AE), generative adversarial networks (GANs), and a flow-based production model as taught by WANG [0034] to improve the efficiency and reliability of inspections of microdefects in electronic devices [0034])

Regarding claim 8, Zamora teaches The information processing apparatus according to claim 1. However, Zamora is not relied upon to explicitly teach wherein the labeled learning data comprises at least one of image data, series data, and text data, and wherein the one or a plurality of processors are configured to perform at least one of classifying, producing, and optimizing the prescribed learning data. On the other hand, WANG teaches wherein the labeled learning data comprises at least one of image data, series data, and text data, and wherein the one or a plurality of processors are configured to perform at least one of classifying, producing, and optimizing the prescribed learning data ([0049] In the CNN, different types of defects have been pre-labeled with corresponding defect labels. For example, a label of solder-joint bubble is classified as a first type of defect label, a label of solder-joint bridge is classified as a second type of defect label, a label of solder-joint size irregularity is classified as a third type of defect label, and a label of cold joint is classified as a fourth type of defect label. The examiner notes the WANG teaches classifying input images of solder defects according to the label of the defect. The examiner further notes that Zamora and WANG are both directed to neural networks and both are reasonably analogous to each other. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Zamora’s neural network model to incorporate wherein the labeled learning data comprises at least one of image data, series data, and text data, and wherein the one or a plurality of processors are configured to perform at least one of classifying, producing, and optimizing the prescribed learning data as taught by WANG [0049] to extract target feature vectors respectively associated with multiple solder joint images in the training sample set. [0048])

Claims 3 is rejected under 35 U.S.C. 102 as being unpatentable over Zamora (US20200005143A1), in view of WANG (US20210334587A1), in view of Liang (US20220300823A1).

Regarding claim 3, Zamora teaches The information processing apparatus according to claim 1. However, Zamora is not relied upon to explicitly teach each of the functions is one of a normalization function, a standardization function, a denoising operation function, a smoothing function, and a regularization function. On the other hand, Liang teaches each of the functions is one of a normalization function, a standardization function, a denoising operation function, a smoothing function, and a regularization function ([0111] The normalization method may be batch normalization, layer normalization, group normalization, and the activation functions may be ReLU, sigmoid, tanh, etc. The examiner notes that Zamora and Liang are both directed to machine learning and both are reasonably analogous to each other. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Zamora’s training routine to incorporate each of the functions is any one of a normalization function, a standardization function, a denoising operation function, a smoothing function, and a regularization function as taught by Liang [0111] to change data into a common scale for faster convergence during training [0111])

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Zamora (US20200005143A1), in view of WANG (US20210334587A1), in view of Lin (US20170344848A1).

Regarding claim 4, Zamora teaches The information processing apparatus according to claim 1. However, Zamora is not relied upon to explicitly teach associate the second function and a type of the labeled learning data with each other and store the association in the memory. On the other hand, Lin teaches associate the second function and a type of the labeled learning data with each other and store the association in the memory ([0061] At operation 604, the computer system accesses a second training dataset. The second training dataset corresponds to a second training domain. Further, the second training dataset includes image or non-image data and a set of data labels. The second training domain also includes a second training tasks defined based on the data labels and associated with a second training loss function. The examiner notes that Zamora and Lin are both directed to machine learning and both are reasonably analogous to each other. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Zamora’s training routine to incorporate associate the second function and a type of the labeled learning data with each other and store the association in the memory as taught by Lin [0061] to train a machine learning model across multiple training datasets and multiple training tasks [0049].)

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Zamora (US20200005143A1), in view of WANG (US20210334587A1), in view of KUEHNEL (US20210095995A1).

Regarding claim 9, Zamora teaches The information processing apparatus according to claim 1. However, Zamora is not relied upon to explicitly teach compare a first evaluation result obtained using a single activation function with a second evaluation result obtained using the second function; and apply the second function to the learning model when the second evaluation result indicates higher accuracy than the first evaluation result On the other hand, KUEHNEL teaches compare a first evaluation result obtained using a single activation function with a second evaluation result obtained using the second function; and apply the second function to the learning model when the second evaluation result indicates higher accuracy than the first evaluation result ([0013] subdividing the data into training data and test data; setting a first target accuracy value for a first artificial neural network that includes linear and/or non-linear activation functions; training the first artificial neural network using the training data; inputting the test data into the trained first artificial neural network in order to obtain a first output value of the first artificial neural network; establishing a first output accuracy value based on a comparison result between the first output value and the test data; storing weightings and the linear and/or non-linear activation functions of the first artificial neural network in a memory unit of the inertial sensor if the first output accuracy value is greater than the first target accuracy value, or training the first artificial neural network again using the training data if the first output accuracy value is lower than the first target accuracy value; establishing an upper limitation value and a lower limitation value for a second artificial neural network that includes non-linear activation functions, based on a predefined constant and on the first output value of the first artificial neural network; training the second artificial neural network using the training data; inputting the test data into the trained second artificial neural network in order to obtain a second output value of the second artificial neural network; comparing the second output value of the second artificial neural network with a value range from the upper limitation value to the lower limitation value; establishing a third output value on the second output value if the second output value is within the value range, or on the first output value if the second output value is not within the value range; and storing weightings and the non-linear activation functions of the second artificial neural network and the predefined constant in the memory unit. The examiner notes that Zamora and KUEHNEL are both directed to neural networks and both are reasonably analogous to each other. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Zamora’s neural network model to incorporate compare a first evaluation result obtained using a single activation function with a second evaluation result obtained using the second function; and apply the second function to the learning model when the second evaluation result indicates higher accuracy than the first evaluation result as taught by KUEHNEL [0013] to enable the self-calibration of an inertial sensor [0012])

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Zamora (US20200005143A1), in view of WANG (US20210334587A1), in view of TORKAMANI (US20190114531A1).

Regarding claim 10, Zamora teaches The information processing apparatus according to claim 1. However, Zamora is not relied upon to explicitly teach specify a type of input data based on characteristics of the input data; extract from the memory a stored second function corresponding to the specified type; and apply the extracted second function to a prescribed layer of the hidden layer of the learning model for performing inference on the input data On the other hand, TORKAMANI teaches specify a type of input data based on characteristics of the input data; extract from the memory a stored second function corresponding to the specified type; and apply the extracted second function to a prescribed layer of the hidden layer of the learning model for performing inference on the input data ([0056] As an example, if the healthcare outcome to be predicted is associated with type-II diabetes, then the activation functions may be selected based on data gathered for diet, weight, blood sugar, age, gender, and the like. The examiner notes that Zamora and TORKAMANI are both directed to neural networks and both are reasonably analogous to each other. Therefore, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified Zamora’s neural network model to incorporate specify a type of input data based on characteristics of the input data; extract from the memory a stored second function corresponding to the specified type; and apply the extracted second function to a prescribed layer of the hidden layer of the learning model for performing inference on the input data as taught by TORKAMANI [0056] to allow for customized regularization functions [0055])

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
The following references have been determined to be related to the application, but were not applied in any specific rejection. They are nonetheless listed below for reference.
ZANPURE (US 2018/0268282 Al)
“ZANPURE teaches a method for predicting a non-linear relationship between a plurality of parameters in a deep neural network framework”
Subramanian (US 2019/0361808 Al)
“Subramanian teaches using a machine learning method for generating an estimated cache performance of a cache configuration”
Dinh (US 2020/0126263 A1)
“Dinh teaches a method for normalization by transforming the outputs of deep neural networks”
Galloway (Batch Normalization is a Cause of Adversarial Vulnerability)
“Galloway teaches mean-field analysis that shows that batch norm causes exploding gradients”
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAMCY ALGHAZZY whose telephone number is (571) 272-8824. The examiner can normally be reached Monday-Friday 8:00am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, OMAR FERNANDEZ RIVAS can be reached on (571) 272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SHAMCY ALGHAZZY/Examiner, Art Unit 2128          

/OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128
Read full office action
Prosecution Timeline

Apr 11, 2022
Application Filed
Jul 31, 2025
Non-Final Rejection — §102, §103, §112
Oct 31, 2025
Response Filed
Feb 07, 2026
Final Rejection — §102, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/613,773
Patent 12596925
SINGLE-STAGE MODEL TRAINING FOR NEURAL ARCHITECTURE SEARCH
2y 5m to grant Granted Apr 07, 2026
18/612,881
Patent 12596922
ACCELERATING NEURAL NETWORKS IN HARDWARE USING INTERCONNECTED CROSSBARS
2y 5m to grant Granted Apr 07, 2026
19/236,733
Patent 12579408
ADAPTIVELY TRAINING OF NEURAL NETWORKS VIA AN INTELLIGENT LEARNING MANAGEMENT SYSTEM
2y 5m to grant Granted Mar 17, 2026
17/704,176
Patent 12572847
SYSTEMS AND METHODS FOR RESOURCE-AWARE MODEL RECALIBRATION
2y 5m to grant Granted Mar 10, 2026
16/678,038
Patent 12566966
TRAINING ADAPTABLE NEURAL NETWORKS BASED ON EVOLVABILITY SEARCH
2y 5m to grant Granted Mar 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
48%
Grant Probability
49%
With Interview (+0.7%)
3y 11m
Median Time to Grant
Moderate
PTA Risk
Based on 62 resolved cases by this examiner. Grant probability derived from career allow rate.