Last updated: April 19, 2026
Application No. 18/252,231
DYNAMIC CONDITIONAL POOLING FOR NEURAL NETWORK PROCESSING

Non-Final OA §101§103§112
Filed
May 09, 2023
Examiner
BALDWIN, RANDALL KERN
Art Unit
2125
Tech Center
2100 — Computer Architecture & Software
Assignee
Intel Corporation
OA Round
1 (Non-Final)
Interview Optional

— +26.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 232 resolved cases, 2023–2026
Examiner Intelligence

BALDWIN, RANDALL KERN View full profile →
Grants 80% — above average
Career Allow Rate
185 granted / 232 resolved
+24.7% vs TC avg
Strong +27% interview lift
Without
With
+26.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 5m
Avg Prosecution
12 currently pending
Career history
244
Total Applications
across all art units
Statute-Specific Performance

§101
17.4%
-22.6% vs TC avg
§103
43.2%
+3.2% vs TC avg
§102
6.4%
-33.6% vs TC avg
§112
26.6%
-13.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 232 resolved cases
Office Action

§101 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the application filed 5/09/2023. Claims 1-20 are pending and have been examined. Claims 1-20 are rejected.

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, 365(c), or 386(c) is acknowledged. The present application is a national stage application under 35 U.S.C. 371 of International Application No. PCT/CN2020/138906, filed on December 24, 2020.

Information Disclosure Statement
Acknowledgment is made of the information disclosure statements filed 5/9/2023, 8/14/2023 and 7/25/2025, which comply with 37 CFR 1.97. As such, the information disclosure statements have been placed in the application file and the information referred to therein has been considered by the examiner.

Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because reference character “135” has been used to designate “Dynamic Conditional Pooling”, “Dynamic Pooling Conditioning”, ‘Weighting of Feature Pixels” and “Normalizing of Aggregated Features Conditioning” in FIG. 1; and because reference character “560” has been used to designate both a convolution operation                                 
                                    ⊗
                                     
                            and value                                 
                                    
                                            X
                                        
                                            L
                                        
                                            '
                                        
                             in FIG. 5.
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference characters not mentioned in the description:
Reference characters 116, 120 and 135 shown in Figure 1 are not found in the detailed description (see, paragraph 22 describing FIG. 1); 
Reference character 370 shown in Figure 3 is not found in the detailed description (see, paragraphs 34-37 describing FIG. 3); 
Reference character 662 shown in Figure 6 is not found in the detailed description (see, paragraphs 49-50 describing FIG. 6); 
Reference characters 755 and 770 shown in Figure 7 are not found in the detailed description (see, paragraphs 54-57 describing FIG. 7); and
Reference characters 800, 832, 834 and 836 shown in Figure 8 are not found in the detailed description (see, paragraph 61 describing FIG. 8 and reciting “N convolutional filters 932, and applying the generated soft weights                         
                            
                                            α
                                        
                                            1
                                        
                                    ,
                                    
                                            α
                                        
                                            2
                                        
                                    ,
                                    …
                                    ,
                                    
                                            α
                                        
                                            N
                                        
                     in a convolution operation 934 and generate an aggregated value                         
                            
                                    X
                                
                                    L
                                
                                    '
                                
                     936.”).
The drawings are further objected to as failing to comply with 37 CFR 1.84(p)(5) because they do not include the following reference signs mentioned in the description: 
125 (see, paragraph 22 describing FIG. 1 and reciting “a deep neural network 125”); 
310 (see, paragraphs 34-37 describing FIG. 3 and reciting “the value                         
                            
                                            X
                                        
                                        ^
                                    
                                    L
                                
                     310”); 
550 (see, paragraphs 44-45 describing FIG. 5 and reciting “illustrated convolution operation 550”); 
770 (see, paragraphs 55-56 describing FIG. 7 and reciting “global average pooling (GAP) 707”); and
932, 934 and 936 (see, paragraph 61 describing FIG. 8 and reciting “N convolutional filters 932, and applying the generated soft weights                         
                            
                                            α
                                        
                                            1
                                        
                                    ,
                                    
                                            α
                                        
                                            2
                                        
                                    ,
                                    …
                                    ,
                                    
                                            α
                                        
                                            N
                                        
                     in a convolution operation 934 and generate an aggregated value                         
                            
                                    X
                                
                                    L
                                
                                    '
                                
                     936.”).
The drawings are additionally objected to as failing to comply with 37 CFR 1.84(p)(3) because Figures 4-9 include letters which do not measure at least .32 cm. (1/8 inch) in height (i.e., most of the lowercase and subscript characters in FIGs. 4-9).
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
The disclosure is objected to because of the following informalities: 
Reference characters 116, 120 and 135 shown in Figure 1 are not found in the detailed description (see, e.g., paragraph 22 describing FIG. 1). Appropriate correction is required.
Reference character 370 shown in Figure 3 is not found in the detailed description (see, e.g., paragraphs 34-37 describing FIG. 3). Appropriate correction is required.
Reference character 662 shown in Figure 6 is not found in the detailed description (see, paragraphs 49-50 describing FIG. 6).
Reference characters 755 and 770 shown in Figure 7 are not found in the detailed description (see, paragraphs 54-57 describing FIG. 7).
Reference characters 800, 832, 834 and 836 shown in Figure 8 are not found in the detailed description (see, paragraph 61 describing FIG. 8 and reciting “N convolutional filters 932, and applying the generated soft weights             
                
                                α
                            
                                1
                            
                        ,
                        
                                α
                            
                                2
                            
                        ,
                        …
                        ,
                        
                                α
                            
                                N
                            
         in a convolution operation 934 and generate an aggregated value             
                
                        X
                    
                        L
                    
                        '
                    
         936.” – examiner notes that it appears that recitations of 932, 934 and 936 in paragraph 61 should recite 832, 834 and 836).

Claim Objections
Claims 1-16, 18 and 20 are objected to because of the following informalities: 
The preamble of claim 1 recites “One or more non-transitory computer-readable storage mediums”, which is grammatically incorrect. The plural of a “storage medium” is “storage media”, not “storage mediums” [sic]. Thus, “One or more non-transitory computer-readable storage mediums” should recite “One or more non-transitory computer-readable storage media . Appropriate correction is required.
	Also, claims 2-9, which each depend directly or indirectly from claim 1, each recite “The medium of claim” [1 or an intervening claim] in their preambles and should recite the “The one or more computer-readable storage media 
The last line of independent claim 10 recites “the convolutional layer.” As applicant previously introduced “a first convolutional layer” in line 6 of the claim, for consistency and clarity, the subsequent recitation of “the convolutional layer” should recite “the first convolutional layer.” Appropriate correction is required.
	Claims 3, 12 and 18 each recite “aggregate the input sample along all but one input dimensions” (see, lines 2-3 of claims 3 and 12, and lines 3-4 of claim 18). These recitations are grammatically incorrect. If supported by the original specification, examiner suggests that one possible way to address these objections would be to amend recitations of “dimensions” to recite “dimension”. Appropriate correction is required.
The last two lines of claims 8, 15 and 20 each recite “the standardized feature map”. As applicant previously introduced “a standardized representation of a feature map” (see, lines 3-4 of claims 8, 15 and 20), for consistency and clarity, the recitations of “the standardized feature map” should recite “the standardized representation of the feature map”. Appropriate correction is required.
Also, claims 4-6 and 13 which each depend directly or indirectly from claims 3 and 12, respectively, are objected to based on their respective dependencies from claims 3 and 12.
Also, claims 11-16, which each depend directly or indirectly from claim 10, each are objected to based on their respective dependencies from claim 10.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f):
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f), is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f), is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f), because the claim limitations use a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitations are: 
at least one soft agent is to perform:
global aggregation of the input sample to aggregate the input sample along all but one input dimensions;
mapping of the aggregated input sample; and
scaling of the mapped input sample to generate the plurality of soft weights in claims 3, 12 and 18; and
a first soft agent to support the conditional aggregation and a second soft agent to support the conditional normalization in claims 4 and 13.
Regarding claims 3, 12 and 18 and the above-noted three-prong test, the recited soft agent is a generic placeholder, to perform: global aggregation of the input sample … ; mapping of the aggregated input sample; and scaling of the mapped input sample is functional language, and there is no recitation in claims 3, 12 and 18 of sufficient structure to perform the aggregating, mapping and scaling. 
Regarding claims 4 and 13 and the above-noted three-prong test, the recited first soft agent is a generic placeholder, to support the conditional aggregation is functional language, and there is no recitation in claims 4 and 13 of sufficient structure to perform the supporting. Also in claims 4 and 13, the recited second soft agent is a generic placeholder, to support the conditional normalization is functional language, and there is no recitation in claims 4 and 13 of sufficient structure to perform the supporting. 
Because these claim limitations are being interpreted under 35 U.S.C. 112(f), they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
A review of the specification shows that the corresponding structure is not described in the specification for the 35 U.S.C. 112(f) limitations:
Regarding the above-noted soft agents recited in claims 2, 4, 12, 13 and 18, although agents are depicted in the black box block diagrams of FIGs. 3-7 and generally mentioned in paragraphs 20, 91-94, 100-101, 106 and 111-114 (which merely repeat the claim language) and 34-56 (which describe the aforementioned figures), the corresponding structure of the claimed agents capable of performing the claimed functions is not described in applicant’s specification. 
For example, regarding the above-noted soft agent claim limitations in claims 2, 4, 12, 13 and 18, with reference to the black box block diagrams of FIGs. 3-7 depicting generic boxes/blocks for a soft agent, paragraphs 35, 39 and 55 mention “a soft agent 330 for generating soft weights conditional on input samples to regulate the aggregation and normalization blocks”, “a soft agent for dynamic conditional pooling” and “two soft agents are implemented to provide conditional aggregation and conditional normalization blocks separately”, and paragraphs 42 and 55-56 of Applicant’s specification generally state “the soft agent 400 thus provides easily implementable operations, and can be effectively trained using forward or backward propagation algorithms in deep learning. Further, the soft agent 400 can serve as a general bridge between the entire input sample 405 and local operations”, “a first soft agent includes global average pooling (GAP) 7071 for global aggregation, a fully-connected (FC) layer 730 with N output units for mapping, and a SoftMax layer 732 for scaling.” and “a second soft agent again includes the global average pooling (GAP) 707 for global aggregation, and further includes a long short-term memory (LSTM) block 750 (LSTM referring to an RNN architecture) included to provide mapping and scaling”.
The drawings merely show black-boxes designed to perform the entire claimed function (see, e.g., agents shown in FIGs. 3-7). 
As such, the specification either fails to describe the claimed modules or agents as noted above, or, at best, describes the claimed modules and agents by their respective functions without disclosing any specific structure performing the claimed functions. 
Accordingly, for these claim limitations, the written description fails to disclose both an algorithm(s) and special-purpose computer hardware to perform the algorithm(s). For more information, see MPEP § 2181.
If applicant wishes to provide further explanation or dispute the examiner's interpretation of the corresponding structure, applicant must identify the corresponding structure with reference to the specification by page and line number, and to the drawing, if any, by reference characters in response to this Office action.
If applicant does not intend to have these limitations interpreted under 35 U.S.C. 112(f), applicant may:  (1) amend the claim limitations to avoid them being interpreted under 35 U.S.C. 112(f) (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitations recite sufficient structure to perform the claimed function so as to avoid them being interpreted under 35 U.S.C. 112(f).

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

Claims 3-6, 12-13 and 18 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. 
Dependent claims 3, 4, 12, 13 and 18 each contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, at the time the application was filed, had possession of the claimed invention.
In particular, and as previously noted, the claim limitations at least one soft agent is to perform: global aggregation of the input sample to aggregate the input sample along all but one input dimensions; mapping of the aggregated input sample; and scaling of the mapped input sample to generate the plurality of soft weights in claims 3, 12 and 18; and a first soft agent to support the conditional aggregation and a second soft agent to support the conditional normalization in claims 4 and 13 invoke 35 U.S.C. 112(f). 
However, as noted above, the written description of the current application fails to disclose the corresponding structure, material, or acts for performing each of the above-identified claimed functions and to clearly link the structure, material, or acts to the function. In particular, for each of the claimed functions, the written description fails to disclose both an algorithm(s) and special-purpose computer hardware to perform the algorithm. For more information, see MPEP § 2181.
Accordingly, claims 3, 4, 12, 13 and 18 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement.
Also, claims 4-6 and 13 which each depend directly or indirectly from claims 3 and 12, respectively, are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement under the same rationale as claims 3 and 12.
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

Claims 3-6, 12-13 and 18 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Claims 3, 12 and 18 each recite “mapping of the aggregated input sample” (see, lines 4 of claims 3 and 12 and line 5 of claim 18). These recitations are unclear and appear to be missing one or more words. In particular, it is unclear what, if anything “the aggregated input sample” is being mapped to in these claims. For the purposes of determining patent eligibility and comparison with the prior art, the examiner is interpreting “mapping of the aggregated input sample” as any mapping, plotting, graphing or correlation of “the aggregated input sample” to any value, variable or feature. Appropriate correction is required.
As discussed above, the claim limitations at least one soft agent is to perform: global aggregation of the input sample to aggregate the input sample along all but one input dimensions; mapping of the aggregated input sample; and scaling of the mapped input sample to generate the plurality of soft weights in claims 3, 12 and 18; and a first soft agent to support the conditional aggregation and a second soft agent to support the conditional normalization in claims 4 and 13 invoke 35 U.S.C. 112(f). 
However, as also discussed above with regard to the rejections of dependent claims 3, 4, 12, 13 and 18 under 35 U.S.C. 112(a), the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. In particular, the specification fails to clearly link the structure, material, or acts to the function for the limitations at least one soft agent is to perform: global aggregation of the input sample to aggregate the input sample along all but one input dimensions; mapping of the aggregated input sample; and scaling of the mapped input sample to generate the plurality of soft weights in claims 3, 12 and 18; and a first soft agent to support the conditional aggregation and a second soft agent to support the conditional normalization in claims 4 and 13. As further noted above, there is insufficient disclosure in the specification of algorithms and specific computer hardware for implementing the above-noted, claimed modules and agents. As such, the above-noted limitations recited in claims 3, 4, 12, 13 and 18 are indefinite. Therefore, claims 3, 4, 12, 13 and 18 are indefinite and are rejected under 35 U.S.C. 112(b). For the purposes of determining patent eligibility and comparison with the prior art, the examiner is interpreting the above-listed agents as any combination of software (i.e., a set of instructions, code, one or more functions or software agents or modules) and/or hardware (i.e., circuitry and/or hardware logic components/modules) capable of performing the claimed functions.
Also, claims 4-6 and 13 which each depend directly or indirectly from claims 3 and 12, respectively, are rejected under 35 U.S.C. 112(b) as being indefinite under the same rationale as claims 3 and 12.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. because the claimed invention is directed to an abstract idea without significantly more. The analysis below of the claims’ subject matter eligibility follows the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50-57 (January 7, 2019) (“2019 PEG”) and the 2024 Guidance Update on Patent Subject Matter Eligibility, Including on Artificial Intelligence, 89 Fed. Reg. 58128-58138 (July 17, 2024) (“2024 AI SME Update”).
When considering subject matter eligibility under 35 U.S.C. 101, it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter (Step 1). If the claim does fall within one of the statutory categories, the second step in the analysis is to determine whether the claim is directed to a judicial exception (Step 2A). The Step 2A analysis is broken into two prongs. In the first prong (Step 2A, Prong 1), it is determined whether or not the claims recite a judicial exception (e.g., mathematical concepts, mental processes, certain methods of organizing human activity). If it is determined in Step 2A, Prong 1 that the claims recite a judicial exception, the analysis proceeds to the second prong (Step 2A, Prong 2), where it is determined whether or not the claims integrate the judicial exception into a practical application. If it is determined at step 2A, Prong 2 that the claims do not integrate the judicial exception into a practical application, the analysis proceeds to determining whether the claim is a patent-eligible application of the exception (Step 2B). If an abstract idea is present in the claim, any element or combination of elements in the claim must be sufficient to ensure that the claim integrates the judicial exception into a practical application, or else amounts to significantly more than the abstract idea itself.
Regarding claims 1, 10 and 17, these claims are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1 Analysis: Claim 1 is directed to one or more non-transitory computer-readable storage mediums [sic – media], corresponding to an article of manufacture, claim 10 is directed to an apparatus, and claim 17 is directed to a system, corresponding to machines, which are each one of the statutory categories.
The claims are directed to an abstract idea. In particular, the claims recite mental processes that are concepts performed in the human mind (including an observation, evaluation, judgment, opinion) combined with a mathematical concept (i.e., mathematical relationships, mathematical formulas or equations, and mathematical calculations).
The limitations recited in claims 1, 10 and 17, using respective similar language:
generating a plurality of soft weights2 based on the input sample;
performing conditional aggregation on the input sample utilizing the plurality of soft weights to generate an aggregated value; and 
performing conditional normalization on the aggregated value to generate an output - as drafted, under their broadest reasonable interpretation (BRI), in view of the specification, cover concepts performed in the human mind (evaluation, judgement, or opinion to generate weights based on observing received input, using the weights to generate an aggregated value by aggregating/grouping the input, combined with a mathematical concept - i.e., mathematical relationships, mathematical formulas or equations, and mathematical calculations – to normalize the aggregated value to generate an output value). The above limitations in the context of these claims encompass, inter alia, generating weights based on input, aggregating/grouping the input based on the weights to generate an aggregated value, and normalizing the aggregated value to generate an output (can be performed as mental processes, evaluation/judgement/opinion to decide on weights based on the input sample/data, aggregating/grouping the observed input data and weights and normalizing the aggregated value - corresponding to mental processes which can be done mentally or by pen and paper). 
The claim limitations, under their broadest reasonable interpretations (BRIs), cover performance of the limitations in the mind but for the recitation of generic computer components (e.g., “One or more non-transitory computer-readable storage mediums [sic – media] having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations” (claim 1), “An apparatus comprising: one or more processors; and a memory to store data, including data of a convolutional neural network (CNN), the CNN having a plurality of layers including one or more convolutional layers, wherein the one or more processors are to:” <perform operations> (claim 10) and “A computing system comprising: one or more processors; a data storage to store data including instructions for the one or more processors; and a memory including random access memory (RAM) to store data, including data of a convolutional neural network (CNN), the CNN having a plurality of layers including one or more convolutional layers, wherein the computing system is to:” <perform operations> “at least one soft agent” (claim 17)) combined with a mathematical concept - mathematical relationships, mathematical formulas or equations, or mathematical calculations (e.g., “performing conditional normalization on the aggregated value”).
Therefore, the claims are directed to an abstract idea - mental processes combined with a mathematical concept.
Step 2A Prong Two Analysis: The judicial exceptions are not integrated into a practical application.
In particular, the claims recite, using respective similar language, the additional elements: receiving an input at a convolutional layer of a convolutional neural network (CNN); 
receiving an input sample at a pooling stage of the convolutional layer; and
generate an output for the convolutional layer. 
 These are insignificant extra-solution activities that are not integrated into the claims as a whole and do not add a meaningful limitation to the above-noted mental processes and mathematical concept specified in these claims. That is, “receiving an input” at a generically-recited convolutional layer and pooling stage of a generically-recited convolutional neural network and “generate an output for the” generically-recited convolutional layer amounts to mere data gathering (i.e., receiving provided/transmitted input data) and necessary data outputting (See MPEP § 2106.05(g)).
The claims also recite the additional elements: “One or more non-transitory computer-readable storage mediums [sic – media] having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations” (claim 1), “An apparatus comprising: one or more processors; and a memory to store data, including data of a convolutional neural network (CNN), the CNN having a plurality of layers including one or more convolutional layers, wherein the one or more processors are to:” <perform operations> (claim 10) and “A computing system comprising: one or more processors; a data storage to store data including instructions for the one or more processors; and a memory including random access memory (RAM) to store data, including data of a convolutional neural network (CNN), the CNN having a plurality of layers including one or more convolutional layers, wherein the computing system is to:” <perform operations> “wherein the plurality of soft weights are generated by at least one soft agent” (claim 17). The computer-readable medium storing instructions, apparatus and system comprising one or more processors and memory, convolutional neural network (CNN) and soft agent are recited at a high level of generality as mere instructions to implement an abstract idea on a computer (i.e., a computer including generically-recited one or more processors, computer-readable medium and memory storing instructions and the generically-recited CNN) and soft agent amount to the recitation of the words “apply it” (or an equivalent) or amount to no more than mere instructions to implement an abstract idea or other exception on a computer or merely use a computer as a tool to perform an abstract idea (i.e., as generic computer components performing generic computer functions). See MPEP 2106.05(f).
Regarding the “convolutional neural network (CNN)”, aside from stating “the CNN having a plurality of layers including one or more convolutional layers”, no details of the neural network are recited, and the neural network is recited at a high level of generality and the network can be constructed by hand with pen and paper. Thus, the claimed “neural network”, under the BRI, in light of the specification, could be any neural network “having a plurality of layers including one or more convolutional layers”, which could be constructed and updated/retrained by hand with pen and paper. That is, the “neural network” limitations give the indication that the neural network can be constructed by hand with pen and paper.
The neural network is recited at a high level of generality and therefore is being interpreted as performing a mental process on a generic computer. See MPEP 2106.04(a)(2) § III.C which states that “a concept that is performed in the human mind and applicant is merely claiming that concept performed 1) on a generic computer, or 2) in a computer environment, or 3) is merely using a computer as a tool to perform the concept” still recite a mental process.
Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
Step 2B Analysis: The claims do not recite additional elements that are sufficient to amount to significantly more than the judicial exception. 
Receiving, transmitting and communicating data are insignificant extra-solution activities that are well-understood, routine, and conventional. See MPEP2106.05(d)(II) (“The courts have recognized the following computer functions as well‐understood, routine, and conventional functions… i. Receiving or transmitting data over a network…iv. Storing and retrieving information in memory”) (citing OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015)). Therefore, recitations of “receiving an input at a convolutional layer of a convolutional neural network (CNN); receiving an input sample at a pooling stage of the convolutional layer” and “generate an output for the convolutional layer” are the well-understood, routine, conventional activities of receiving or transmitting data over a network, as discussed in MPEP § 2106.05(d).
Mere instructions to apply the mental process electronically, or to implement an abstract idea or other exception on a computer (i.e., with the recited “non-transitory computer-readable storage mediums [sic – media] having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations” of claim 1, and apparatus and system comprising “one or more processors” and a memory and a data storage of claims 10 and 17) do not amount to significantly more than the judicial exception. As noted above, merely asserting that a judicial exception is to be carried out on a generic computer cannot provide significantly more than the judicial exception. See MPEP § 2106.05(f).
These claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there are no additional elements recited that impose any meaningful limits on practicing the abstract idea. Therefore, the additional elements of these dependent claims are not sufficient to amount to significantly more than the abstract idea. These claims are not patent eligible.

Regarding claims 2 and 11, these claims are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 2 is directed to a non-transitory computer-readable storage medium as depending from claim 1, and claim 11 is directed to an apparatus as depending from claim 10, thus the analysis for patent eligibilities of claims 1 and 10 are incorporated herein.
Step 2A Prong 1: The claims both recite “wherein the plurality of soft weights are generated by at least one soft agent”.
This limitation does nothing to alter the fundamental nature of the claims as a mental process combined with a mathematical concept. This is because the additional limitation merely limits the invention to a narrower abstract idea by further narrowing what generating the soft weights includes, i.e., using a generically-recited “soft agent.”
Dependent claims 2 and 11, when analyzed as a whole, are not patent eligible under 35 U.S.C. 101 because the additional recited limitation fails to establish that the claims are not directed to an abstract idea. The additional limitation added by these claims covers a mental process of evaluation, judgement, or opinion to generate weights based on observing received input. This can be performed as a mental process. A person can merely decide on weights based on observing the input data.
Thus, this limitation does nothing to alter the analysis of claims 1 and 10.
Step 2A Prong Two Analysis: The judicial exceptions are not integrated into a practical application.
In particular, the claims recite the additional element: “the plurality of soft weights are generated by at least one soft agent.”
The “by at least one soft agent” amounts to the recitation of the words “apply it” (or an equivalent) or amount to no more than mere instructions to implement an abstract idea or other exception on a computer or merely use a computer as a tool to perform an abstract idea (i.e., as generic computer components performing generic computer functions). See MPEP 2106.05(f).
Step 2B Analysis: The claims do not recite additional elements that are sufficient to amount to significantly more than the judicial exception. 
Mere instructions to apply the mental process electronically, or to implement an abstract idea or other exception on a computer (i.e., with the recited “soft agent”, the “non-transitory computer-readable storage mediums [sic – media] having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations” of base claim 1, and the apparatus comprising “one or more processors” and memory of base claim 10) do not amount to significantly more than the judicial exception. As noted above, merely asserting that a judicial exception is to be carried out on a generic computer cannot provide significantly more than the judicial exception. See MPEP § 2106.05(f).
These claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there are no additional elements recited that impose any meaningful limits on practicing the abstract idea. Therefore, the additional element of these dependent claims is not sufficient to amount to significantly more than the abstract idea. These claims are not patent eligible.

Regarding claims 3, 12 and 18, these claims are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 3 is directed to a non-transitory computer-readable storage medium as depending from claim 2, claim 12 is directed to an apparatus as depending from claim 11, and claim 18 is directed to a system as depending from claim 17, thus the analysis for patent eligibilities of claims 2, 11 and 18, and of base claims 1 and 10 are incorporated herein.
Step 2A Prong 1: The claims each recite “global aggregation of the input sample to aggregate the input sample along all but one input dimensions3;
mapping of the aggregated input sample4; and
scaling of the mapped input sample to generate the plurality of soft weights.”
These limitations do nothing to alter the fundamental nature of the claims as a mental process combined with a mathematical concept. This is because the additional limitations merely limit the invention to a narrower abstract idea by further narrowing what generating the soft weights includes, i.e., using a generically-recited “soft agent” to aggregate the observed input sample/data, map/correlate the input sample, and then a mathematical concept to scale the mapped input to generate weights.
Dependent claims 3, 12 and 18, when analyzed as a whole, are not patent eligible under 35 U.S.C. 101 because the additional recited limitations fail to establish that the claims are not directed to an abstract idea. The additional limitations added by these claims cover a mental process of evaluation, judgement, or opinion to generate weights based on observing received input by aggregating the observed input sample/data and to map/correlate the input sample (corresponding to mental processes which can be done mentally or by pen and paper), combined with a mathematical concept - scaling the mapped input (corresponding to a mathematical concept - mathematical relationships, mathematical formulas or equations, or mathematical calculations).
Thus, these limitations do nothing to alter the analysis of claims 2, 11 and 17.
Step 2A Prong Two Analysis: The judicial exceptions are not integrated into a practical application.
In particular, the claims recite the additional element: “wherein the at least one soft agent5 is to perform:” <the above-noted operations/mental processes combined with a mathematical concept>
The “the at least one soft agent is to perform:” <the operations> amounts to the recitation of the words “apply it” (or an equivalent) or amount to no more than mere instructions to implement an abstract idea or other exception on a computer or merely use a computer as a tool to perform an abstract idea (i.e., as generic computer components performing generic computer functions). See MPEP 2106.05(f).
Step 2B Analysis: The claims do not recite additional elements that are sufficient to amount to significantly more than the judicial exception. 
Mere instructions to implement the abstract idea electronically (i.e., with the recited “soft agent”, the “non-transitory computer-readable storage mediums [sic – media] having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations” of base claim 1, and the apparatus and system comprising “one or more processors” and a memory and a data storage of base claims 10 and 17) do not amount to significantly more than the judicial exception. As noted above, merely asserting that a judicial exception is to be carried out on a generic computer cannot provide significantly more than the judicial exception. See MPEP § 2106.05(f).
These claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there are no additional elements recited that impose any meaningful limits on practicing the abstract idea. Therefore, the additional elements of these dependent claims are not sufficient to amount to significantly more than the abstract idea. These claims are not patent eligible.

Regarding claims 4 and 13, these claims are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 4 is directed to a non-transitory computer-readable storage medium as depending from claim 3 and claim 13 is directed to an apparatus as depending from claim 12, thus the analysis for patent eligibilities of claims 2-3 and 11-12, and of base claims 1 and 10 are incorporated herein.
Step 2A Prong 1: The claims both recite “wherein the at least one soft agent includes a first soft agent to support the conditional aggregation and a second soft agent6 to support the conditional normalization.”
These limitations do nothing to alter the fundamental nature of the claims as a mental process combined with a mathematical concept. This is because the additional limitations merely limit the invention to a narrower abstract idea by further narrowing what the conditional aggregation and normalization include, i.e., using a generically-recited 1st and 2nd “soft agent”.
Dependent claims 4 and 13, when analyzed as a whole, are not patent eligible under 35 U.S.C. 101 because the additional recited limitations fail to establish that the claims are not directed to an abstract idea. The additional limitations added by these claims cover a mental process of evaluation, judgement, or opinion to conditionally aggregate a data value based on observing received input sample/data (corresponding to mental processes which can be done mentally or by pen and paper), and then conditionally normalize the aggregated value (corresponding to a mathematical concept - mathematical relationships, mathematical formulas or equations, or mathematical calculations).
Thus, these limitations do nothing to alter the analysis of claims 1-3 and 10-12.
Step 2A Prong Two Analysis: The judicial exceptions are not integrated into a practical application.
In particular, the claims recite the additional elements: “wherein the at least one soft agent includes a first soft agent to support” <the above-noted mental process> “and a second soft agent7 to support” <the above-noted mathematical concept>.
The 1st and 2nd soft agent “to support” <the above-noted mental process and mathematical concept> amount to the recitation of the words “apply it” (or an equivalent) or amount to no more than mere instructions to implement an abstract idea or other exception on a computer or merely use a computer as a tool to perform an abstract idea (i.e., as generic computer components performing generic computer functions). See MPEP 2106.05(f).
Step 2B Analysis: The claims do not recite additional elements that are sufficient to amount to significantly more than the judicial exception. 
Mere instructions to apply the mental process electronically, or to implement an abstract idea or other exception on a computer (i.e., with the recited 1st and 2nd “soft agent”, the “non-transitory computer-readable storage mediums [sic – media] having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations” of base claim 1, and the apparatus comprising “one or more processors” and memory of base claim 10) do not amount to significantly more than the judicial exception. As noted above, merely asserting that a judicial exception is to be carried out on a generic computer cannot provide significantly more than the judicial exception. See MPEP § 2106.05(f).
These claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there are no additional elements recited that impose any meaningful limits on practicing the abstract idea. Therefore, the additional elements of these dependent claims are not sufficient to amount to significantly more than the abstract idea. These claims are not patent eligible.
Regarding claim 5, this claims is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 5 is directed to a non-transitory computer-readable storage medium as depending from claim 4, thus the analysis for patent eligibilities of claims 2-4, and of base claim 1 are incorporated herein.
Step 2A Prong 1: The claim recites “wherein the first soft agent includes a fully connected layer for mapping and a layer for scaling.”
This limitation does nothing to alter the fundamental nature of the claim as a mental process combined with a mathematical concept. This is because the additional limitation merely limits the invention to a narrower abstract idea by further narrowing what the 1st soft agent includes, i.e., “a fully connected layer for mapping and a layer for scaling.”
Dependent claim 5, when analyzed as a whole, is not patent eligible under 35 U.S.C. 101 because the additional recited limitation fails to establish that the claim is not directed to an abstract idea. The additional limitation added by this claim covers a mental process of evaluation, judgement, or opinion to map data based on observing received input sample/data (corresponding to mental processes which can be done mentally or by pen and paper), and scaling data (corresponding to a mathematical concept - mathematical relationships, mathematical formulas or equations, or mathematical calculations). Also, the limitation includes intended use language with no patentable weight (e.g., “layer for mapping and a layer for scaling.”).
Thus, this limitation does nothing to alter the analysis of claims 1-4.
Step 2A Prong Two Analysis: The judicial exceptions are not integrated into a practical application.
In particular, the claim recites the additional elements: “a fully connected layer for” <the above-noted mental process> “and a layer for” <the above-noted mathematical concept>.
The generically-recited layers “for” performing <the above-noted mental process and mathematical concept> amount to the recitation of the words “apply it” (or an equivalent) or amount to no more than mere instructions to implement an abstract idea or other exception on a computer or merely use a computer as a tool to perform an abstract idea (i.e., as generic computer components performing generic computer functions). See MPEP 2106.05(f).
Step 2B Analysis: The claim does not recite additional elements that are sufficient to amount to significantly more than the judicial exception. 
Mere instructions to apply the mental process electronically, or to implement an abstract idea or other exception on a computer (i.e., with the recited layers and the “non-transitory computer-readable storage mediums [sic – media] having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations” of base claim 1) do not amount to significantly more than the judicial exception. As noted above, merely asserting that a judicial exception is to be carried out on a generic computer cannot provide significantly more than the judicial exception. See MPEP § 2106.05(f).
This claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there are no additional elements recited that impose any meaningful limits on practicing the abstract idea. Therefore, the additional element of this dependent claim is not sufficient to amount to significantly more than the abstract idea. This claim is not patent eligible.

Regarding claim 6, this claims is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 6 is directed to a non-transitory computer-readable storage medium as depending from claim 4, thus the analysis for patent eligibilities of claims 2-4, and of base claim 1 are incorporated herein.
Step 2A Prong 1: The claim recites “wherein the second soft agent includes a long short-term memory (LSTM) block to provide mapping and scaling.”
This limitation does nothing to alter the fundamental nature of the claim as a mental process combined with a mathematical concept. This is because the additional limitation merely limits the invention to a narrower abstract idea by further narrowing what the 2nd soft agent includes, i.e., “a long short-term memory (LSTM) block to provide mapping and scaling.”
Dependent claim 6, when analyzed as a whole, is not patent eligible under 35 U.S.C. 101 because the additional recited limitation fails to establish that the claim is not directed to an abstract idea. The additional limitation added by this claim covers a mental process of evaluation, judgement, or opinion to map data based on observing received input sample/data (corresponding to mental processes which can be done mentally or by pen and paper), and scaling data (corresponding to a mathematical concept - mathematical relationships, mathematical formulas or equations, or mathematical calculations). Also, the limitation includes intended use language with no patentable weight (e.g., “block to provide mapping and scaling.”).
Thus, this limitation does nothing to alter the analysis of claims 1-4.
Step 2A Prong Two Analysis: The judicial exceptions are not integrated into a practical application.
In particular, the claim recites the additional elements: “a long short-term memory (LSTM) block to provide” <the above-noted mental process and mathematical concept>.
The generically-recited LSTM memory block “to provide” performing <the above-noted mental process and mathematical concept> amount to the recitation of the words “apply it” (or an equivalent) or amount to no more than mere instructions to implement an abstract idea or other exception on a computer or merely use a computer as a tool to perform an abstract idea (i.e., as generic computer components performing generic computer functions). See MPEP 2106.05(f).
Step 2B Analysis: The claim does not recite additional elements that are sufficient to amount to significantly more than the judicial exception. 
Mere instructions to apply the mental process electronically, or to implement an abstract idea or other exception on a computer (i.e., with the recited LSTM memory block and the “non-transitory computer-readable storage mediums [sic – media] having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations” of base claim 1) do not amount to significantly more than the judicial exception. As noted above, merely asserting that a judicial exception is to be carried out on a generic computer cannot provide significantly more than the judicial exception. See MPEP § 2106.05(f).
This claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there are no additional elements recited that impose any meaningful limits on practicing the abstract idea. Therefore, the additional element of this dependent claim is not sufficient to amount to significantly more than the abstract idea. This claim is not patent eligible.

Regarding claims 7, 14 and 19, these claims are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 7 is directed to a non-transitory computer-readable storage medium as depending from claim 1, claim 14 is directed to an apparatus as depending from claim 10, and claim 19 is directed to a system as depending from claim 17, thus the analysis for patent eligibilities of claims 1, 10 and 17 are incorporated herein.
Step 2A Prong 1: The claims each recite “wherein performing the conditional aggregation includes: … weighting an output of each of the convolutional filters with a respective soft weight of the plurality of soft weights.”
This limitation does nothing to alter the fundamental nature of the claims as a mental process combined with a mathematical concept. This is because the additional limitation merely limits the invention to a narrower abstract idea by further narrowing what “performing the conditional aggregation includes”, i.e., “weighting an output of each of the convolutional filters with a respective soft weight of the plurality of soft weights.”
Dependent claims 7, 14 and 19, when analyzed as a whole, are not patent eligible under 35 U.S.C. 101 because the additional recited limitation fails to establish that the claims are not directed to an abstract idea. The additional limitation added by these claims covers a mental process of evaluation, judgement, or opinion to weight an observed output of each of the generically-recited convolutional filters with a respective soft weight of an observed set of soft weights (corresponding to mental processes which can be done mentally or by pen and paper).
Thus, this limitation does nothing to alter the analysis of claims 1, 10 and 17.
Step 2A Prong Two Analysis: The judicial exceptions are not integrated into a practical application.
In particular, the claims recite the additional element: “receiving the input sample at a plurality of convolutional kernels for a plurality of convolutional filters.”
This is an insignificant extra-solution activity that is not integrated into the claims as a whole and does not add a meaningful limitation to the above-noted mental processes and mathematical concept specified in these claims. That is, “receiving the input sample” at a generically-recited “plurality of convolutional kernels for a plurality of convolutional filters” amounts to mere data gathering (i.e., receiving provided/transmitted input data) (See MPEP § 2106.05(g)).
Step 2B Analysis: The claims do not recite additional elements that are sufficient to amount to significantly more than the judicial exception. 
Receiving, transmitting and communicating data are insignificant extra-solution activities that are well-understood, routine, and conventional. See MPEP2106.05(d)(II) (“The courts have recognized the following computer functions as well‐understood, routine, and conventional functions… i. Receiving or transmitting data over a network…iv. Storing and retrieving information in memory”) (citing OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015)). Therefore, recitations of “receiving the input sample at a plurality of convolutional kernels for a plurality of convolutional filters” are the well-understood, routine, conventional activities of receiving or transmitting data over a network, as discussed in MPEP § 2106.05(d).
Mere instructions to apply the mental process electronically, or to implement an abstract idea or other exception on a computer (i.e., with the recited “convolutional kernels for a plurality of convolutional filters”, “non-transitory computer-readable storage mediums [sic – media] having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations” of claim 1, and apparatus and system comprising “one or more processors” and a memory and a data storage of claims 10 and 17) do not amount to significantly more than the judicial exception. As noted above, merely asserting that a judicial exception is to be carried out on a generic computer cannot provide significantly more than the judicial exception. See MPEP § 2106.05(f).
These claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there are no additional elements recited that impose any meaningful limits on practicing the abstract idea. Therefore, the additional elements of these dependent claims are not sufficient to amount to significantly more than the abstract idea. These claims are not patent eligible.

Regarding claims 8, 15 and 20, these claims are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 8 is directed to a non-transitory computer-readable storage medium as depending from claim 1, claim 15 is directed to an apparatus as depending from claim 10, and claim 20 is directed to a system as depending from claim 17, thus the analysis for patent eligibilities of claims 1, 10 and 17 are incorporated herein.
Step 2A Prong 1: The claims each recite “wherein performing the conditional normalization includes:
performing standardization to generate a standardized representation of a feature map; and
performing an affine transform to re-scale and re-shift the standardized feature map8.”
These limitations do nothing to alter the fundamental nature of the claims as a mental process combined with a mathematical concept. This is because the additional limitations merely limit the invention to a narrower abstract idea by further narrowing what “performing the conditional normalization includes”, i.e., “performing standardization to generate a standardized representation of a feature map; and performing an affine transform to re-scale and re-shift the standardized feature map.”
Dependent claims 8, 15 and 20, when analyzed as a whole, are not patent eligible under 35 U.S.C. 101 because the additional recited limitations fail to establish that the claims are not directed to an abstract idea. The additional limitations added by these claims cover mathematical concepts - generating a standardized representation of a feature map via standardization and then performing an affine transform to re-scale and re-shift the standardized feature map (corresponding to mathematical concepts - mathematical relationships, mathematical formulas or equations, or mathematical calculations).
Thus, these limitations do nothing to alter the analysis of claims 1, 10 and 17.
Step 2A Prong Two Analysis: The judicial exceptions are not integrated into a practical application.
The claims do not recite any additional elements that integrate the abstract idea into a practical application or provide significantly more than the abstract idea, and thus the claims are subject-matter ineligible. 
Mere instructions to apply the mathematical concept electronically do not meaningfully integrate the judicial exception into a practical application. See MPEP 2106.05(f).
Step 2B Analysis: The claims do not recite additional elements that are sufficient to amount to significantly more than the judicial exception. 
Mere instructions to apply the mathematical concepts electronically (i.e., with the “non-transitory computer-readable storage mediums [sic – media] having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations” of claim 1, and apparatus and system comprising “one or more processors” and a memory and a data storage of claims 10 and 17) do not amount to significantly more than the judicial exception. See MPEP § 2106.05(f).
These claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there are no additional elements recited that impose any meaningful limits on practicing the abstract idea. Therefore, the additional elements of these dependent claims are not sufficient to amount to significantly more than the abstract idea. These claims are not patent eligible.

Regarding claims 9 and 16, these claims are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 9 is directed to a non-transitory computer-readable storage medium as depending from claim 1, and claim 16 is directed to an apparatus as depending from claim 10, thus the analysis for patent eligibilities of claims 1 and 10 are incorporated herein.
Step 2A Prong 1: The claims recite, using respective similar language, “performing convolution and detection to generate the input sample from the input received”.
This limitation does nothing to alter the fundamental nature of the claims as a mental process combined with a mathematical concept. 
Dependent claims 9 and 16, when analyzed as a whole, are not patent eligible under 35 U.S.C. 101 because the additional recited limitation fails to establish that the claims are not directed to an abstract idea. The additional limitation added by these claims covers a mental process of evaluation, judgement, or opinion to generate/detect the input sample based on the received/observed input. This can be performed as a mental process. A person can merely generate an input sample by detecting/identifying the sample based on observing the received input data.
Thus, this limitation does nothing to alter the analysis of claims 1 and 10.
Step 2A Prong Two Analysis: The judicial exceptions are not integrated into a practical application.
In particular, the claims recite the additional element: “the input received at the convolutional layer.”
This is an insignificant extra-solution activity that is not integrated into the claims as a whole and does not add a meaningful limitation to the above-noted mental processes and mathematical concept specified in these claims. That is, “the input received” at a generically-recited “convolutional layer” amounts to mere data gathering (i.e., receiving provided/transmitted input data) (See MPEP § 2106.05(g)).
Step 2B Analysis: The claims do not recite additional elements that are sufficient to amount to significantly more than the judicial exception. 
Receiving, transmitting and communicating data are insignificant extra-solution activities that are well-understood, routine, and conventional. See MPEP2106.05(d)(II) (“The courts have recognized the following computer functions as well‐understood, routine, and conventional functions… i. Receiving or transmitting data over a network…iv. Storing and retrieving information in memory”) (citing OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015)). Therefore, recitations of “the input received at the convolutional layer” are the well-understood, routine, conventional activities of receiving or transmitting data over a network, as discussed in MPEP § 2106.05(d).
Mere instructions to apply the mental process electronically, or to implement an abstract idea or other exception on a computer (i.e., with the “non-transitory computer-readable storage mediums [sic – media] having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations” of base claim 1, and the apparatus comprising “one or more processors” and memory of base claim 10) do not amount to significantly more than the judicial exception. As noted above, merely asserting that a judicial exception is to be carried out on a generic computer cannot provide significantly more than the judicial exception. See MPEP § 2106.05(f).
These claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, there are no additional elements recited that impose any meaningful limits on practicing the abstract idea. Therefore, the additional element of these dependent claims is not sufficient to amount to significantly more than the abstract idea. These claims are not patent eligible.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-6, 9-13 and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over non-patent literature Luo et al. (“Switchable Normalization for Learning-to-Normalize Deep Representation”, 2019, arXiv:1907.10473v1, hereinafter referred “Luo”) in view of non-patent literature Wang et al. (“Semi- Supervised Domain Adaptation for Weakly Labeled Semantic Video Object Segmentation”, 2016, arXiv:1606.02280v1, hereinafter “Wang”) and further in view of Nemlekar et al. (U.S. Patent Application Pub. No. 2020/0192631 A1, hereinafter “Nemlekar”).
With respect to claim 1, Luo discloses the invention as claimed including instructions that, when executed by one or more processors, cause the one or more processors to perform operations (see, e.g., Abstract, “We address a learning-to-normalize problem by proposing Switchable Normalization (SN), which learns to select different normalizers for different normalization layers of a deep neural network. … maintaining high performance even when small minibatch is presented (e.g. 2 images/GPU). … The code of SN has been released” and pages 1-2, Sect. 1, “The gradients in training are averaged over all GPUs and the statistics of normalizers are estimated in each GPU.”, “We make the code of SN available” [i.e., executable software code/instructions causing a GPU/processor to perform operations]) comprising:
receiving an input at a convolutional layer of a convolutional neural network (CNN) (see, e.g., pages 2, Sect. 1, right column, “learn arbitrary normalization for different convolutional layers.”, 3, Sect. 3, left column, paragraph 6, “We take CNN as an illustrative example. Let h be the input data of an arbitrary normalization layer”, 13, Sect. 4.7, right column, “a convolutional neural network (CNN) is constructed by stacking multiple convolutional cells.” and 15, Sect. 5, right column, “it is valuable to design the algorithm to lean [sic – learn] arbitrary normalization operations for different convolutional layers in a deep ConvNet.” [i.e., receive input data at a convolutional layer of a CNN]);
receiving an input sample at a … stage of the convolutional layer (see, e.g., pages 3, Sects 2-3, left column, paragraphs 2, 4 and 6, “SN trained with a single stage”, “SN can be generally optimized within a single stage in the same dataset”, “Let h be the input data of an arbitrary normalization layer represented by a 4D tensor (N, C, H, W), indicating number of samples, number of channels, height and width of a channel respectively”, page 13, Sect. 4.7, right column, “a convolutional neural network (CNN) is constructed by stacking multiple convolutional cells.” and page 15, Sect. 5, right column, “it is valuable to design the algorithm to lean [sic – learn] arbitrary normalization operations for different convolutional layers in a deep ConvNet.” [i.e., receive input sample at a stage of the convolutional layer]);
generating a plurality of soft weights9 based on the input sample (see, e.g., pages 5, Sect. 4.1.2, right column, last paragraph, “using only 16 and 32 samples, such that their batch sizes are the same as (8; 2) and (8; 4).” [i.e., input sample has batch sizes] and 6, Sect 4.1.3, left column, paragraph 4 “Fig.1 (a) and Fig.4 plot histograms to compare the importance weights of SN with respect to different tasks and batch sizes. … SN adapts to various scenarios by changing its importance weights. For example, SN prefers BN when the minibatch is sufficiently large [i.e., based on the size of the input sample], while it selects LN instead when small minibatch is presented” [i.e., generate changeable/updateable soft weights based on the input sample size]); … and
performing conditional normalization on the aggregated value to generate an output for the convolutional layer (see, e.g., pages 2, left column, paragraph 1 and right column, paragraphs 2 and 4, “We introduce Switchable Normalization (SN), which is applicable in … CNNs … enabling each normalization layer in a deep network to have its own operation”, “Dynamic Normalization (DN) to learn arbitrary normalization for different convolutional layers. … we compare SN with five popular normalization methods, i.e. BN, IN, LN, GN and WN”, 5, Sect. 4.1.1, “For each setting, the gradients are aggregated … and the means and variances of the normalization methods are computed” and 16, Appendix A, “Let h^ be the output of the SN layer” [i.e., conditional normalization on an aggregated value to generate the CNN’s convolutional layer output]).
Although Luo substantially discloses the claimed invention, and Luo discloses input “samples per GPU). For each setting, the gradients are aggregated over all GPUs” and “spatial pyramid pooling (i.e. one type of multi-scale global pooling)” (see, pages 5, Sect. 4.1.1 and 13, Sect 4.3), Luo is not relied on for explicitly disclosing receiving an input sample at a pooling stage of the convolutional layer; and
performing conditional aggregation on the input sample utilizing the plurality of soft weights to generate an aggregated value.
In the same field, analogous art Wang teaches receiving an input sample at a pooling stage of the convolutional layer (see, e.g., Fig. 2, reproduced below, an “illustration of the weighted spatial average pooling strategy” where an input sample is received and pages 5, “VGG-16 net uses 3×3 convolution interleaved with max pooling and 3 fully-connected layers. … firstly warp the image data in each region into a form that is compatible with the CNN (VGG-16 net requires inputs” [i.e., receiving input sample a convolutional layer of the CNN] and 6, paragraph 1 “Spatial Average Pooling After the initial discovery, a large number of region proposals are positively detected … We adopt a simple weighted spatial average pooling strategy to aggregate the region-wise score, confidence as well as their spatial extent. For each proposal                 
                    
                            r
                        
                            i
                        
            , we rescore it by multiplying its score and classification confidence, which is denoted by                 
                    
                                    s
                                
                                ~
                            
                                    r
                                
                                    i
                                
                    =
                    
                            s
                        
                                    r
                                
                                    i
                                
                    ∙
                    
                            c
                        
                                    r
                                
                                    i
                                
            . We then generate score map                 
                    
                            S
                        
                                    r
                                
                                    i
                                
             of the size of image frame, which is composited as the binary map of current region proposal multiplied by its score                 
                    
                                    s
                                
                                ~
                            
                                    r
                                
                                    i
                                
            . We perform an average pooling over the score maps of all the proposals to compute a confidence map” [i.e., receive input sample at a pooling stage of the layer after initial discovery/input stage]); 

    PNG
    media_image1.png
    200
    400
    media_image1.png
    Greyscale

Wang, FIG. 2
performing conditional aggregation on the input sample utilizing the plurality of soft weights10 to generate an aggregated value (see, e.g., pages 6, paragraph 1 “We adopt a simple weighted spatial average pooling strategy to aggregate the region-wise score, confidence as well as their spatial extent. For each proposal                 
                    
                            r
                        
                            i
                        
            , we rescore it by multiplying its score and classification confidence, which is denoted by                 
                    
                                    s
                                
                                ~
                            
                                    r
                                
                                    i
                                
                    =
                    
                            s
                        
                                    r
                                
                                    i
                                
                    ∙
                    
                            c
                        
                                    r
                                
                                    i
                                
            . We then generate score map                 
                    
                            S
                        
                                    r
                                
                                    i
                                
             of the size of image frame, which is composited as the binary map of current region proposal multiplied by its score                 
                    
                                    s
                                
                                ~
                            
                                    r
                                
                                    i
                                
            . We perform an average pooling over the score maps of all the proposals to compute a confidence map Ct… The resulted confidence map Ct aggregates not only the region-wise score but also their spatial extent. … the weighted spatial average pooling is shown in Fig. 2” [i.e., perform average pooling and conditional aggregation using the weights to generate an aggregated value/score]).
Wang relates to adaptive convolutional neural networks and is analogous to the claimed invention. Luo teaches an apparatus that has unique composite normalization for data at each layer. Wang teaches an apparatus that performs spatial average pooling at a pooling stage on CNN data. Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine Luo and Wang by pooling Luo’s input data before further processing. Doing so would achieve the predictable result of emphasizing the most important aspects of the images while ensuring their compatibility with further processing in the CNN, with Luo’s normalization and Wang’s pooling performing the same together as they did separately. (See, MPEP 2143 I. (A) Combining prior art elements according to known methods to yield predictable results).
	Although Luo in view of Wang substantially teaches the claimed invention, Luo in view of Wang is not relied on to teach One or more non-transitory computer-readable storage mediums having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations.
In the same field, analogous art Nemlekar teaches One or more non-transitory computer-readable storage mediums11 having stored thereon instructions that, when executed by one or more processors, cause the one or more processors to perform operations (see, e.g., paragraphs 28-29, “A computer readable storage medium may include any non-transitory storage medium, or combination of non-transitory storage media, accessible by a computer system during use to provide instructions and/or data to the computer system”, “aspects of the techniques described above may implemented by one or more processors of a processing system executing software. The software includes one or more sets of executable instructions stored or otherwise tangibly embodied on a non-transitory computer readable storage medium. The software can include the instructions and certain data that, when executed by the one or more processors, manipulate the one or more processors to perform one or more aspects of the techniques described above. The non-transitory computer readable storage medium can include, for example, a magnetic or optical disk storage device, solid state storage devices such as Flash memory, a cache, random access memory (RAM) or other non-volatile memory device or devices, and the like. The executable instructions stored on the non-transitory computer readable storage medium may be in source code, assembly language code, object code, or other instruction format that is interpreted or otherwise executable by one or more processors.”).
Nemlekar relates to normalization in convolutional neural networks and is analogous to the claimed invention. Luo in view of Wang teaches techniques, methods and an apparatus for performing normalization in convolutional neural networks. The claimed invention improves upon these methods by storing them in the form of instructions on computer hardware. Nemlekar teaches computer hardware for normalization in CNNs, applicable to Luo. Before the effective filing date of the claimed invention, one of ordinary skill in the art would have recognized that storing Luo in view of Wang’s techniques/methods as computer-executable instructions on Nemlekar’s hardware would lead to the predictable result of the method being executable by a computing system, and would improve the known device by allowing it to be performed with real data (See, MPEP 2143 I. (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results).

With respect to independent claim 10, claim 10 is substantially similar to claim 1 and therefore is rejected on the same grounds as claim 1, discussed above. In particular, claim 10 is an apparatus claim with operations that correspond to the operations of claim 1.
Although Luo in view of Wang substantially teaches the claimed invention and Wang discloses “We implement our method using MATLAB and C/C++, with Caffe … on a commodity desktop with a Quad-Core 4.0 GHz processor, 16 GB of RAM, and GTX 980 GPU” (see, page 10, Sect. 3.3), Luo in view of Wang is not relied on to teach an apparatus comprising: one or more processors; and a memory to store data, including data of a convolutional neural network (CNN), the CNN having a plurality of layers including one or more convolutional layers.
In the same field, analogous art Nemlekar teaches an apparatus comprising: one or more processors; and a memory to store data, including data of a convolutional neural network (CNN), the CNN having a plurality of layers including one or more convolutional layers (see, e.g., paragraphs 11, “the CNN includes layers … operations to implement the CNN can be grouped into 3 categories, or phases: a convolution phase, a batch normalization (BN) phase, and an activation phase, … phases can be repeated for each layer of the CNN”, 26, “fusing a convolution phase of a CNN with a reduction phase of a batch normalization phase of the CNN” [i.e., a CNN with multiple layers and a convolution layer] and 29 “aspects of the techniques described above may implemented by one or more processors of a processing system executing software. The software includes one or more sets of executable instructions stored or otherwise tangibly embodied on a non-transitory computer readable storage medium. The software can include the instructions and certain data that, when executed by the one or more processors, manipulate the one or more processors to perform one or more aspects of the techniques described above. The non-transitory computer readable storage medium can include, for example, a magnetic or optical disk storage device, solid state storage devices such as Flash memory, a cache, random access memory (RAM) or other non-volatile memory device or devices, and the like. The executable instructions stored on the non-transitory computer readable storage medium may be in source code, assembly language code, object code, or other instruction format that is interpreted or otherwise executable by one or more processors.”).
Nemlekar relates to normalization in convolutional neural networks and is analogous to the claimed invention. Luo in view of Wang teaches techniques, methods and an apparatus for performing normalization in convolutional neural networks. The claimed invention improves upon these methods by storing them in the form of instructions on computer hardware. Nemlekar teaches computer hardware for normalization in CNNs, applicable to Luo. Before the effective filing date of the claimed invention, one of ordinary skill in the art would have recognized that storing Luo in view of Wang’s techniques/methods as computer-executable instructions on Nemlekar’s hardware would lead to the predictable result of the method being executable by a computing system, and would improve the known device by allowing it to be performed with real data (See, MPEP 2143 I. (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results).

Regarding claims 2 and 11, as discussed above, Luo in view of Wang and Nemlekar teaches the computer-readable storage medium of claim 1 and the apparatus of claim 10.
Luo further discloses wherein the plurality of soft weights12 are generated by at least one soft agent13 (see, e.g., pages 4, Sect. 3.1, left column, equation (5), paragraph 2, “Each wk or w’k is a scalar variable, which is shared across all channels. There are 3 x 2 = 6 importance weights in SN. …
    PNG
    media_image2.png
    87
    563
    media_image2.png
    Greyscale
Here each wk [i.e., soft weight] is computed by using a softmax function with λin, λln, and λbn as the control parameters, which can be learned by back-propagation (BP). w’k are defined similarly by using another three control parameters λ’in, λ’ln, and λ’bn.” and 6, Sect 4.1.3, left column, paragraph 4 “Fig.1 (a) and Fig.4 plot histograms to compare the importance weights of SN with respect to different tasks and batch sizes. … SN adapts to various scenarios by changing its importance weights.” [i.e., updateable/changeable soft weights wk and w’k are computed/generated by a soft agent/function]).

Regarding claims 9 and 16, as discussed above, Luo in view of Wang and Nemlekar teaches the computer-readable storage medium of claim 1 and the apparatus of claim 10.
Luo further discloses performing convolution and detection to generate the input sample from the input received at the convolutional layer (see, e.g., pages 2, Sect. 2, right column, “learn arbitrary normalization for different convolutional layers”, 3, Sect. 3, left column, “We take CNN as an illustrative example. Let h be the input data of an arbitrary normalization layer represented by a 4D tensor (N;C;H;W), indicating number of samples, number of channels, height and width of a channel respectively” and right column, “a mean value and a variance value are computed in (C;H;W) for each one of the N samples”, 8, Sects. 4.2-4.2.1, right column, “SN selects different operations in different components of a detection system”, “we implement object detection … on existing detection softwares of PyTorch and Caffe2-Detectron … Faster R-CNN … R-CNN+FPN … Mask R-CNN” [i.e., performing convolution and detection operations], 13, Sect. 4.7, right column, “a convolutional neural network (CNN) is constructed by stacking multiple convolutional cells.” and 14-15, Sect. 5, “This work has demonstrated SN in multiple tasks of CV [computer vision] such as recognition, detection … it is valuable to design the algorithm to lean [sic – learn] arbitrary normalization operations for different convolutional layers in a deep ConvNet.” [i.e., perform convolution and detection to generate the input sample N from input data received at the convolutional layer]).

With respect to independent claim 17, claim 17 is substantially similar to claim 1 and therefore is rejected on the same grounds as claim 1, discussed above. In particular, claim 17 is an system claim with operations that correspond to the operations of claim 1.
Luo further discloses one or more processors; a data storage to store data including instructions for the one or more processors (see, e.g., Abstract, “We address a learning-to-normalize problem by proposing Switchable Normalization (SN), which learns to select different normalizers for different normalization layers of a deep neural network. … maintaining high performance even when small minibatch is presented (e.g. 2 images/GPU). … The code of SN has been released” and pages 1-2, Sect. 1, “The gradients in training are averaged over all GPUs and the statistics of normalizers are estimated in each GPU.”, “We make the code of SN available” [i.e., software code/instructions for one or more GPUs/processors]),
wherein the plurality of soft weights14 are generated by at least one soft agent15 (see, e.g., pages 4, Sect. 3.1, left column, equation (5), paragraph 2, “Each wk or w’k is a scalar variable, which is shared across all channels. There are 3 x 2 = 6 importance weights in SN. …
    PNG
    media_image2.png
    87
    563
    media_image2.png
    Greyscale
Here each wk [i.e., soft weight] is computed by using a softmax function with λin, λln, and λbn as the control parameters, which can be learned by back-propagation (BP). w’k are defined similarly by using another three control parameters λ’in, λ’ln, and λ’bn.” and 6, Sect 4.1.3, left column, paragraph 4 “Fig.1 (a) and Fig.4 plot histograms to compare the importance weights of SN with respect to different tasks and batch sizes. … SN adapts to various scenarios by changing its importance weights.” [i.e., updateable/changeable soft weights wk and w’k are computed/generated by a soft agent/function]).
Although Luo in view of Wang substantially teaches the claimed invention, and Wang discloses “We implement our method using MATLAB and C/C++, with Caffe … on a commodity desktop with a Quad-Core 4.0 GHz processor, 16 GB of RAM, and GTX 980 GPU” (see, page 10, Sect. 3.3), Luo in view of Wang is not relied on to teach a computing system comprising: one or more processors; a data storage to store data including instructions for the one or more processors; and a memory including random access memory (RAM) to store data, including data of a convolutional neural network (CNN), the CNN having a plurality of layers including one or more convolutional layers.
In the same field, analogous art Nemlekar teaches a computing system comprising: one or more processors; a data storage to store data including instructions for the one or more processors; and a memory including random access memory (RAM) to store data, including data of a convolutional neural network (CNN), the CNN having a plurality of layers including one or more convolutional layers (see, e.g., paragraphs 11, “the CNN includes layers … operations to implement the CNN can be grouped into 3 categories, or phases: a convolution phase, a batch normalization (BN) phase, and an activation phase, … phases can be repeated for each layer of the CNN”, 26, “fusing a convolution phase of a CNN with a reduction phase of a batch normalization phase of the CNN” [i.e., a CNN with multiple layers and a convolution layer], and 29, “aspects of the techniques described above may implemented by one or more processors of a processing system executing software. The software includes one or more sets of executable instructions stored or otherwise tangibly embodied on a non-transitory computer readable storage medium. The software can include the instructions and certain data that, when executed by the one or more processors, manipulate the one or more processors to perform one or more aspects of the techniques described above. The non-transitory computer readable storage medium can include, for example, a magnetic or optical disk storage device, solid state storage devices such as Flash memory, a cache, random access memory (RAM) or other non-volatile memory device or devices, and the like. The executable instructions stored on the non-transitory computer readable storage medium may be in source code, assembly language code, object code, or other instruction format that is interpreted or otherwise executable by one or more processors.”).
Nemlekar relates to normalization in convolutional neural networks and is analogous to the claimed invention. Luo in view of Wang teaches techniques, methods and an apparatus for performing normalization in convolutional neural networks. The claimed invention improves upon these methods by storing them in the form of instructions on computer hardware. Nemlekar teaches computer hardware for normalization in CNNs, applicable to Luo. Before the effective filing date of the claimed invention, one of ordinary skill in the art would have recognized that storing Luo in view of Wang’s techniques/methods as computer-executable instructions on Nemlekar’s hardware would lead to the predictable result of the method being executable by a computing system, and would improve the known device by allowing it to be performed with real data (See, MPEP 2143 I. (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results).

Regarding claims 3, 12 and 18, as discussed above, Luo in view of Wang and Nemlekar teaches the computer-readable storage medium of claim 2, the apparatus of claim 11, and the system of claim 17.
Luo further discloses wherein the at least one soft agent16 is to perform: … scaling of the mapped input sample to generate the plurality of soft weights (see, e.g., page 3, Table 1, “                
                    γ
                
            ,                
                    β
                
             denote the scale and shift parameters … wk is an importance weight of each kind”, and right column, paragraphs 1 and 4-5, “                
                    γ
                
             and                 
                    β
                
             are a scale and a shift parameter respectively … each pixel is normalized by using μ and Ϭ, and then re-scale and re-shift by                 
                    γ
                
             and                 
                    β
                
            .”, “mean value and a variance value are computed in (C;H;W) for each one of the N samples.”, “normalizing the hidden feature maps of CNNs.” [i.e., mapped input samples/images with pixels], and 4, equation (3), Sect. 3.1, left column, paragraphs 1-2, “SN has an intuitive expression 
    PNG
    media_image3.png
    96
    613
    media_image3.png
    Greyscale
 where                 
                    Ω
                
             is a set of statistics estimated in different ways. In this work, we define                 
                    Ω
                    =
                    {
                    i
                    n
                    ,
                    l
                    n
                    ,
                    b
                    n
                    }
                
             the same as above where                 
                    
                            μ
                        
                            k
                        
             and                 
                    
                            σ
                        
                            k
                        
             can be calculated by following Eqn. (2).”, “each wk [i.e., soft weight] is computed by using a softmax function with λin, λln, and λbn as the control parameters, which can be learned by back-propagation (BP). w’k are defined similarly by using another three control parameters λ’in, λ’ln, and λ’bn.” [i.e., scaling input sample to generate/compute updateable/changeable soft weights wk and w’k]).
Although Luo substantially discloses the claimed invention, and Luo discloses input “samples per GPU). For each setting, the gradients are aggregated over all GPUs” and “spatial pyramid pooling (i.e. one type of multi-scale global pooling)” (see, pages 5, Sect. 4.1.1 and 13, Sect 4.3), Luo is not relied on for explicitly disclosing global aggregation of the input sample to aggregate the input sample along all but one input dimensions; and 
mapping of the aggregated input sample.
In the same field, analogous art Wang teaches global aggregation of the input sample to aggregate the input sample along all but one input dimensions17 (see, e.g., pages 6, paragraph 1 “We adopt a simple weighted spatial average pooling strategy to aggregate the region-wise score, confidence as well as their spatial extent. For each proposal                 
                    
                            r
                        
                            i
                        
            , we rescore it by multiplying its score and classification confidence, which is denoted by                 
                    
                                    s
                                
                                ~
                            
                                    r
                                
                                    i
                                
                    =
                    
                            s
                        
                                    r
                                
                                    i
                                
                    ∙
                    
                            c
                        
                                    r
                                
                                    i
                                
            . We then generate score map                 
                    
                            S
                        
                                    r
                                
                                    i
                                
             of the size of image frame, which is composited as the binary map of current region proposal multiplied by its score                 
                    
                                    s
                                
                                ~
                            
                                    r
                                
                                    i
                                
            . We perform an average pooling over the score maps of all the proposals [i.e., global aggregation and pooling] to compute a confidence map Ct… The resulted confidence map Ct aggregates not only the region-wise score but also their spatial extent. … the weighted spatial average pooling is shown in Fig. 2” [i.e., perform global aggregation/over all the proposals of the input sample/image data/frames to aggregate the input sample along all but one spatial dimension of the input image]); and
mapping of the aggregated input sample18 (see, e.g., pages 6, paragraph 1 “aggregate the region-wise score, confidence as well as their spatial extent. For each proposal                 
                    
                            r
                        
                            i
                        
            , we rescore it by multiplying its score and classification confidence, which is denoted by                 
                    
                                    s
                                
                                ~
                            
                                    r
                                
                                    i
                                
                    =
                    
                            s
                        
                                    r
                                
                                    i
                                
                    ∙
                    
                            c
                        
                                    r
                                
                                    i
                                
            . We then generate score map                 
                    
                            S
                        
                                    r
                                
                                    i
                                
             of the size of image frame, which is composited as the binary map of current region proposal multiplied by its score                 
                    
                                    s
                                
                                ~
                            
                                    r
                                
                                    i
                                
            . We perform an average pooling over the score maps of all the proposals to compute a confidence map Ct… The resulted confidence map Ct aggregates not only the region-wise score but also their spatial extent.” [i.e., map aggregated input image frame/sample]).
Wang relates to adaptive convolutional neural networks and is analogous to the claimed invention. Luo teaches an apparatus that has unique composite normalization for data at each layer. Wang teaches an apparatus that performs spatial average pooling at a pooling stage on CNN data. Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine Luo and Wang by pooling Luo’s input data before further processing. Doing so would achieve the predictable result of emphasizing the most important aspects of the images while ensuring their compatibility with further processing in the CNN, with Luo’s normalization and Wang’s pooling performing the same together as they did separately. (See, MPEP 2143 I. (A) Combining prior art elements according to known methods to yield predictable results).

Regarding claims 4 and 13, as discussed above, Luo in view of Wang and Nemlekar teaches the computer-readable storage medium of claim 3 and the apparatus of claim 12.
Luo further discloses wherein the at least one soft agent includes … a second soft agent19 to support the conditional normalization (see, e.g., pages 2, left column, paragraph 1 and right column, paragraphs 2 and 4, “We introduce Switchable Normalization (SN), which is applicable in … CNNs … enabling each normalization layer in a deep network to have its own operation”, “Dynamic Normalization (DN) to learn arbitrary normalization for different convolutional layers. … we compare SN with five popular normalization methods, i.e. BN, IN, LN, GN and WN”, 4, Sect. 3.1, left column, paragraph 2, “each wk is computed by using a softmax function with λin, λln, and λbn as the control parameters” and 5, Sect. 4.1.1, “For each setting, the gradients are aggregated … and the means and variances of the normalization methods are computed” [i.e., conditional normalization is supported/computed by a 2nd soft agent/function computing method for the normalization]).
Although Luo substantially discloses the claimed invention, and Luo discloses input “samples per GPU). For each setting, the gradients are aggregated over all GPUs” and “spatial pyramid pooling (i.e. one type of multi-scale global pooling)” (see, pages 5, Sect. 4.1.1 and 13, Sect 4.3), Luo is not relied on for explicitly disclosing wherein the at least one soft agent includes a first soft agent to support the conditional aggregation.
In the same field, analogous art Wang teaches wherein the at least one soft agent includes a first soft agent20 to support the conditional aggregation (see, e.g., pages 6, paragraph 1 “We adopt a simple weighted spatial average pooling strategy to aggregate the region-wise score, confidence as well as their spatial extent. For each proposal                 
                    
                            r
                        
                            i
                        
            , we rescore it by multiplying its score and classification confidence, which is denoted by                 
                    
                                    s
                                
                                ~
                            
                                    r
                                
                                    i
                                
                    =
                    
                            s
                        
                                    r
                                
                                    i
                                
                    ∙
                    
                            c
                        
                                    r
                                
                                    i
                                
            . We then generate score map                 
                    
                            S
                        
                                    r
                                
                                    i
                                
             of the size of image frame, which is composited as the binary map of current region proposal multiplied by its score                 
                    
                                    s
                                
                                ~
                            
                                    r
                                
                                    i
                                
            . We perform an average pooling over the score maps of all the proposals to compute a confidence map Ct… The resulted confidence map Ct aggregates not only the region-wise score but also their spatial extent.” and 10, Sect. 3.3, “Implementation We implement our method using MATLAB and C/C++, with Caffe … on a commodity desktop with a Quad-Core 4.0 GHz processor, 16 GB of RAM, and GTX 980 GPU” [i.e., a 1st soft agent/C/C++ software module to calculate/support the conditional aggregation]). 
The motivation to combine Luo, Wang and Nemlekar is the same as discussed above with respect to claims 3 and 12.

Regarding claim 5, as discussed above, Luo in view of Wang and Nemlekar teaches the computer-readable storage medium of claim 4.
Luo further discloses wherein the first soft agent includes a fully connected layer for mapping and a layer for scaling (see, e.g., pages 2, Sect. 1, left column, “to use SN in modern deep neural networks. (3) By enabling each normalization layer in a deep network to have its own operation, SN helps ease the usage of normalizers” [i.e., each layer of agent has its own operation] and Sect. 2, right column, “batch normalization in fully-connected feedforward neural networks”, 3, Table 1, “                
                    γ
                
            ,                
                    β
                
             denote the scale and shift parameters”, and Sect. 3, right column, paragraphs 1 and 5-6, “                
                    γ
                
             and                 
                    β
                
             are a scale and a shift parameter respectively … each pixel is normalized by using μ and Ϭ, and then re-scale and re-shift by                 
                    γ
                
             and                 
                    β
                
            .”, “normalizing the hidden feature maps of CNNs.”, 6, Sect. 4.1.3, right column, “the SN layers in different places of a network may select distinct operations” [i.e., layers of agent have distinct operation] and 14, Table 11 listing “practices that help training CNNs with SN.” Including “Adding 0:5 dropout in the last fully-connected layer helps generalization in ImageNet.” and “Do not put SN after global pooling when feature map size is 1x1” [i.e., 1st agent includes a fully-connected layer for mapping and another layer for scaling]).

Regarding claim 6, as discussed above, Luo in view of Wang and Nemlekar teaches the computer-readable storage medium of claim 4.
Luo further discloses wherein the second soft agent includes a long short-term memory (LSTM) block to provide mapping and scaling (see, e.g., pages 1-2, Sect. 1, “SN is applied to LSTM”, “We introduce Switchable Normalization (SN), which is applicable in both CNNs and RNNs/LSTMs”, 3, Table 1, “                
                    γ
                
            ,                
                    β
                
             denote the scale and shift parameters”, and Sect. 3, right column, paragraphs 1 and 5-6, “                
                    γ
                
             and                 
                    β
                
             are a scale and a shift parameter respectively … each pixel is normalized by using μ and Ϭ, and then re-scale and re-shift by                 
                    γ
                
             and                 
                    β
                
            .”, “normalizing the hidden feature maps of CNNs.”, 13-14, Sect. 4.7, “We investigate SN in LSTM for efficient neural architecture … (CNN) is constructed by stacking multiple convolutional cells. … training controllers … A controller is a LSTM whose parameters are trained”, and 14, Table 11 listing “practices that help training CNNs with SN.” including “Do not put SN after global pooling when feature map size is 1x1” [i.e., 2nd agent includes LTSM block for mapping and scaling]).

Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, is considered pertinent to applicant's disclosure. 
The references listed on form PTO-892 are all generally related to techniques, methods and systems for normalization of data used in convolutional neural networks (CNNs).
For example, Luo et al. (U.S. Patent Application Pub. No. 2020/0257979 A1, hereinafter “Luo ‘979”) discloses a system highly similar to Luo’s “Switchable Normalization” method.
Also, for example, non-patent literature Jia et al. (“Instance-Level Meta Normalization”, 2019, arXiv:1904.03516v1, hereinafter “Jia”) discloses a method of normalizing data in a CNN with dynamic, separately calculated normalization parameters that can be combined with existing normalization methods
Further, for example, non-patent literature Li et al. (“Attentive Normalization”, 2019, arXiv:1908.01259v2, hereinafter “Li”) discloses normalization through a weighted sum of different normalization parameters based on attention.
The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached at (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/RANDALL K. BALDWIN/Primary Examiner, Art Unit 2125                                                                                      

        1 As noted in the objections to the drawings and specification above, no “global average pooling (GAP) 707” is shown in FIG. 7 or any other drawings. 
        2 Paragraph 39 of the instant specification discloses “As used here, a soft weight refers to a weight value that is determined in operation based on certain values or conditions.” Therefore, “soft weights”, under the broadest reasonable interpretation (BRI), in light of the specification are any weight values that can be determined or updated/changed in operations based on values or conditions.
        
        3 As noted in the objections to these claims above, it appears “dimensions” should recite “dimension”.
        4 As note in the rejections of these claims above, “mapping of the aggregated input sample” has been interpreted as any mapping, plotting, graphing or correlation of “the aggregated input sample” to any value, variable or feature.
        5 As noted in the rejections of these claims under 112(b) above, the “soft agent” has been interpreted as any combination of software (i.e., a set of instructions, code, one or more functions or software agents or modules) and/or hardware (i.e., circuitry and/or hardware logic components/modules) capable of performing the claimed functions.
        6 As noted in the rejections of these claims under 112(b) above, the 1st and 2nd “soft agent” have been interpreted as any combination of software (i.e., a set of instructions, code, one or more functions or software agents or modules) and/or hardware (i.e., circuitry and/or hardware logic components/modules) capable of performing the claimed functions.
        7 As noted in the rejections of these claims under 112(b) above, the 1st and 2nd “soft agent” have been interpreted as any combination of software (i.e., a set of instructions, code, one or more functions or software agents or modules) and/or hardware (i.e., circuitry and/or hardware logic components/modules) capable of performing the claimed functions.
        8 As noted in the objections to these claims above, “the standardized feature map” should recite “the standardized representation of the feature map”.
        9 Paragraph 39 of the specification discloses “As used here, a soft weight refers to a weight value that is determined in operation based on certain values or conditions.” Therefore, “soft weights”, under the BRI, in light of the specification are any weight values that can be determined or updated/changed in operations based on values or conditions.
        
        10 As indicated above “soft weights”, under the BRI, in light of the specification, are any weight values that can be determined or updated/changed in operations based on values or conditions.
        
        11 As noted above in the objection to this claim, it appears “mediums” should recite “media”.
        12 As indicated above “soft weights”, under the BRI, in light of the specification, are any weight values that can be determined or updated/changed in operations based on values or conditions.
        
        13 As indicated above, a “soft agent” under the BRI, in view of the specification any combination of software functions and/or hardware capable of performing the claimed weight generation.
        14 As indicated above “soft weights”, under the BRI, in light of the specification, are any weight values that can be determined or updated/changed in operations based on values or conditions.
        
        15 Paragraphs 35 and 42 of applicant’s specification mention “a soft agent 330 for generating soft weights conditional on input samples to regulate the aggregation and normalization blocks” and “the soft agent 400 thus provides easily implementable operations, and can be effectively trained using forward or backward propagation … the soft agent 400 can serve as a general bridge between the entire input sample 405 and local operations”. Therefore, a “soft agent” under the BRI, in view of the specification any combination of software functions and/or hardware capable of performing the claimed weight generation.
        16 As noted in the rejections of these claims under 112(b) above, the “soft agent” has been interpreted as any combination of software (i.e., a set of instructions, code, one or more functions or software agents or modules) and/or hardware (i.e., circuitry and/or hardware logic components/modules) capable of performing the claimed functions.
        17 As noted in the objections to these claims above, it appears “dimensions” should recite “dimension”.
        18 As note in the rejections of these claims above, “mapping of the aggregated input sample” has been interpreted as any mapping, plotting, graphing or correlation of “the aggregated input sample” to any value, variable or feature.
        19 As noted in the rejections of these claims under 112(b) above, the 2nd “soft agent” has been interpreted as any combination of software (i.e., a set of instructions, code, one or more functions or software agents or modules) and/or hardware (i.e., circuitry and/or hardware logic components/modules) capable of performing the claimed functions.
        20 As noted in the rejections of these claims under 112(b) above, the 1st “soft agent” has been interpreted as any combination of software (i.e., a set of instructions, code, one or more functions or software agents or modules) and/or hardware (i.e., circuitry and/or hardware logic components/modules) capable of performing the claimed functions.
Read full office action
Prosecution Timeline

May 09, 2023
Application Filed
Mar 04, 2026
Non-Final Rejection — §101, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/314,751
Patent 12602573
NEURAL NETWORK ROBUSTNESS VIA BINARY ACTIVATION
2y 5m to grant Granted Apr 14, 2026
17/846,837
Patent 12596918
ACCELERATOR FOR DEEP NEURAL NETWORKS
2y 5m to grant Granted Apr 07, 2026
19/093,171
Patent 12579000
SCHEDULING METHOD FOR A MULTI-LAYER CONVOLUTIONAL NEURAL NETWORK, ELECTRONIC DEVICE AND STORAGE MEDIUM
2y 5m to grant Granted Mar 17, 2026
15/491,950
Patent 12574477
DISTRIBUTED DEEP LEARNING USING A DISTRIBUTED DEEP NEURAL NETWORK
2y 5m to grant Granted Mar 10, 2026
17/809,044
Patent 12572789
BLOCKWISE FACTORIZATION OF HYPERVECTORS
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
80%
Grant Probability
99%
With Interview (+26.9%)
3y 5m
Median Time to Grant
Low
PTA Risk
Based on 232 resolved cases by this examiner. Grant probability derived from career allow rate.