DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA the applicant regards as the invention.
Claims 1, 11 recite the limitation: “… receiv[ing] a pooling output from each of the maximum pool, the minimum pool, the average pool, and the learnable pool … input … the pooling output from each …” (emphasis added to accentuate ambiguity).
Since there are four distinct pools, receiving “a pooling output from each” physically results in four pooling outputs. Referring to them subsequently in the singular as “the pooling output” creates ambiguity.
For the purposes of examination, the limitation is interpreted as the following:
“… receiv[ing], within the pooling layer, a respective pooling output from each of the maximum pool, the minimum pool, the average pool, and the learnable pool;
input, within the pooling layer, the respective pooling outputs from each of the maximum pool, the minimum pool, the average pool, and the learnable pool into the concatenating function such that the concatenating function outputs a concatenated output within the pooling layer formed based on the respective pooling outputs from each of the maximum pool, the minimum pool, the average pool, and the learnable pool …”.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-9, 11-19 are rejected under 35 U.S.C. 103 as being unpatentable over Socher et al., hereinafter referred to as Socher (US 2019/0213482 A1) in view of Gholamalinezhad et al., hereinafter referred to as Gholamalinezhad (“Pooling Methods in Deep Neural Networks, a Review”; ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 16 September 2020 (2020-09-16), XP081764209) [Provided in IDS].
As per claim 1, Socher discloses a method of preserving signals for artificial neural networks when downsampling (Socher: Abstract.), the method comprising:
receiving, by a processor (Socher: Para. [0098] discloses server 1502 operates with any sort of conventional processing hardware, such as a processor.), a set of hyperparameters for a model of an artificial neural network (ANN) and a downsample factor for the model (Socher: Para. [0047] discloses receiving hyperparameters such as fixed-size patches/kernels and a stride [which acts as the downsample factor] for the model.), wherein the ANN includes a first layer (pre-processing layer 310), a pooling layer (subnetwork 200A), and a second layer (subnetwork 200B), wherein the first layer (pre-processing layer 310) feeds the pooling layer (200A), wherein the pooling layer (subnetwork 200A) feeds the second layer (subnetwork 200B), wherein the pooling layer (subnetwork 200A) is positioned between the first layer (pre-processing layer 310) and the second layer (Socher: Paras. [0051], [0055], [0057] disclose the ANN including a first layer, a pooling/subnetwork layer, and a second layer positioned sequentially (e.g., pre-processing layers feeding into a sequence of module subnetworks 200A, 200B, 200C, where subnetwork 200A acts as the pooling layer positioned between the preceding layer and the subsequent subnetwork 200B).), wherein the pooling layer (Socher: Paras. [0047], [0049], [0050], [0088] disclose a pooling layer (subnetwork 200A) containing parallel paths including 3D convolutional layer paths (e.g., 3D convolution layer paths 216) serving as dimensionality reduction layers [learnable pools as they use trained weights and nonlinearity activations to downsample], and a 3D pooling path including “max, min or average pooling operations”, along with a concatenation layer 234 [includes concatenating function].);
receiving, by the processor, within the pooling layer, an input set of data from the first layer (Socher: Para. [0049] subnetwork 200A takes a feature map as the input.);
forming, by the processor, within the pooling layer, a plurality of copies of the input set of data (Socher: Para. [0049] discloses forming copies of the input data by applying the input “in parallel” to the several 3D convolutional layer paths and pooling layers [applying operations in parallel requires splitting/copying the input dataset to route to each parallel path].);
inputting, by the processor, within the pooling layer, the copies to each of the maximum pool,(Socher: Paras. [0049], [0059] disclose inputting the copies to the parallel paths [the pooling paths and the learnable convolutional pools] according to the hyperparameters (e.g., 3x3x3 sizes, strides).);
receiving, by the processor, within the pooling layer, a pooling output (Socher: Para. [0049], discloses each layer or layer path with subnetwork 200A generates different outputs or feature maps.);
inputting, by the processor, within the pooling layer, the pooling output from each of the maximum pool,(Socher: Paras. [0049], [0088] disclose inputting the pooling outputs from the parallel pool paths into a concatenating function such that it outputs a concatenated output.);
inputting, by the processor, the concatenated output from the pooling layer into the second layer (Socher: Paras. [0049], [0051], [0055] disclose each layer or layer path with subnetwork 200A generates different outputs or feature maps concatenated into one feature map as the final output at the concatenation layer 234 and the output of a preceding subnetwork (e.g., subnetwork 200A) is used as input for the next subnetwork's convolution and pooling (e.g., subnetwork 200B) [the concatenated output being passed to the second layer].); and
taking, by the processor, an action based on the concatenated output being in the second layer (Socher: Paras. [0063]-[0064] disclose taking an action based on the network's output by classifying the image for abnormalities like intracranial hemorrhaging and forwarding the results/alerts to a health service provider.).
However, Socher does not explicitly disclose “… the pooling layer contains a maximum pool, a minimum pool, an average pool … the copies to each of the maximum pool, the minimum pool, the average pool … a pooling output from each of the maximum pool, the minimum pool, the average pool … the pooling output from each of the maximum pool, the minimum pool, the average pool, … into the concatenating function such that the concatenating function outputs a concatenated output within the pooling layer formed based on the pooling output from each of the maximum pool, the minimum pool, the average pool …”.
Further, Gholamalinezhad is in the same field of endeavor and teaches the pooling layer contains a maximum pool, a minimum pool, an average pool, the copies to each of the maximum pool, the minimum pool, the average pool, a pooling output from each of the maximum pool, the minimum pool, the average pool and the pooling output from each of the maximum pool, the minimum pool, the average pool into the concatenating function such that the concatenating function outputs a concatenated output within the pooling layer formed based on the pooling output from each of the maximum pool, the minimum pool, the average pool (Gholamalinezhad: Pgs 3-4; Pg. 4, Section 2.3; Pgs. 6-7, Section 3.1; Pg. 11, Section 3.10 disclose mixed pooling which is a “combination of average pooling and max pooling” and the structural arrangement of the “concatenation of several pooling techniques” to preserve different statistical representations of the data.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Socher and Gholamalinezhad before him or her, to modify the parallel convolutional layer architecture of Socher to include the maximum, minimum, average pooling operations simultaneously in a concatenated layer feature as described in Gholamalinezhad. The motivation for doing so would have been to improve classification accuracy by providing a configuration that prevents the loss of discriminative information.
As per claim 2, Socher-Gholamalinezhad disclose the method of claim 1, wherein the learnable pool is a first learnable pool, wherein the pooling layer includes a set of learnable pools including the first learnable pool and a second learnable pool, wherein the copies are input into each of the maximum pool, the minimum pool, the average pool, the first learnable pool, and the second learnable pool according to the set of hyperparameters based on the downsample factor, wherein the pooling output is received from each of the maximum pool, the minimum pool, the average pool, the first learnable pool, and the second learnable pool (Socher: Para. [0049] discloses the parallel outputs “are concatenated into one feature map” [concatenating the outputs of parallel convolutional/pooling branches to form a single multi-channel feature map axiomatically relies on channel-wise (depth-wise) concatenation, making this configuration obvious].).
As per claim 3, Socher-Gholamalinezhad disclose the method of claim 1, wherein the learnable pool is executed concurrent with at least one of the maximum pool, the minimum pool, or the average pool within the pooling layer on respective copies of the input set of data (Socher: Para. [0049] discloses the parallel outputs “are concatenated into one feature map”.).
As per claim 4, Socher-Gholamalinezhad disclose the method of claim 3, wherein the learnable pool is executed concurrent with at least two of the maximum pool, the minimum pool, or the average pool within the pooling layer on respective copies of the input set of data (Socher: Paras. [0047], [0049] disclose the hyperparameters (such as receptive field size, strides, and convolution volumes) dictate how the convolution paths [learnable pool] operates on the input data.).
As per claim 5, Socher-Gholamalinezhad disclose the method of claim 4, wherein the learnable pool is executed concurrent with each of the maximum pool, the minimum pool, or the average pool within the pooling layer on respective copies of the input set of data (Socher: Para. [0049] discloses applying in parallel several 3D convolutional layer paths and 3D max pooling layers.).
As per claim 6, Socher-Gholamalinezhad disclose the method of claim 1, wherein the concatenated output from the pooling layer is a single output (Socher: Para. [0049] discloses that the various layer paths are concatenated into one feature map as the final output.).
As per claim 7, Socher-Gholamalinezhad disclose the method of claim 1, wherein the learnable pool is programmed to or a logic is programmed to cause the learnable pool to better fit itself to best downsample the input set of data based on a set of criteria (Socher: Paras. [0037], [0047] disclose the neural network layers have trainable parameters whose weights are adjusted during training using backpropagation to minimize a cost function [acting as the set of criteria to better fit the data].).
As per claim 8, Socher-Gholamalinezhad disclose the method of claim 1, wherein the learnable pool includes a convolutional neuron with a learnable activation function that are programmed such that the learnable pool processes the copy according to the set of hyperparameters based on the downsample factor, wherein the convolutional neuron has a stride and a kernel size each set according to how the learnable pool is sized (Socher: Paras. [0047], [0054] disclose convolutional paths utilizing fixed-size 3D patches or 3D kernels and setting a stride and also applying a learned nonlinearity activation that is trained during back propagation using two parameters: a scale parameter and a shift parameter.).
As per claim 9, Socher-Gholamalinezhad disclose the method of claim 1, wherein the learnable pool includes a convolutional neuron that is convolved such that the learnable pool processes the copy according to the set of hyperparameters based on the downsample factor, wherein convolutional neuron is programmed to generate a set of values that are condensed using global max pooling operation (Socher: Paras. [0051], [0062] disclose a vertical/global max pooling operation on the final output representation and Gholamalinezhad: Pg. 2, Abstract discloses various pooling operations, including those that drastically reduce spatial dimensions to condense feature maps.).
As per claims 11-19, the claim(s) recites analogous limitations to claim(s) 1-10 above, and is/are therefore rejected on the same premise.
Claims 10 & 20 are rejected under 35 U.S.C. 103 as being unpatentable over Socher in view of Gholamalinezhad in further view of Ding et al., (“Adaptive Multi-Scale Detection of Acoustic Events”, IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, [Online] 24 November 2019 (2019-11-24), pages 1-13, XP055918735, USA ISSN: 2329-9290, DOI: 10.1109/TASLP.2019.2953350 URL: https://arxiv.org/pdf/1911.06878.pdf.) [Provided in IDS].
As per claim 10, Socher-Gholamalinezhad disclose the method of claim 1 (Socher: Abstract.),
However, Socher-Gholamalinezhad do not explicitly disclose “… wherein the learnable pool includes a recurrent neuron that is convolved such that the learnable pool processes the copy according to the set of hyperparameters based on the downsample factor, wherein the recurrent neuron is programmed to run within a designated pooling area and to generate a value that is used as the value of the learnable pool.”
Further, Ding is in the same field of endeavor and teaches wherein the learnable pool (branches) includes a recurrent neuron (GRU) that is convolved such that the learnable pool processes the copy (the bi-directional GRU module is cascaded to the outputs in each scale of the hourglass) according to the set of hyperparameters based on the downsample factor (the output’s dimensions of all scales are different), wherein the recurrent neuron is programmed to run within a designated pooling area (the specific time-frequency scale or resolution dimension of the respective branch) and to generate a value (Pk) that is used as the value of the learnable pool (Ding: Pg. 1, Abstract; Pg. 4 Section V.A General Structure; Pg. 5, Section V.C Adaptive Fusion disclose a neural network utilizing an hourglass module to generate multi-scale feature representations that ae fed into parallel GRU branches. Since each time-frequency scale is a different downsampled dimension, the corresponding GRU branches act as weak classifiers that utilize different input and hidden layer dimensions to generate a specific prediction value for their designated scale, which are then merged into a final output.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Socher-Gholamalinezhad and Ding before him or her, to modify the parallel convolutional layer architecture of Socher-Gholamalinezhad to include the recurrent neuron feature as described in Ding. The motivation for doing so would have been to improve classification accuracy to detect a variety of events by providing a configuration that leads to greater noise-resistant capability.
As per claim 20, the claim(s) recites analogous limitations to claim(s) 10 above, and is/are therefore rejected on the same premise.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure and can be viewed in the list of references.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PEET DHILLON whose telephone number is (571)270-5647. The examiner can normally be reached M-F: 5am-1:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sath V. Perungavoor can be reached at 571-272-7455. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PEET DHILLON/Primary Examiner
Art Unit: 2488
Date: 03-17-2026