DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 9 February 2026 has been entered.
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. 17/978,458 (the instant application), filed on 11/01/2022.
Response to Arguments
Claims 1, 7, and 13 have been amended. Claims 3 and 9 have been canceled. Claims 16, 18, and 20 have been canceled previously. Claims 1-2, 4-8, 10-15, 17, and 19 are pending in this action.
Applicant’s arguments, see pg. 9, filed 9 February 2026, with respect to the rejection of claims 3 and 9 under 35 U.S.C. 112(d) have been fully considered and are persuasive. Specifically, claims 3 and 9 have been canceled. The rejection of claims 3 and 9 under 35 U.S.C. 112(d) has been withdrawn.
Applicant’s arguments, see pg. 9-10, filed 9 February 2026, with respect to the rejection of claims 1-6 and 15 under 35 U.S.C. 102(a)(1) and claims 7-14, 17, and 19 under 35 U.S.C. 103 have been fully considered and are not persuasive. The applicant argues that "Qin et al. does not disclose a specific process for generating a new input image frame by reducing the number of channels at an input stage based on channel-wise binary determination using an AI engine, as recited in claim 1." (applicant's remarks filed 9 February 2026, pg. 10). The examiner disagrees. Qin generates a new input image frame by reducing the number of channels at an input stage (Qin, pg. 1740 col. 2 para. 1 and fig. 3, the input channels go 224 X 224 channels with 3 layers to 56 X 56 channels with 64 layers which is understood as reducing the number of channels. The new version of the image, 56 X 56 X 64, is understood as a new input image frame. Pg. 1740 col. 2 para. 2 and fig. 3, the new input image frame is input to a model. Therefore, this is performed at an input stage) based on channel-wise binary determination using an AI engine (Qin, pg. 1741 col. 1 para. 1 and fig. 4, an AI model, the dynamic gate module, assigns a binary score to each channel. The channels which are assigned a values of zero are "detached from the network" which is understood as reducing the number of channels). The applicant further argues that "Qin et al. does not disclose channel filtering at all" (applicant's remarks filed 9 February 2026, pg. 10). This argument amounts to a conclusive statement without supporting evidence showing that Qin does not disclose channel filtering. Further, while Qin may not employ the words "channel filtering" in describing their process, they perform a channel selection which reduces the number of channels which is understood as a channel filtering (see Qin, pg. 1741 col. 1 para. 1-2 and fig. 4). Therefore, the applicant's arguments are not persuasive and the rejections of claims 1-2, 4-6 and 15 under 35 U.S.C. 102(a)(1) and claims 7-8, 10-14, 17, and 19 under 35 U.S.C. 103 are maintained (claims 3 and 9 being canceled).
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-2, 4-6, and 15 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Xu et al. ("Learning in the Frequency Domain" full reference on PTO-892 filed with this action; hereafter, Qin).
Regarding claim 1, Qin discloses:
A method comprising: receiving, by an electronic device (pg. 1739 col. 2 para. 2, the method performs preprocessing on a CPU which is understood as an electronic device),
PNG
media_image1.png
36
360
media_image1.png
Greyscale
an image frame (pg. 1740 col. 1 para. 2 and Fig, 2, the image is received as seen by the image being input into the flow diagram and the image being resized);
PNG
media_image2.png
106
366
media_image2.png
Greyscale
PNG
media_image3.png
278
626
media_image3.png
Greyscale
transforming, by the electronic device, the image frame from a spatial domain comprising a first plurality of channels to a non-spatial domain comprising a second plurality of channels (pg. 1740 col. 1 para. 2 and Fig. 2, the image is converted into a frequency domain),
PNG
media_image4.png
52
352
media_image4.png
Greyscale
wherein a number of the second plurality of channels is greater than a number of the first plurality of channels (pg. 1740 col. 1 para. 1 and Fig. 2, when converted to the frequency domain, the number of the channels of the image increase);
PNG
media_image5.png
136
352
media_image5.png
Greyscale
removing, by the electronic device, at least one channel comprising irrelevant information from among the second plurality of channels (pg. 1741 col. 1 para. 1 and Fig. 2, a subset of channels are selected. Therefore, the non-selected channels are understood to be removed. The selected channels are "salient". Therefore, the removed channels are understood as unimpactful or comprising irrelevant information)
PNG
media_image6.png
124
354
media_image6.png
Greyscale
using an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 1, the channel selected is "learning-based" which is understood as using an AI engine)
PNG
media_image7.png
84
352
media_image7.png
Greyscale
to generate a low-resolution image frame in the non-spatial domain (pg. 1741 col. 1 para. 1 and fig. 2, the non-salient channels are removed which causes the input data size to be reduced. This is understood as generating a low-resolution image)
PNG
media_image8.png
122
350
media_image8.png
Greyscale
and providing, by the electronic device, the low-resolution image frame to a neural network for an inference of the image frame (pg. 1739 col. 2 para. 2, after the task of image compression, the preserved channels are input into an AI accelerator, which may be a neural network, for inference. See also fig. 1(b) for an example of inputting the output of pre-processing into a neural network)
PNG
media_image9.png
86
358
media_image9.png
Greyscale
PNG
media_image10.png
126
576
media_image10.png
Greyscale
wherein a generic stub layer is embedded at an input of the neural network (pg. 1740 col. 2 para. 2 and Fig. 3, figure 3 shows that the pre-processed input is embedded such that it may be input directly into the neural network. The block "DCT: 56X56X64" is understood as the generic stub layer. Alternatively, the combination of the three dashed boxes may also be interpreted as the generic stub layer in a way nearly identical to the applicant’s specification) for compatibility of the neural network in receiving the low-resolution image frame (pg. 1740 col. 2 para. 2, "Then we adjust the channel size of the next layer to match the number of channels in the frequency domain", showing that the generic stub layer is compatible with the frequency domain), wherein the generic stub layer bypasses input layers of the neural network that are relevant for the image frame in the spatial domain (pg. 1740 col. 2 para. 2 and Fig. 3, "Since the input feature maps in the frequency domain are smaller in the H and W dimensions but larger in the C dimension than the spatial-domain counterpart, we skip the input layer of a conventional CNN model," therefore, the generic stub layer skips, or bypasses, input layers of the neural network),
PNG
media_image11.png
156
362
media_image11.png
Greyscale
PNG
media_image12.png
422
362
media_image12.png
Greyscale
and wherein the removing, by the electronic device, the at least one channels comprising the irrelevant information comprises: determining, by the electronic device using the AI engine (pg. 1741 col. 1 para. 1, the dynamic gate module is understood as an AI engine as it makes selections, pg. 1741 col. 1 para. 1, contains at least two trainable parameters, pg. 1741 col. 2 para. 1-3, and is trained by loss functions. A person of ordinary skill in the art would recognize this as an AI engine for these reasons), a binary value corresponding to each of the second plurality of channels (pg. 1741 col. 1 para. 1, "dynamic gate module that assigns a binary score to each frequency channel.");
filtering, by the electronic device, channels based on an application of the binary value to the second plurality of channels to reduce a number of channels (pg. 1741 col. 1 para. 1, "The salient channels are rated as one, the others as zero. The input frequency channels with zero scores are detached from the network. Thus, the input data size is reduced," The channels with zero scores being detached is understood as filtering);
and generating, by the electronic device, the low-resolution image frame in the non-spatial domain using the filtered channels (pg. 1740 col. 2 para. 1 and fig. 3, the input channels go 224 X 224 channels with 3 layers to 56 X 56 channels with 64 layers which is understood as reducing the number of channels. The new version of the image, 56 X 56 X 64, is understood as a new input image frame in the non-spatial domain. Pg. 1740 col. 2 para. 2 and fig. 3, the new input image frame is input to a model. Therefore, this is performed at an input stage. Finally, pg. 1740 col. 2 para. 3 and fig. 2, "As discussed in Section 3.3, the majority of the frequency channels can be pruned without sacrificing accuracy. The frequency channel pruning operation is referred to as DCT channel select in Figure 2." This is understood to show that the low resolution image frame uses the filtered channels).
Regarding claim 2, Qin discloses the subject matter of claim 1. Qin further discloses:
wherein the transforming comprises performing, by the electronic device, a Discrete Cosine Transformation (DCT) or a Fourier transformation on the image frame (pg. 1740 col. 1 para. 2 and Fig. 2, the image is converted into a frequency domain by DCT).
PNG
media_image4.png
52
352
media_image4.png
Greyscale
Regarding claim 4, Qin discloses the subject matter of claim 1. Qin further discloses:
wherein the non-spatial domain comprises a Luminance, Red difference, Blue difference (Y, Cb, Cr) domain, a Hue, Saturation, Value (H, S, V) domain, or a Luminance, Chrominance (YUV) domain (pg. 1740 col. 1 para. 2, the images are transformed into the YCbCr color space).
PNG
media_image13.png
36
356
media_image13.png
Greyscale
Regarding claim 5, Qin discloses the subject matter of claim 1. Qin further discloses:
grouping, by the electronic device, components of the transformed image frame with a same frequency into a channel of the second plurality of channels (pg. 1740 col.1 para. 2 and Fig. 2, the frequency domain coefficients at the same frequency are gathered into "cubes")
PNG
media_image14.png
66
360
media_image14.png
Greyscale
by preserving spatial position information of each component (pg. 1740 col. 2 para. 1, the coefficients are grouped while maintaining their spatial relations at each frequency).
PNG
media_image15.png
74
358
media_image15.png
Greyscale
Regarding claim 6, Qin discloses the subject matter of claim 1. Qin further discloses:
generating, by the electronic device, a tensor by performing a depth-wise convolution and average pool on each channel of the second plurality of channels (pg. 1741 col. 1 para. 2 and Fig. 4, a tensor is generated by performing convolution and average pooling on the collected channels);
PNG
media_image16.png
174
352
media_image16.png
Greyscale
PNG
media_image17.png
210
360
media_image17.png
Greyscale
adding, by the electronic device, two trainable parameters with each component of the tensor (pg. 1741 col.1 para. 2 and Fig. 4, each element in the tensor is multiplied by two trainable parameters which is understood as adding the parameters to the tensor);
PNG
media_image18.png
66
354
media_image18.png
Greyscale
determining, by the electronic device, values of the two trainable parameters using the AI engine (pg. 1741 col. 1 para. 3, the trainable parameters, "two numbers" in this paragraph, are determined and output by the gate module. Pg. 1741 col. 1 para. 1, the gate module is learning based and dynamic, therefore it is understood as an AI engine);
PNG
media_image19.png
174
352
media_image19.png
Greyscale
PNG
media_image20.png
86
358
media_image20.png
Greyscale
determining, by the electronic device, a binary value of each component of the tensor based on the values of the two trainable parameters (pg. 1741 col. 1 para. 3, the trainable parameters are used to determine the probability for each number to be a binary value);
PNG
media_image21.png
90
352
media_image21.png
Greyscale
performing, by the electronic device, an elementwise product between the second plurality of channels and the binary value of the components of the tensor (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, by 0 or 1 depending on the value of the trainable parameters);
PNG
media_image22.png
176
354
media_image22.png
Greyscale
filtering, by the electronic device, at least one channel without a zero value among the second plurality of channels upon performing the elementwise product (pg. 1741 col. 1 para. 2 and Fig. 4, the process outputs tensor with some channels off and other channels on. The “on” channels are understood as being filtered without a zero value. See Tensor 5 has channels that are not canceled);
PNG
media_image23.png
174
358
media_image23.png
Greyscale
PNG
media_image17.png
210
360
media_image17.png
Greyscale
and generating, by the electronic device, the low-resolution image frame in the nonspatial domain using the at least one filtered channel (pg. 1741 col. 1 para. 1 and Fig. 4, the above process is the "channel selection" step. pg. 1740 col. 1 para. 2 and fig. 2, the selected channels from the selection step are concatenated which is understood as a low-resolution image frame being generated using the at least one filtered channel).
PNG
media_image24.png
54
360
media_image24.png
Greyscale
PNG
media_image25.png
88
358
media_image25.png
Greyscale
PNG
media_image26.png
282
624
media_image26.png
Greyscale
Regarding claim 15, Qin discloses the subject matter of claim 1. Qin further discloses:
wherein the removing is based on operations on the second plurality of channels (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, which is understood as an operation) performed using values determined from trainable parameters (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, by 0 or 1 depending on the value of the trainable parameters. Therefore, the values are based on the trainable parameters), determined by an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 3, the trainable parameters, "two numbers" in this paragraph, are determined and output by the gate module. Pg. 1741 col. 1 para. 1, the gate module is learning based and dynamic, therefore it is understood as an AI engine),
PNG
media_image19.png
174
352
media_image19.png
Greyscale
PNG
media_image20.png
86
358
media_image20.png
Greyscale
associated with a tensor (pg. 1741 col. 1 para. 2, the trainable parameters are associated with tensor 4. Tensor 4 is from tensor 3 which is from tensor 2. The included edits are interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)) generated from the second plurality of channels (pg. 1741 col. 1 para. 2, tensor 2 is generated from tensor 1 which is understood as the second plurality of channels. The included edits are interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)).
PNG
media_image27.png
210
354
media_image27.png
Greyscale
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 7-8, 10-14, 17, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. ("Learning in the Frequency Domain" full reference on PTO-892 filed with this action; hereafter, Qin) in view of Xu et al. (U.S. Publ. No. 20210201538; hereafter Xu).
Regarding claim 7, Qin discloses:
receive an image frame (pg. 1740 col. 1 para. 2 and Fig, 2, the image is received as seen by the image being input into the flow diagram and the image being resized);
PNG
media_image2.png
106
366
media_image2.png
Greyscale
PNG
media_image3.png
278
626
media_image3.png
Greyscale
transform the image frame from a spatial domain comprising a first plurality of channels to a non-spatial domain comprising a second plurality of channels (pg. 1740 col. 1 para. 2 and Fig. 2, the image is converted into a frequency domain),
PNG
media_image4.png
52
352
media_image4.png
Greyscale
wherein a number of the second plurality of channels is greater than a number of the first plurality of channels (pg. 1740 col. 1 para. 1 and Fig. 2, when converted to the frequency domain, the number of the channels of the image increase);
PNG
media_image5.png
136
352
media_image5.png
Greyscale
remove at least one channel comprising irrelevant information from among the second plurality of channels (pg. 1741 col. 1 para. 1 and Fig. 2, a subset of channels are selected. Therefore, the non-selected channels are understood to be removed. The selected channels are "salient". Therefore, the removed channels are understood as unimpactful or comprising irrelevant information)
PNG
media_image6.png
124
354
media_image6.png
Greyscale
using an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 1, the channel selected is "learning-based" which is understood as using an AI engine)
PNG
media_image7.png
84
352
media_image7.png
Greyscale
to generate a low-resolution image frame in the non-spatial domain (pg. 1741 col. 1 para. 1 and fig. 2, the non-salient channels are removed which causes the input data size to be reduced. This is understood as generating a low-resolution image)
PNG
media_image8.png
122
350
media_image8.png
Greyscale
and provide the low-resolution image frame to a neural network for an inference of the image frame (pg. 1739 col. 2 para. 2, after the task of image compression, the preserved channels are input into an AI accelerator, which may be a neural network, for inference. See also fig. 1(b) for an example of inputting the output of pre-processing into a neural network),
PNG
media_image9.png
86
358
media_image9.png
Greyscale
PNG
media_image10.png
126
576
media_image10.png
Greyscale
wherein a generic stub layer is embedded at an input of the neural network (pg. 1740 col. 2 para. 2 and Fig. 3, figure 3 shows that the pre-processed input is embedded such that it may be input directly into the neural network. The block "DCT: 56X56X64" is understood as the generic stub layer. Alternatively, the combination of the three dashed boxes may also be interpreted as the generic stub layer in a way nearly identical to the applicant’s specification) for compatibility of the neural network in receiving the low-resolution image frame (pg. 1740 col. 2 para. 2, "Then we adjust the channel size of the next layer to match the number of channels in the frequency domain", showing that the generic stub layer is compatible with the frequency domain), wherein the generic stub layer bypasses input layers of the neural network that are relevant for the image frame in the spatial domain (pg. 1740 col. 2 para. 2 and Fig. 3, "Since the input feature maps in the frequency domain are smaller in the H and W dimensions but larger in the C dimension than the spatial-domain counterpart, we skip the input layer of a conventional CNN model," therefore, the generic stub layer skips, or bypasses, input layers of the neural network),
PNG
media_image11.png
156
362
media_image11.png
Greyscale
PNG
media_image12.png
422
362
media_image12.png
Greyscale
and wherein the image frame inferencing engine is further configured to: determine, using the AI engine (pg. 1741 col. 1 para. 1, the dynamic gate module is understood as an AI engine as it makes selections, pg. 1741 col. 1 para. 1, contains at least two trainable parameters, pg. 1741 col. 2 para. 1-3, and is trained by loss functions. A person of ordinary skill in the art would recognize this as an AI engine for these reasons), a binary value corresponding to each of the second plurality of channels (pg. 1741 col. 1 para. 1, "dynamic gate module that assigns a binary score to each frequency channel.");
filter channels based on an application of the binary value to the second plurality of channels to reduce a number of channels (pg. 1741 col. 1 para. 1, "The salient channels are rated as one, the others as zero. The input frequency channels with zero scores are detached from the network. Thus, the input data size is reduced," The channels with zero scores being detached is understood as filtering);
and generate the low-resolution image frame in the non-spatial domain using the filtered channels (pg. 1740 col. 2 para. 1 and fig. 3, the input channels go 224 X 224 channels with 3 layers to 56 X 56 channels with 64 layers which is understood as reducing the number of channels. The new version of the image, 56 X 56 X 64, is understood as a new input image frame in the non-spatial domain. Pg. 1740 col. 2 para. 2 and fig. 3, the new input image frame is input to a model. Therefore, this is performed at an input stage. Finally, pg. 1740 col. 2 para. 3 and fig. 2, "As discussed in Section 3.3, the majority of the frequency channels can be pruned without sacrificing accuracy. The frequency channel pruning operation is referred to as DCT channel select in Figure 2." This is understood to show that the low resolution image frame uses the filtered channels).
Qin does not disclose expressly a device comprising a memory, a processor, and an image inferencing engine operably connected to the memory and processor.
Xu discloses:
An electronic device comprising: memory ([0028] the system includes a computer readable storage medium);
at least one processor comprising processing circuitry ([0028] the system includes at least one general-purpose processor. A general purpose processor is understood to include processing circuitry);
and an image frame inferencing engine comprising image processing circuitry ([0073]-[0074] the image is input into a neural network for inference calculations. The neural network is understood as an inferencing engine), operably coupled to the memory and the processor ([0040] the steps of process 200, such as inputting into the neural network of [0073]-[0074], may be performed by the general purpose and special purpose processors. [0029] the general purpose and special purpose processors are connected to each other and the memory by a bus. The edited text reflects the way the claim is being interpreted by the examiner per the explanation in the rejection of claim 7 under 35 U.S.C. 112(b))
Qin and Xu are combinable because they are from the same field of endeavor of resizing images for inference computations (Qin, pg. 1737 col. 2 para. 1; Xu, [0012]).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention to combine the memory and processor of Xu with the method of Qin.
The motivation for doing so would have been that doing so is combining prior art elements (the method of Qin and the system of Xu) according to known methods (it is well known in the art, as demonstrated by Xu, to use a memory and a processor for performing image based methods) to yield predictable results (the result of performing the method of Qin with the aid of digital processing of Xu). Both the method of Qin and the system of Xu operate in combination performing the same function as they did separately, the method of Qin continues to remove irrelevant channels and the system of Xu continues to perform calculations for image processing. Further, while Qin dos not expressly disclose a memory, a processor, and an image frame inferencing engine coupled to the memory and the processor, a person having ordinary skill in the art would understand based on the disclosure of Qin that such elements are in use for the described method and calculations.
Therefore, it would have been obvious to combine Xu with Qin to obtain the invention as specified in claim 7.
Regarding claim 8, Qin in view of Xu discloses the subject matter of claim 7. Qin further discloses:
perform a Discrete Cosine Transformation (DCT) or a Fourier transformation on the image frame for transforming the image frame from the spatial domain to the non-spatial domain (pg. 1740 col. 1 para. 2 and Fig. 2, the image is converted into a frequency domain by DCT).
PNG
media_image4.png
52
352
media_image4.png
Greyscale
Regarding claim 10, Qin in view of Xu discloses the subject matter of claim 7. Qin further discloses:
wherein the non-spatial domain comprises a Luminance, Red difference, Blue difference (Y, Cr, Cb) domain, a Hue, Saturation, Value (H, S, V) domain, or a Luminance, Chrominance (YUV) domain (pg. 1740 col. 1 para. 2, the images are transformed into the YCbCr color space).
PNG
media_image13.png
36
356
media_image13.png
Greyscale
Regarding claim 11, Qin in view of Xu discloses the subject matter of claim 7. Qin further discloses:
group components of the transformed image frame with a same frequency into a channel of the second plurality of channels (pg. 1740 col.1 para. 2 and Fig. 2, the frequency domain coefficients at the same frequency are gathered into "cubes")
PNG
media_image14.png
66
360
media_image14.png
Greyscale
by preserving spatial position information of each component (pg. 1740 col. 2 para. 1, the coefficients are grouped while maintaining their spatial relations at each frequency).
PNG
media_image15.png
74
358
media_image15.png
Greyscale
Regarding claim 12, Qin in view of Xu discloses the subject matter of claim 7. Qin further discloses:
generate a tensor by performing a depth-wise convolution and average pool on each channel of the second plurality of channels (pg. 1741 col. 1 para. 2 and Fig. 4, a tensor is generated by performing convolution and average pooling on the collected channels);
PNG
media_image16.png
174
352
media_image16.png
Greyscale
PNG
media_image17.png
210
360
media_image17.png
Greyscale
add two trainable parameters with each component of the tensor (pg. 1741 col.1 para. 2 and Fig. 4, each element in the tensor is multiplied by two trainable parameters which is understood as adding the parameters to the tensor);
PNG
media_image18.png
66
354
media_image18.png
Greyscale
determine values of the two trainable parameters using the AI engine (pg. 1741 col. 1 para. 3, the trainable parameters, "two numbers" in this paragraph, are determined and output by the gate module. Pg. 1741 col. 1 para. 1, the gate module is learning based and dynamic, therefore it is understood as an AI engine);
PNG
media_image19.png
174
352
media_image19.png
Greyscale
PNG
media_image20.png
86
358
media_image20.png
Greyscale
determine a binary value of each component of the tensor based on the values of the two trainable parameters (pg. 1741 col. 1 para. 3, the trainable parameters are used to determine the probability for each number to be a binary value);
PNG
media_image21.png
90
352
media_image21.png
Greyscale
perform an elementwise product between the second plurality of channels and the binary value of the components of the tensor (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, by 0 or 1 depending on the value of the trainable parameters);
PNG
media_image22.png
176
354
media_image22.png
Greyscale
filter at least one channel without a zero value among the second plurality of channels upon performing the elementwise product (pg. 1741 col. 1 para. 2 and Fig. 4, the process outputs tensor with some channels off and other channels on. The “on” channels are understood as being filtered without a zero value. See Tensor 5 has channels that are not canceled);
PNG
media_image23.png
174
358
media_image23.png
Greyscale
PNG
media_image17.png
210
360
media_image17.png
Greyscale
and generate the low-resolution image frame in the nonspatial domain using the at least one filtered channel (pg. 1741 col. 1 para. 1 and Fig. 4, the above process is the "channel selection" step. pg. 1740 col. 1 para. 2 and fig. 2, the selected channels from the selection step are concatenated which is understood as a low-resolution image frame being generated using the at least one filtered channel).
PNG
media_image24.png
54
360
media_image24.png
Greyscale
PNG
media_image25.png
88
358
media_image25.png
Greyscale
PNG
media_image26.png
282
624
media_image26.png
Greyscale
Regarding claim 13, Qin discloses:
receive an image frame (pg. 1740 col. 1 para. 2 and Fig, 2, the image is received as seen by the image being input into the flow diagram and the image being resized);
PNG
media_image2.png
106
366
media_image2.png
Greyscale
PNG
media_image3.png
278
626
media_image3.png
Greyscale
transform the image frame from a spatial domain comprising a first plurality of channels to a non-spatial domain comprising a second plurality of channels (pg. 1740 col. 1 para. 2 and Fig. 2, the image is converted into a frequency domain),
PNG
media_image4.png
52
352
media_image4.png
Greyscale
wherein a number of the second plurality of channels is greater than a number of the first plurality of channels (pg. 1740 col. 1 para. 1 and Fig. 2, when converted to the frequency domain, the number of the channels of the image increase);
PNG
media_image5.png
136
352
media_image5.png
Greyscale
remove at least one channel comprising irrelevant information from among the second plurality of channels (pg. 1741 col. 1 para. 1 and Fig. 2, a subset of channels are selected. Therefore, the non-selected channels are understood to be removed. The selected channels are "salient". Therefore, the removed channels are understood as unimpactful or comprising irrelevant information)
PNG
media_image6.png
124
354
media_image6.png
Greyscale
using an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 1, the channel selected is "learning-based" which is understood as using an AI engine)
PNG
media_image7.png
84
352
media_image7.png
Greyscale
to generate a low-resolution image frame in the non-spatial domain (pg. 1741 col. 1 para. 1 and fig. 2, the non-salient channels are removed which causes the input data size to be reduced. This is understood as generating a low-resolution image)
PNG
media_image8.png
122
350
media_image8.png
Greyscale
and provide the low-resolution image frame to a neural network for an inference of the image frame (pg. 1739 col. 2 para. 2, after the task of image compression, the preserved channels are input into an AI accelerator, which may be a neural network, for inference. See also fig. 1(b) for an example of inputting the output of pre-processing into a neural network).
PNG
media_image9.png
86
358
media_image9.png
Greyscale
PNG
media_image10.png
126
576
media_image10.png
Greyscale
wherein a generic stub layer is embedded at an input of the neural network (pg. 1740 col. 2 para. 2 and Fig. 3, figure 3 shows that the pre-processed input is embedded such that it may be input directly into the neural network. The block "DCT: 56X56X64" is understood as the generic stub layer. Alternatively, the combination of the three dashed boxes may also be interpreted as the generic stub layer in a way nearly identical to the applicant’s specification) for compatibility of the neural network in receiving the low-resolution image frame (pg. 1740 col. 2 para. 2, "Then we adjust the channel size of the next layer to match the number of channels in the frequency domain", showing that the generic stub layer is compatible with the frequency domain), wherein the generic stub layer bypasses input layers of the neural network that are relevant for the image frame in the spatial domain (pg. 1740 col. 2 para. 2 and Fig. 3, "Since the input feature maps in the frequency domain are smaller in the H and W dimensions but larger in the C dimension than the spatial-domain counterpart, we skip the input layer of a conventional CNN model," therefore, the generic stub layer skips, or bypasses, input layers of the neural network),
PNG
media_image11.png
156
362
media_image11.png
Greyscale
PNG
media_image12.png
422
362
media_image12.png
Greyscale
and determine, using the AI engine (pg. 1741 col. 1 para. 1, the dynamic gate module is understood as an AI engine as it makes selections, pg. 1741 col. 1 para. 1, contains at least two trainable parameters, pg. 1741 col. 2 para. 1-3, and is trained by loss functions. A person of ordinary skill in the art would recognize this as an AI engine for these reasons), a binary value corresponding to each of the second plurality of channels (pg. 1741 col. 1 para. 1, "dynamic gate module that assigns a binary score to each frequency channel."),
filter channels based on an application of the binary value to the second plurality of channels to reduce a number of channels (pg. 1741 col. 1 para. 1, "The salient channels are rated as one, the others as zero. The input frequency channels with zero scores are detached from the network. Thus, the input data size is reduced," The channels with zero scores being detached is understood as filtering),
and generate the low-resolution image frame in the non-spatial domain using the filtered channels (pg. 1740 col. 2 para. 1 and fig. 3, the input channels go 224 X 224 channels with 3 layers to 56 X 56 channels with 64 layers which is understood as reducing the number of channels. The new version of the image, 56 X 56 X 64, is understood as a new input image frame in the non-spatial domain. Pg. 1740 col. 2 para. 2 and fig. 3, the new input image frame is input to a model. Therefore, this is performed at an input stage. Finally, pg. 1740 col. 2 para. 3 and fig. 2, "As discussed in Section 3.3, the majority of the frequency channels can be pruned without sacrificing accuracy. The frequency channel pruning operation is referred to as DCT channel select in Figure 2." This is understood to show that the low resolution image frame uses the filtered channels).
Qin does not disclose expressly a device comprising a non-transitory computer readable medium storing instruction executed by a processor.
Xu discloses:
At least one non-transitory computer-readable memory ([0029] the memory of the device may be separate embodiments, including non-transitory memory) storing instructions ([0028] instruction are stored on the memory) that, when executed by at least one processor, perform the function ([0028] stored instructions may be performed by at least one processor)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention to combine the memory and processor of Xu with the method of Qin.
The motivation for doing so would have been that doing so is combining prior art elements (the method of Qin and the system of Xu) according to known methods (it is well known in the art, as demonstrated by Xu, to use a memory and a processor for performing image based methods) to yield predictable results (the result of performing the method of Qin with the aid of digital processing of Xu). Both the method of Qin and the system of Xu operate in combination performing the same function as they did separately, the method of Qin continues to remove irrelevant channels and the system of Xu continues to perform calculations for image processing. Further, while Qin dos not expressly disclose a memory and a processor, a person having ordinary skill in the art would understand based on the disclosure of Qin that such elements are in use for the described method and calculations.
Therefore, it would have been obvious to combine Xu with Qin to obtain the invention as specified in claim 13.
Regarding claim 14, Qin in view of Xu discloses the subject matter of claim 13. Qin does not disclose that an electronic device comprises the non-transitory computer readable memory of claim 13.
Xu discloses:
An electronic device comprising the at least one non-transitory computer- readable memory of claim 13 ([0028] and Fig. 1, the electronic device includes the memory).
Regarding claim 17, Qin in view of Xu discloses the subject matter of claim 7. Qin further discloses:
wherein the removing is based on operations on the second plurality of channels (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, which is understood as an operation) performed using values determined from trainable parameters (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, by 0 or 1 depending on the value of the trainable parameters. Therefore, the values are based on the trainable parameters), determined by an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 3, the trainable parameters, "two numbers" in this paragraph, are determined and output by the gate module. Pg. 1741 col. 1 para. 1, the gate module is learning based and dynamic, therefore it is understood as an AI engine),
PNG
media_image19.png
174
352
media_image19.png
Greyscale
PNG
media_image20.png
86
358
media_image20.png
Greyscale
associated with a tensor (pg. 1741 col. 1 para. 2, the trainable parameters are associated with tensor 4. Tensor 4 is from tensor 3 which is from tensor 2. The included edits are interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)) generated from the second plurality of channels (pg. 1741 col. 1 para. 2, tensor 2 is generated from tensor 1 which is understood as the second plurality of channels. The included edits are interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)).
PNG
media_image27.png
210
354
media_image27.png
Greyscale
Regarding claim 19, Qin in view of Xu discloses the subject matter of claim 13. Qin further discloses:
wherein the removing is based on operations on the second plurality of channels (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, which is understood as an operation) performed using values determined from trainable parameters (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, by 0 or 1 depending on the value of the trainable parameters. Therefore, the values are based on the trainable parameters), determined by an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 3, the trainable parameters, "two numbers" in this paragraph, are determined and output by the gate module. Pg. 1741 col. 1 para. 1, the gate module is learning based and dynamic, therefore it is understood as an AI engine),
PNG
media_image19.png
174
352
media_image19.png
Greyscale
PNG
media_image20.png
86
358
media_image20.png
Greyscale
associated with a tensor (pg. 1741 col. 1 para. 2, the trainable parameters are associated with tensor 4. Tensor 4 is from tensor 3 which is from tensor 2. Edits interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)) generated from the second plurality of channels (pg. 1741 col. 1 para. 2, tensor 2 is generated from tensor 1 which is understood as the second plurality of channels. Edits interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)).
PNG
media_image27.png
210
354
media_image27.png
Greyscale
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Shao-Yuan et al. ("Exploring Semantic Segmentation on the DCT Representation", full reference on PTO-892 included with this action) discloses a system which performs Discrete Cosine Transformation (DCT) on an image and inputs the frequency image into a neural network to perform semantic segmentation in the spatial domain. The system removes high frequency components without impacting accuracy.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSHUA B CROCKETT whose telephone number is (571)270-7989. The examiner can normally be reached Monday-Thursday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John M Villecco can be reached on (571) 272-7319. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JOSHUA B. CROCKETT/Examiner, Art Unit 2661
/JOHN VILLECCO/Supervisory Patent Examiner, Art Unit 2661