Last updated: April 19, 2026
Application No. 17/978,458
METHOD AND ELECTRONIC DEVICE FOR EFFICIENTLY REDUCING DIMENSIONS OF IMAGE FRAME

Non-Final OA §102§103§112
Filed
Nov 01, 2022
Examiner
CROCKETT, JOSHUA BRIGHAM
Art Unit
2661
Tech Center
2600 — Communications
Assignee
Samsung Electronics Co., Ltd.
OA Round
4 (Non-Final)
Interview Optional

— +27.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 18 resolved cases, 2023–2026
Examiner Intelligence

CROCKETT, JOSHUA BRIGHAM View full profile →
Grants 72% — above average
Career Allow Rate
13 granted / 18 resolved
+10.2% vs TC avg
Strong +28% interview lift
Without
With
+27.5%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
26 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
6.0%
-34.0% vs TC avg
§103
47.5%
+7.5% vs TC avg
§102
10.1%
-29.9% vs TC avg
§112
35.1%
-4.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 18 resolved cases
Office Action

§102 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 9 February 2026 has been entered.
 
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. 17/978,458 (the instant application), filed on 11/01/2022.

Response to Arguments
Claims 1, 7, and 13 have been amended. Claims 3 and 9 have been canceled. Claims 16, 18, and 20 have been canceled previously. Claims 1-2, 4-8, 10-15, 17, and 19 are pending in this action.
Applicant’s arguments, see pg. 9, filed 9 February 2026, with respect to the rejection of claims 3 and 9 under 35 U.S.C. 112(d) have been fully considered and are persuasive. Specifically, claims 3 and 9 have been canceled. The rejection of claims 3 and 9 under 35 U.S.C. 112(d) has been withdrawn. 
Applicant’s arguments, see pg. 9-10, filed 9 February 2026, with respect to the rejection of claims 1-6 and 15 under 35 U.S.C. 102(a)(1) and claims 7-14, 17, and 19 under 35 U.S.C. 103 have been fully considered and are not persuasive. The applicant argues that "Qin et al. does not disclose a specific process for generating a new input image frame by reducing the number of channels at an input stage based on channel-wise binary determination using an AI engine, as recited in claim 1." (applicant's remarks filed 9 February 2026, pg. 10). The examiner disagrees. Qin generates a new input image frame by reducing the number of channels at an input stage (Qin, pg. 1740 col. 2 para. 1 and fig. 3, the input channels go 224 X 224 channels with 3 layers to 56 X 56 channels with 64 layers which is understood as reducing the number of channels. The new version of the image, 56 X 56 X 64, is understood as a new input image frame. Pg. 1740 col. 2 para. 2 and fig. 3, the new input image frame is input to a model. Therefore, this is performed at an input stage) based on channel-wise binary determination using an AI engine (Qin, pg. 1741 col. 1 para. 1 and fig. 4, an AI model, the dynamic gate module, assigns a binary score to each channel. The channels which are assigned a values of zero are "detached from the network" which is understood as reducing the number of channels). The applicant further argues that "Qin et al. does not disclose channel filtering at all" (applicant's remarks filed 9 February 2026, pg. 10). This argument amounts to a conclusive statement without supporting evidence showing that Qin does not disclose channel filtering. Further, while Qin may not employ the words "channel filtering" in describing their process, they perform a channel selection which reduces the number of channels which is understood as a channel filtering (see Qin, pg. 1741 col. 1 para. 1-2 and fig. 4). Therefore, the applicant's arguments are not persuasive and the rejections of claims 1-2, 4-6 and 15 under 35 U.S.C. 102(a)(1) and claims 7-8, 10-14, 17, and 19 under 35 U.S.C. 103 are maintained (claims 3 and 9 being canceled).

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-2, 4-6, and 15 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Xu et al. ("Learning in the Frequency Domain" full reference on PTO-892 filed with this action; hereafter, Qin).
Regarding claim 1, Qin discloses:
A method comprising: receiving, by an electronic device (pg. 1739 col. 2 para. 2, the method performs preprocessing on a CPU which is understood as an electronic device),

    PNG
    media_image1.png
    36
    360
    media_image1.png
    Greyscale

 an image frame (pg. 1740 col. 1 para. 2 and Fig, 2, the image is received as seen by the image being input into the flow diagram and the image being resized);

    PNG
    media_image2.png
    106
    366
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    278
    626
    media_image3.png
    Greyscale

transforming, by the electronic device, the image frame from a spatial domain comprising a first plurality of channels to a non-spatial domain comprising a second plurality of channels (pg. 1740 col. 1 para. 2 and Fig. 2, the image is converted into a frequency domain),

    PNG
    media_image4.png
    52
    352
    media_image4.png
    Greyscale

wherein a number of the second plurality of channels is greater than a number of the first plurality of channels (pg. 1740 col. 1 para. 1 and Fig. 2, when converted to the frequency domain, the number of the channels of the image increase);

    PNG
    media_image5.png
    136
    352
    media_image5.png
    Greyscale

removing, by the electronic device, at least one channel comprising irrelevant information from among the second plurality of channels (pg. 1741 col. 1 para. 1 and Fig. 2, a subset of channels are selected. Therefore, the non-selected channels are understood to be removed. The selected channels are "salient". Therefore, the removed channels are understood as unimpactful or comprising irrelevant information)

    PNG
    media_image6.png
    124
    354
    media_image6.png
    Greyscale

using an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 1, the channel selected is "learning-based" which is understood as using an AI engine)

    PNG
    media_image7.png
    84
    352
    media_image7.png
    Greyscale

to generate a low-resolution image frame in the non-spatial domain (pg. 1741 col. 1 para. 1 and fig. 2, the non-salient channels are removed which causes the input data size to be reduced. This is understood as generating a low-resolution image)

    PNG
    media_image8.png
    122
    350
    media_image8.png
    Greyscale

and providing, by the electronic device, the low-resolution image frame to a neural network for an inference of the image frame (pg. 1739 col. 2 para. 2, after the task of image compression, the preserved channels are input into an AI accelerator, which may be a neural network, for inference. See also fig. 1(b) for an example of inputting the output of pre-processing into a neural network)

    PNG
    media_image9.png
    86
    358
    media_image9.png
    Greyscale


    PNG
    media_image10.png
    126
    576
    media_image10.png
    Greyscale

wherein a generic stub layer is embedded at an input of the neural network (pg. 1740 col. 2 para. 2 and Fig. 3, figure 3 shows that the pre-processed input is embedded such that it may be input directly into the neural network. The block "DCT: 56X56X64" is understood as the generic stub layer. Alternatively, the combination of the three dashed boxes may also be interpreted as the generic stub layer in a way nearly identical to the applicant’s specification) for compatibility of the neural network in receiving the low-resolution image frame (pg. 1740 col. 2 para. 2, "Then we adjust the channel size of the next layer to match the number of channels in the frequency domain", showing that the generic stub layer is compatible with the frequency domain), wherein the generic stub layer bypasses input layers of the neural network that are relevant for the image frame in the spatial domain (pg. 1740 col. 2 para. 2 and Fig. 3, "Since the input feature maps in the frequency domain are smaller in the H and W dimensions but larger in the C dimension than the spatial-domain counterpart, we skip the input layer of a conventional CNN model," therefore, the generic stub layer skips, or bypasses, input layers of the neural network),

    PNG
    media_image11.png
    156
    362
    media_image11.png
    Greyscale


    PNG
    media_image12.png
    422
    362
    media_image12.png
    Greyscale

and wherein the removing, by the electronic device, the at least one channels comprising the irrelevant information comprises: determining, by the electronic device using the AI engine (pg. 1741 col. 1 para. 1, the dynamic gate module is understood as an AI engine as it makes selections, pg. 1741 col. 1 para. 1, contains at least two trainable parameters, pg. 1741 col. 2 para. 1-3, and is trained by loss functions. A person of ordinary skill in the art would recognize this as an AI engine for these reasons), a binary value corresponding to each of the second plurality of channels (pg. 1741 col. 1 para. 1, "dynamic gate module that assigns a binary score to each frequency channel.");
filtering, by the electronic device, channels based on an application of the binary value to the second plurality of channels to reduce a number of channels (pg. 1741 col. 1 para. 1, "The salient channels are rated as one, the others as zero. The input frequency channels with zero scores are detached from the network. Thus, the input data size is reduced," The channels with zero scores being detached is understood as filtering);
and generating, by the electronic device, the low-resolution image frame in the non-spatial domain using the filtered channels (pg. 1740 col. 2 para. 1 and fig. 3, the input channels go 224 X 224 channels with 3 layers to 56 X 56 channels with 64 layers which is understood as reducing the number of channels. The new version of the image, 56 X 56 X 64, is understood as a new input image frame in the non-spatial domain. Pg. 1740 col. 2 para. 2 and fig. 3, the new input image frame is input to a model. Therefore, this is performed at an input stage. Finally, pg. 1740 col. 2 para. 3 and fig. 2, "As discussed in Section 3.3, the majority of the frequency channels can be pruned without sacrificing accuracy. The frequency channel pruning operation is referred to as DCT channel select in Figure 2." This is understood to show that the low resolution image frame uses the filtered channels).
Regarding claim 2, Qin discloses the subject matter of claim 1. Qin further discloses:
wherein the transforming comprises performing, by the electronic device, a Discrete Cosine Transformation (DCT) or a Fourier transformation on the image frame (pg. 1740 col. 1 para. 2 and Fig. 2, the image is converted into a frequency domain by DCT).

    PNG
    media_image4.png
    52
    352
    media_image4.png
    Greyscale

Regarding claim 4, Qin discloses the subject matter of claim 1. Qin further discloses:
wherein the non-spatial domain comprises a Luminance, Red difference, Blue difference (Y, Cb, Cr) domain, a Hue, Saturation, Value (H, S, V) domain, or a Luminance, Chrominance (YUV) domain (pg. 1740 col. 1 para. 2, the images are transformed into the YCbCr color space).

    PNG
    media_image13.png
    36
    356
    media_image13.png
    Greyscale

Regarding claim 5, Qin discloses the subject matter of claim 1. Qin further discloses:
grouping, by the electronic device, components of the transformed image frame with a same frequency into a channel of the second plurality of channels (pg. 1740 col.1 para. 2 and Fig. 2, the frequency domain coefficients at the same frequency are gathered into "cubes") 

    PNG
    media_image14.png
    66
    360
    media_image14.png
    Greyscale

by preserving spatial position information of each component (pg. 1740 col. 2 para. 1, the coefficients are grouped while maintaining their spatial relations at each frequency).

    PNG
    media_image15.png
    74
    358
    media_image15.png
    Greyscale

Regarding claim 6, Qin discloses the subject matter of claim 1. Qin further discloses:
generating, by the electronic device, a tensor by performing a depth-wise convolution and average pool on each channel of the second plurality of channels (pg. 1741 col. 1 para. 2 and Fig. 4, a tensor is generated by performing convolution and average pooling on the collected channels);

    PNG
    media_image16.png
    174
    352
    media_image16.png
    Greyscale


    PNG
    media_image17.png
    210
    360
    media_image17.png
    Greyscale
 
adding, by the electronic device, two trainable parameters with each component of the tensor (pg. 1741 col.1 para. 2 and Fig. 4, each element in the tensor is multiplied by two trainable parameters which is understood as adding the parameters to the tensor);

    PNG
    media_image18.png
    66
    354
    media_image18.png
    Greyscale

determining, by the electronic device, values of the two trainable parameters using the AI engine (pg. 1741 col. 1 para. 3, the trainable parameters, "two numbers" in this paragraph, are determined and output by the gate module. Pg. 1741 col. 1 para. 1, the gate module is learning based and dynamic, therefore it is understood as an AI engine);

    PNG
    media_image19.png
    174
    352
    media_image19.png
    Greyscale


    PNG
    media_image20.png
    86
    358
    media_image20.png
    Greyscale

determining, by the electronic device, a binary value of each component of the tensor based on the values of the two trainable parameters (pg. 1741 col. 1 para. 3, the trainable parameters are used to determine the probability for each number to be a binary value);

    PNG
    media_image21.png
    90
    352
    media_image21.png
    Greyscale

performing, by the electronic device, an elementwise product between the second plurality of channels and the binary value of the components of the tensor (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, by 0 or 1 depending on the value of the trainable parameters);

    PNG
    media_image22.png
    176
    354
    media_image22.png
    Greyscale

filtering, by the electronic device, at least one channel without a zero value among the second plurality of channels upon performing the elementwise product (pg. 1741 col. 1 para. 2 and Fig. 4, the process outputs tensor with some channels off and other channels on. The “on” channels are understood as being filtered without a zero value. See Tensor 5 has channels that are not canceled);

    PNG
    media_image23.png
    174
    358
    media_image23.png
    Greyscale


    PNG
    media_image17.png
    210
    360
    media_image17.png
    Greyscale

and generating, by the electronic device, the low-resolution image frame in the nonspatial domain using the at least one filtered channel (pg. 1741 col. 1 para. 1 and Fig. 4, the above process is the "channel selection" step. pg. 1740 col. 1 para. 2 and fig. 2, the selected channels from the selection step are concatenated which is understood as a low-resolution image frame being generated using the at least one filtered channel).

    PNG
    media_image24.png
    54
    360
    media_image24.png
    Greyscale


    PNG
    media_image25.png
    88
    358
    media_image25.png
    Greyscale


    PNG
    media_image26.png
    282
    624
    media_image26.png
    Greyscale

Regarding claim 15, Qin discloses the subject matter of claim 1. Qin further discloses:
wherein the removing is based on operations on the second plurality of channels (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, which is understood as an operation) performed using values determined from trainable parameters (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, by 0 or 1 depending on the value of the trainable parameters. Therefore, the values are based on the trainable parameters), determined by an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 3, the trainable parameters, "two numbers" in this paragraph, are determined and output by the gate module. Pg. 1741 col. 1 para. 1, the gate module is learning based and dynamic, therefore it is understood as an AI engine),

    PNG
    media_image19.png
    174
    352
    media_image19.png
    Greyscale


    PNG
    media_image20.png
    86
    358
    media_image20.png
    Greyscale

associated with a tensor (pg. 1741 col. 1 para. 2, the trainable parameters are associated with tensor 4. Tensor 4 is from tensor 3 which is from tensor 2. The included edits are interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)) generated from the second plurality of channels (pg. 1741 col. 1 para. 2, tensor 2 is generated from tensor 1 which is understood as the second plurality of channels. The included edits are interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)).

    PNG
    media_image27.png
    210
    354
    media_image27.png
    Greyscale


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 7-8, 10-14, 17, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. ("Learning in the Frequency Domain" full reference on PTO-892 filed with this action; hereafter, Qin) in view of Xu et al. (U.S. Publ. No. 20210201538; hereafter Xu).
Regarding claim 7, Qin discloses:
receive an image frame (pg. 1740 col. 1 para. 2 and Fig, 2, the image is received as seen by the image being input into the flow diagram and the image being resized);

    PNG
    media_image2.png
    106
    366
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    278
    626
    media_image3.png
    Greyscale

transform the image frame from a spatial domain comprising a first plurality of channels to a non-spatial domain comprising a second plurality of channels (pg. 1740 col. 1 para. 2 and Fig. 2, the image is converted into a frequency domain),

    PNG
    media_image4.png
    52
    352
    media_image4.png
    Greyscale

wherein a number of the second plurality of channels is greater than a number of the first plurality of channels (pg. 1740 col. 1 para. 1 and Fig. 2, when converted to the frequency domain, the number of the channels of the image increase);

    PNG
    media_image5.png
    136
    352
    media_image5.png
    Greyscale

remove at least one channel comprising irrelevant information from among the second plurality of channels  (pg. 1741 col. 1 para. 1 and Fig. 2, a subset of channels are selected. Therefore, the non-selected channels are understood to be removed. The selected channels are "salient". Therefore, the removed channels are understood as unimpactful or comprising irrelevant information)

    PNG
    media_image6.png
    124
    354
    media_image6.png
    Greyscale

using an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 1, the channel selected is "learning-based" which is understood as using an AI engine)

    PNG
    media_image7.png
    84
    352
    media_image7.png
    Greyscale

to generate a low-resolution image frame in the non-spatial domain (pg. 1741 col. 1 para. 1 and fig. 2, the non-salient channels are removed which causes the input data size to be reduced. This is understood as generating a low-resolution image)

    PNG
    media_image8.png
    122
    350
    media_image8.png
    Greyscale

and provide the low-resolution image frame to a neural network for an inference of the image frame (pg. 1739 col. 2 para. 2, after the task of image compression, the preserved channels are input into an AI accelerator, which may be a neural network, for inference. See also fig. 1(b) for an example of inputting the output of pre-processing into a neural network),

    PNG
    media_image9.png
    86
    358
    media_image9.png
    Greyscale


    PNG
    media_image10.png
    126
    576
    media_image10.png
    Greyscale

wherein a generic stub layer is embedded at an input of the neural network (pg. 1740 col. 2 para. 2 and Fig. 3, figure 3 shows that the pre-processed input is embedded such that it may be input directly into the neural network. The block "DCT: 56X56X64" is understood as the generic stub layer. Alternatively, the combination of the three dashed boxes may also be interpreted as the generic stub layer in a way nearly identical to the applicant’s specification) for compatibility of the neural network in receiving the low-resolution image frame (pg. 1740 col. 2 para. 2, "Then we adjust the channel size of the next layer to match the number of channels in the frequency domain", showing that the generic stub layer is compatible with the frequency domain), wherein the generic stub layer bypasses input layers of the neural network that are relevant for the image frame in the spatial domain (pg. 1740 col. 2 para. 2 and Fig. 3, "Since the input feature maps in the frequency domain are smaller in the H and W dimensions but larger in the C dimension than the spatial-domain counterpart, we skip the input layer of a conventional CNN model," therefore, the generic stub layer skips, or bypasses, input layers of the neural network),

    PNG
    media_image11.png
    156
    362
    media_image11.png
    Greyscale


    PNG
    media_image12.png
    422
    362
    media_image12.png
    Greyscale

and wherein the image frame inferencing engine is further configured to: determine, using the AI engine (pg. 1741 col. 1 para. 1, the dynamic gate module is understood as an AI engine as it makes selections, pg. 1741 col. 1 para. 1, contains at least two trainable parameters, pg. 1741 col. 2 para. 1-3, and is trained by loss functions. A person of ordinary skill in the art would recognize this as an AI engine for these reasons), a binary value corresponding to each of the second plurality of channels (pg. 1741 col. 1 para. 1, "dynamic gate module that assigns a binary score to each frequency channel.");
filter channels based on an application of the binary value to the second plurality of channels to reduce a number of channels (pg. 1741 col. 1 para. 1, "The salient channels are rated as one, the others as zero. The input frequency channels with zero scores are detached from the network. Thus, the input data size is reduced," The channels with zero scores being detached is understood as filtering);
and generate the low-resolution image frame in the non-spatial domain using the filtered channels (pg. 1740 col. 2 para. 1 and fig. 3, the input channels go 224 X 224 channels with 3 layers to 56 X 56 channels with 64 layers which is understood as reducing the number of channels. The new version of the image, 56 X 56 X 64, is understood as a new input image frame in the non-spatial domain. Pg. 1740 col. 2 para. 2 and fig. 3, the new input image frame is input to a model. Therefore, this is performed at an input stage. Finally, pg. 1740 col. 2 para. 3 and fig. 2, "As discussed in Section 3.3, the majority of the frequency channels can be pruned without sacrificing accuracy. The frequency channel pruning operation is referred to as DCT channel select in Figure 2." This is understood to show that the low resolution image frame uses the filtered channels).
Qin does not disclose expressly a device comprising a memory, a processor, and an image inferencing engine operably connected to the memory and processor.
Xu discloses:
An electronic device comprising: memory ([0028] the system includes a computer readable storage medium);
at least one processor comprising processing circuitry ([0028] the system includes at least one general-purpose processor. A general purpose processor is understood to include processing circuitry);
and an image frame inferencing engine comprising image processing circuitry ([0073]-[0074] the image is input into a neural network for inference calculations. The neural network is understood as an inferencing engine), operably coupled to the memory and the processor ([0040] the steps of process 200, such as inputting into the neural network of [0073]-[0074], may be performed by the general purpose and special purpose processors. [0029] the general purpose and special purpose processors are connected to each other and the memory by a bus. The edited text reflects the way the claim is being interpreted by the examiner per the explanation in the rejection of claim 7 under 35 U.S.C. 112(b))
Qin and Xu are combinable because they are from the same field of endeavor of resizing images for inference computations (Qin, pg. 1737 col. 2 para. 1; Xu, [0012]).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention to combine the memory and processor of Xu with the method of Qin.
The motivation for doing so would have been that doing so is combining prior art elements (the method of Qin and the system of Xu) according to known methods (it is well known in the art, as demonstrated by Xu, to use a memory and a processor for performing image based methods) to yield predictable results (the result of performing the method of Qin with the aid of digital processing of Xu). Both the method of Qin and the system of Xu operate in combination performing the same function as they did separately, the method of Qin continues to remove irrelevant channels and the system of Xu continues to perform calculations for image processing. Further, while Qin dos not expressly disclose a memory, a processor, and an image frame inferencing engine coupled to the memory and the processor, a person having ordinary skill in the art would understand based on the disclosure of Qin that such elements are in use for the described method and calculations.
Therefore, it would have been obvious to combine Xu with Qin to obtain the invention as specified in claim 7.
Regarding claim 8, Qin in view of Xu discloses the subject matter of claim 7. Qin further discloses:
perform a Discrete Cosine Transformation (DCT) or a Fourier transformation on the image frame for transforming the image frame from the spatial domain to the non-spatial domain (pg. 1740 col. 1 para. 2 and Fig. 2, the image is converted into a frequency domain by DCT).

    PNG
    media_image4.png
    52
    352
    media_image4.png
    Greyscale

Regarding claim 10, Qin in view of Xu discloses the subject matter of claim 7. Qin further discloses:
wherein the non-spatial domain comprises a Luminance, Red difference, Blue difference (Y, Cr, Cb) domain, a Hue, Saturation, Value (H, S, V) domain, or a Luminance, Chrominance (YUV) domain (pg. 1740 col. 1 para. 2, the images are transformed into the YCbCr color space).

    PNG
    media_image13.png
    36
    356
    media_image13.png
    Greyscale

Regarding claim 11, Qin in view of Xu discloses the subject matter of claim 7. Qin further discloses:
group components of the transformed image frame with a same frequency into a channel of the second plurality of channels (pg. 1740 col.1 para. 2 and Fig. 2, the frequency domain coefficients at the same frequency are gathered into "cubes") 

    PNG
    media_image14.png
    66
    360
    media_image14.png
    Greyscale

by preserving spatial position information of each component (pg. 1740 col. 2 para. 1, the coefficients are grouped while maintaining their spatial relations at each frequency).

    PNG
    media_image15.png
    74
    358
    media_image15.png
    Greyscale

Regarding claim 12, Qin in view of Xu discloses the subject matter of claim 7. Qin further discloses:
generate a tensor by performing a depth-wise convolution and average pool on each channel of the second plurality of channels (pg. 1741 col. 1 para. 2 and Fig. 4, a tensor is generated by performing convolution and average pooling on the collected channels);

    PNG
    media_image16.png
    174
    352
    media_image16.png
    Greyscale


    PNG
    media_image17.png
    210
    360
    media_image17.png
    Greyscale
 
add two trainable parameters with each component of the tensor (pg. 1741 col.1 para. 2 and Fig. 4, each element in the tensor is multiplied by two trainable parameters which is understood as adding the parameters to the tensor);

    PNG
    media_image18.png
    66
    354
    media_image18.png
    Greyscale

determine values of the two trainable parameters using the AI engine (pg. 1741 col. 1 para. 3, the trainable parameters, "two numbers" in this paragraph, are determined and output by the gate module. Pg. 1741 col. 1 para. 1, the gate module is learning based and dynamic, therefore it is understood as an AI engine);

    PNG
    media_image19.png
    174
    352
    media_image19.png
    Greyscale


    PNG
    media_image20.png
    86
    358
    media_image20.png
    Greyscale

determine a binary value of each component of the tensor based on the values of the two trainable parameters (pg. 1741 col. 1 para. 3, the trainable parameters are used to determine the probability for each number to be a binary value);

    PNG
    media_image21.png
    90
    352
    media_image21.png
    Greyscale

perform an elementwise product between the second plurality of channels and the binary value of the components of the tensor (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, by 0 or 1 depending on the value of the trainable parameters);

    PNG
    media_image22.png
    176
    354
    media_image22.png
    Greyscale

filter at least one channel without a zero value among the second plurality of channels upon performing the elementwise product (pg. 1741 col. 1 para. 2 and Fig. 4, the process outputs tensor with some channels off and other channels on. The “on” channels are understood as being filtered without a zero value. See Tensor 5 has channels that are not canceled);

    PNG
    media_image23.png
    174
    358
    media_image23.png
    Greyscale


    PNG
    media_image17.png
    210
    360
    media_image17.png
    Greyscale

and generate the low-resolution image frame in the nonspatial domain using the at least one filtered channel (pg. 1741 col. 1 para. 1 and Fig. 4, the above process is the "channel selection" step. pg. 1740 col. 1 para. 2 and fig. 2, the selected channels from the selection step are concatenated which is understood as a low-resolution image frame being generated using the at least one filtered channel).

    PNG
    media_image24.png
    54
    360
    media_image24.png
    Greyscale


    PNG
    media_image25.png
    88
    358
    media_image25.png
    Greyscale


    PNG
    media_image26.png
    282
    624
    media_image26.png
    Greyscale

Regarding claim 13, Qin discloses:
receive an image frame (pg. 1740 col. 1 para. 2 and Fig, 2, the image is received as seen by the image being input into the flow diagram and the image being resized);

    PNG
    media_image2.png
    106
    366
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    278
    626
    media_image3.png
    Greyscale

transform the image frame from a spatial domain comprising a first plurality of channels to a non-spatial domain comprising a second plurality of channels (pg. 1740 col. 1 para. 2 and Fig. 2, the image is converted into a frequency domain),

    PNG
    media_image4.png
    52
    352
    media_image4.png
    Greyscale

wherein a number of the second plurality of channels is greater than a number of the first plurality of channels (pg. 1740 col. 1 para. 1 and Fig. 2, when converted to the frequency domain, the number of the channels of the image increase);

    PNG
    media_image5.png
    136
    352
    media_image5.png
    Greyscale

remove at least one channel comprising irrelevant information from among the second plurality of channels (pg. 1741 col. 1 para. 1 and Fig. 2, a subset of channels are selected. Therefore, the non-selected channels are understood to be removed. The selected channels are "salient". Therefore, the removed channels are understood as unimpactful or comprising irrelevant information)

    PNG
    media_image6.png
    124
    354
    media_image6.png
    Greyscale

using an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 1, the channel selected is "learning-based" which is understood as using an AI engine)

    PNG
    media_image7.png
    84
    352
    media_image7.png
    Greyscale

to generate a low-resolution image frame in the non-spatial domain (pg. 1741 col. 1 para. 1 and fig. 2, the non-salient channels are removed which causes the input data size to be reduced. This is understood as generating a low-resolution image)

    PNG
    media_image8.png
    122
    350
    media_image8.png
    Greyscale

and provide the low-resolution image frame to a neural network for an inference of the image frame (pg. 1739 col. 2 para. 2, after the task of image compression, the preserved channels are input into an AI accelerator, which may be a neural network, for inference. See also fig. 1(b) for an example of inputting the output of pre-processing into a neural network).

    PNG
    media_image9.png
    86
    358
    media_image9.png
    Greyscale


    PNG
    media_image10.png
    126
    576
    media_image10.png
    Greyscale

wherein a generic stub layer is embedded at an input of the neural network (pg. 1740 col. 2 para. 2 and Fig. 3, figure 3 shows that the pre-processed input is embedded such that it may be input directly into the neural network. The block "DCT: 56X56X64" is understood as the generic stub layer. Alternatively, the combination of the three dashed boxes may also be interpreted as the generic stub layer in a way nearly identical to the applicant’s specification) for compatibility of the neural network in receiving the low-resolution image frame (pg. 1740 col. 2 para. 2, "Then we adjust the channel size of the next layer to match the number of channels in the frequency domain", showing that the generic stub layer is compatible with the frequency domain), wherein the generic stub layer bypasses input layers of the neural network that are relevant for the image frame in the spatial domain (pg. 1740 col. 2 para. 2 and Fig. 3, "Since the input feature maps in the frequency domain are smaller in the H and W dimensions but larger in the C dimension than the spatial-domain counterpart, we skip the input layer of a conventional CNN model," therefore, the generic stub layer skips, or bypasses, input layers of the neural network),

    PNG
    media_image11.png
    156
    362
    media_image11.png
    Greyscale


    PNG
    media_image12.png
    422
    362
    media_image12.png
    Greyscale

and determine, using the AI engine (pg. 1741 col. 1 para. 1, the dynamic gate module is understood as an AI engine as it makes selections, pg. 1741 col. 1 para. 1, contains at least two trainable parameters, pg. 1741 col. 2 para. 1-3, and is trained by loss functions. A person of ordinary skill in the art would recognize this as an AI engine for these reasons), a binary value corresponding to each of the second plurality of channels (pg. 1741 col. 1 para. 1, "dynamic gate module that assigns a binary score to each frequency channel."),
filter channels based on an application of the binary value to the second plurality of channels to reduce a number of channels (pg. 1741 col. 1 para. 1, "The salient channels are rated as one, the others as zero. The input frequency channels with zero scores are detached from the network. Thus, the input data size is reduced," The channels with zero scores being detached is understood as filtering),
and generate the low-resolution image frame in the non-spatial domain using the filtered channels (pg. 1740 col. 2 para. 1 and fig. 3, the input channels go 224 X 224 channels with 3 layers to 56 X 56 channels with 64 layers which is understood as reducing the number of channels. The new version of the image, 56 X 56 X 64, is understood as a new input image frame in the non-spatial domain. Pg. 1740 col. 2 para. 2 and fig. 3, the new input image frame is input to a model. Therefore, this is performed at an input stage. Finally, pg. 1740 col. 2 para. 3 and fig. 2, "As discussed in Section 3.3, the majority of the frequency channels can be pruned without sacrificing accuracy. The frequency channel pruning operation is referred to as DCT channel select in Figure 2." This is understood to show that the low resolution image frame uses the filtered channels).
Qin does not disclose expressly a device comprising a non-transitory computer readable medium storing instruction executed by a processor.
Xu discloses:
At least one non-transitory computer-readable memory ([0029] the memory of the device may be separate embodiments, including non-transitory memory) storing instructions ([0028] instruction are stored on the memory) that, when executed by at least one processor, perform the function ([0028] stored instructions may be performed by at least one processor)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention to combine the memory and processor of Xu with the method of Qin.
The motivation for doing so would have been that doing so is combining prior art elements (the method of Qin and the system of Xu) according to known methods (it is well known in the art, as demonstrated by Xu, to use a memory and a processor for performing image based methods) to yield predictable results (the result of performing the method of Qin with the aid of digital processing of Xu). Both the method of Qin and the system of Xu operate in combination performing the same function as they did separately, the method of Qin continues to remove irrelevant channels and the system of Xu continues to perform calculations for image processing. Further, while Qin dos not expressly disclose a memory and a processor, a person having ordinary skill in the art would understand based on the disclosure of Qin that such elements are in use for the described method and calculations.
Therefore, it would have been obvious to combine Xu with Qin to obtain the invention as specified in claim 13.
Regarding claim 14, Qin in view of Xu discloses the subject matter of claim 13. Qin does not disclose that an electronic device comprises the non-transitory computer readable memory of claim 13. 
Xu discloses:
An electronic device comprising the at least one non-transitory computer- readable memory of claim 13 ([0028] and Fig. 1, the electronic device includes the memory).
Regarding claim 17, Qin in view of Xu discloses the subject matter of claim 7. Qin further discloses:
wherein the removing is based on operations on the second plurality of channels (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, which is understood as an operation) performed using values determined from trainable parameters (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, by 0 or 1 depending on the value of the trainable parameters. Therefore, the values are based on the trainable parameters), determined by an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 3, the trainable parameters, "two numbers" in this paragraph, are determined and output by the gate module. Pg. 1741 col. 1 para. 1, the gate module is learning based and dynamic, therefore it is understood as an AI engine),

    PNG
    media_image19.png
    174
    352
    media_image19.png
    Greyscale


    PNG
    media_image20.png
    86
    358
    media_image20.png
    Greyscale

associated with a tensor (pg. 1741 col. 1 para. 2, the trainable parameters are associated with tensor 4. Tensor 4 is from tensor 3 which is from tensor 2. The included edits are interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)) generated from the second plurality of channels (pg. 1741 col. 1 para. 2, tensor 2 is generated from tensor 1 which is understood as the second plurality of channels. The included edits are interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)).

    PNG
    media_image27.png
    210
    354
    media_image27.png
    Greyscale

Regarding claim 19, Qin in view of Xu discloses the subject matter of claim 13. Qin further discloses:
wherein the removing is based on operations on the second plurality of channels (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, which is understood as an operation) performed using values determined from trainable parameters (pg. 1741 col. 1 para. 3, each channel is multiplied, by a 1X1 convolution layer understood as elementwise multiplication, by 0 or 1 depending on the value of the trainable parameters. Therefore, the values are based on the trainable parameters), determined by an Artificial Intelligence (AI) engine (pg. 1741 col. 1 para. 3, the trainable parameters, "two numbers" in this paragraph, are determined and output by the gate module. Pg. 1741 col. 1 para. 1, the gate module is learning based and dynamic, therefore it is understood as an AI engine),

    PNG
    media_image19.png
    174
    352
    media_image19.png
    Greyscale


    PNG
    media_image20.png
    86
    358
    media_image20.png
    Greyscale

associated with a tensor (pg. 1741 col. 1 para. 2, the trainable parameters are associated with tensor 4. Tensor 4 is from tensor 3 which is from tensor 2. Edits interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)) generated from the second plurality of channels (pg. 1741 col. 1 para. 2, tensor 2 is generated from tensor 1 which is understood as the second plurality of channels. Edits interpreted by the examiner per the explanation in the rejection under 35 U.S.C. 112(a)).

    PNG
    media_image27.png
    210
    354
    media_image27.png
    Greyscale


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Shao-Yuan et al. ("Exploring Semantic Segmentation on the DCT Representation", full reference on PTO-892 included with this action) discloses a system which performs Discrete Cosine Transformation (DCT) on an image and inputs the frequency image into a neural network to perform semantic segmentation in the spatial domain. The system removes high frequency components without impacting accuracy.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSHUA B CROCKETT whose telephone number is (571)270-7989. The examiner can normally be reached Monday-Thursday 8am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John M Villecco can be reached on (571) 272-7319. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JOSHUA B. CROCKETT/Examiner, Art Unit 2661                                                                                                                                                                                                        
/JOHN VILLECCO/Supervisory Patent Examiner, Art Unit 2661
Read full office action
Prosecution Timeline

Nov 01, 2022
Application Filed
Feb 13, 2025
Non-Final Rejection — §102, §103, §112
May 16, 2025
Applicant Interview (Telephonic)
May 16, 2025
Examiner Interview Summary
May 26, 2025
Response Filed
Jul 21, 2025
Non-Final Rejection — §102, §103, §112
Oct 22, 2025
Response Filed
Dec 05, 2025
Final Rejection — §102, §103, §112
Feb 09, 2026
Request for Continued Examination
Feb 18, 2026
Response after Non-Final Action
Mar 09, 2026
Non-Final Rejection — §102, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/199,017
Patent 12592060
ARTIFICIAL INTELLIGENCE DEVICE AND 3D AGENCY GENERATING METHOD THEREOF
2y 5m to grant Granted Mar 31, 2026
17/925,201
Patent 12587704
VIDEO DATA TRANSMISSION AND RECEPTION METHOD USING HIGH-SPEED INTERFACE, AND APPARATUS THEREFOR
2y 5m to grant Granted Mar 24, 2026
17/811,329
Patent 12567150
EDITING PRESEGMENTED IMAGES AND VOLUMES USING DEEP LEARNING
2y 5m to grant Granted Mar 03, 2026
18/170,040
Patent 12561839
SYSTEMS AND METHODS FOR CALIBRATING IMAGE SENSORS OF A VEHICLE
2y 5m to grant Granted Feb 24, 2026
17/999,990
Patent 12529639
METHOD FOR ESTIMATING HYDROCARBON SATURATION OF A ROCK
2y 5m to grant Granted Jan 20, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

4-5
Expected OA Rounds
72%
Grant Probability
99%
With Interview (+27.5%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 18 resolved cases by this examiner. Grant probability derived from career allow rate.