Last updated: April 19, 2026
Application No. 18/294,134
TARGET DETECTION OPTIMIZATION METHOD AND DEVICE

Non-Final OA §101§103
Filed
Jan 31, 2024
Examiner
ALLEN, LUCIUS CAMERON GREE
Art Unit
2673
Tech Center
2600 — Communications
Assignee
BOE TECHNOLOGY GROUP CO., LTD.
OA Round
1 (Non-Final)
Interview Optional

— +39.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 38 resolved cases, 2023–2026
Examiner Intelligence

ALLEN, LUCIUS CAMERON GREE View full profile →
Grants 71% — above average
Career Allow Rate
27 granted / 38 resolved
+9.1% vs TC avg
Strong +39% interview lift
Without
With
+39.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
20 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
12.6%
-27.4% vs TC avg
§103
53.7%
+13.7% vs TC avg
§102
8.5%
-31.5% vs TC avg
§112
23.7%
-16.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 38 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of AIA  Status
The present application is being examined under the AIA  the first inventor to file provisions.

Priority
Receipt is acknowledged of certified copies of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 07/17/2024 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claims 1, 7, and 14, are objected to because of the following informalities:
In claim 1, Line 4 the term “the target detection model” should be changed to “the trained target detection model” for typographical/grammar issues to avoid clarity issues to prevent a rejection under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph.
In claim 1, Line 5 the term “the target detection model” should be changed to “the trained target detection model” for typographical/grammar issues to avoid clarity issues to prevent a rejection under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph.
In claim 7, Line 2 the term “the target detection model” should be changed to “the trained target detection model” for typographical/grammar issues to avoid clarity issues to prevent a rejection under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph.
In claim 14, Line 3 the term “the target detection model” should be changed to “the trained target detection model” for typographical/grammar issues to avoid clarity issues to prevent a rejection under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. 

Claim 18 along with its dependent claims are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. 
Claim 18 is drawn to a computer storage medium that may include additional digital storage device comprising transitory and non-transitory memory as defined in the specification in paragraph [0202]- “the present disclosure can take the form of an entire hardware embodiment, an entire software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.” thus explicitly defined to encompass both transitory and non-transitory, can be software or signal; therefore, fail(s) to fall within at least one of the four categories of patent eligible subject matter.
Therefore claim 18 does not fit within the recognized categories of statutory subject matter.  See MPEP 2106.  The office respectfully recommends the applicant to amend claim 18 limitation “a computer storage medium” to reflect the limitation “a non-transitory computer storage medium”.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 9-12, and 17- 18 are rejected under 35 U.S.C 103 as being unpatentable over An et al. (US 20230131518 A1) hereafter referenced as An in view of Ran et al. (US 20210150170 A1) hereafter referenced as Ran and Muralidharan et al. (US 20210065052 A1) hereafter referenced as Muralidharan.

Regarding claim 1, An teaches an optimized target detection method (Fig. 1, Paragraph [0006]- An discloses embodiments of the present application provide a model generation method and apparatus, an object detection method and apparatus, a device and a storage medium to achieve the effect of increasing the model detection speed by model compression.), 
comprising: inputting an image comprising an object into a trained target detection model for detection (Fig. 1, Paragraph [0013]- An discloses in S320, the to-be-detected image is input into the object detection model, and an object detection result of a to-be-detected object in the to-be-detected image is obtained according to an output result of the object detection model.), 
and determining coordinates of the object in the image and a category of the object (Fig. 1, Paragraph [0039]- An discloses the sample annotation result may be a category probability and position coordinates.); 
and the target detection model is obtained by: training using a first training sample set to obtain a to-be-optimized model (Fig. 1, Paragraph [0039]- An discloses after the original detection model is trained based on the multiple training samples, the initially-trained intermediate detection model may be obtained, and each training sample may include the sample image and the sample annotation result of the known object in the sample image.), 
pruning model parameters in the to-be- optimized model using an optimal pruning scheme (Fig. 4A, Paragraph [0088]- An discloses the to-be-pruned channel screening unit is configured to screen, from multiple channels of multiple convolution layers of the intermediate detection model, an output channel of the current convolution layer corresponding to the current pruning coefficient among the multiple to-be-pruned coefficients and an input channel of the next convolution layer of the current convolution layer and regard the output channel and the input channel as the to-be-pruned channel.), 
and training the pruned to-be-optimized model using a second training sample set (Fig. 4A, Paragraph [0094-95]- An discloses the channel pruning unit is configured to perform channel pruning on the to-be-pruned channel to obtain a pruning detection model. The fine-tuning training unit is configured to perform fine-tuning training on the pruning detection model to generate the object detection model.); 
An fails to explicitly teach wherein the target detection model comprises a plurality of depthwise convolutional network layers.
However, Ran discloses wherein the target detection model comprises a plurality of depthwise convolutional network layers (Fig. 10, Paragraph [0136]- Ran discloses to further reduce a size of the entire target detection model and make it applicable to the terminal, in a possible implementation, some convolutional layers in the first residual block are replaced with depthwise convolutional layers to reduce a size of the residual network and improve a processing speed of the residual network while ensuring accuracy of recognition.), 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Ran wherein the target detection model comprises a plurality of depthwise convolutional network layers.
Wherein having An’s system for creating a pruned network for object detection wherein the target detection model comprises a plurality of depthwise convolutional network layers.
The motivation behind the modification would have been to allow for a faster and smaller system, since both An and Ran are systems that use machine learning for object detection. Wherein An’s system wherein increased the speed of the machine learning model, while Ran’s system wherein further improved speed and reduced size of the network. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Ran et al. (US 20210150170 A1) Paragraph [0136].
An in view of Ran fails to explicitly teach wherein the optimal pruning scheme is obtained by screening pruning schemes determined according to different pruning methods and pruning rates.
However, Muralidharan discloses wherein the optimal pruning scheme is obtained by screening pruning schemes determined according to different pruning methods and pruning rates (Fig. 2, Paragraph [0030]- Muralidharan discloses compression scheme 240 includes a type of compression to be applied to produce compressed machine learning model 226. For example, a user selects compression scheme 240 from a list that includes, but is not limited to, quantization of parameters (e.g., from 32-bit floating point to 16-bit floating point), unstructured magnitude-based pruning of weights in a DNN, pruning of neurons in the DNN, pruning of blocks of various dimensions in the DNN, and/or low-rank tensor factorization. Further in Fig. 2, Paragraph [0031]- Muralidharan discloses Bayesian optimizer 122 and compression engine 124 perform a number of iterations 202-204 that sample different sparsity ratios 218 for producing compressed machine learning model 226 from reference machine learning model 222 using compression scheme 240. During each iteration, Bayesian optimizer 122 selects a sparsity ratio for compressed machine learning model 226, and compression engine 124 uses compression scheme 240 to produce compressed machine learning model 226 from reference machine learning model 222 in a way that adheres to the selected sparsity ratio.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Muralidharan wherein the optimal pruning scheme is obtained by screening pruning schemes determined according to different pruning methods and pruning rates.
Wherein having An’s system for creating a pruned network for object detection wherein the optimal pruning scheme is obtained by screening pruning schemes determined according to different pruning methods and pruning rates.
The motivation behind the modification would have been to allow for improvement to runtime performance, since both An and Muralidharan are systems that compress machine learning models. Wherein An’s system wherein increased the speed of the machine learning model, while Muralidharan’s system wherein further improved runtime performance. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Muralidharan et al. (US 20210065052 A1) Paragraph [0023].

Regarding claim 9, An in view of Ran and Muralidharan teaches the method according to claim 1, 
An further teaches structured pruning (Fig. 3, Paragraph [0076]- An discloses in the case where part of the convolution layers and the BN layers are retained, the object detection method with structured pruning in this embodiment of the present application can achieve a relatively noticeable compression effect while the detection result mAP (the average of the five subsets) has only a slight decrease.), 	An in view of Ran fails to explicitly teach wherein the pruning method comprises at least one of block pruning or unstructured pruning.
However, Muralidharan explicitly teaches wherein the pruning method comprises at least one of block pruning (Fig. 2, Paragraph [0030]- Muralidharan discloses compression scheme 240 includes a type of compression to be applied to produce compressed machine learning model 226. For example, a user selects compression scheme 240 from a list that includes, but is not limited to, quantization of parameters (e.g., from 32-bit floating point to 16-bit floating point), unstructured magnitude-based pruning of weights in a DNN, pruning of neurons in the DNN, pruning of blocks of various dimensions in the DNN, and/or low-rank tensor factorization).
or unstructured pruning (Fig. 2, Paragraph [0030]- Muralidharan discloses compression scheme 240 includes a type of compression to be applied to produce compressed machine learning model 226. For example, a user selects compression scheme 240 from a list that includes, but is not limited to, quantization of parameters (e.g., from 32-bit floating point to 16-bit floating point), unstructured magnitude-based pruning of weights in a DNN, pruning of neurons in the DNN, pruning of blocks of various dimensions in the DNN, and/or low-rank tensor factorization.),
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Muralidharan wherein the pruning method comprises at least one of block pruning or unstructured pruning.
Wherein having An’s system for creating a pruned network for object detection wherein the pruning method comprises at least one of block pruning or unstructured pruning.
The motivation behind the modification would have been to allow for improvement to runtime performance, since both An and Muralidharan are systems that compress machine learning models. Wherein An’s system wherein increased the speed of the machine learning model, while Muralidharan’s system wherein further improved runtime performance. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Muralidharan et al. (US 20210065052 A1) Paragraph [0023].

Regarding claim 10, An in view of Ran and Muralidharan teaches the method according to claim 1, 
An further teaches wherein the pruning the model parameters in the to-be-optimized model using the optimal pruning scheme, comprises: pruning model parameters corresponding to at least one network layer in the to-be- optimized model using the optimal pruning scheme (Fig. 4, Paragraph [0096]- An discloses the model generation module may screen the to-be-pruned channel corresponding to the to-be-pruned coefficient from the multiple channels of the intermediate detection model and perform the channel pruning on the to-be-pruned channel to generate the object detection model.).  

Regarding claim 11, An in view of Ran and Muralidharan teaches the method according to claim 1,
An in view of Ran fails to explicitly teach wherein the optimal pruning scheme is determined by: determining pruning schemes based on different pruning methods and pruning rates evaluating performance of each to-be-optimized model corresponding to each pruning scheme separately according to Bayesian optimization, and obtaining evaluation performance of each to-be-optimized model; determining the optimal pruning scheme corresponding to optimal evaluation performance from obtained evaluation performance.
However, Muralidharan explicitly teaches wherein the optimal pruning scheme is determined by: determining pruning schemes based on different pruning methods and pruning rates (Fig. 2, Paragraph [0030]- Muralidharan discloses compression scheme 240 includes a type of compression to be applied to produce compressed machine learning model 226. For example, a user selects compression scheme 240 from a list that includes, but is not limited to, quantization of parameters (e.g., from 32-bit floating point to 16-bit floating point), unstructured magnitude-based pruning of weights in a DNN, pruning of neurons in the DNN, pruning of blocks of various dimensions in the DNN, and/or low-rank tensor factorization. Further in Fig. 2, Paragraph [0031]- Muralidharan discloses Bayesian optimizer 122 and compression engine 124 perform a number of iterations 202-204 that sample different sparsity ratios 218 for producing compressed machine learning model 226 from reference machine learning model 222 using compression scheme 240. During each iteration, Bayesian optimizer 122 selects a sparsity ratio for compressed machine learning model 226, and compression engine 124 uses compression scheme 240 to produce compressed machine learning model 226 from reference machine learning model 222 in a way that adheres to the selected sparsity ratio.), 
evaluating performance of each to-be-optimized model corresponding to each pruning scheme separately according to Bayesian optimization, and obtaining evaluation performance of each to-be-optimized model (Fig. 2, Paragraph [0031]- Muralidharan discloses during each iteration, Bayesian optimizer 122 selects a sparsity ratio for compressed machine learning model 226, and compression engine 124 uses compression scheme 240 to produce compressed machine learning model 226 from reference machine learning model 222 in a way that adheres to the selected sparsity ratio. Compression engine 124 also measures an accuracy 236, performance 238, and/or another value related to the execution or output of compressed machine learning model 226 and provides the measurement to Bayesian optimizer 122.); 
determining the optimal pruning scheme corresponding to optimal evaluation performance from obtained evaluation performance (Fig. 3, Paragraph [0056]- Muralidharan discloses after a number of iterations 202, Bayesian optimizer 122 identifies sample 318 as having a sparsity ratio that produces a compressed machine learning model 226 with accuracy 236 that is equal to accuracy loss limit 206. In turn, Bayesian optimizer 122 performs additional iterations 204 that sample sparsity ratios 220 within search space 224 that is bounded by the sparsity ratio of sample 318 to select another sparsity ratio for producing compressed machine learning model 226 that optimizes a user-specified objective function 208 while meeting accuracy loss limit 206.).  
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Muralidharan wherein the optimal pruning scheme is determined by: determining pruning schemes based on different pruning methods and pruning rates evaluating performance of each to-be-optimized model corresponding to each pruning scheme separately according to Bayesian optimization, and obtaining evaluation performance of each to-be-optimized model; determining the optimal pruning scheme corresponding to optimal evaluation performance from obtained evaluation performance.
Wherein having An’s system for creating a pruned network for object detection wherein the optimal pruning scheme is determined by: determining pruning schemes based on different pruning methods and pruning rates evaluating performance of each to-be-optimized model corresponding to each pruning scheme separately according to Bayesian optimization, and obtaining evaluation performance of each to-be-optimized model; determining the optimal pruning scheme corresponding to optimal evaluation performance from obtained evaluation performance.
The motivation behind the modification would have been to allow for improvement to runtime performance, since both An and Muralidharan are systems that compress machine learning models. Wherein An’s system wherein increased the speed of the machine learning model, while Muralidharan’s system wherein further improved runtime performance. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Muralidharan et al. (US 20210065052 A1) Paragraph [0023].

Regarding claim 12, An in view of Ran and Muralidaharan teaches the method according to claim 11, 
An in view of Ran fails to explicitly teach wherein the evaluating the performance of each to-be-optimized model corresponding to each pruning scheme separately according to Bayesian optimization, and obtaining the evaluation performance of each to-be-optimized model;  comprises: initially evaluating the performance of each to-be-optimized model corresponding to each pruning scheme according to Bayesian optimization, and obtaining initial evaluation performance of each to-be-optimized model; screening each pruning scheme according to a preset number of iterations, and a degree of influence of a gradient of mean values of a Gaussian process obeyed by each to-be-optimized model on performance, and reevaluating performance of each to-be-optimized model corresponding to each screened pruning scheme; determining the evaluation performance of each to-be-optimized model according to the evaluation performance corresponding to each pruning scheme obtained after a last iteration is completed.
However, Muralidharan explicitly teaches wherein the evaluating the performance of each to-be-optimized model corresponding to each pruning scheme separately according to Bayesian optimization, and obtaining the evaluation performance of each to-be-optimized model (Fig. 2, Paragraph [0031]- Muralidharan discloses during each iteration, Bayesian optimizer 122 selects a sparsity ratio for compressed machine learning model 226, and compression engine 124 uses compression scheme 240 to produce compressed machine learning model 226 from reference machine learning model 222 in a way that adheres to the selected sparsity ratio. Compression engine 124 also measures an accuracy 236, performance 238, and/or another value related to the execution or output of compressed machine learning model 226 and provides the measurement to Bayesian optimizer 122.);  
comprises: initially evaluating the performance of each to-be-optimized model corresponding to each pruning scheme according to Bayesian optimization, and obtaining initial evaluation performance of each to-be-optimized model (Fig. 2, Paragraph [0031]- Muralidharan discloses during each iteration, Bayesian optimizer 122 selects a sparsity ratio for compressed machine learning model 226, and compression engine 124 uses compression scheme 240 to produce compressed machine learning model 226 from reference machine learning model 222 in a way that adheres to the selected sparsity ratio. Compression engine 124 also measures an accuracy 236, performance 238, and/or another value related to the execution or output of compressed machine learning model 226 and provides the measurement to Bayesian optimizer 122.);
screening each pruning scheme according to a preset number of iterations (Fig. 2, Paragraph [0031]- Muralidharan discloses Bayesian optimizer 122 and compression engine 124 perform a number of iterations 202-204 that sample different sparsity ratios 218 for producing compressed machine learning model 226 from reference machine learning model 222 using compression scheme 240.), 
and a degree of influence of a gradient of mean values of a Gaussian process obeyed by each to-be-optimized model on performance, and reevaluating performance of each to-be-optimized model corresponding to each screened pruning scheme (Fig. 2, Paragraph [0042]- Muralidharan discloses Bayesian optimizer 122 utilizes a Gaussian Process (GP) prior distribution for a given black box function ƒ(x), which can include accuracy 236 and/or objective function 208. The GP includes a distribution over functions that is specified by a mean function m:X.fwdarw.R and a covariance function K:X×X.fwdarw.R. As Bayesian optimizer 122 and compression engine 124 accumulate observations D.sub.1:t={x.sub.1:t, y.sub.1:t}, Bayesian optimizer 122 combines the prior distribution P(ƒ) with the likelihood function P(D.sub.1:t|ƒ) to produce the posterior distribution P(ƒ|D.sub.1:t)∝P(D.sub.1:t|ƒ)P(ƒ).); 
determining the evaluation performance of each to-be-optimized model according to the evaluation performance corresponding to each pruning scheme obtained after a last iteration is completed (Fig. 4 Paragraph [0060]- Muralidharan discloses Bayesian optimizer performs a second series of iterations that use a GP-UCB (or GP-LCB) sampling criterion to sample sparsity ratios that maximize (or minimize) the objective function, where the sampled sparsity ratios range from 0 to the first sparsity ratio.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Muralidharan wherein the evaluating the performance of each to-be-optimized model corresponding to each pruning scheme separately according to Bayesian optimization, and obtaining the evaluation performance of each to-be-optimized model;  comprises: initially evaluating the performance of each to-be-optimized model corresponding to each pruning scheme according to Bayesian optimization, and obtaining initial evaluation performance of each to-be-optimized model; screening each pruning scheme according to a preset number of iterations, and a degree of influence of a gradient of mean values of a Gaussian process obeyed by each to-be-optimized model on performance, and reevaluating performance of each to-be-optimized model corresponding to each screened pruning scheme; determining the evaluation performance of each to-be-optimized model according to the evaluation performance corresponding to each pruning scheme obtained after a last iteration is completed.
Wherein having An’s system for creating a pruned network for object detection wherein the evaluating the performance of each to-be-optimized model corresponding to each pruning scheme separately according to Bayesian optimization, and obtaining the evaluation performance of each to-be-optimized model;  comprises: initially evaluating the performance of each to-be-optimized model corresponding to each pruning scheme according to Bayesian optimization, and obtaining initial evaluation performance of each to-be-optimized model; screening each pruning scheme according to a preset number of iterations, and a degree of influence of a gradient of mean values of a Gaussian process obeyed by each to-be-optimized model on performance, and reevaluating performance of each to-be-optimized model corresponding to each screened pruning scheme; determining the evaluation performance of each to-be-optimized model according to the evaluation performance corresponding to each pruning scheme obtained after a last iteration is completed.
The motivation behind the modification would have been to allow for improvement to runtime performance, since both An and Muralidharan are systems that compress machine learning models. Wherein An’s system wherein increased the speed of the machine learning model, while Muralidharan’s system wherein further improved runtime performance. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Muralidharan et al. (US 20210065052 A1) Paragraph [0023].

Regarding claim 17, An in view of Ran and Muralidharan teaches the method according to claim 1, 
An further teaches an optimized target detection device (Fig. 7, Paragraph [0021]- An discloses the device may include at least one processor and a memory configured to store the at least one program.),
comprising a processor and a memory (Fig. 7, Paragraph [0021]- An discloses the device may include at least one processor and a memory configured to store the at least one program.), 
the memory is configured to store a program executable by the processor (Fig. 7, Paragraph [0021]- An discloses the device may include at least one processor and a memory configured to store the at least one program.),  
and the processor is configured to read the program in the memory and perform steps of the method according to claim 1 (Fig. 7, Paragraph [0021]- An discloses the at least one program, when executed by the at least one processor, causes the at least one processor to perform the model generation method or the object detection method according to any embodiment of the present application.). 

Regarding claim 18, An in view of Ran and Muralidharan teaches the method according to claim 1, 
An further teaches a computer storage medium (Fig. 8, Paragraph [0022]- An discloses embodiments of the present application further provide a computer-readable storage medium storing a computer program.), 
storing a computer program thereon (Fig. 8, Paragraph [0022]- An discloses embodiments of the present application further provide a computer-readable storage medium storing a computer program.),
wherein the computer program, when being executed by a processor, implements steps of the method according to claim 1 (Fig. 8, Paragraph [0022]- An discloses the computer program, when executed by a processor, implements the model generation method or the object detection method according to any embodiment of the present application.).

Claim 2 is rejected under 35 U.S.C 103 as being unpatentable over An et al. (US 20230131518 A1) hereafter referenced as An in view of Ran et al. (US 20210150170 A1) hereafter referenced as Ran, Muralidharan et al. (US 20210065052 A1) hereafter referenced as Muralidharan, and Coban et al. (US 20220086463 A1) hereafter referenced as Coban.

Regarding claim 2, An in view of Ran and Muralidharan the method according to claim 1, 
An in view of Ran and Muralidharan fails to explicitly teach wherein before the inputting the image comprising the object into the trained target detection model for detection, the method further comprises: decoding an obtained video stream comprising the object to obtain frames of images comprising the object in three-channel RGB format or, performing format conversion on an obtained unprocessed image comprising the object to obtain an image comprising the object in RGB format.
However, Coban explicitly teaches wherein before the inputting the image comprising the object into the trained target detection model for detection, the method further comprises: decoding an obtained video stream comprising the object to obtain frames of images comprising the object in three-channel RGB format (Fig. 4, paragraph [0114]- Coban discloses a diagram illustrating a machine-learning based video coding system where frames of the video coding data are divided among standardized channel inputs (e.g., the three channels of an RGB format frame).); 
or, performing format conversion on an obtained unprocessed image comprising the object to obtain an image comprising the object in RGB format (Fig. 5, Paragraph [0117]- Coban discloses with a video format where channels have different characteristics, a separate copy would be needed for each type of channel with different characteristics, or a format conversion precedes the encoding and follows the decoding. For YUV format data, the data could be converted to RGB data prior to the initial encoding layer 511 and converted back to YUV data following the final operations 524.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran and Muralidharan of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Coban wherein before the inputting the image comprising the object into the trained target detection model for detection, the method further comprises: decoding an obtained video stream comprising the object to obtain frames of images comprising the object in three-channel RGB format or, performing format conversion on an obtained unprocessed image comprising the object to obtain an image comprising the object in RGB format.
Wherein having An’s system for creating a pruned network for object detection wherein before the inputting the image comprising the object into the trained target detection model for detection, the method further comprises: decoding an obtained video stream comprising the object to obtain frames of images comprising the object in three-channel RGB format or, performing format conversion on an obtained unprocessed image comprising the object to obtain an image comprising the object in RGB format.
The motivation behind the modification would have been to allow for improvement to efficiency by standardizing information, since both An and Coban are systems that use machine learning models for detection. Wherein An’s system wherein increased the speed of the machine learning model, while Coban’s system wherein further improved efficiency. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Coban et al. (US 20220086463 A1) Paragraph [0006].

Claim 3 is rejected under 35 U.S.C 103 as being unpatentable over An et al. (US 20230131518 A1) hereafter referenced as An in view of Ran et al. (US 20210150170 A1) hereafter referenced as Ran, Muralidharan et al. (US 20210065052 A1) hereafter referenced as Muralidharan, and Miao et al. (US 20210166058 A1) hereafter referenced as Miao.

Regarding claim 3, An in view of Ran and Muralidharan the method according to claim 1, 
An in view of Ran and Muralidharan fails to explicitly teach wherein before the inputting the image comprising the object into the trained target detection model for detection, the method further comprises: under a condition of maintaining an original ratio of the image, normalizing a size of the image to obtain an image of a preset size.
However, Miao explicitly teaches wherein before the inputting the image comprising the object into the trained target detection model for detection, the method further comprises: under a condition of maintaining an original ratio of the image, normalizing a size of the image to obtain an image of a preset size (Fig. 1, Paragraph [0017]- Miao discloses the original images acquired in the plurality may have differences in format, size, or image quality. Therefore, after acquiring the plurality of original images, it is necessary to normalize each original image of the plurality of original images. The image database can be created using the normalized original images.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran and Muralidharan of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Miao wherein before the inputting the image comprising the object into the trained target detection model for detection, the method further comprises: under a condition of maintaining an original ratio of the image, normalizing a size of the image to obtain an image of a preset size.
Wherein having An’s system for creating a pruned network for object detection wherein before the inputting the image comprising the object into the trained target detection model for detection, the method further comprises: under a condition of maintaining an original ratio of the image, normalizing a size of the image to obtain an image of a preset size.
The motivation behind the modification would have been to allow for improvement to normalization of data, since both An and Miao are systems that use machine learning models for detection. Wherein An’s system wherein increased the speed of the machine learning model, while Miao’s system wherein further improved efficiency and normalization. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Miao et al. (US 20210166058 A1) Paragraph [0017-20 and 0031].

Claims 4-5 are rejected under 35 U.S.C 103 as being unpatentable over An et al. (US 20230131518 A1) hereafter referenced as An in view of Ran et al. (US 20210150170 A1) hereafter referenced as Ran, Muralidharan et al. (US 20210065052 A1) hereafter referenced as Muralidharan, and Sun et al. (US 20170206431 A1) hereafter referenced as Sun.

Regarding claim 4, An in view of Ran and Muralidharan the method according to claim 1, 
An further teaches wherein the inputting the image comprising the object into the trained target detection model for detection, and determining the coordinates of the object in the image and the category of the object, comprises: inputting the image comprising the object into the trained target detection model for detection (Fig. 3, Paragraph [0073]- An discloses in S320, the to-be-detected image is input into the object detection model, and an object detection result of a to-be-detected object in the to-be-detected image is obtained according to an output result of the object detection model.),  
and obtaining coordinates of each candidate frame of the object in the image and a confidence degree of a category corresponding to each candidate frame (Fig. 1, Paragraph [0039]- An discloses the sample image may be a frame of an image, a video sequence, etc. The sample annotation result may be a category probability and position coordinates.); 
An in view of Ran and Muralidharan fails to explicitly teach screening out each preferred candidate frame with a confidence degree greater than a threshold from candidate frames; determining the coordinates of the object in the image according to coordinates of each preferred candidate frame, and determining the category of the object according to a category corresponding to each preferred candidate frame.
However, Sun explicitly teaches screening out each preferred candidate frame with a confidence degree greater than a threshold from candidate frames (Fig. 4, paragraph [0070]- Sun discloses if the objectness score is above a threshold, the RPN can determine that an object or portion thereof is located at a particular point.); 
determining the coordinates of the object in the image according to coordinates of each preferred candidate frame (Fig. 5, Paragraph [0086]- Sun discloses the REG cony layer 516 can be configured to calculate four coordinate values for each anchor. In such examples, the coordinate values can include, for example, t.sub.x, t.sub.y, t.sub.w, and t.sub.h. In some examples, the REG cony layer 516 can take the coordinate values for the anchors, and calculate a shift in the center (t.sub.x, t.sub.y) of each of the anchors 508, and a shift in the height and width (t.sub.x, t.sub.y) (e.g., scale) of each of the anchors 508. In such examples, the anchors can be adjusted in shift and scale to most effectively cover the object.),
 and determining the category of the object according to a category corresponding to each preferred candidate frame (Fig. 3, Paragraph [0063]- Sun discloses the proposal classifier can generate a confidence score associated with each object category. In various examples, the confidence score can be based on a similarity between the object in the proposal and an object associated with a pre-defined object category.).  
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran and Muralidharan of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Sun screening out each preferred candidate frame with a confidence degree greater than a threshold from candidate frames; determining the coordinates of the object in the image according to coordinates of each preferred candidate frame, and determining the category of the object according to a category corresponding to each preferred candidate frame.
Wherein having An’s system for creating a pruned network for object detection wherein screening out each preferred candidate frame with a confidence degree greater than a threshold from candidate frames; determining the coordinates of the object in the image according to coordinates of each preferred candidate frame, and determining the category of the object according to a category corresponding to each preferred candidate frame.
The motivation behind the modification would have been to allow for improvement to region proposal, since both An and Sun are systems that use machine learning models for detection. Wherein An’s system wherein increased the speed of the machine learning model, while Sun’s system wherein improved the detection and classification of images. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Sun et al. (US 20170206431 A1) Paragraph [0116].

Regarding claim 5, An in view of Ran, Muralidharan, and Sun teaches the method according to claim 4, 
An in view of Ran and Muralidharan fails to explicitly teach wherein the determining the coordinates of the object in the image according to the coordinates of each preferred candidate frame, and determining the category of the object according to the category corresponding to each preferred candidate frame, comprises: screening out an optimal candidate frame from preferred candidate frames according to a Non-Maximum Suppression, NMS, method; determining coordinates of the optimal candidate frame as the coordinates of the object in the image, and determining a category corresponding to the optimal candidate frame as the category of the object.
However, Sun explicitly teaches wherein the determining the coordinates of the object in the image according to the coordinates of each preferred candidate frame, and determining the category of the object according to the category corresponding to each preferred candidate frame, comprises: screening out an optimal candidate frame from preferred candidate frames according to a Non-Maximum Suppression, NMS, method (Fig. 5, Paragraph [0087]- Sun discloses the RPN 500 component can use non-maximum suppression to determine the anchors that can provide the basis for each proposal. In such examples, the RPN 500 component can take the anchor with the highest objectness score (e.g., highest IoU overlap ratio) at a particular point, and suppress all other anchors at the particular point.); 
determining coordinates of the optimal candidate frame as the coordinates of the object in the image (Fig. 5, Paragraph [0087]- Sun discloses the RPN 500 component can use non-maximum suppression to determine the anchors that can provide the basis for each proposal. In such examples, the RPN 500 component can take the anchor with the highest objectness score (e.g., highest IoU overlap ratio) at a particular point, and suppress all other anchors at the particular point.), 
and determining a category corresponding to the optimal candidate frame as the category of the object (Fig. 5, Paragraph [0087]- Sun discloses the RPN 500 component can use non-maximum suppression to determine the anchors that can provide the basis for each proposal. In such examples, the RPN 500 component can take the anchor with the highest objectness score (e.g., highest IoU overlap ratio) at a particular point, and suppress all other anchors at the particular point. Additionally, the component of the RPN 500 can suppress other anchors that overlap significantly with the highest scoring anchors, but have lower objectness scores. Thus, the RPN 500 component can reduce the number of anchors that are used to determine proposals and the number of anchors that are output to the proposal classifier.). 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran and Muralidharan of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Sun wherein the determining the coordinates of the object in the image according to the coordinates of each preferred candidate frame, and determining the category of the object according to the category corresponding to each preferred candidate frame, comprises: screening out an optimal candidate frame from preferred candidate frames according to a Non-Maximum Suppression, NMS, method; determining coordinates of the optimal candidate frame as the coordinates of the object in the image, and determining a category corresponding to the optimal candidate frame as the category of the object.
Wherein having An’s system for creating a pruned network for object detection wherein the determining the coordinates of the object in the image according to the coordinates of each preferred candidate frame, and determining the category of the object according to the category corresponding to each preferred candidate frame, comprises: screening out an optimal candidate frame from preferred candidate frames according to a Non-Maximum Suppression, NMS, method; determining coordinates of the optimal candidate frame as the coordinates of the object in the image, and determining a category corresponding to the optimal candidate frame as the category of the object.
The motivation behind the modification would have been to allow for improvement to region proposal, since both An and Sun are systems that use machine learning models for detection. Wherein An’s system wherein increased the speed of the machine learning model, while Sun’s system wherein improved the detection and classification of images. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Sun et al. (US 20170206431 A1) Paragraph [0116].

Claim 6 is rejected under 35 U.S.C 103 as being unpatentable over An et al. (US 20230131518 A1) hereafter referenced as An in view of Ran et al. (US 20210150170 A1) hereafter referenced as Ran, Muralidharan et al. (US 20210065052 A1) hereafter referenced as Muralidharan, Sun et al. (US 20170206431 A1) hereafter referenced as Sun, and Chadha et al. (US 20220083819 A1) hereafter referenced as Chadha.

Regarding claim 6, An in view of Ran and Muralidharan teaches the method according to claim 5, 
An in view of Ran, Muralidharan, and Sun fails to explicitly teach wherein, when the image comprising the object is normalized in size and then input into the trained target detection model for detection, before the determining the coordinates of the optimal candidate frame as the coordinates of the object in the image, the method further comprises: converting the coordinates of the optimal candidate frame into a coordinate system of the image before normalization, and determining coordinates obtained after conversion as the coordinates of the optimal candidate frame.
However, Chadha explicitly teaches wherein, when the image comprising the object is normalized in size and then input into the trained target detection model for detection, before the determining the coordinates of the optimal candidate frame as the coordinates of the object in the image, the method further comprises: converting the coordinates of the optimal candidate frame into a coordinate system of the image before normalization (Fig. 2, Paragraph [0032]- Chadha discloses certain post processing and filtering algorithms 230 may perform certain procedures to output detected objects. For example, the post processing and filtering algorithms 230 may resize bounding boxes to original scale and filter confidence scores and associated objects (wherein updating the bounding box requires updating its coordinates)), 
and determining coordinates obtained after conversion as the coordinates of the optimal candidate frame (Fig. 2, Paragraph [0032]- Chadha discloses certain post processing and filtering algorithms 230 may perform certain procedures to output detected objects. For example, the post processing and filtering algorithms 230 may resize bounding boxes to original scale and filter confidence scores and associated objects (wherein updating the bounding box requires updating its coordinates))
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran, Muralidharan, and Sun of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Chadha converting the coordinates of the optimal candidate frame into a coordinate system of the image before normalization, and determining coordinates obtained after conversion as the coordinates of the optimal candidate frame.
Wherein having An’s system for creating a pruned network for object detection wherein converting the coordinates of the optimal candidate frame into a coordinate system of the image before normalization, and determining coordinates obtained after conversion as the coordinates of the optimal candidate frame.
The motivation behind the modification would have been to allow for translation back to the original frame of reference, since both An and Chadha are systems that use machine learning models for object detection. Wherein An’s system wherein increased the speed of the machine learning model, while Chadha’s system wherein improved the accuracy of object detection. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Chadha et al. (US 20220083819 A1) Paragraph [0014].

Claim 7 is rejected under 35 U.S.C 103 as being unpatentable over An et al. (US 20230131518 A1) hereafter referenced as An in view of Ran et al. (US 20210150170 A1) hereafter referenced as Ran, Muralidharan et al. (US 20210065052 A1) hereafter referenced as Muralidharan, Hassan et al. (US 20220398405 A1) hereafter referenced as Hassan, and Zhou et al. (Rethinking Bottleneck Structure for Efficient Mobile Network Design) hereafter referenced as Zhou.

Regarding claim 7, An in view of Ran and Muralidharan teaches the method according to claim 1,
An further teaches wherein: the backbone network is configured to extract features of the image (Fig. 1, Paragraph [0051]- An discloses the backbone network may be used for extracting a feature map.), 
An in view of Ran and Muralidharan fails to explicitly teach wherein the target detection model comprises a backbone network, a neck network, and a head network, the neck network is configured to perform feature fusion on features extracted by the backbone network to obtain a fused feature map;  the head network is configured to detect an object in the fused feature map to obtain coordinates of the object in the image and a category of the object.
However, Hassan explicitly teaches wherein the target detection model comprises a backbone network, a neck network, and a head network (Hassan discloses the backbone network comprises a convolutional neural network (CNN) (e.g., an EfficientNet) and a feature pyramid network (FPN) (e.g., a bi-directional FPN) coupled to the CNN. In one embodiment, a subset of the plurality of prediction heads receives input from the CNN, and a second subset of the plurality of prediction heads receives input from the FPN.), 
the neck network is configured to perform feature fusion on features extracted by the backbone network to obtain a fused feature map (Hassan discloses the FPN comprises a bidirectional FPN (BFPN). In brief, the FPN 206 receives a plurality of detected features from the backbone network 204 and repeatedly applies top-down and bottom-up bidirectional feature fusion.);  
the head network is configured to detect an object in the fused feature map to obtain coordinates of the object in the image and a category of the object (Hassan discloses these features are input into the pose estimation prediction head 400, which can determine pose parameters (e.g., X and Y, 2D coordinates of the body landmarks) for each feature identified by the backbone network and FPN. Further in Hassan … the fused features generated by FPN 206 can then be supplied to one or more downstream prediction heads 208a-208n for prediction or classification).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran, Muralidharan, and Sun of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Hassan wherein the target detection model comprises a backbone network, a neck network, and a head network, the neck network is configured to perform feature fusion on features extracted by the backbone network to obtain a fused feature map;  the head network is configured to detect an object in the fused feature map to obtain coordinates of the object in the image and a category of the object.
Wherein having An’s system for creating a pruned network for object detection wherein the target detection model comprises a backbone network, a neck network, and a head network, the neck network is configured to perform feature fusion on features extracted by the backbone network to obtain a fused feature map; the head network is configured to detect an object in the fused feature map to obtain coordinates of the object in the image and a category of the object.
The motivation behind the modification would have been to allow for higher classification accuracy, since both An and Hassan are systems that use machine learning models for object detection. Wherein An’s system wherein increased the speed of the machine learning model, while Hassan’s system wherein improved the accuracy of classification. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Hassan et al. (US 20220398405 A1) Paragraph [0029-30].
An in view of Ran, Muralidharan, and Hassan fails to explicitly teach, the backbone network comprises a plurality of depthwise convolutional network layers and a plurality of unit convolutional network layers, wherein the depthwise convolutional network layers are symmetrically distributed at head and tail of the backbone network, the unit convolutional network layers are distributed in middle of the backbone network.
However, Zhou explicitly teaches the backbone network comprises a plurality of depthwise convolutional network layers and a plurality of unit convolutional network layers (Fig. 3(b), Section 3.2 Paragraph [0007]- Zhou discloses Regarding the positions of the pointwise convolutions, instead of directly putting the depthwise convolution between the two pointwise convolutions, we propose to add depthwise convolutions at the ends of the residual path as shown in Figure 3(b). (wherein fig. 3(b), shows both unit convolutional and depthwise convolutional layers)),
wherein the depthwise convolutional network layers are symmetrically distributed at head and tail of the backbone network (Fig. 3(b), Section 3.2 Paragraph [0007]- Zhou discloses Regarding the positions of the pointwise convolutions, instead of directly putting the depthwise convolution between the two pointwise convolutions, we propose to add depthwise convolutions at the ends of the residual path as shown in Figure 3(b). (wherein fig. 3(b), shows both a depthwise convolutional layer at the head and tail of the network)),
the unit convolutional network layers are distributed in middle of the backbone network (Fig. 3(b), Section 3.2 Paragraph [0007]- Zhou discloses Regarding the positions of the pointwise convolutions, instead of directly putting the depthwise convolution between the two pointwise convolutions, we propose to add depthwise convolutions at the ends of the residual path as shown in Figure 3(b). (wherein fig. 3(b), shows 2 unit convolutional networks in the middle of the 2 depthwise layers making the backbone network)); 
 Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran, Muralidharan, and Sun of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Zhou the backbone network comprises a plurality of depthwise convolutional network layers and a plurality of unit convolutional network layers, wherein the depthwise convolutional network layers are symmetrically distributed at head and tail of the backbone network, the unit convolutional network layers are distributed in middle of the backbone network.
Wherein having An’s system for creating a pruned network for object detection wherein the backbone network comprises a plurality of depthwise convolutional network layers and a plurality of unit convolutional network layers, wherein the depthwise convolutional network layers are symmetrically distributed at head and tail of the backbone network, the unit convolutional network layers are distributed in middle of the backbone network.
The motivation behind the modification would have been to allow for higher classification accuracy, since both An and Zhou are both systems that use machine learning models for object detection. Wherein An’s system wherein increased the speed of the machine learning model, while Zhou’s system wherein improved the accuracy of classification. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Zhou et al. (Rethinking Bottleneck Structure for Efficient Mobile Network Design) Section 4.2 [0003-4].

Claim 8 is rejected under 35 U.S.C 103 as being unpatentable over An et al. (US 20230131518 A1) hereafter referenced as An in view of Ran et al. (US 20210150170 A1) hereafter referenced as Ran, Muralidharan et al. (US 20210065052 A1) hereafter referenced as Muralidharan, and Kim et al. (US 20200356860 A1) hereafter referenced as Kim.

Regarding claim 8, An in view of Ran and Muralidharan teaches the method according to claim 1, 
An in view of Ran and Muralidharan fails to explicitly teach wherein a data volume of training samples in the second training sample set is smaller than a data volume of training samples in the first training sample set.
However, Kim explicitly teaches wherein a data volume of training samples in the second training sample set is smaller than a data volume of training samples in the first training sample set (Kim discloses the accuracy or sensitivity of an entire artificial neural network may be first analyzed and then pruning may be performed thereon using a post-training method, and thus, an artificial neural network may be quickly trained using a smaller amount of data than a re-training method of the related art.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran and Muralidharan of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Kim converting the coordinates of the optimal candidate frame into a coordinate system of the image before normalization, and determining coordinates obtained after conversion as the coordinates of the optimal candidate frame.
Wherein having An’s system for creating a pruned network for object detection wherein converting the coordinates of the optimal candidate frame into a coordinate system of the image before normalization, and determining coordinates obtained after conversion as the coordinates of the optimal candidate frame.
The motivation behind the modification would have been to allow for faster training of the neural network while improving accuracy, since both An and Kim are systems that prune neural networks. Wherein An’s system wherein increased the speed of the machine learning model, while Kim’s system wherein improved the accuracy of the neural network and how quickly it is trained. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Kim et al. (US 20200356860 A1) Paragraph [0099-100].

Claim 14 is rejected under 35 U.S.C 103 as being unpatentable over An et al. (US 20230131518 A1) hereafter referenced as An in view of Ran et al. (US 20210150170 A1) hereafter referenced as Ran, Muralidharan et al. (US 20210065052 A1) hereafter referenced as Muralidharan, Ding et al. (CN 110717574 A) hereafter referenced as Ding, and Wu et al. (US 20200210228 A1) hereafter referenced as Wu.

Regarding claim 14, An in view of Ran and Muralidharan teaches the method according to claim 1,
An in view of Ran and Muralidharan fails to explicitly teach further comprising: determining a calculation amount of each network layer in the target detection model.
However, Ding discloses further comprising: determining a calculation amount of each network layer in the target detection model (Fig. 1, Page 10, Paragraph [0001]- Ding discloses The layer splitting module 730 is configured to obtain the calculation amount of each to-be-run layer through analysis, and determine whether the calculation amount of each to-be-run layer is greater than a preset threshold; for the to-be-run layer whose calculation amount is greater than the preset threshold, split The to-be-run layer obtains multiple sub-layers of the to-be-run layer.); 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran and Muralidharan of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Ding further comprising: determining a calculation amount of each network layer in the target detection model.
Wherein having An’s system for creating a pruned network for object detection wherein further comprising: determining a calculation amount of each network layer in the target detection model.
The motivation behind the modification would have been to allow for improved operating efficiency of the neural network, since both An and Ding are systems that use neural networks. Wherein An’s system wherein increased the speed of the machine learning model, while Ding’s system wherein improved the operating efficiency of the neural network. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Ding et al. (CN 110717574 A) Page 1 [0003].
An in view of Ran, Muralidharan, and Ding fails to explicitly teach using a Graphics Processing Unit, GPU, to process data of the network layer with the calculation amount higher than a data threshold, and using a Central Processing Unit, CPU, to process data of the network layer with the calculation amount not higher than the data threshold.
However, Wu discloses using a Graphics Processing Unit, GPU, to process data of the network layer with the calculation amount higher than a data threshold (Fig. 1, Paragraph [0030]- Wu discloses otherwise, if the current utilization rate is equal to or above the utilization threshold, the scheduling module 104 assigns the particular application to the GPU processing queue.), 
and using a Central Processing Unit, CPU, to process data of the network layer with the calculation amount not higher than the data threshold (Fig. 1, Paragraph [0030]- Wu discloses prior to comparing the CPU processing cost and the GPU processing cost for the particular application, the scheduling module 104 may determine whether a current utilization rate for the CPUs 130 is below a utilization threshold. If so, the scheduling module 104 may perform the determination of which processing queue to store the particular application as described above (e.g., comparing the CPU processing cost and the GPU processing cost).).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran and Muralidharan of using a Graphics Processing Unit, GPU, to process data of the network layer with the teachings of Wu calculation amount higher than a data threshold, and using a Central Processing Unit, CPU, to process data of the network layer with the calculation amount not higher than the data threshold.
Wherein having An’s system for creating a pruned network for object detection wherein using a Graphics Processing Unit, GPU, to process data of the network layer with the calculation amount higher than a data threshold, and using a Central Processing Unit, CPU, to process data of the network layer with the calculation amount not higher than the data threshold.
The motivation behind the modification would have been to allow for improved efficiency and throughput, since both An and Wu are systems that use neural networks. Wherein An’s system wherein increased the speed of the machine learning model, while Wu’s system wherein improved the efficiency and throughput. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Wu et al. (US 20200210228 A1) Paragraph [0018].

Claim 15 is rejected under 35 U.S.C 103 as being unpatentable over An et al. (US 20230131518 A1) hereafter referenced as An in view of Ran et al. (US 20210150170 A1) hereafter referenced as Ran, Muralidharan et al. (US 20210065052 A1) hereafter referenced as Muralidharan, and Chen et al. (US 20240046504 A1) hereafter referenced as Chen.

Regarding claim 15, An in view of Ran and Muralidharan teaches the method according to claim 1, 
An in view of Ran and Muralidharan fails to explicitly teach wherein after the determining the coordinates of the object in the image and the category of the object, the method further comprising: screening out an image comprising the object with a largest size from images in which the category belongs to a preset category; or, screening out an image comprising the object with a highest definition from images in which the category belongs to a preset category and a size of the object is larger than a size threshold; or, screening out an image comprising the object with a highest definition from images in which the category belongs to a preset category; or, screening out an image comprising the object with a largest size from images in which the category belongs to a preset category and a definition of the object is greater than a definition threshold.
However, Chen explicitly teaches wherein after the determining the coordinates of the object in the image and the category of the object, the method further comprising: screening out an image comprising the object with a largest size from images in which the category belongs to a preset category (Fig. 6, Paragraph [0126]- Chen discloses it is assumed that the first preset condition includes that the occupied area is the largest, and the first to-be-detected object belongs to the first preset category); 
or, screening out an image comprising the object with a highest definition from images in which the category belongs to a preset category (Fig. 6, Paragraph [0143]- Chen discloses the priority of the image is determined only by the definition of the photographed subject in the image. A higher definition of the photographed subject in the image indicates a higher priority of the image.)
and a size of the object is larger than a size threshold (Fig. 7, Paragraph [0123]- Chen discloses the first preset condition may include at least one of the following: the definition is greater than or equal to a first preset threshold, the occupied area is greater than or equal to a second preset threshold, first to-be-detected object belongs to the first preset category, and the region in which a focus point is located is in the region in which the first to-be-detected object is located. (wherein the occupied area threshold is the object size)); 
or, screening out an image comprising the object with a highest definition from images in which the category belongs to a preset category (Fig. 6, Paragraph [0143]- Chen discloses the priority of the image is determined only by the definition of the photographed subject in the image. A higher definition of the photographed subject in the image indicates a higher priority of the image.); 
or, screening out an image comprising the object with a largest size from images in which the category belongs to a preset category (Fig. 6, Paragraph [0126]- Chen discloses it is assumed that the first preset condition includes that the occupied area is the largest, and the first to-be-detected object belongs to the first preset category)
and a definition of the object is greater than a definition threshold (Fig. 7, Paragraph [0123]- Chen discloses the first preset condition may include at least one of the following: the definition is greater than or equal to a first preset threshold, the occupied area is greater than or equal to a second preset threshold, first to-be-detected object belongs to the first preset category, and the region in which a focus point is located is in the region in which the first to-be-detected object is located.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran and Muralidharan of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Chen wherein after the determining the coordinates of the object in the image and the category of the object, the method further comprising: screening out an image comprising the object with a largest size from images in which the category belongs to a preset category; or, screening out an image comprising the object with a highest definition from images in which the category belongs to a preset category and a size of the object is larger than a size threshold; or, screening out an image comprising the object with a highest definition from images in which the category belongs to a preset category; or, screening out an image comprising the object with a largest size from images in which the category belongs to a preset category and a definition of the object is greater than a definition threshold.
Wherein having An’s system for creating a pruned network for object detection wherein after the determining the coordinates of the object in the image and the category of the object, the method further comprising: screening out an image comprising the object with a largest size from images in which the category belongs to a preset category; or, screening out an image comprising the object with a highest definition from images in which the category belongs to a preset category and a size of the object is larger than a size threshold; or, screening out an image comprising the object with a highest definition from images in which the category belongs to a preset category; or, screening out an image comprising the object with a largest size from images in which the category belongs to a preset category and a definition of the object is greater than a definition threshold.
The motivation behind the modification would have been to allow for improved accuracy of detection by using the best image of the object, since both An and Chen are systems that use neural networks for object detection. Wherein An’s system wherein increased the speed of the machine learning model, while Chen’s system wherein improved the authenticity and reliability of the images. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Chen et al. (US 20240046504 A1) Paragraph [0042].

Claim 16 is rejected under 35 U.S.C 103 as being unpatentable over An et al. (US 20230131518 A1) hereafter referenced as An in view of Ran et al. (US 20210150170 A1) hereafter referenced as Ran, Muralidharan et al. (US 20210065052 A1) hereafter referenced as Muralidharan, Chen et al. (US 20240046504 A1) hereafter referenced as Chen, and Liu et al. (US 20230298382 A1) hereafter referenced as Liu.

Regarding claim 16, An in view of Ran, Muralidharan, and Chen teaches the method according to claim 15, 
An in view of Ran, Muralidharan, and Chen fails to explicitly teach further comprising: obtaining position information of each key point of the object in the screened image according to a preset key point; aligning the object in the screened image according to the position information; extracting features from the aligned image to obtain features of the object.
However, Liu explicitly teaches further comprising: obtaining position information of each key point of the object in the screened image according to a preset key point (Fig. 2, Paragraph [0057]- Liu discloses S2 is extracting facial feature points. The facial feature point detection method in the dlib library is adopted to extract 68 feature points of the face from the facial picture data in S1, and the 68 feature points correspond to the eyes, eyebrows, nose, mouth and facial contour respectively, and the facial feature point sequence P.sup.(t) is obtained. Further in … Liu discloses In the formula, (x.sub.i.sup.(t), y.sub.i.sup.(t)) is a coordinate position of the i-th key point of the facial picture in the t-th video frame in the video sequence, 1≤i≤68.); 
aligning the object in the screened image according to the position information (Fig. 1, Paragraph [0049]- Liu discloses S102. correcting the facial picture in each video frame on the basis of location information of facial feature point of the facial picture in each video frame, so that the facial picture in each video frame is aligned relative to the plane rectangular coordinate system.); 
extracting features from the aligned image to obtain features of the object (Fig. 1, Paragraph [0050]- Liu discloses S103. inputting the aligned facial picture in each video frame of the video sequence into a residual neural network, and extracting a spatial feature of a facial expression corresponding to the facial picture;).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention was made to combine the teachings of An in view of Ran and Muralidharan of an optimized target detection method, comprising: inputting an image comprising an object into a trained target detection model for detection with the teachings of Liu further comprising: obtaining position information of each key point of the object in the screened image according to a preset key point; aligning the object in the screened image according to the position information; extracting features from the aligned image to obtain features of the object.
Wherein having An’s system for creating a pruned network for object detection wherein further comprising: obtaining position information of each key point of the object in the screened image according to a preset key point; aligning the object in the screened image according to the position information; extracting features from the aligned image to obtain features of the object.
The motivation behind the modification would have been to allow for improved accuracy of recognition, since both An and Liu are systems that use neural networks for object detection. Wherein An’s system wherein increased the speed of the machine learning model, while Liu’s system wherein improved the accuracy of recognition. Please see An et al. (US 20230131518 A1), Paragraph [0006] and Liu et al. (US 20230298382 A1) Paragraph [0005 and 0039].

Allowable Subject Matter
Claim 13 along with its dependent claims respectively, are therefrom objected to as being dependent upon rejected base claim, claims 1, respectively but would be allowable if rewritten in independent form including all of the limitations of the base claims and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  
Regarding claim 13, the prior arts fail to explicitly teach, screening pruning schemes by replacing the pruning scheme of the to-be-optimized model with the gradient probability greater than a first threshold with the pruning scheme of the to- be-optimized model with the gradient probability less than a second threshold, wherein the first threshold is greater than the second threshold, as claimed in claim 13.

Conclusion
Listed below are the prior arts made of record and not relied upon but are considered
pertinent to applicant`s disclosure.
Xie et al. (US 20190370658 A1)- A method of compressing a pre-trained deep neural network model includes inputting the pre-trained deep neural network model as a candidate model. The candidate model is compressed by increasing sparsity of the candidate, removing at least one batch normalization layer present in the candidate model, and quantizing all remaining weights into fixed-point representation to form a compressed model. Accuracy of the compressed model is then determined utilizing an end-user training and validation data set. Compression of the candidate model is repeated when the accuracy improves. Hyper parameters for compressing the candidate model are adjusted, then compression of the candidate model is repeated when the accuracy declines. The compressed model is output for inference utilization when the accuracy meets or exceeds the end-user performance metric and target.....................Please see Fig. 1. Abstract.
Zhang et al. (US 20190205759 A1)- A method and apparatus for compressing a neural network are provided. A specific embodiment of the method includes: acquiring a to-be-compressed trained neural network; selecting at least one layer from layers of the neural network as a to-be-compressed layer; performing following processing steps sequentially on each of the to-be-compressed layers in descending order of the number of level of the to-be-compressed layer: determining a pruning ratio based on a total number of parameters included in the to-be-compressed layer, selecting a parameter for pruning from the parameters included in the to-be-compressed layer based on the pruning ratio and a parameter value threshold, and training the pruned neural network based on a preset training sample using a machine learning method; and determining the neural network obtained after performing the processing steps on the selected at least one to-be-compressed layer as a compressed neural network, and storing the compressed neural network......................Please see Fig. 1. Abstract.
JIANG et al. (US 20210406691 A1)- A method of multi-rate neural image compression includes selecting encoding masks, based on a hyperparameter, and performing a convolution of a first plurality of weights of a first neural network and the selected encoding masks to obtain first masked weights. The method further includes encoding an input image to obtain an encoded representation, using the first masked weights, and encoding the obtained encoded representation to obtain a compressed representation.......................Please see Fig. 1. Abstract.
SAMEK et al. (US 20220114455 A1)- Pruning and/or quantizing a machine learning predictor or, in other words, a machine learning model such as a neural network is rendered more efficient if the pruning and/or quantizing is performed using relevance scores which are determined for portions of the machine learning predictor on the basis of an activation of the portions of the machine learning predictor manifesting itself in one or more inferences performed by the machine learning (ML) predictor........................Please see Fig. 1. Abstract.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LUCIUS C.G. ALLEN whose telephone number is (703)756-5987. The examiner can normally be reached Mon - Fri 8-5pm (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chineyere Wills-Burns can be reached at (571)272-9752. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/LUCIUS CAMERON GREEN ALLEN/Examiner, Art Unit 2673   
                                                                                                                                                                                                     /CHINEYERE WILLS-BURNS/Supervisory Patent Examiner, Art Unit 2673
Read full office action
Prosecution Timeline

Jan 31, 2024
Application Filed
Mar 17, 2026
Non-Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/842,056
Patent 12597105
SEMANTIC-AWARE AUTO WHITE BALANCE
2y 5m to grant Granted Apr 07, 2026
17/791,058
Patent 12579755
OVERLAYING AUGMENTED REALITY (AR) CONTENT WITHIN AN AR HEADSET COUPLED TO A MAGNIFYING LOUPE
2y 5m to grant Granted Mar 17, 2026
18/032,016
Patent 12541972
Computing Device and Method for Handling an Object in Recorded Images
2y 5m to grant Granted Feb 03, 2026
18/245,942
Patent 12536247
Roughness Compensation Method and System, Image Processing Device, and Readable Storage Medium
2y 5m to grant Granted Jan 27, 2026
18/008,632
Patent 12529684
INSPECTION DEVICE, INSPECTION METHOD, AND INSPECTION PROGRAM
2y 5m to grant Granted Jan 20, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
71%
Grant Probability
99%
With Interview (+39.3%)
3y 0m
Median Time to Grant
Low
PTA Risk
Based on 38 resolved cases by this examiner. Grant probability derived from career allow rate.