Last updated: April 19, 2026
Application No. 18/287,617
BEHAVIOR DETECTION METHOD, ELECTRONIC DEVICE, AND COMPUTER READABLE STORAGE MEDIUM

Final Rejection §101§103
Filed
Oct 19, 2023
Examiner
COLEMAN, STEPHEN P
Art Unit
2675
Tech Center
2600 — Communications
Assignee
ZTE CORPORATION
OA Round
2 (Final)
Interview Optional

— +11.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 877 resolved cases, 2023–2026
Examiner Intelligence

COLEMAN, STEPHEN P View full profile →
Grants 84% — above average
Career Allow Rate
737 granted / 877 resolved
+22.0% vs TC avg
Moderate +12% lift
Without
With
+11.6%
Interview Lift
resolved cases with interview
Typical timeline
2y 5m
Avg Prosecution
47 currently pending
Career history
924
Total Applications
across all art units
Statute-Specific Performance

§101
12.5%
-27.5% vs TC avg
§103
45.5%
+5.5% vs TC avg
§102
27.0%
-13.0% vs TC avg
§112
6.8%
-33.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 877 resolved cases
Office Action

§101 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
RESPONSE TO ARGUMENTS
35 USC 101 REJECTION
The examiner acknowledges the amendment of claims 1-3, 8-9, 13, 16 & 18-19 filed 12/22/2025.  After carefully reviewing applicant amendments, 35 USC 101 guidance and claim limitations, examiner respectfully disagrees.
	Applicant submits amended claims are a technical solution integrated into a practical application. Specifically, applicant submits identifying and detecting a pedestrian behavior from a video stream is itself a practical application. Applicant also argues that new CNN/channel exchange time sequence limitations provide significantly more. 
In response, examiner will apply 35 USC 101 Alice test. In Step 2A - Prong 1, this step inquires “does the claim recite an abstract idea, law or natural phenomenon”.   Examiner submits applicant new limitations of feature extraction/exchanging feature channels/fusing time sequence information is still under the classification of data manipulation and mathematical processing.  “2D CNN”, “convolutional layer”, “feature map”, “feature channels”, “classification feature vector” are implementation detail of an algorithm.
In Step 2A - Prong 2, this step inquires “does the claim recite additional elements that integrate the judicial exception into a practical application”. Examiner submits that applicant submitted technological improvement does recite a technological improvement to computer functionality (or a particular machine transformation) beyond generic machine learning processing.  Examiner further submits no practical application is present and claim is viewed as collecting data, applying an algorithm and outputting a classification implemented in a security context.  Examiner submits “2D CNN with convolutional layer, feature maps/channels, and channel exchange” are disclosed at a high relatively functional level, without tying those steps to a particular computer technology improvement (e.g. a specific memory architecture, a defined bandwidth saving mechanism, a particular hardware pipeline, a quantified computer particular non-generic image processing improvement.).  As to applicant improved overhead/resource use, the claim language recites results (“feature data fusing time sequence information”, “identify behavior”) rather than a concrete mechanism the produces the performed benefit. 
In Step 2B, the critical inquiry here is does the claim recite additional elements that amount to “significantly more” than the judicial exception?  Examiner submits CNN layers/FC layers/ thresholding/feature maps etc. are generic computer components and/or insignificant extra solution activity.  These generic computer components are used to execute the abstract idea of analyzing data and outputting a classification.  Examiner also asserts that “exchanging a part of the feature channels” is data manipulation/mathematical processing rather than a nonconventional computer technique. 
In view of above 35 USC 101 Alice Test, examiner submits rejection is sufficient and respectfully maintained. 
Prior Art Rejection 
	The examiner acknowledges the amendment of claims 1-3, 8-9, 13, 16 & 18-19 filed 12/22/2025.  Applicants arguments filed on (12/22/2025) have been fully considered but are deemed moot in view of new grounds of rejection. Due to the variation in claim scope via amendments a new ground of rejection is proper. 
CLAIM REJECTIONS - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to as ineligible under subject eligibility test. In the Subject Matter Eligibility Test for Products and Processes (Federal Register, Vol. 79, No. 241, dated Tuesday, December 16, 2014, page 74621), The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional device elements, which are recited at a high level of generality, provide conventional computer functions that do not add meaningful limits to practicing the abstract idea.
Claim 1
Step 1 
This step inquires “is the claim to a process, machine, manufacture or composition of matter?” Yes, 
Claim 1 – “Method” is a process.
Step 2A - Prong 1
This step inquires “does the claim recite an abstract idea, law or natural phenomenon”.  This claim appears to directed to an abstract idea.  
The limitation of “acquiring data of a plurality of video image frames from a video stream; and detecting a pedestrian behavior in the video stream according to the data of the plurality of video image frames, wherein the detecting a pedestrian behavior in the video stream according to the data of the plurality of video image frames comprises at least: inputting the data of the plurality of video image frames into a two-dimensional convolutional neural network, and identifying the pedestrian behavior in the video stream according to the data of the plurality of video image frames and an association relationship between time sequences of the data of the plurality of video image frames, wherein the two-dimensional convolutional neural network comprises at least one convolutional layer, inputting the data of the plurality of video image frames into the two-dimensional convolutional neural network, and identifying the pedestrian behavior in the video stream according to the data of the plurality of video image frames and the association relationship between the time sequences of the data of the plurality of video image frames comprises: performing feature extraction on the data of the plurality of video image frames through the at least one convolutional layer, to obtain a plurality of feature maps corresponding to the data of the plurality of video image frames one by one, with each feature map comprising a plurality of feature channels: exchanging a part of the feature channels of the feature maps to obtain feature data fusing time sequence information of the data of the plurality of video image frames, with the time sequence information representing the association relationship between the time sequences of the data of the plurality of video image frames; and identifying the pedestrian behavior in the video stream according to the feature data.”, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind (e.g. mathematical concepts, mental processes or certain methods of organizing human activity) but for the recitation of generic computer components.  
STEP 2A – PRONG 1 - CONCLUSION
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas.  Accordingly, the claim recites an abstract idea.  
Step 2A - Prong 2
This step inquires “does the claim recite additional elements that integrate the judicial exception into a practical application”.  This judicial exception is not integrated into a practical application.  
STEP 2A – PRONG 2 - CONCLUSION
Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.  The claim is directed to an abstract idea.  
Step 2B
  The critical inquiry here is does the claim recite additional elements that amount to “significantly more” than the judicial exception? The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.  The claim is not patent eligible. 
Dependent Claims
As to claim 2, this claim is directed to generic computer components (“CNN with at least one convolutional layer and at least one fully connected layer”), mental process (“Yes”) and insignificant extra-solution activity (“generic machine learning implementation”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 3, this claim is directed to generic computer components (“plurality of serially connected convolutional layers; feature maps/feature channels data structures”), mental process (“selecting/deciding which channels to exchange; determining when it’s the first/last layer and routing data”) and insignificant extra-solution activity (“generating feature maps; exchanging channels; passing intermediate results layer to layer”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 4, this claim is directed to generic computer components (“handling N sequential frames, N sequential feature map; grouped feature channels”), mental process (“choosing i and j and determining a corresponding map; selecting which groups to swap”) and insignificant extra-solution activity (“dividing channels into groups; exchanging groups; generic parameterization with integers N, I, j”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 5, this claim is directed to generic computer components (“fully connected layer producing a classification feature vector”), mental process (“determining a classification probability, identifying…according to the classification probability”) and insignificant extra-solution activity (“computing and outputting a vector of scores”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 6, this claim is directed to generic computer components (“threshold parameter; comparison operation; selection logic”), mental process (“determining whether probability > threshold, determining that no target behavior is identified”) and insignificant extra-solution activity (“filtering by threshold; outputting the chosen label”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 7, this claim is directed to generic computer components (“same CNN; output data structures of the network”), mental process (“detecting a spatial position…according to output data”) and insignificant extra-solution activity (“additional analysis of already generated outputs”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 8, this claim is directed to generic computer components (“classification feature vector + feature maps of a target convolutional layer”), mental process (“determining a spatial position…according to maps + vector”) and insignificant extra-solution activity (“using intermediate tensors to localize”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 9, this claim is directed to generic computer components (“edge extraction algorithm; feature maps; classification vector”), mental process (“determining an edge contour; determining the spatial position according to the edge contour”) and insignificant extra-solution activity (“deriving contours from score maps; using contours to report location”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 10, this claim is directed to generic computer components (“gradient/derivative computation; weight map; space prediction maps; resizing to frame size”), mental process (“choosing “highest classification confidence”, selecting which map to treat as second/third, deciding edge contour based on maps”) and insignificant extra-solution activity (“computing derivates/weights, multiplying, upsampling/generating maps”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 11, this claim is directed to generic computer components (“drawing routine; frame buffer”) and insignificant extra-solution activity (“drawing the edge contour on frames”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 12, this claim is directed to generic computer components (“video generation buffer area; storage/export/I/O”) and insignificant extra-solution activity (“storing annotated frames; generating/exporting a video clip”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 13, this claim is directed to generic computer components (“checking whether is identified; buffer existence check”), mental process (“determining whether any frame; generating a video clip”) and insignificant extra-solution activity (“generating/exporting a clip when there was an overlay”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 14, this claim is directed to generic computer components (“foreground area calculation; momentum calculation; sampling/preprocessing pipeline”), mental process (“determining…greater than a threshold, “determining starting point of sampling”) and insignificant extra-solution activity (“acquiring/sampling frames; preprocessing and data gathering and filtering before analysis”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 15, this claim is directed to generic computer components (“processor; memory; I/O interface”) and insignificant extra-solution activity (“generic “information interaction” between processor and memory implementing the method on standard hardware”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 16, this claim is directed to generic computer components (“computer-readable storage medium”) and mental process (“same classification/threshold decisions as claim 1 encapsulated as instructions”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 17, this claim is directed to generic computer components (“same CNN outputs; using “output data” of the CNN to detect spatial position”), mental process (“detecting/inferring spatial position from outputs”) and insignificant extra-solution activity (“additional analysis of already produced outputs”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 18, this claim is directed to generic computer components (“target conv layer; classification vector and feature maps”), mental process (“determining a spatial position according to maps and vectors”) and insignificant extra-solution activity (“localization post processing”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 19, this claim is directed to generic computer components (“buffer check; clip generation/export I/O”), mental process (“deciding whether any edge overlay frame exists and whether to generate/export a clip”) and insignificant extra-solution activity (“generating and exporting a video clip”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.
As to claim 20, this claim is directed to generic computer components (“foreground segmentation/area calc; motion momentum calc; sampling logic”), mental process (“threshold comparisons; choosing sampling start; uniform sampling decision”) and insignificant extra-solution activity (“data collection/sampling and preprocessing of frames”). Thus, this claim does not integrate the abstract idea into a practical application or constitute significantly more than the abstract.

CLAIM REJECTIONS - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-5, 13 & 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (U.S. Publication 2017/0262705) in view of Venkatesh (U.S. Publication 2021/0019633)
As to claim 1, Li discloses a behavior detection method, comprising: acquiring data of a plurality of video image frames from a video stream (702, Fig. 7 & [0065] discloses receiving a video stream; 710, Fig. 7 & [0068] discloses predicting an action label for each frame based on a prior hidden state plus attention and features, examples include actions like running/diving.); and detecting a pedestrian behavior in the video stream according to the data of the plurality of video image frames, wherein the detecting a pedestrian behavior in the video stream according to the data of the plurality of video image frames 
(708, Fig. 3B & [0049-0051] discloses calculating a feature map from an upper convolution layer of a CNN; Fig. 3B discloses DCNN with “Feature Maps”; 710, Fig. 7 & [0068] discloses predicts an action label for each frame based on a previous hidden state plus attention and features; examples include action like running/diving) 
Li further discloses performing feature extraction via convolution (e.g. feature extraction to classification; Convolution to subsampling fully connected; (Figs. 3A-3B; Fig. 7 discloses receiving video stream to calculating optic flow to generate attention map to calculate feature map to label prediction; [0017-0022, 0028-0031, 0034-0039] discloses feature maps are obtained for frames over time (Figs. 5A/5B/6 and [0028-0031] and discuss sequential feature maps/temporal sequence.)
Li is silent to exchanging a part of the feature channels of the feature maps to obtain feature data fusing time sequence information (i.e. mixing temporal information inside early CNN layers rather than relying solely on RNN/attention). 
However, Venkatesh discloses exchanging a part of the feature channels of the feature maps to obtain feature data fusing time sequence information (i.e. mixing temporal information inside early CNN layers rather than relying solely on RNN/attention). (Venkatesh discloses partitioning channels and shuffling/mixing channel groups between convolutions to exchange information across channel partitions (see “Channel Shuffle” as partitions 27/28/29 across time; 225, Fig. 2B Channel Shuffle; Fig. 2C (partitions across time and channels/time 221-223; arrows 234/235 indicate inter-partition movement; [0003-0007]; [0085-0092] discloses group convolution to shuffle to exchange information.)
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify LI’s disclosure to include the above limitations in order to reduce RNN overhead, latency and memory bandwidth while improving classification accuracy, and/or in order to fuse temporal associations inside the 2D CNN in a low cost, hardware friendly manner
Examiner submits after above modification claim 1’s “exchanging to obtain feature data fusing time sequence information” is met by channel shuffle/exchange across time/channel partitions, and identifying behavior according to the feature data met by Li’s prediction/classification from the resulting features (710, Fig. 7 – “Predict A Label”)
As to claim 2, Li in view of Venkatesh discloses everything as disclosed in claim 1. In addition, Li discloses wherein the two- dimensional convolutional neural network further comprises at least one fully-connected layer; the identifying the pedestrian behavior in the video stream according to the feature data comprises: identifying the pedestrian behavior in the video stream according to the feature data through the at least one fully-connected layer. (Li discloses identifying the behavior according to the feature data through at least one fully connected layer. See Feature Extraction to Classification with fully connected; Fig. 3A (classification head 322 “output”; 710, Fig. 7 “Predict a label; [0034-0039])
As to claim 3, Li in view of Venkatesh discloses everything as disclosed in claim 2. In addition, Li discloses multiple serial convolutional layers producing feature maps, passed forward for classifications. (Figs. 3A/3B & [0036-0039])
Li in view of Venkatesh current embodiment is silent to exchanging a part of feature channels to obtain “first data” and then conditionally routing “first data” as input to a next convolutional layer.
However, Venkatesh discloses grouped convolutions followed by channel shuffle/mixing across groups between successive convs (GConv1 -> shuffle -> GConv2), exchanging subsets of channels before feeding the next layer. See Fig. 2B (GConv1 221 -> 225 “Channel Shuffle” -> GConv2 223; groups 21-23, 24-26); See [0085-0092]
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li in view of Venkatesh current embodiment’s disclosure to include the above limitations in order to improve representational efficiency by mixing groups between serial conv layer to propagate cross-group (e.g. cross frame) information while keeping compute/cache costs low.


As to claim 4, Li in view of Venkatesh discloses everything as disclosed in claim 3. In addition, Li discloses sequential feature maps over N frames (temporal sequence); See Figs. 5A/5B/6 & [0028-0031]
Li in view of Venkatesh current embodiment is silent to dividing channels into N groups and exchanging the i-th group of channels with a group in a j-th feature map.
However, Venkatesh discloses channel partitions (“Partitions 27-29/37-49) and channel re-assignment shuffle across partitions, which maps to exchanging selected channel groups between feature maps/partitions. See Fig. 2C (partitions across time and channels), Fig. 2B (channel shuffle), See [0085-0092]
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li in view of Venkatesh current embodiment’s disclosure to include the above limitations in order to fuse temporal associations inside the 2D CNN in a low cost, hardware friendly manner (See Background of Invention and Summary)
As to claim 5, Li in view of Venkatesh discloses everything as disclosed in claim 2. In addition, Li discloses forming a classification output vector from CNN features via fully connected layers for behavior/action type. (See Fig. 3A (classification head 322 “output”), 710, Fig. 7 (“Predict a label”); [0036-0039]
As to claim 13, Li discloses everything as disclosed in claim 1. In addition, Li discloses CNNs with convolutional layer feeding classification (Figs. 3A/3B and corresponding disclosure)
Li is silent to a series of convolutional layers and a series of connected layers of decreasing size.
However, Venkatesh discloses CNN blocks (GConv -> FC) and fully connected components in the pipeline (Figs. 2A-2B and corresponding disclosure; system Fig. 1D)
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li’s disclosure to include the above limitations in order to enhance computing and memory efficiencies.
As to claim 15, , Li in view of Venkatesh discloses everything as disclosed in claim 1. In addition, LI discloses at least one processor; a memory having at least one computer program stored thereon, the at least one computer program, executed by the at least one processor, causing the at least one processor to implement the behavior detection method according to claim 1; and at least one I/O interface connected between the processor and the memory and configured to implement information interaction between the processor and the memory. (See Fig. 1 & [0031-0041])
As to claim 16, , Li in view of Venkatesh discloses everything as disclosed in claim 1. In addition, Li discloses A computer-readable storage medium having a computer program stored thereon, wherein the computer program, executed by a processor, causes the processor to implement the behavior detection method according to claim 1. (See [0012])
Claims 6-12 & 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (U.S. Publication 2017/0262705) in view of Venkatesh (U.S. Publication 2021/0019633) as applied in claim 5, above further in view of Bilenko et al. (U.S. Patent 10, 937, 156)
As to claim 6, Li in view of Venkatesh discloses everything as disclosed in claim 5. In addition, Li discloses producing class predictions (“label”), See 710, Fig. 7. 
Li in view of Venkatesh is silent to thresholding of classification probabilities to declare a target behavior. 
However, Bilenko’s Fig. 3 & Corresponding Disclosure discloses thresholding probabilities or gradient derived confidence to decide when to overlay/accept.
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li in view of Venkatesh current embodiment’s disclosure to include the above limitations in order to suppress false positives and stabilize detections before signaling target behavior.
As to claim 7, Li in view of Venkatesh discloses everything as disclosed in claim 1. In addition, Li discloses identifying the action over a frame sequence and producing attention maps (spatial saliency hints) from video frames, (706, Fig. 7 & [0066], See Generating An Attention Map)
Li in view of Venkatesh is silent to detecting the spatial position of the pedestrian behavior from the network output data (classification vector + feature maps) post identification.
However, Bilenko’s Fig. 1-3 & Corresponding Disclosure discloses a class specific pixel wise map overlay.
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li in view of Venkatesh current embodiment’s disclosure to include the above limitations in order to output a spatial position for the detected behavior.
As to claim 8, Li in view of Venkatesh & Bilenko discloses everything as disclosed in claim 7. In addition, Li discloses CNN conv layers + classification vector (Fig. 3A; [0036-0039]) and an attention saliency concept. 
Li in view of Venkatesh & Bilenko is silent to classification feature vector.
However, Bilenko’s Fig. 1-3 & Corresponding Disclosure discloses linking class outputs (probabilities) to pixel wise attributions (i.e. spatial maps) can be overlaid.
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li in view of Venkatesh & Bilenko current embodiment’s disclosure to include the above limitations in order to produce a class specific explanation offering transparent localization.
As to claim 9, Li in view of Venkatesh & Bilenko discloses everything as disclosed in claim 8. In addition, Li in view of Venkatesh & Bilenko discloses a class specific map.
Li in view of Venkatesh & Bilenko is silent to extracting an edge contour of the target behavior from feature map class vector attribution. 
However, Bilenko’s Fig. 1-3 & Corresponding Disclosure discloses overlaying pixel level attributions and discusses generating maps suitable for display; edge extraction from the attribution map.
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li in view of Venkatesh & Bilenko current embodiment’s disclosure to include the above limitations in order to delineate contours of the target behavior for clearer visualization/annotation.
As to claim 10, Li in view of Venkatesh & Bilenko discloses everything as disclosed in claim 9. In addition, Li discloses using gradients (derivates) of the classification output features to (i) compute gradients/weights (derivative to obtain gradient values). (ii) combine with feature responses into a prediction map (iii) select the top class map (iv) resize/overlay to frame size and (v) perform edge extraction before overlay/visual emphasis.
Li in view of Venkatesh & Bilenko is silent to computing the derivative of a classification feature vector feature maps to get a weight map, multiplying to get a first space prediction map, selecting the top class map, up sampling to frame size, and edge extracting.
However, Bilenko’s Fig. 1-3 & Corresponding Disclosure discloses gradient based saliency thresholding and overlay on the frame.
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li in view of Venkatesh & Bilenko current embodiment’s disclosure to include the above limitations in order to obtain a class specific spatial map compatible with Li’s CNN feature maps.
As to claim 11, Li in view of Venkatesh & Bilenko discloses everything as disclosed in claim 10. In addition, Li discloses overlaying a graphical indicator on the video frames to show localized evidence (Figs. 1-3)
Li in view of Venkatesh & Bilenko is silent to drawing the edge contour of the target behavior on the video frames.
However, Bilenko’s Fig. 1-3 & Corresponding Disclosure discloses overlays graphical indicators aligned to attribution on the displayed imagery. 
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li in view of Venkatesh & Bilenko current embodiment’s disclosure to include the above limitations in order to enhance legibility and operator acceptance.
As to claim 12, Li in view of Venkatesh & Bilenko discloses everything as disclosed in claim 11 but is silent to after drawing the edge contour of the target behavior on the plurality of video image frames, storing the plurality of video image frames drawn thereon with the edge contour of the target behavior into a video generation buffer area.
However, Bilenko’s Fig. 1-3 & Corresponding Disclosure discloses storing attribution values/map and performing overlaying during display.
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li in view of Venkatesh & Bilenko current embodiment’s disclosure to include the above limitations in order to provide action information to the user. 
As to claim 17, Li in view of Venkatesh discloses everything as disclosed in claim 2. In addition, Li discloses producing class labels and output data including feature maps/attention (Figs. 5B-7 and corresponding disclosure).
Li in view of Venkatesh is silent to detecting a spatial position post identification according to output data of the network. 
However, Bilenko’s Fig. 3 & Corresponding Disclosure discloses linking probability outputs to pixel level maps and overlays (Figs. 1-3 and Corresponding Disclosure)
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li in view of Venkatesh current embodiment’s disclosure to include the above limitations in order to provide the operator useful localization at low cost.
As to claim 18, Li in view of Venkatesh & Bilenko discloses everything as disclosed in claim 17 but is silent to outputting data comprising the classification feature vector and feature maps output by a target convolutional layer and using the output to determine the spatial position.  
However, Bilenko’s Fig. 3 & Corresponding Disclosure discloses class specific saliency tied to the output vector and overlay it on the image; (See Fig. 3 and Corresponding Disclosure)
It would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Li in view of Venkatesh & Bilenko current embodiment’s disclosure to include the above limitations in order to provide localization.

CONCLUSION
		No prior art was found for claims 14 & 19-20 in their current form. 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.                                                                                                                                                                              Any inquiry concerning this communication or earlier communications from the examiner should be directed to Stephen P Coleman whose telephone number is (571)270-5931. The examiner can normally be reached Monday-Thursday 8AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Moyer can be reached at (571) 272-9523. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Stephen P. Coleman
Primary Examiner
Art Unit 2675



/STEPHEN P COLEMAN/Primary Examiner, Art Unit 2675
Read full office action
Prosecution Timeline

Oct 19, 2023
Application Filed
Sep 19, 2025
Non-Final Rejection — §101, §103
Dec 22, 2025
Response Filed
Mar 02, 2026
Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/760,897
Patent 12601591
DISTANCE MEASURING DEVICE, DISTANCE MEASURING METHOD, PROGRAM, ELECTRONIC APPARATUS, LEARNING MODEL GENERATING METHOD, MANUFACTURING METHOD, AND DEPTH MAP GENERATING METHOD
2y 5m to grant Granted Apr 14, 2026
18/326,496
Patent 12602429
Video and Audio Multimodal Searching System
2y 5m to grant Granted Apr 14, 2026
18/524,039
Patent 12597146
INFORMATION PROCESSING APPARATUS AND CONTROL METHOD THEREOF
2y 5m to grant Granted Apr 07, 2026
18/357,150
Patent 12591961
MONITORING DEVICE AND MONITORING SYSTEM
2y 5m to grant Granted Mar 31, 2026
18/216,635
Patent 12586237
DEVICE, COMPUTER PROGRAM AND METHOD
2y 5m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
84%
Grant Probability
96%
With Interview (+11.6%)
2y 5m
Median Time to Grant
Moderate
PTA Risk
Based on 877 resolved cases by this examiner. Grant probability derived from career allow rate.