Last updated: April 19, 2026
Application No. 18/355,788
METHOD AND APPARATUS WITH FRAME CLASS IDENTIFICATION

Non-Final OA §103
Filed
Jul 20, 2023
Examiner
VANCHY JR, MICHAEL J
Art Unit
2666
Tech Center
2600 — Communications
Assignee
Seoul National University R&Db Foundation
OA Round
1 (Non-Final)
Interview Optional

— +20.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 606 resolved cases, 2023–2026
Examiner Intelligence

VANCHY JR, MICHAEL J View full profile →
Grants 67% — above average
Career Allow Rate
404 granted / 606 resolved
+4.7% vs TC avg
Strong +20% interview lift
Without
With
+20.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
16 currently pending
Career history
622
Total Applications
across all art units
Statute-Specific Performance

§101
11.7%
-28.3% vs TC avg
§103
60.8%
+20.8% vs TC avg
§102
8.4%
-31.6% vs TC avg
§112
10.4%
-29.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 606 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 2, 4-6, 8-10, 12-15, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Sundareson, US 2020/0410241 A1 (Sundareson).
Regarding claim 1, Sundareson teaches a processor-implemented method (wherein various functions may be carried out by a processor executing instructions stored in memory) (Fig. 1; [0024]), the method comprising: 
generating respective final feature vectors of a plurality of frames of time-series data (generating feature vectors for a plurality of frames in video clips using feature extractor 156) (Fig. 1; [0058]), while sequentially processing the plurality of frames (wherein the frames are sequentially processed being each frame of a video clip) (Fig. 1; [0006] and [0037-0038]) by using a neural network comprising a plurality of layers (wherein the feature extractor 156 is a machine-learning model; such as a neural network) (Fig. 1; [0006] and [0058]); 
determining a class of the time-series data based on at least one final feature vector of the respective final feature vectors (wherein the video clip can be classified based on the one or more feature vectors) ([0006] and [0019-0020]); 
generating a reference feature vector based on the at least one final feature vector (wherein now a feature vector is associated with a class) ([0020]); 
calculating a similarity score (distance metric) ([0020-0023]) between the reference feature vector and a feature vector of at least one second frame (wherein the next frame can be compared to the feature vector to determine similarity based on using a distance metric) ([0020]), wherein the second frame comprises a non-final feature frame for which the final feature vector is not generated (wherein the second frame is not yet used to generate the final feature vector within the representative feature vector; cluster) ([0020]); and 
determining the at least one second frame to be a frame corresponding to the class (wherein the second frame corresponds to the class based on using a distance metric associated with the detected feature vector class) ([0020]), based on a result of comparing the similarity score and a threshold value (wherein the feature vectors similarity is based on a distance metric and a distance threshold value) (Abstract and [0020-0021]).  
Although Sundareson does not explicitly teach a “similarity score” it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention that the “distance metric” in Sundareson is a type of similarity score; wherein the closer the feature vectors are to each other the higher the similarity (Sundareson; [0020-0021]).

Regarding claim 2, Sundareson teaches wherein the generating the respective final feature vectors (generating feature vectors) ([0020] and [0058]) comprises: 
determining whether to proceed with a sequence for generating the final feature vector of a frame of the plurality of frames by using the layers for each of the plurality of frames (continuing to generate a feature vector and class for the clip until a distance is above a threshold, and thus a new feature vector class has been detected) (Abstract and [0021]); and 
generating the final feature vector of the frame when the sequence reaches a final stage (wherein the clip has a final feature vector when it reaches that last frame) ([0020-0021] and [0066-0067]).  

Regarding claim 4, Sundareson teaches wherein the neural network (feature extractor 156 that is a neural network) (Fig. 1; [0006] and [0058]) is configured to perform an operation for a frame of the plurality of frames based on an internal state of the layers calculated in a previous frame of which a final feature vector is generated (wherein the neural network is a trained feature vector and thus uses previous frames and vectors for generating the final feature vector) ([0006] and [0058-0060]).  

Regarding claim 5, Sundareson teaches wherein the determining the class comprises determining the class for a first frame comprising a frame determined to be the frame corresponding to the class of the at least one second frame or a frame of which a final feature vector is generated (determining if the frame corresponds to the final feature vector created, the clustered feature vector, and/or if it corresponds also to the class of the second/next frame based on a distance metric and threshold) ([0019-0021]).  

Regarding claim 6, Sundareson teaches wherein the generating the reference feature vector comprises generating reference feature vectors for each stage corresponding to each of stages comprising a sequence (wherein generating the feature vectors that correspond to each sequence and video clips of the video) (Fig. 1; [0019-0023]).  

Regarding claim 8, Sundareson teaches wherein a feature vector of the at least one second frame comprises a feature vector for each stage that is generated based on a sequence corresponding to the at least one second frame (wherein the second frame has a feature vector that is based on the sequence or video clip that the second frame is part of) ([0019-0021] and [0066-0067]).  

Regarding claim 9, Sundareson teaches wherein a feature vector of the at least one second frame comprises a feature vector for each stage corresponding to a stage at which a sequence stops (wherein the feature vector for the second frame does not correspond to the video clip feature vector, that clip sequence is stopped and a new class is generated or found) ([0019-0021]).  

Regarding claim 10, Sundareson teaches wherein the calculating the similarity score comprises calculating a similarity score (distance metric) ([0020-0023]) between a feature vector for each stage corresponding to a stage at which a sequence stops corresponding to the at least one second frame and a reference feature vector for each stage corresponding to a same stage as the stage at which the sequence stops (wherein the feature vector for the second frame does not correspond to the video clip feature vector, that video clip sequence is stopped and a new class is generated or found; all which is based on a distance metric) ([0019-0021]).
Although Sundareson does not explicitly teach a “similarity score” it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention that the “distance metric” in Sundareson is a type of similarity score; wherein the closer the feature vectors are to each other the higher the similarity (Sundareson; [0020-0021]).

Regarding claim 12, Sundareson teaches wherein, when the time-series data is a video (wherein the time-series data is a video clip) ([0006]) for detecting an abnormality in a production process (for detecting a specific class; wherein the class can be anything desired by the user, and thus could be an “abnormality”) ([0006], [0019], and [0022-23]), the class is the abnormality in the production process (for detecting a specific class; wherein the class can be anything desired by the user, and thus could be an “abnormality”) ([0006], [0019], and [0022-0023]), and a first frame is a frame corresponding to the abnormality in the production process (wherein the first frame can correspond to a specific class set by the user, which can be an “abnormality” class) ([0006], [0019], and [0022-0023]).  

Regarding claim 13, Sundareson teaches wherein, when the time-series data is streaming data and the class is a streaming filter (wherein the video classification system 100 may be part of a game streaming system) ([0031]), a first frame is a filtering target frame (wherein the first frame can be used as the class for detecting frames that match the feature vector and thus are part of that class) ([0057-0060] and [0063-0064]).  

Regarding claim 14, see the rejection made to claim 1, as well as prior art Sundareson for an electronic device (wherein the video classification system 100 may be implemented using one or more computing devices) (Figs. 1 and 6; [0025]), the device comprising: a processor configured to execute a plurality of instructions (processor for executing instructions) (Fig. 6; [0024]); and a memory storing the plurality of instructions (wherein the instructions are stored in a memory) (Fig. 6; [0024]), wherein execution of the plurality of instructions configures the processor (wherein the processor executes the instructions stored in the memory) (Fig. 6; [0024]), for they teach all the limitations within this claim.

Regarding claim 15, Sundareson teaches wherein a first frame comprises a frame determined to be the frame corresponding to the class of the at least one second frame and a final feature frame where the final feature vector is generated (determining if the frame corresponds to the final feature vector created, the clustered feature vector, and if it corresponds also to the class of the second/next frame based on a distance metric and threshold) ([0019-0021]).  

Regarding claim 17, see the rejection made to claim 10, as well as prior art Sundareson for an electronic device (wherein the video classification system 100 may be implemented using one or more computing devices) (Figs. 1 and 6; [0025]), the device comprising: a processor configured to execute a plurality of instructions (processor for executing instructions) (Fig. 6; [0024]); and a memory storing the plurality of instructions (wherein the instructions are stored in a memory) (Fig. 6; [0024]), wherein execution of the plurality of instructions configures the processor (wherein the processor executes the instructions stored in the memory) (Fig. 6; [0024]), for they teach all the limitations within this claim.

Claim(s) 3, 7, 11, 16, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sundareson, US 2020/0410241 A1 (Sundareson), and further in view of Alcock et al., US 2020/0082212 A1 (Alcock).
Regarding claim 3, Sundareson teaches wherein the sequence comprises: generating feature vectors (generating feature vectors for a plurality of frames in video clips using feature extractor 156) (Fig. 1; [0058]), for each of a plurality of stages (Fig. 1, item 156), of the frame by using a first neural network corresponding to each of the stages comprising the sequence (wherein the feature extractor 156 is a machine-learning model; such as a neural network) (Fig. 1; [0006] and [0058]) (for the frames of the video clips) (Fig. 1; [0019-0021]); and determining whether to proceed with the sequence corresponding to each of the stages and the feature vectors for each of the stages (continuing to generate a feature vector and class for the clip until a distance is above a threshold, and thus a new feature vector class has been detected) (Abstract and [0021]).  
However, Sundareson does not explicitly teach “using a second neural network”.
Alcock teaches a method and system for processing images for a search is provided, including: receiving a plurality of images selected from search results (Abstract); for each image in the plurality of images, retrieving a feature vector associated with the image (Abstract); selecting a subset of the feature vectors based on similarity of feature vectors associated with the images in the plurality of images (Abstract); performing a search for feature vectors in a database similar to the feature vectors in the subset of feature vectors (Abstract); and wherein multiple learning engines (such as multiple different neural networks) can be used ([0067]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Sundareson to include a second/plurality of neural networks since using multiple neural networks can have the goal of yielding results that are in average better than the each of the engines (neural networks) alone (Alcock; [0067]).

Regarding claim 7, Sundareson teaches wherein the generating the reference feature vector comprises determining, to be a reference feature vector for each stage (wherein generating the feature vectors that correspond to each sequence and video clips of the video) (Fig. 1; [0019-0023]), feature vectors for each stage calculated for each of stages comprising a sequence corresponding to each frame of which a final feature vector is generated (wherein taking the majority of frames to be considered part of a class for the video clip) ([0066-0067]); and wherein the feature vectors are clustered to identify classes ([0006]). 
However, Sundareson does not explicitly teach “an average of feature vectors”.
Alcock teaches a method and system for processing images for a search is provided, including: receiving a plurality of images selected from search results (Abstract); for each image in the plurality of images, retrieving a feature vector associated with the image (Abstract); selecting a subset of the feature vectors based on similarity of feature vectors associated with the images in the plurality of images (Abstract); performing a search for feature vectors in a database similar to the feature vectors in the subset of feature vectors (Abstract); and wherein using an average of feature vectors can be used (in an alternative exemplary embodiment a feature vector may be generated to represent each cluster, for example an average feature vector may be determined) ([0074]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Sundareson to include generating an average of the feature vectors with the clusters of Sundareson since it can be used to determine the best image/frame for the class (Alcock; [0074]).

Regarding claim 11, Sundareson teaches wherein the determining whether the at least one second frame to be the frame corresponding to the class comprises determining a second frame corresponding to the similarity score to be the frame corresponding to the class when the similarity score satisfies a distance threshold (wherein the next frame can be compared to the feature vector to determine similarity based on using a distance metric) ([0006] and [0020-0021]).  
Although Sundareson does not explicitly teach a “similarity score” it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention that the “distance metric” in Sundareson is a type of similarity score; wherein the closer the feature vectors are to each other the higher the similarity (Sundareson; [0020-0021]).
However, Sundareson does not explicitly teach “the similarity score is greater than or equal to the threshold value”.
Alcock teaches determining that the candidate feature vectors similar to at least one feature vector in the subset of feature vectors, i.e. passing a threshold confidence level ([0075]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Sundareson to include another way of determining similarity because it can have the goal of yielding results that are in average better than using a single type of similarity matching (Alcock; [0067]).

Regarding claim 16, see the rejection made to claim 7, as well as prior art Sundareson for an electronic device (wherein the video classification system 100 may be implemented using one or more computing devices) (Figs. 1 and 6; [0025]), the device comprising: a processor configured to execute a plurality of instructions (processor for executing instructions) (Fig. 6; [0024]); and a memory storing the plurality of instructions (wherein the instructions are stored in a memory) (Fig. 6; [0024]), wherein execution of the plurality of instructions configures the processor (wherein the processor executes the instructions stored in the memory) (Fig. 6; [0024]), for they teach all the limitations within this claim.

Regarding claim 18, Sundareson teaches a processor implemented method (wherein various functions may be carried out by a processor executing instructions stored in memory) (Fig. 1; [0024]), the method comprising processing, by a neural network (wherein the feature extractor 156 is a machine-learning model; such as a neural network) (Fig. 1; [0006] and [0058]), a first frame of a plurality of frames (a first frame of a plurality of frames within the video clip) ([0019-0020]) of time-series data in a sequence (wherein the time-series data is a video clip) ([0006]); 
determining, whether to stop the processing of the first frame before an end of the sequence (determining whether to stop based on the similarity distance metric exceeding a threshold value) ([0020-0021]); 
generating a final feature vector for the first frame responsive to reaching the end of the sequence (generating a feature vector for the first frame; wherein if the first frame is not similar to later frames, it can be its own class and thus reaches the end) ([0019-0021]); 
generating a first frame reference feature vector for the final feature vector (generating a feature vector for the first frame that is based on a specific class) ([0020-0021]); and 
determining a class (determine a specific class) ([0006], [0019], and [0022-0023]) of the time-series data based on the final feature vector (wherein the single frame is a single class for the time series data of the frame and/or the class for the whole video clip) ([0019-0023]). 
However, Sundareson does not explicitly teach “a series of neural networks”.
Alcock teaches a method and system for processing images for a search is provided, including: receiving a plurality of images selected from search results (Abstract); for each image in the plurality of images, retrieving a feature vector associated with the image (Abstract); selecting a subset of the feature vectors based on similarity of feature vectors associated with the images in the plurality of images (Abstract); performing a search for feature vectors in a database similar to the feature vectors in the subset of feature vectors (Abstract); and wherein multiple learning engines (such as a series of different neural networks) can be used ([0067]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Sundareson to include a second/plurality of neural networks since using multiple neural networks can have the goal of yielding results that are in average better than the each of the engines (neural networks) alone (Alcock; [0067]).

Regarding claim 19, Sundareson teaches further comprising: 
generating a second frame reference feature vector for a second frame responsive to stopping the processing of the second frame (generating a feature vector for the second frame based on stopping the similar classification; i.e. potential new class) ([0019-0023]); 
calculating a similarity score (distance metric) ([0020-0023]) between the first frame reference feature vector and the second frame reference feature vector (wherein the next frame can be compared to the feature vector to determine similarity based on using a distance metric) ([0020-0023]); and 
assigning the second frame to the class (wherein the second frame corresponds to the class based on using a distance metric associated with the detected feature vector class) ([0020]) based on a result of comparing the similarity score and a threshold value (wherein the feature vectors similarity is based on a distance metric and a distance threshold value) (Abstract and [0020-0021]). 
Although Sundareson does not explicitly teach a “similarity score” it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention that the “distance metric” in Sundareson is a type of similarity score; wherein the closer the feature vectors are to each other the higher the similarity (Sundareson; [0020-0021]).
 
Regarding claim 20, Sundareson teaches wherein the processing of the first frame comprises sequentially processing each frame of the plurality of frames (wherein the frames are sequentially processed being each frame of a video clip) (Fig. 1; [0006] and [0037-0038]) until reaching the end of the sequence for the each frame or responsive to stopping the processing of the each frame (wherein the feature vector for each frame either corresponds or doesn’t correspond to the video clip feature vector, that video clip sequence is stopped if there isn’t a similarity and a new class is generated or found; all which is based on a distance metric) ([0019-0021]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Aliamiri et al., US 2021/0142068 A1: teaches receiving the batch of frames of video data (Abstract); mapping, by a feature extractor, the frames of the batch to corresponding feature vectors in a feature space, each of the feature vectors having a lower dimension than a corresponding one of the frames of the batch (Abstract); and selecting a set of dissimilar frames from the plurality of frames of video data based on dissimilarities between corresponding ones of the feature vectors (Abstract).




Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL J VANCHY JR whose telephone number is (571)270-1193. The examiner can normally be reached Monday - Friday 9am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached at (571) 270-3717. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHAEL J VANCHY  JR/Primary Examiner, Art Unit 2666                                                                                                                                                                                                        Michael.Vanchy@uspto.gov
Read full office action
Prosecution Timeline

Jul 20, 2023
Application Filed
Jan 10, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/160,186
Patent 12602906
IMAGE RECOGNITION APPARATUS
2y 5m to grant Granted Apr 14, 2026
17/584,140
Patent 12579596
MANAGING ARTIFICIAL-INTELLIGENCE DERIVED IMAGE ATTRIBUTES
2y 5m to grant Granted Mar 17, 2026
18/533,652
Patent 12579634
REAL-TIME PROCESS DEFECT DETECTION AUTOMATION SYSTEM AND METHOD USING MACHINE LEARNING MODEL
2y 5m to grant Granted Mar 17, 2026
18/506,681
Patent 12573225
METHODS AND SYSTEMS OF FIELD DETECTION IN A DOCUMENT
2y 5m to grant Granted Mar 10, 2026
17/838,131
Patent 12551101
SYSTEM AND METHOD FOR DIGITAL MEASUREMENTS OF SUBJECTS
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
67%
Grant Probability
87%
With Interview (+20.1%)
3y 4m
Median Time to Grant
Low
PTA Risk
Based on 606 resolved cases by this examiner. Grant probability derived from career allow rate.