Prosecution Insights
Last updated: April 19, 2026
Application No. 17/468,224

SELF-SUPERVISED VIDEO REPRESENTATION LEARNING BY EXPLORING SPATIOTEMPORAL CONTINUITY

Non-Final OA §103
Filed
Sep 07, 2021
Examiner
BRAHMACHARI, MANDRITA
Art Unit
2144
Tech Center
2100 — Computer Architecture & Software
Assignee
Huawei Technologies Co., Ltd.
OA Round
3 (Non-Final)
76%
Grant Probability
Favorable
3-4
OA Rounds
3y 0m
To Grant
99%
With Interview

Examiner Intelligence

Grants 76% — above average
76%
Career Allow Rate
311 granted / 407 resolved
+21.4% vs TC avg
Strong +30% interview lift
Without
With
+29.8%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
27 currently pending
Career history
434
Total Applications
across all art units

Statute-Specific Performance

§101
10.5%
-29.5% vs TC avg
§103
54.5%
+14.5% vs TC avg
§102
7.8%
-32.2% vs TC avg
§112
17.9%
-22.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 407 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . DETAILED ACTION Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission has been entered. The action is in response to claims dated 12/1/2025. Claims pending in the case: 1-5, 7-14, 16-20 Cancelled claims: 6, 15 This is a transferred case. Previous examinations were done by Tamara Kyle. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1-5, 7-9, 11-14, 16-18 and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li (“Frame Deletion Detection Based on Optical Flow Orientation Variation,” March 2021), in view of Luo (“Exploring Relations in Untrimmed Videos for Self-Supervised Learning”, August 2020). Regarding Claim 1, Li teaches, A method comprising: feeding a primary video segment, representative of a concatenation of a first and a second nonadjacent video segments obtained from a video source (Li: Pg. 37198 Fig. 2, Pg. 37203 section VI.A.1: video segments with deleted segments in between) , … extract spatiotemporal representations from the concatenation of the first and the second nonadjacent video segments (Li: Pg. 37198-37199 section B: optical flow analysis - involves extracting a small set of stable, repeatable image locations and compute compact descriptors for each; As illustrated in Fig. 3, flow analysis uses spatiotemporal information of each segment (adjacent and non-adjacent)); embedding, …, the primary video segment into a first feature output (Li: Pg. 37198-37199 section B-C: optical flow analysis using vector representation of video feature); providing the first feature output to a first perception network to generate a first set of … outputs indicating a temporal location of a discontinuous point associated with the primary video segment (Li: Pg. 37203-37204 section B: histogram of descriptor sequence indicates discontinuity); However, Li does not specifically teach, feeding to a deep learning backbone network having a 3-dimensional convolution layer configured to extract; embedding, via the deep learning backbone network; perception network to generate a first set of probability distribution outputs indicating a temporal location of a discontinuous point associated with the primary video segment; generating a first loss function based on the first set of probability distribution outputs; and optimizing the deep learning backbone network, by backpropagation of the first loss function; Luo teaches, feeding to a deep learning backbone network having a 3-dimensional convolution layer configured to extract spatiotemporal representations from the video segments (Luo Pg. 4-5 section C: fed to backbone having 3D convolution layers for feature extraction); embedding, via the deep learning backbone network (Luo Pg. 3 section III, Pg. 4-5 section C: fed to backbone having 3D convolution layers for feature extraction); perception network to generate a first set of probability distribution outputs indicating a temporal location of a discontinuous point associated with the primary video segment (Luo Pg. 5 col 1 [2]: probability distribution of relations; Pg. 4 section B: relations may be location of a point); generating a first loss function based on the first set of probability distribution outputs (Luo Pg. 5 col 1 [2]: loss function based on probability); and optimizing the deep learning backbone network, by backpropagation of the first loss function (Luo Pg. 6 section B and section C: supervised training); It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Li and Luo because the combination would enable using a backbone network for feature extraction and perform video analysis based on a probability distribution. One of ordinary skill in the art would have been motivated to combine the teachings because the combination would enable the use of backbone architecture which are widely used in the art for feature extraction due to their proven effectiveness. The Examiner further notes that the fact that the video segments are "nonadjacent" is not functionally involved in the steps recited as the limitation broadly claims analyzing the frames in the video. Hence, this does not distinguish the claimed invention from the prior art in terms of patentability. Regarding claim 2, Li and Luo teach the invention as claimed in claim 1 above and, further comprising: feeding a third video segment, nonadjacent to each of the first video segment and second video segment, obtained from the video source, to the deep learning backbone network (Luo Pg. 4-5 section C: fed to backbone having 3D convolution layers for feature extraction); embedding, via the deep learning backbone network, the third video segment into a second feature output (Luo Pg. 3 section III, Pg. 4-5 section C: fed to backbone having 3D convolution layers for feature extraction); and providing the first feature output and the second feature output to a second perception network to generate a second set of probability distribution outputs indicating one or more of a continuity probability and a discontinuity probability associated with the primary and the third video segments (Luo Pg. 5 col 1 [2]: probability distribution of relations; Pg. 4 section B: relations may be location of a point); generating a second loss function based on the second set of probability distribution outputs (Luo Pg. 5 col 1 [2]: loss function based on probability); and optimizing the deep learning backbone network, by backpropagation of at least one of the first loss function and the second loss function (Luo Pg. 6 section B and section C: supervised training). Regarding claim 3, Li and Luo teach the invention as claimed in claim 2 above and, further comprising: feeding a fourth video segment, obtained from the video source and temporally adjacent to the first and the second video segments, to the deep learning backbone network (Li: Pg. 37198 Fig. 2, Pg. 37203 section VI.A.1: video segments with deleted segments in between) (Luo Pg. 4-5 section C: fed to backbone having 3D convolution layers for feature extraction); embedding, via the deep learning backbone network, the fourth video segment into a third feature output (Luo Pg. 3 section III, Pg. 4-5 section C: fed to backbone having 3D convolution layers for feature extraction); providing the first feature output, the second feature output, and the third feature output to a projection network to generate a set of feature embedding outputs comprising: a first feature embedding output associated with the primary video segment; a second feature embedding output associated with the third video segment; and a third feature embedding output associated with the fourth video segment (Li: Pg. 37198-37199 section B-C: optical flow analysis using vector representation of video feature) (Luo Pg. 3 section III, Pg. 4-5 section C: fed to backbone having 3D convolution layers for feature extraction); generating a third loss function based on the set of feature embedding outputs (Luo Pg. 5 col 1 [2]: loss function based on probability); and optimizing the deep learning backbone network by backpropagation of at least one of the first loss function, the second loss function and the third loss function (Luo Pg. 6 section B and section C: supervised training). Regarding claim 4, Li and Luo teach the invention as claimed in claim 3 above and, wherein: each of the primary video segment and the third video segment is of length n frames, n being an integer equal or greater than two (Li: Pg. 37198 Fig. 2, Pg. 37203 section VI.A.1: video frames) (Luo: Pg. 4 col 1 [2], Pg. 4-5 section C: video frames). Regarding claim 5, Li and Luo teach the invention as claimed in claim 3 above and, wherein the fourth video segment is of length m frames, m being an integer equal or greater than one (Li: Pg. 37198 Fig. 2, Pg. 37203 section VI.A.1: video frames) (Luo: Pg. 4 col 1 [2], Pg. 4-5 section C: video frames). Regarding claim 7, Li and Luo teach the invention as claimed in claim 2 above and, wherein each of the first perception network and the second perception network is a multi-layer perception network (Luo Pg. 3 section III, Pg. 4-5 section C: multilayer network). Regarding claim 8, Li and Luo teach the invention as claimed in claim 3 above and, wherein the projection network is a light-weight convolutional network comprising one or more of: a 3-dimensional convolution layer, an activation layer, and an average pooling layer (Luo Pg. 3 section III, Pg. 4-5 section C: fed to backbone having 3D convolution layers for feature extraction). It would have been obvious to one killed in the art that a 3D-CNN includes one or more 3D convolution layers, activation layers and pooling layers. Regarding claim 9, Li and Luo teach the invention as claimed in claim 1 above and, wherein the video source suggests a smooth translation of content and motion across consecutive frame (Li: Pg. 37199 Fig. 3, Pg. 37203 section VI.A.1: video segments with objects in motion) Regarding Claim(s) 11-14, 16-18, 20 this/these claim(s) is/are similar in scope as claim(s) 1-4, 7-9 and 1 respectively. Therefore, this/these claim(s) is/are rejected under the same rationale. Allowable Subject Matter Claims 10 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Response to Arguments Applicants’ amendments have been fully considered and overcome the 35 U.S.C. § 112b rejection. These rejections are respectfully withdrawn. Applicants’ prior art arguments have been fully considered and are not persuasive. Applicant argues that the prior arts do not teach the specifics of the new limitation added by this amendment. The Examiner respectfully disagrees. Since the arguments pertain to the amended sections of the claim, the applicant is requested to refer to the cited sections and explanations in the rejection presented above. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure in attached 892. Any inquiry concerning this communication or earlier communications from the examiner should be directed to MANDRITA BRAHMACHARI whose telephone number is (571)272-9735. The examiner can normally be reached Monday to Friday, 11 am to 8 pm EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached at 571 272 4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /Mandrita Brahmachari/Primary Examiner, Art Unit 2144
Read full office action

Prosecution Timeline

Sep 07, 2021
Application Filed
Feb 07, 2025
Non-Final Rejection — §103
May 14, 2025
Response Filed
Sep 28, 2025
Final Rejection — §103
Dec 01, 2025
Response after Non-Final Action
Dec 29, 2025
Request for Continued Examination
Jan 17, 2026
Response after Non-Final Action
Mar 09, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12596746
AUDIO PREVIEWING METHOD, APPARATUS AND STORAGE MEDIUM
2y 5m to grant Granted Apr 07, 2026
Patent 12596469
COMBINED DATA DISPLAY WITH HISTORIC DATA ANALYSIS
2y 5m to grant Granted Apr 07, 2026
Patent 12591358
DAMAGE DETECTION PORTAL
2y 5m to grant Granted Mar 31, 2026
Patent 12585979
MANAGING DATA DRIFT AND OUTLIERS FOR MACHINE LEARNING MODELS TRAINED FOR IMAGE CLASSIFICATION
2y 5m to grant Granted Mar 24, 2026
Patent 12585992
MACHINE LEARNING WITH ATTRIBUTE FEEDBACK BASED ON EXPRESS INDICATORS
2y 5m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
76%
Grant Probability
99%
With Interview (+29.8%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 407 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month