Last updated: April 18, 2026

Application No. 18/780,242

FRAME SELECTION FOR STREAMING APPLICATIONS

Final Rejection §102§103§DP

Filed

Jul 22, 2024

Examiner

KALAPODAS, DRAMOS

Art Unit

2487

Tech Center

2400 — Computer Networks

Assignee

Nvidia Corporation

OA Round

2 (Final)

Interview Optional

— +28.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 713 resolved cases, 2023–2026

Examiner Intelligence

KALAPODAS, DRAMOS View full profile →

Grants 79% — above average

Career Allow Rate

562 granted / 713 resolved

+20.8% vs TC avg

Strong +28% interview lift

Without

With

+28.2%

Interview Lift

resolved cases with interview

Typical timeline

2y 5m

Avg Prosecution

34 currently pending

Career history

747

Total Applications

across all art units

Statute-Specific Performance

§101

5.0%

-35.0% vs TC avg

§103

54.4%

+14.4% vs TC avg

§102

12.0%

-28.0% vs TC avg

§112

16.5%

-23.5% vs TC avg

Black line = Tech Center average estimate • Based on career data from 713 resolved cases

Office Action

§102 §103 §DP

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Claim Status
2.	Claims 2-21, are pending.
	The Double Patenting rejection over the issued Patent No. US 12,047,595 as it is pursuant to 37 CFR 1.78(f) or pre-AIA  37 CFR 1.78(b), is Withdrawn upon the amended claimed matter.

Response to Arguments
3.	Applicant’s arguments with respect to claims 2-21, have been considered but are moot in view of the new ground(s) of rejection.
	However, in response the Applicant’s Remarks at point (III), Examiner determines that the argument presented to independent claims 2, 12 and 19, wherein Tran and Kim fail to teach the recited limitations, are directed to Specification citing: “As described in the present specification, the neural network operates within the video encoding pipeline to determine whether a particular frame should be included in a set of reference frames maintained by the encoder. For example, paragraphs [0121]-[0124] of the present specification explain that a neural network (e.g., a multi-layer perceptron) analyzes features of incoming frames and features of previously cached frames and generates a Boolean output indicating whether the frame should be cached or rejected for inclusion in the set of reference frames. The determination therefore controls the composition of a reference-frame set used for subsequent predictive encoding and reconstruction of video content.”.
In lieu of the BRI, provisions of MPEP 2173.01 (I) addressing the Claim Interpretation and MPEP 2111 for Broadest Reasonable Interpretation.
In lieu of BRI (broadest reasonable interpretation) the claims are examined based on the recited language without reading the specification into the claims and determining the scope of claims in patent applications not solely on the basis of the claim language, but upon giving claims their broadest reasonable construction “in light of the specification as it would be interpreted by one of ordinary skill in the art.” [MPEP 2111 citing In re Am. Acad. of Sci. Tech. Ctr., 367 F.3d 1359, 1364 (Fed. Cir. 2004).]  
The "broadest reasonable interpretation standard" means that the words of the claim must be given their plain meaning unless the plain meaning is inconsistent with the specification. [Id. at 2111.01 citing In re Zletz, 893 F.2d 319, 321, 13 USPQ2d 1320, 1322 (Fed. Cir. 1989).]  Further, “[t]hough understanding the claim language may be aided by explanations contained in the written description, it is important not to import into a claim limitations that are not part of the claim. For example, a particular embodiment appearing in the written description may not be read into a claim when the claim language is broader than the embodiment.” [Id. at 2111.01 citing Superguide Corp. v. DirecTV Enterprises, Inc., 69 USPQ2d 1865, 1868 (Fed. Cir. 2004).].

The ordinary skilled would have found obvious to find that items with multiple attribute values may determine from queries of result sets by using Boolean logic, as inferred from Tran citing: “The system has textual index running in the cloud with shards, fields, suffix matching and exact phrase matching, with Boolean operator.”, (e.g., Tran Par.[0415]) to paraphrase the Boolean operator referenced above from Specification. 
The Remarks at point (iv) A, are directed to the amended claim matter which is considered in lieu of a new search and determination.
	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application does not currently name joint inventors. 
4.	Claims 2-21 are rejected under 35 U.S.C. 103, as being obvious over Bao Tran  (hereinafter Tran) (US 2014/0300758) and Kim Peter et al., (hereinafter Kim)  (WO 2020/112228 A1) in view of Carl Munkberg et al., (hereinafter Munkberg) (US 2021/0264562) as anticipative prior art under 35 U.S.C. 102(a)(1).
	Re Claim 1. (Canceled)  

	Re Claim 2. (Currently Amended) Tran discloses, a system (Fig.5c) comprising:  
	one or more processing units (implementing video analytics processing Par.[0008] or H.264 by a coding processing unit 762 Fig.5C comprising video analytics engine Par.[0013-0021, 0033] etc.) to: 
	locate, using a graphics processing unit (using graphic processing, (GPUs) attached/connected to the video source Par.[0189, 0245-0247, 0353, 0358-0359, 0383.. ], Figs.6-7), a depiction of an object in at least one image of a video stream (locating a particular object depiction in an image captured by the camera imaging unit, Par.[0198] the object being located in a specific image area Par.[0208] of a video stream Par.[0359-0361] where the GPUs are used to identify video objects and generate metadata, Par.[0019, 0392-0393, 0409]); 
	determine, using at least one neural network (using neural networks Par.[0033, 0048, 0183-0185] to process image data including a set of reference frames Fig.3E-F and 3G-3M Par.[0231-0232, 0237-0238), that the at least one image is to be included in a set of reference frames (the face recognition is a combination of a local image sample self-organizing map network and a convolutional network where the self-organizing map is trained on the vectors from the previous stage, i.e., the reference vectors of the previous stage, e.g., where “.., the same window as in the first step is stepped over all the images in the training and test sets.”, or further “a multilayer perceptron neural network, is trained on the newly created training set.”, obviating that the “at least one image is to be includes in a set of reference frames” and classified as suitable for image processing by a CNN, Par.[0231, 0384] as recited per Par.[0381]); and 
	generate an encoded video stream by encoding the set of reference frames (encoding the reference frames to generate a coded video stream Par.[0356-0359]).  
	Tran discloses “an automatic object recognition which is a combination of a local image sample representation (i.e., a reference image sample), a self-organizing map network and a convolutional network for face recognition”, according to Par.[0381], which implies the presence of the reference frame at the neural network layers, he does not determine that at least one image is to be included in a set of reference frames by using a neural network, according to the attributes of objects identified in the image(s).
	However, the analogous art to Kim, teaches about the neural network used to,
	locate, using a graphics processing unit, a depiction of an object in at least one image of a video stream (according to Fig.3, locating by using a DNN (308/608) at least one image (306/606) from a training set of image frames (302/602), Par.[0024]  and Par.[0051-0053] to identify the images that contain an object by detected features, Par.[0047]); 
	determine, using at least one neural network (determine at a Classifier (310/610) using the neural network (308/608)), that the at least one image is to be included in a set of reference frames (by using image recognition Par.[0028] to classify items into one of several category values of e.g., objects, by attributes Par.[0031-0035] further detailed at Fig.3, as an image (306/606) included in the training set (302/602) is recognized at (314/614) to be properly classified as part of the image set of frames Par.[0051-0054]); and
	generate an encoded video stream by encoding the set of reference frames (being also capable to encode the result i.e., generate a video stream, Par.[0070]).
	However, Tran and Kim do not expressly teach about the determination of a reference frame to be used in the frame/block reconstruction being based on the motion information identified in the image,
	In an analogous art, Munkberg teaches about the claimed,
	determine, using at least one neural network, that the at least one image is to be included in a set of reference frames based at least on motion information associated with the at least one image (determining the reference frame corresponding to an image and based on motion vectors and blur radii, Par.[0012] according to Fig.8 from a set of frames selected from a sequence of frames, Par.[0138-0147], where according to Fig.9A the neural network 210, used to generate the reconstructed image 920, includes both motion blur effects and simulates anti-aliasing effects Par.[0148-0149]); and
	 The one of ordinary skill would have found the incentive to search for advantages offered by the use of multiple processors (GPUs) for independent slice processing, found in Tran at Par.[0360] and using parallel GPU technique adaptable to  neural networks for object/image recognition, in order to extract simple features at higher resolution at Par.[0378-0379] and further seek the advantage of combining with known improvement techniques for improving the performance of the neural network (i.e., DNN) identifying newer structures for the feature extraction layers along with suggesting the analysis of detected motion capable of generating more than one quality level of video data Par.[0199], to further improving the parameter attribute identification, in Kim, Par.[0033-0034, 0063] and Fig.1 by generating an assessment 120, at the trained ML 116, by which having the incentive to seek other methods of considering the motion information in the determination of frame motion variations identified in Munkberg at to determine and select the reference frame to be used in reconstruction per Fig.8, Par.[0011-0012] by which post-processing techniques Par.[0005],  hence leading to the conclusion that the terms of the claimed matter are predictable.     
	
	Re Claim 3. (Previously Presented) Tran, Kim and Munkberg disclose, the system of claim 2, 
	Kim teaches that, wherein the at least one neural network is to use differences of an attribute of the object in the video stream to determine the at least one image to be included in the set of reference frames (at deep neural network (NN/DNN) calculating score attributes by machine-learning algorithms, i.e., based on differences between attributes of the identified features to find correlations among features, Par.[0034] containing objects, Par.[0047] according to the cost function returning the DNN performance maps the training examples to correct output, Par.[0049-0050]).  

	Re Claim 4. (Previously Presented) Tran, Kim and Munkberg disclose, the system of claim 2, wherein to locate the depiction of the object in the at least one image, the one or more processing units are further to:
	Kim teaches about, locate a face of the object depicted in the at least one image (Fig.3 Par.[0052]); 
	determine at least one attribute corresponding to one or more facial expressions of the face (determining object attributes at (108) in Fig.1, Par.[0033); and 
	include the at least one image in the set of reference frames based in part on the at least one attribute being greater than a threshold difference to the at least one attribute in one or more other reference frames of the set of reference frames (the training phase for image inclusion/selection includes a model satisfying the end-goal of accuracy threshold, Par.[0040]).  

	Re Claim 5. (Previously Presented) Tran, Kim and Munkberg disclose, the system of claim 2, wherein the at least one neural network is trained based at least in part on at least one of: 
	Kim teaches about, one or more past video streams, one or more past reference frames, and/or contents within the video stream (the input to the trained machine-learning program utilizes the message content, Par.[0036] where each of the neurons (208) of the NN (204) provide relational and sub-relational outputs for the current content of frames analyzed, Par.[0044]). 

	Re Claim 6. (Currently Amended) Tran, Kim and Munkberg disclose, the system of claim 2, wherein the at least one neural network includes an Multi-Layer Perceptron (MLP) neural network, and 
	Tran teaches about, an input to the MLP neural network is the set of reference frames and prior image data corresponding to one or more other reference frames of the set of reference frames (the convolutional neural networks (CNN)s adopts the perceptron as a supervised machine-learning algorithm, Par.[0379] and the multi-layer perceptron (MLP) Par.[0381] and representing reference frames corresponding to reconstructed prior images).  
	Munkberg teaches about the, set of reference frames (Pars.[0139-0143, ] Fig.8).
 
	Re Claim 7. (Previously Presented) Tran, Kim and Munkberg disclose, the system of claim 2, wherein the at least one neural network is trained based at least in part on at least one of: 
	Tran teaches about, one or more prior reference frames or one or more initial frames of at least one specific content of the video stream (tracking movement based on previous reference frames, Par.[0357]).  
	Kim teaches this limitation at (previous reference training set (302/602) with specific content (306/606) in Fig.3).
	
	Re Claim 8. (Previously Presented) Tran, Kim and Munkberg disclose, the system of claim 2, 
	Tran teaches that, wherein the at least one neural network is trained to generate or infer an appearance vector comprising one or more data values indicating one or more features contributing to inferred different attributes of the object in the at least one image (the GPU using vectors Par.[0388] and where the CNN calculate the motion vectors in the parallelized networks CNNs, in detecting face detection, etc., Par.[0384]). 
	Kim teaches this feature as (by supporting vector machines, (SVM), Par.[0030] and Fig.2 the vector input x, Par.[0043-0045, 0059]).   

	Re Claim 9. (Previously Presented) Tran, Kim and Munkberg disclose, the system of claim 2, 
	Kim teaches, wherein the at least one neural network is trained using a training framework that is a generative adversarial network (GAN) or a StyleGAN (using a gan_driver or cnn_driver at Par.[0094] Fig.12).  

	Re Claim 10. (Previously Presented) Tran, Kim and Munkberg disclose, the system of claim 2, 
	Tran teaches that, wherein the at least one neural network is trained using a training framework to infer or otherwise generate features for a specific object, subject, user, or person (a face detection CNN, Par.[0377-0379]).
	Kim teaches this feature at (Par.[0047])  

	Re Claim 11. (Previously Presented) Tran, Kim and Munkberg disclose, the system of claim 2, 
	Kim teaches, wherein the at least one neural network is trained using a training framework to infer or generate features for one or more objects and is configured to be further trained after deployment in the system (regenerating the NN architecture after the training, Par.[0083, 0095] Fig.13).  

	Re Claim 12. (Previously Presented) This claim represents the method implementing each and every limitation of the system claim 2, hence it is rejected on the same mapped evidence mutatis  mutandis.

	Re Claim 13. (Previously Presented) This claim represents the method implementing each and every limitation of the system claim 3, hence it is rejected on the same mapped evidence mutatis  mutandis.

	Re Claim 14. (Previously Presented) This claim represents the method implementing each and every limitation of the system claim 4, hence it is rejected on the same mapped evidence mutatis  mutandis.

	Re Claim 15. (Previously Presented) This claim represents the method implementing each and every limitation of the system claim 5, hence it is rejected on the same mapped evidence mutatis  mutandis.

	Re Claim 16. (Currently Amended) This claim represents the method implementing each and every limitation of the system claim 6, hence it is rejected on the same mapped evidence mutatis  mutandis.

	Re Claim 17. (Previously Presented) This claim represents the method implementing each and every limitation of the system claim 7, hence it is rejected on the same mapped evidence mutatis  mutandis.

	Re Claim 18. (Previously Presented) This claim represents the method implementing each and every limitation of the system claim 8, hence it is rejected on the same mapped evidence mutatis  mutandis.

	Re Claim 19. (Currently Amended) This claim represents the system implementing each and every limitation of the system claim 2, with the exception of naming the “graphics processing unit” GPU which in any processing application may be performed at the CPU level, e.g., CPU/GPU used for object recognition, as disclosed in Tran at Par.[0392-0393], hence it is rejected on the same mapped evidence mutatis  mutandis.

	Re Claim 20. (Previously Presented) This claim represents the method implementing each and every limitation of the system claim 3, reciting the same limitations in haec verba, hence it is rejected on the same mapped evidence mutatis  mutandis.

	Re Claim 21. (Previously Presented) Tran, Kim and Munkberg disclose, the system of claim 19, wherein the system is comprised in at least one of:
	Tran teaches about,
	a system for performing simulation operations; 
	a system for performing digital twin operations; 
	a system for performing light transport simulation; 
	a system for performing collaborative content creation for 3D assets; 
	a system for performing deep learning operations; 
	a system implemented using an edge device; 
	a system implemented using a robot; 
	a system for performing conversational AI operations; 
	a system for generating synthetic data; 
	a system incorporating one or more virtual machines (VMs); 
	a system implemented at least partially in a data center; or 
	a system implemented at least partially using cloud computing resources (the system comprising among other, a cloud based parallel video search engine, Par.[0407-0410, 0414-0418, 0424]).

Conclusion
5.	THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
	A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVE J CZEKAJ.  The examiner can normally be reached on 8-6:00 Monday-Thursday and every other Friday.
	 If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Czekaj can be reached at (571)272-7327.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/DRAMOS KALAPODAS/Primary Examiner, Art Unit 2487

Read full office action

Prosecution Timeline

Jul 22, 2024

Application Filed

Nov 06, 2025

Non-Final Rejection — §102, §103, §DP

Mar 02, 2026

Applicant Interview (Telephonic)

Mar 02, 2026

Examiner Interview Summary

Mar 12, 2026

Response Filed

Apr 06, 2026

Final Rejection — §102, §103, §DP (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/743,673

Patent 12604039

SIGN PREDICTION FOR BLOCK-BASED VIDEO CODING

2y 5m to grant Granted Apr 14, 2026

18/641,065

Patent 12598327

RESIDUAL CODING CONSTRAINT FLAG SIGNALING

2y 5m to grant Granted Apr 07, 2026

18/774,225

Patent 12598301

BDPCM-BASED IMAGE CODING METHOD AND DEVICE THEREFOR

2y 5m to grant Granted Apr 07, 2026

18/562,793

Patent 12593044

DEEP CONTEXTUAL VIDEO IMAGE COMPRESSION

2y 5m to grant Granted Mar 31, 2026

18/889,890

Patent 12593022

STEREOSCOPIC DISPLAY SYSTEM AND LIQUID CRYSTAL SHUTTER DEVICE

2y 5m to grant Granted Mar 31, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

79%

Grant Probability

99%

With Interview (+28.2%)

2y 5m

Median Time to Grant

Moderate

PTA Risk

Based on 713 resolved cases by this examiner. Grant probability derived from career allow rate.