DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In amendments dated 02/18/2026, applicant(s) amended claims 1, 12, 16 – 18 and 27, particularly the dependent claim 18 in which examiner previously indicated allowable subject matter. Claims 1 and 7 – 30 are still pending in this application.
Response to Arguments
Applicant argues on pages 9 - 10 of remarks filed on 02/18/2026 that the prior-cited references fail to meet the limitations of the newly-amended and re-arranged claims. However, as will be shown more completely in the Rejections to follow, the prior-cited references (Habibian et al. (U.S PreGrant Publication No. 2020/0304804 A1) and Mandt et al. (U.S PreGrant Publication No. 2020/0090069 A1) continue to meet the partial limitations of the claims, wherein Habibian merely teaches generating a video frame, encode said video frame into a latent vector/code/representation/space, modify/change said latent vector (e.g. code/representation or space) and decode it back to an original; and wherein Mandt merely teaches quantizing a latent code/representation to predict subsequent frames; decode the quantized latent code using a decoder of a generative model to reconstruct a video frame; where predict a difference/variable between frames using a predictive model. Therefore, Habibian and Mandt still do indeed disclose/suggest the partial aforementioned limitations, as required by claims 1, 12, 16 and 27. Therefore, the prior-cited reference continues to meet the partial limitations of the claims. Accordingly, these arguments fail to be persuasive.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 7, 8, 11, 12, 15 – 23, 26, 27 and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Habibian et al. (U.S PreGrant Publication No. 2020/0304804 A1, hereinafter 'Habibian') in view of Mandt et al. (U.S PreGrant Publication No. 2020/0090069 A1, hereinafter ‘Mandt’) and further in view of Li (U.S PreGrant Publication No. 2020/0128307 A1, hereinafter ‘Li’).
With respect to claim 1, Habibian teaches a method of processing video data, the method comprising: generating a video frame using received video data (e.g., generating a video frame using received video content, ¶0007 - ¶0010, ¶0055 with ¶0101, Fig. 2D);
encoding said video frame into a latent vector using an encoder part of a generative model in response to determining a reduction in generating the video frames using the received video data (e.g., encoding said video frame into a first feature/distance vector (e.g., code or representation) using an encoder part of a generative model upon determining a reduction in generating video frames using the received video content, ¶0011, ¶0056 - ¶0057, ¶0069 - ¶0072, ¶0088, Fig. 2D, Fig. 5);
modifying the latent vector (e.g., the first feature/distance vector (e.g., code or representation) is convolved into a second feature vector, ¶0057 with ¶0080); and
decoding the modified latent vector using a decoder part of the generative model to generate a new video frame (e.g., decoding a second feature vector using a decoder of the generative model to generate a reconstructed version, ¶0009, ¶0060, ¶0069, ¶0072 - ¶0073, ¶0106, ¶0148, Fig. 6); but fails to teach that
said latent vector when modifying is to represent a temporally subsequent new video frame; wherein determining said reduction comprises detecting or predicting an inter-time gap between frames displayed on a display being above a threshold using a prediction model; and
wherein said reduction is due to insufficient receive video data to generate the video frames; wherein determining the reduction comprises detecting or predicting an inter-time gap between frames being above a threshold using an inter-frame delay prediction model.
However, with respect to above difference (a), the aforementioned claimed limitations are well-known in the art as evidenced by Mandt. In particular Mandt teaches modifying the latent vector to represent a temporally subsequent new video frame (e.g., quantizing a latent code/representation to characterize (or predict) a subsequent video frames, ¶009, ¶0056 - ¶0068, Fig. 1B); and decoding the modified latent vector using a decoder part of the generative model to generate the new video frame (e.g., decoding the quantized latent code/representation using a decoder of the generative model to reconstruct a video frame, ¶0069 - ¶0071, ¶0086, ¶0107, Fig. 1B); wherein determining said reduction comprises detecting or predicting an inter-time gap (e.g., difference) between frames displayed on a display being above a threshold using a prediction model (e.g., wherein the reduction includes predicting an interpolation between multiple frames from a target content exceeding a threshold using a predictive model, ¶0056 - ¶0060, ¶0062, ¶0065 - ¶0070 & ¶0073; keep in mind that said target content is already presented (e.g., displayed real-time) while applying representation model, encoder, quantizer or decoder in an electronic device, ¶0073, ¶0079 with Figs. 3A - 3C)).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention was made to modify the method of Habibian as taught by Mandt since Mandt suggested within ¶0070 - ¶0071 that such modification would reduce compressed file length or redundant latent variables in order to improve compression; thereby to keep image quality intact.
Habibian, modified by Mandt, fails to teach above difference (b). However, also in the same field of endeavor of processing video data using encoder, vector(s), and decoder, Li teaches wherein a reduction is due to insufficient receive video data to generate a video frames (e.g. a reduction is due to data size or redundancy that cannot be used for real-time as they require entire data set in compression, ¶0008 - ¶0010, ¶0086, ¶0097 & ¶0217); and wherein determining the reduction comprises detecting or predicting an inter-time gap (e.g., difference) between frames being above a threshold using an inter-frame delay prediction model (e.g., detecting/predicting/determining a difference between frames being greater than a threshold using an inter-frame prediction model, ¶0091 - ¶0092 with ¶0207; wherein it's well-known in the art that a delay inter-frame prediction model is used to compensates for delay introduced by real world time for each frame).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention was made to modify the method of Habibian in view of Mandt as taught by Li since Li suggested within ¶0010, ¶0091 - ¶0092 with ¶0207 that such modification of having the inter-frame of Li instead of the intra-frame of Mandt would reduce computation complexity, power consumption and/or memory requirement in order to compress by comparing frames, storing only changes between them for smaller sizes and to improve compression ratio(s), coding efficiency, robustness in error-prone environments, and content-based random access.
With respect to claim 7, Habibian in view of Mandt and further in view of Li teaches the method of claim 1, wherein modifying the latent vector comprises moving a position corresponding to the latent vector in a latent space by a step for one or more new video frames (e.g., an autoencoder 400 may take a first training video (designated x) and map the first training video to a code z in a latent code space; wherein the encoder 402 may be implemented as a three-dimensional convolutional network such that the latent code space has at each (x, y, t) position a vector describing a block of video centered at that position. Each block may be set to an a priori defined horizontal and vertical size over a number of frames. The x coordinate may represent a horizontal pixel location in the block of video, the y coordinate may represent a vertical pixel location in the block of video, and the t position may represent a timestamp in the block of video. By using the three dimensions of horizontal pixel location, vertical pixel location, and time, the vector may describe an image patch across a plurality of frames. In some aspects, however, encoder 402 may map frames of a video in a two-dimensional space using a two-dimensional convolutional network. A code model used in mapping frames of a video in a two-dimensional space may make use of redundancy between adjacent frames (e.g., same or similar information included in successive frames), ¶0071, ¶0141).
With respect to claim 8, Habibian in view of Mandt and further in view of Li teaches the method of claim 7, wherein the size and/or direction of the step is dependent on one or more of the following: the rate of change in a sequence of video frames prior to the reduction in generating the video frames using the received video data; a prediction of future video frames; an application using the video data (e.g., at least frame rates, size of feature maps or video content (sequence) are some of the factors before reduction, ¶0056, ¶0068, ¶0087, ¶0096, ¶0140).
With respect to claim 11, Habibian in view of Mandt and further in view of Li teaches the method of claim 1, comprising displaying the video frames using one or more of the following applications; video on demand; real-time video; gaming: artificial reality; augmented reality (e.g., a receiving device can decompress the received compressed video for output (e.g., to a display) in video hosting service(s) or playback, ¶0003, ¶0066 - ¶0069, ¶0107 & ¶0147; Mandt also notes a augmented reality device, ¶0073).
With respect to claim 12, it's rejected for the similar reasons as those described in connection with claim 1, in addition that after learning, a DCN may be presented with new images and a forward pass through the network may yield an output 222 that may be considered an inference or a prediction of the DCN as shown in shown in ¶0042, ¶0060 and ¶0069.
With respect to claim 15, Habibian in view of Mandt and further in view of Li teaches the method of claim 12, comprising: forwarding the first generative model to a server and receiving an updated first generative model from the server; using the updated first generative model to encode the video frame and to decode the modified vector (e.g., a number of machine learnings and computational devices may be modified as desired. Further, the functionality included in any of the applications may be divided across any number of applications or other software that are stored and execute via any number of devices that are located in any number of physical locations, ¶0039, ¶0042, ¶0054, ¶0099; it can be an in-between device that receives data and transmits the received data to another device).
With respect to claim 16, it's rejected for the similar reasons as those described in connection with claim 1.
With respect to claim 17, Habibian in view of Mandt and further in view of Li teaches the apparatus of claim 16, wherein Li teaches operative to determine a reduction in generating video frames from received video data by: predicting or detecting a predetermined change in a performance metric for a connection used to receive the video (e.g., performance metric(s) given compression ratio(s) to determine, ¶0202 - ¶0203).
With respect to claim 18, Habibian in view of Mandt and further in view of Li teaches the apparatus of claim 16, wherein Li generally teaches the inter-frame delay prediction model utilizes a Long Short Term Memory (LSTM) network or a Wavenet architecture (e.g., wherein the inter-frame prediction of Li is associating at least a Long Short Term Memory (LSTM), ¶0197).
With respect to claim 19, Habibian in view of Mandt and further in view of Li teaches the apparatus of claim 17, wherein Li teaches the performance metric is one or more of the following: packet delay; packet variation; bandwidth; received power (e.g. the performance metric contains has least packet related to traffic rates, power consumption, or bandwidth saving, etc., ¶0010, ¶0125, ¶0203, Table 24).
With respect to claim 20, Habibian in view of Mandt and further in view of Li teaches the apparatus of claim 16, wherein the generative model is a variational autoencoder, VAE (e.g., VAE, ¶0079).
With respect to claim 21, Habibian in view of Mandt and further in view of Li teaches the apparatus of claim 16, operative to: receive a pretrained generative model; train the generative model using a sequence of video frames; and/or receive weight data to update the generative model (e.g., a deep convolutional network 350 may also include one or more fully connected layers, such as layer 362A (and layer 362B. The deep convolutional network 350 may further include a logistic regression (LR) layer 364. Between each layer 356, 358, 360, 362, 364 of the deep convolutional network 350 are weights that are to be updated, ¶0065).
With respect to claims 22 - 23, these are rejected for the similar reasons as those described in connection with claims 7 and 8, respectively.
With respect to claim 26, it's rejected for the similar reasons as those described in connection with claim 11.
With respect to claim 27, it's rejected for the similar reasons as those described in connection with claim 16.
With respect to claim 30, it's rejected for the similar reasons as those described in connection with claim 12.
Claims 9, 10, 24 and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Habibian in view of Mandt and Li, and further in view of Bushell et al. (U.S PreGrant Publication No. 2014/0362918 A1, hereinafter ‘Bushell’).
With respect to claim 9, Habibian in view of Mandt and further in view of Li teaches the method of claim 1, but fails to teach: comprising switching from new video frames generated using the decoder part to video frames generated using received video data in response to determining an increase in generating the video frames using the received video data.
However, in the same field of endeavor of video coding, encoder and decoder, Bushell teaches comprising switching from new video frames generated using the decoder part to video frames generated using received video data in response to determining an increase in generating the video frames using the received video data (e.g., after determining that there is an increase in generating video frames, change/switch video frames using decoder, abstract, ¶0021 - ¶0024 & ¶0060).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention was made to modify the method of Habibian in view of Mandt and further in view of Li as taught by Bushell since Bushell suggested in abstract, ¶0021 - ¶0024 & ¶0060 that such modification of applying filtering operation to the frame data would improve efficiency of coding operations apply a coder in order to reduce video bit rate.
With respect to claim 10, the integration of Habibian, Mandt, Li and Bushell teaches the method of claim 9, wherein Bushell teaches the switching comprises blending the new video frames generated using the decoder part and video frames generated using received video data, with a weight of the video frames generated using received video data in the blending increasing over time (As interpretation: e.g., a decoder may keep adding (combining) video frame (or additional droppable frames) using the decoder in real time due to limited resource of a device, ¶0014, ¶0021, ¶0027 - ¶0028 & ¶0034; or could be an interpolation of frames using a decoder, ¶0056 - ¶0057 of Mandt).
With respect to claims 24 - 25, these are rejected for the similar reasons as those described in connection with claims 9 and 10.
Claims 13, 14, 28 and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Habibian in view of Mandt and Li, and further in view of Zhang et al. (U.S PreGrant Publication No. 2021/0174072 A1, hereinafter ‘Zhang’).
With respect to claim 13, Habibian in view of Mandt and further in view of Li teaches the method of claim 12, wherein Habibian teaches all the limitations except for a second generative models received from a plurality of devices, the second generative models having respective model weights, and aggregate the model weights to generate the first generative model.
However, Zhang teaches receiving at least a second generative models from a plurality of devices (e.g., receiving a second generative model 30c, ¶0107, Fig. 7), the second generative models having respective weights (e.g., said second generative model 30c having model weights, ¶0105 - ¶0107, Fig. 7), and aggregate model weights to generate first generative model (e.g., adjust/apply weights to generate a new model, ¶0085, ¶0107 - ¶0110, Fig. 1).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention was made to modify the method of Habibian in view of Mandt and further in view of Li as taught by Zhang since Zhang suggested within ¶0105 - ¶0107 and Fig. 7 that such modification would improve accuracy in microexpression image recognition in order to obtain the best quality output possible.
With respect to claim 14, the integration of Habibian, Mandt, Li and Zhang teaches the method of claim 13, wherein Zhang teaches wherein a second generative model is received from the first device and used to generate the first generative model (Zhang: e.g., receiving a second generative model 30c, the second generative model 30c having model weights, and adjust weights to generate a new model, ¶0085, ¶0107, Fig. 7).
With respect to claim 28, Habibian in view of Mandt and further in view of Li teaches the apparatus of claim 27, wherein Habibian teaches all the limitations except for a second generative models received from a plurality of devices, the second generative models having respective model weights, and aggregate the model weights to generate the first generative model.
However, Zhang teaches receiving at least a second generative models from a plurality of devices (e.g., receiving a second generative model 30c, ¶0107, Fig. 7), the second generative models having respective weights (e.g., said second generative model 30c having model weights, ¶0105 - ¶0107, Fig. 7), and aggregate model weights to generate first generative model (e.g., adjust/apply weights to generate a new model, ¶0085, ¶0107 - ¶0110, Fig. 1).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention was made to modify the method of Habibian in view of Mandt and further in view of Li as taught by Zhang since Zhang suggested within ¶0105 - ¶0107 and Fig. 7 that such modification would improve accuracy in microexpression image recognition in order to obtain the best quality output possible.
With respect to claim 29, it's rejected for the similar reasons as those described in connection with claim 14.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JUAN M GUILLERMETY whose telephone number is (571)270-3481. The examiner can normally be reached 9:00AM - 5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Benny Q TIEU can be reached at 571-272-7490. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JUAN M GUILLERMETY/Primary Examiner, Art Unit 2682