DETAILED ACTION
This is in response to the amendment filed on February 19th 2026.
Response to Arguments
Applicant’s arguments, see pg. 8-10, filed 2/19/26, with respect to the rejection(s) of claim(s) 1-20 under 103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of Yelton US 12,439,129 B2.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claim(s) 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Jen et al. US 2020/0327864 A1 in view of Sztejna et al. US 2022/0201061 A1 and Lee et al. US 2021/0133943 A1 and Yelton US 12,439,129 B2.
Regarding claim 1, Jen discloses a method performed by a media device (video processing system – see Fig. 1, item 100), comprising:
a video frame from a video signal that is received by the media device (obtain video frame – see paragraphs 18-19, Figs. 1-2);
providing the video frame as input to a machine learning model that analyzes … and outputs a set of picture quality parameter values … for enhancing a picture quality of the video frame (provide video frame to AI processor which produces picture quality parameters – Figs. 1-2, abstract, paragraphs 2, 16-18, 20);
receiving the set of picture quality parameter values output by the machine learning model (picture quality engine receives the parameters from the AI – see Figs. 1-2, abstract, paragraphs 2, 16-18, 20);
modifying the video frame based on the set of picture quality parameter values to generate a modified video frame (picture quality engine modifies the video frame – Figs. 1,2, 4 and paragraphs 16, 20-23); and
providing the modified video frame to a display device for presentation thereby (display enhanced video frames – abstract, paragraphs 2-4, 18 and Figs. 1-2).
Jen does not explicitly disclose reconstructing, by at least one computer processor of the media device, a video frame from a video signal but this is taught by Sztejna as a media processor that can process a stream and reconstruct a video frame (paragraph 17, Fig. 1). It would have been obvious to one of ordinary skill in the art before the filing date of the claimed invention to modify Jen with the frame reconstruction taught by Sztejan for the purpose of enhancing picture quality. Sztejna teaches that the reliability of network paths is not assured, so reconstruction of video frames from packets is useful for video processing (paragraphs 19-20).
The combination of Jen and Sztejna does not explicitly disclose a machine learning model that analyzes visual content of the video frame [to generate a modified video frame]. However, this is taught by Lee as using models to analyze individual video frames including images (i.e. visual content), and producing enhanced video content using the machine learning models output (abstract, Figs. 3-5, paragraphs 19 and 150-165). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Jen and Sztejna with the technique taught by Lee to have a machine learning model analyzing video content in order to enhance video quality. Lee suggests that by analyzing video frame content, only the particular frames that need enhancement are processed by the model, thereby saving time and resources (paragraphs 16-17 and 27-29).
The combination of Jen, Sztejna and Lee does not disclose determine a type of scene depicted in the visual content and output… a set of audio quality parameters values for enhancing … audio quality of an audio stream associated with the visual content of the video frame in a manner that is specific to the determined type of scene, wherein the set of audio quality parameter values comprise at least audio equalization settings; and
modifying the audio stream based on the set of audio quality parameter values to generate a modified audio stream for playback by the media device.
But this is taught by Yelton as a system for determining to adjust volume of individual audio components (e.g. equalization settings) based on a determined type of scene in a media segment/asset (abstract, Figs. 1 and 6, col. 1 ln. 34-41, col. 6 ln. 61 – col. 7 ln. 14 and col. 12 ln. 53-56; and see claim 1). Yelton also teaches using a neural network (i.e. machine learning model) to analyze the audio and determine audio component adjustment (see col. 9 ln. 61 – col. 10 ln. 25).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Jen, Sztejna and Lee to also adjust audio parameters based on a determined scene type as taught by Yelton for the purpose of enhancing user experience. Yelton suggests that by dynamically adjusting audio quality based on a determined scene type, a better user experience is provided because individual audio component adjustment improves upon the prior art of uniformly adjusting the volume and allows for user personalization via profiles (col. 1 ln. 20 – col. 3 ln. 25).
Regarding claim 2, Jen does not explicitly disclose reconstructing the video frame comprises one of: reconstructing the video frame from an encoded video signal received via a network. But this is taught by Sztejna as the video is received via a network (paragraph 17, Fig. 1). The motivation to combine is the same.
Regarding claim 3, Jen does not disclose a convolutional neural network (CNN) but this is well known in the art (see Lee paragraphs 67, 160). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Jen with the CNN taught by Lee. Lee suggests the CNN performs the process which allows the benefits discussed above to be realized.
Regarding claim 4, Jen discloses the machine learning model executes on the media device (model executes on device – see Figs. 1-2).
Regarding claim 5, Jen does not explicitly disclose the CNN executes on a neural processing unit but this is taught by Lee as explained above (Lee discloses neural network – see abstract, paragraph 19, 67). The motivation to combine is the same.
Regarding claim 6, Jen discloses providing, via a network, the video frame as input to a machine learning model (Figs. 1-2). Jen does not explicitly disclose that the model is executing on a remote device. However, Sztejna teaches the well-known concept of offloading processing to a remote device (paragraphs 21, 41 and Fig. 8). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Jen with the offloading technique taught by Sztejna so that the model was executed on a remote device. This is merely the combination of a well-known technique (e.g. offloading) according to its established function in order to yield a predictable result (i.e. less work for host).
Also, Lee explicitly discloses that the machine learning model may execute on a remote device such as a server (paragraph 125, Fig. 1). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Jen so the remote device executes the model as taught by Lee. Lee suggests that “heavy” models which require more power may better execute by a server, whereas lightweight models are executed by edge/client devices (paragraph 126).
Regarding claim 7, Jen discloses the set of picture quality parameter values comprises receiving one or more of: a sharpness parameter … a color parameter, a contrast parameter (paragraph 20).
Regarding claim 8, Jen discloses receiving one or more of: a value corresponding to a picture quality parameter to be applied to an entirety of the video frame; or a value corresponding to a picture quality parameter to be applied to a portion of the video frame (in one embodiment, picture quality enhancement is applying to the “frame”, thus teaching at least “an entirety of the video frame” – see paragraph 21, Fig. 2).
Regarding claim 9, it is a device claim that corresponds to the method of claim 1. The corresponding limitations are rejected for the same reasons. Jen also discloses a memory and processing to perform the operations as recited by claim (video processing “system” – includes memory and processor/circuits – see abstract, Fig. 1).
Regarding claims 10-16, they correspond to previously presented dependent claims 2-8 respectively. Thus, they are rejected for the same reasons.
Regarding claim 17, it is a non-transitory computer readable medium that corresponds to the method of claim 1; thus it is rejected for the same reasons.
Regarding claims 18-20, they correspond to claims 3 and 7-8 respectively, and thus are rejected for the same reasons.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
F. Nasiri, W. Hamidouche, L. Morin, N. Dhollande and G. Cocherel, "A CNN-Based Prediction-Aware Quality Enhancement Framework for VVC," in IEEE Open Journal of Signal Processing, vol. 2, pp. 466-483, 2021, doi: 10.1109/OJSP.2021.3092598; discloses using a CNN to enhance a single video frame by analysis of individual pixels (Section II).
Sun et al. US 2022/0188976 A1 discloses using a neural network to improve video picture quality (abstract, paragraph 12, Fig. 1).
Wang et al. US 2022/0261960 A1 discloses reconstructing a video frame (abstract) and using a model to improve the picture quality of the reconstructed video frame (paragraph 23, Fig. 1).
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASON D RECEK whose telephone number is (571)270-1975. The examiner can normally be reached Flex M-F 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Umar Cheema can be reached at 571-270-3037. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JASON D RECEK/Primary Examiner, Art Unit 2458