DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Applicant(s) Response to Official Action
The response filed on 12/03/2025 has been entered and made of record.
Response to Arguments/Amendments
Presented arguments have been fully considered but are held unpersuasive. Examiner’s response to the presented arguments follows below.
Claim Rejections - 35 USC § 103
Summary of Arguments:
Regarding claim 1, the applicant argues Yoneda (US 2017/0186365 A1) in view of Krishnan (US 2018/0322941 A1) in further view of Davies (EP-4436048-A1) do not disclose:
None of Yoneda, Krishnan, or Davies discloses or suggests an intermediary transcoding network element. Each reference discloses operations performed at a transmitter or receiver; there is no intermediary transcoding network element performing either diffusion-based transcoding or standard codec transcoding. [Remarks: Page 5]
Regarding claims 2-9, the applicant argues:
“Claims 2-8 are dependent upon claim 1 and therefore should also be in a condition for allowance. Claim 9 has language consistent with amended claim 1 and therefore should also be in a condition for allowance, along with its dependent claim 10.” [Remarks: Page 5]
Examiner’s Response:
Regarding claim 1, the examiner contends:
Davies discloses an intermediary transcoding network element (Davies: Fig. 4 & Paras. [0094], [0141] disclose a transmitting edge server (TxES) 24 acting as an intermediary transcoding network element that receives scene observation (video data) from a transmitting UE [network source].). Please see the new mappings of Davies, teaching the intermediary transcoding network element feature under the 35 U.S.C. 103 rejection pertaining to claim 1 below.
Regarding claims 2-9, the examiner contends:
See examiner’s response for claim 1 above.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-10 are rejected under 35 U.S.C. 103 as being unpatentable over Yoneda et al., hereinafter referred to as Yoneda (US 2017/0186365 A1) in view of Krishnan et al., hereinafter referred to as Krishnan (US 2018/0322941 A1) in further view of Davies et al., hereinafter referred to as Davies (EP-4436048-A1).
As per claim 1, Yoneda discloses a method (Yoneda: Abstract), comprising:
receiving input frames of video information (Yoneda: Paras. [0074], [0077] disclose a device 10 including a decoder 30 that functions as a network element which receives input video information in the form of bitstream data (signal BD).);
selecting, (Yoneda: Paras. [0075], [0084], [0096], [0101] disclose a determination circuit 112 that functions as a mode selector, determining the necessity of rewriting an image and thereby selecting between an image rewriting mode [a first modality] and an image storing mode [a second modality], by the use of a standard video coding format, HEVC (H.265).);
generating video coding data (Yoneda: Paras. [0077], [0094]-[0095], [0097] disclose a video coding arrangement, comprising a decoder circuit 111 and loop filter 120, that processes input video frames to generate an output signal (SS). This signal is sent to the display panel electronics. The arrangement generates data according to the selected mode (i.e., either rewriting the image with new data or stopping the supply of signals).); and
sending the video coding data to the mobile device (Yoneda: Paras. [0077], [0094]-[0095], [0097] disclose a video coding arrangement, comprising a decoder circuit 111 and loop filter 120, that processes input video frames to generate an output signal (SS). This signal is sent to the display panel electronics, which is analogous to the claimed mobile device.).
However, Yoneda does not explicitly disclose “… receiving at an intermediary transcoding network element input frames of video information from a network source; receiving, at the intermediary transcoding network element through an uplink channel, a requirements indication from a mobile device configured to implement a diffusion model; selecting, by the intermediary transcoding network element based upon the requirements indication, … video coding modality utilizes diffusion …”.
Further, Krishnan is in the same field of endeavor and teaches an uplink channel receiver configured to receive a requirements indication from a mobile device (Krishnan: Paras. [0038], [0077] disclose a system where a mobile device's status (“battery life of the electronic device, processing power of the electronic device, ... and a user selected setting”) determines whether data analysis is performed locally on the device or remotely via a network. This status information functions as a requirements indication communicated from the mobile device to a remote system to control processing modality.);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Yoneda and Krishnan before him or her, to modify the video coding system of Yoneda to include the requirements indication from a mobile device feature as described in Krishnan. The motivation for doing so would have been to improve power management and computational load on receiving devices during video streaming by providing a more efficient video delivery system that adapts not only to content but also to the real-time resource constraints.
However, Yoneda-Krishnan do not explicitly disclose “… receiving at an intermediary transcoding network element input frames of video information from a network source; receiving, at the intermediary transcoding network element through an uplink channel, a requirements indication from a mobile device configured to implement a diffusion model; selecting, by the intermediary transcoding network element based upon the requirements indication, … video coding modality utilizes diffusion …”.
Furthermore, Davies is in the same field of endeavor and teaches receiving at an intermediary transcoding network element (TxES) input frames of video information from a network source (Davies: Fig. 4 & Paras. [0094], [0141] disclose a transmitting edge server (TxES) 24 acting as an intermediary transcoding network element that receives scene observation (video data) from a transmitting UE [network source].);
receiving, at the intermediary transcoding network element (TxES) through an uplink channel, a requirements indication from a mobile device configured to implement a diffusion model (Davies: Fig. 4 & Paras. [0051], [0149], [0223] disclose a mobile receiving device, signals its preferences or capabilities [requirements indication] to the transmitting device/server (TxES). It is axiomatic that this signaling occurs via a return channel [uplink] or uplink direction.);
selecting, by the intermediary transcoding network element based upon the requirements indication (Davies: Paras. [0136], [0189], [0206] disclose selecting, by the transmitting device/server that switches between generative compression based on diffusion models (first modality) and transmitting physical inputs compressed using conventional compression techniques (second modality) and is based on the negotiation/preferences of the mobile device.),
a video coding modality utilizing diffusion (Davies: Paras. [0103], [0115]-[0118], [0139] disclose the receiving mobile terminal device capable of reproducing/rendering data associated with a received updated diffusion model, thus utilizing diffusion.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Yoneda-Krishnan and Davies before him or her, to modify the video coding system of Yoneda-Krishnan to include the intermediary transcoding network element and diffusion feature as described in Davies. The motivation for doing so would have been to improve spatial and temporal stability for video streaming by providing techniques that mitigate semantic loss between regenerated output and the original observations.
As per claim 9, Yoneda discloses a transcoding network element (Yoneda: Abstract), comprising:
an input interface through which is received input frames of video information from a network source (Yoneda: Paras. [0074], [0077] disclose a device 10 including a decoder 30 that functions as a network element which receives input video information in the form of bitstream data (signal BD).);
a mode selector operative to select, (Yoneda: Paras. [0075], [0084], [0096], [0101] disclose a determination circuit 112 that functions as a mode selector, determining the necessity of rewriting an image and thereby selecting between an image rewriting mode [a first modality] and an image storing mode [a second modality], by the use of a standard video coding format, HEVC (H.265).);
a video coding arrangement for generating video coding data by processing the input frames of video information using the current video coding modality, the video coding information being sent to a mobile device configured to implement a (Yoneda: Paras. [0077], [0094]-[0095], [0097] disclose a video coding arrangement, comprising a decoder circuit 111 and loop filter 120, that processes input video frames to generate an output signal (SS). This signal is sent to the display panel electronics, which is analogous to the claimed mobile device. The arrangement generates data according to the selected mode (i.e., either rewriting the image with new data or stopping the supply of signals).).
However, Yoneda does not explicitly disclose “… an uplink channel receiver configured to receive a requirements indication from a mobile device; … based upon the requirements indication, … utilizes diffusion …”.
Further, Krishnan is in the same field of endeavor and teaches an uplink channel receiver configured to receive a requirements indication from a mobile device (Krishnan: Paras. [0038], [0077] disclose a system where a mobile device's status (“battery life of the electronic device, processing power of the electronic device, ... and a user selected setting”) determines whether data analysis is performed locally on the device or remotely via a network. This status information functions as a requirements indication communicated from the mobile device to a remote system to control processing modality.);
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Yoneda and Krishnan before him or her, to modify the video coding system of Yoneda to include the requirements indication from a mobile device feature as described in Krishnan. The motivation for doing so would have been to improve power management and computational load on receiving devices during video streaming by providing a more efficient video delivery system that adapts not only to content but also to the real-time resource constraints.
However, Yoneda-Krishnan do not explicitly disclose “… video coding modality … utilizes diffusion …”.
Furthermore, Davies is in the same field of endeavor and teaches a video coding modality utilizing diffusion (Davies: Paras. [0103], [0115]-[0118], [0139] disclose a receiving mobile terminal device capable of reproducing/rendering data associated with a received updated diffusion model, thus utilizing diffusion.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Yoneda-Krishnan and Davies before him or her, to modify the video coding system of Yoneda-Krishnan to include the diffusion feature as described in Davies. The motivation for doing so would have been to improve spatial and temporal stability for video streaming by providing techniques that mitigate semantic loss between regenerated output and the original observations.
As per claim 2, Yoneda-Krishnan-Davies disclose the method of claim 1 wherein the current video coding modality is the first video coding modality, the generating the video coding data including deriving metadata from the input frames of video data wherein the metadata is useable by the diffusion model on the mobile device to generate reconstructions of the input frames of video information (Davies: Paras. [0109], [0207], [0239] disclose a compression system based on diffusion models where a descriptive prompt is generated for a semantic object. This prompt acts as metadata derived from the input data and is used by a reconstruction model (identified as a diffusion model) to recreate an approximate version of the original observed data.).
As per claim 3, Yoneda-Krishnan-Davies disclose the method of claim 1 wherein the current video coding modality is the second video coding modality, the generating the video coding data including compressing the video frames using one of the following compression protocols: H.264, Motion JPEG, LL-HLS, VP9 (Yoneda: Paras. [0075], [0084], [0096], [0101] disclose a determination circuit 112 that functions as a mode selector, determining the necessity of rewriting an image and thereby selecting between an image rewriting mode [a first modality] and an image storing mode [a second modality], by the use of a standard video coding format and Davies: [0123], [0136], [0170], [0244] disclose the system can revert to transmitting physical inputs (i.e., non-prompt compressed data which has been compressed using conventional techniques) and that the conventional codec can be a non-generative compression codec (e.g., H.264/Advanced Video Coding (AVC), JPEG, H.265, etc.).).
As per claim 4, Yoneda-Krishnan-Davies disclose the method of claim 1 wherein the selecting results in switching from the first video coding modality to the second video coding modality (Yoneda: Paras. [0075], [0084], [0096], [0101] disclose a determination circuit 112 that functions as a mode selector, determining the necessity of rewriting an image and thereby selecting between an image rewriting mode [a first modality] and an image storing mode [a second modality], by the use of a standard video coding format, HEVC (H.265).).
As per claim 5, Yoneda-Krishnan-Davies disclose the method of claim 1 wherein the selecting results in switching from the second video coding modality to the first video coding modality (Yoneda: Paras. [0075], [0084], [0096], [0101] disclose a determination circuit 112 that functions as a mode selector, determining the necessity of rewriting an image and thereby selecting between an image rewriting mode [a first modality] and an image storing mode [a second modality], by the use of a standard video coding format, HEVC (H.265).).
As per claim 6, Yoneda-Krishnan-Davies disclose the method of claim 1 further including:
generating a set of weights for the diffusion model (Davies: Paras. [0122], [0139], [0161] disclose generating model updates (equivalent to a set of weights).);
sending the set of weights to the mobile device (Davies: Paras. [0122], [0139], [0161] disclose a shared reconstruction model that is available to both the transmitter and receiver. It further teaches that this model is dynamically updated and/or extended to previously unseen entities by generating new learned prompts. These learned prompts, which are derived using techniques like textual inversion, function as updates or fine-tuning weights for the model. Furthermore, when an unseen entity is detected, the newly generated prompt for that unseen entity are added to the local model of the transmitting device and sent to the receiving device so that the model of the receiving device is enhanced as well. This describes generating model updates (equivalent to a set of weights) and sending them to the receiving device.).
As per claim 7, Yoneda-Krishnan-Davies disclose the method of claim 6 wherein the generating the set of weights includes training a first artificial neural network using the frames of training image data where values of the weights are adjusted during the training (Davies: Paras. [0139], [0161] disclose a shared reconstruction model (the first artificial neural network) is used at the transmitter side. This model is updated by learning new prompts for previously unseen objects, which is a form of training.);
wherein the mobile device uses the set of weights to establish a second artificial neural network configured to substantially replicate the first artificial neural network (Davies: Paras. [0139], [0161] disclose the new prompts are sent to the receiving device so that the model of the receiving device is enhanced as well. This describes the receiving device using the sent information (the weights/prompts) to update its own model (the second artificial neural network) to mirror the enhancement made on the transmitter's side, thereby substantially replicating the updated functionality of the first network.).
As per claim 8, Yoneda-Krishnan-Davies disclose the method of claim 1 further including:
receiving encoded video content from a network source (Yoneda: Paras. [0074], [0077] disclose a device 10 including a decoder 30 that functions as a network element which receives input video information in the form of bitstream data (signal BD).);
decoding the encoding video content into the input frames of video information (Yoneda: Paras. [0074], [0077]-[0078], [0109] disclose the process where an encoded signal is input to the front end portion 20, the front end portion 20 performs demodulation... and generates bitstream data. The bitstream data is input to the decoder 30.).
As per claim 10, Yoneda-Krishnan-Davies disclose the transcoding network element of claim 9 wherein the video coding arrangement includes an artificial neural network for implementing the first video coding modality and an encoder for implementing the second video coding modality (Davies: Paras. [0123], [0136], [0239], [0244] disclose a system architecture containing the components for both modalities. The first modality is implemented using a shared reconstruction model, which is identified as a machine learning model (most likely, a diffusion model)—a type of artificial neural network. The second modality is implemented using a conventional codec (e.g., JPEG, H.265, etc.), which is a standard encoder. The system is designed to select between these two modalities for compressing different semantic objects within the same data stream.).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure and can be viewed in the list of references.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PEET DHILLON whose telephone number is (571)270-5647. The examiner can normally be reached M-F: 5am-1:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sath V. Perungavoor can be reached at 571-272-7455. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PEET DHILLON/Primary Examiner
Art Unit: 2488
Date: 02-21-2026