Last updated: April 19, 2026

Application No. 19/012,648

SEI MESSAGE FOR CARRIAGE OF TEXT DATA FOR GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS IN VIDEO STREAMS

Non-Final OA §103

Filed

Jan 07, 2025

Examiner

DHILLON, PUNEET S

Art Unit

2488

Tech Center

2400 — Computer Networks

Assignee

Tencent America LLC

OA Round

1 (Non-Final)

Interview Optional

— +18.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 281 resolved cases, 2023–2026

Examiner Intelligence

DHILLON, PUNEET S View full profile →

Grants 83% — above average

Career Allow Rate

232 granted / 281 resolved

+24.6% vs TC avg

Strong +18% interview lift

Without

With

+18.4%

Interview Lift

resolved cases with interview

Typical timeline

2y 6m

Avg Prosecution

41 currently pending

Career history

322

Total Applications

across all art units

Statute-Specific Performance

§101

5.4%

-34.6% vs TC avg

§103

49.1%

+9.1% vs TC avg

§102

17.5%

-22.5% vs TC avg

§112

24.9%

-15.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 281 resolved cases

Office Action

§103

DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The disclosure is objected to because of the following informalities: According to the applicant’s specification, paragraphs [0110], [0122], [0126], [0159], [0165] include the word “reminder”, however according the context of the disclosure, the word should be “remainder”.  
The lengthy specification has not been checked to the extent necessary to determine the presence of all possible minor errors. Applicant’s cooperation is requested in correcting any errors of which applicant may become aware in the specification. 
Appropriate correction is required.

Claim Objections
Claims 8 & 14 are objected to because of the following informalities: The claims recite “reminder” and should recite “remainder”.  Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 4-5, 7, 9-11, 13, 15-16, 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al., hereinafter referred to as Wang (US 2018/0278964 A1) in view of Chen et al., hereinafter referred to as Chen (WO 2024/020603 A2). 

	As per claim 1, Wang discloses a method of processing a video bitstream (Wang: Abstract), the method comprising:
receiving the video bitstream comprising (i) one of a picture and a video and (ii) a supplemental enhancement information (SEI) message associated with the one of the picture and the video (Wang: Paras. [0019], [0081], [0091] disclose receiving an encoded video bitstream comprising VCL NAL units (picture/video) and SEI NAL units (supplemental enhancement information messages) associated with the video data.), the SEI message including text data (Wang: Paras. [0054], [0093], [0159] disclose that SEI messages can carry text data such as subtitling, captioning, or user data unregistered for private use.); and
extracting the text data from the SEI message (Wang: Para. [0247] discloses decoder can parse essential information carrying SEI messages.), wherein
the SEI message does not indicate whether the one of the picture and the video has been modified by another generative AI process (Wang: Paras. [0019], [0093], [0247], [0335] disclose parsing [extracts] syntax elements and information from the SEI messages according to the video coding standard [format rule] to determine how to process/present the video.).
However, Wang does not explicitly disclose “… text data purposed for use with a generative artificial intelligence (AI) process …”.
Further, Chen is in the same field of endeavor and teaches the text data purposed for use with a generative artificial intelligence (AI) process (Chen: Paras. [0005], [0075], [0086] disclose using commentaries [text] as an information source for a system that automatically converts the text into embedded visualizations using NLP and computer vision models (a generative AI process).).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Wang and Chen before him or her, to modify the SEI messaging of Wang to include the text data utilized for Generative AI processing feature as described in Chen. The motivation for doing so would have been to improve user viewing experience by providing a configuration that enables generative content to be transmitted alongside video content in a standardized bitstream format, ensuring the receiving device has the necessary data to perform the augmentation/visualization.

	As per claim 2, Wang-Chen disclose the method of claim 1, wherein after the one of the picture and the video is decoded, the decoded one of the picture and the video is modified with the generative AI process based on the text data when the generative AI process is to be applied to the one of the picture and the video data (Wang: Para. [0093] discloses post-decoding operations and Chen: Paras. [0005], [0009], [0086] disclose a frame buffer for storing video stream comprising a temporal sequence of video frames, embedding the visualizations in the frames of the video stream and augmenting sports video clips based on a text by automatically converting the text into embedded visualizations and generating the visual content based on text descriptions using natural language processing (NLP) models (e.g., large language models.).).

	As per claim 4, Wang-Chen disclose the method of claim 1, wherein the text data includes instructions for the generative AI process to modify the one of the picture and the video (Chen: Fig. 1 & Para. [0076] disclose obtaining raw video footage 102 and commentary text 106 of racket-based sports as input 104, and outputs an augmented video 134 [generative AI process to modify the one of the picture and the video].).

	As per claim 5, Wang-Chen disclose the method of claim 1, wherein the generative AI process is implemented using a neural-network post-filtering process that includes one or more neural networks (Chen: Para. [0077] discloses the system 108 may include a video processor 116 for pre-processing and/or postprocessing the input video, which may include computer vision techniques based on deep learning, object detection, object tracking, pose estimation, and/or segmentation and Chen: Para. [0083] discloses utilizing a ViT-Adapter [a neural network] trained on COCO 164K to perform tasks.).

	As per claim 7, Wang-Chen disclose the method of claim 1, wherein the text data is carried in a payload of the SEI message (Wang: Paras. [0053], [0093], [0159] disclose a payload of the SEI message carrying essential information, such as subtitling, captioning [text data].).

	As per claim 9, Wang-Chen disclose the method of claim 1, wherein the SEI message does not indicate that the one of the picture and the video has been modified by the other generative AI process and the SEI message does not indicate that the one of the picture and the video has not been modified by the other generative AI process (Wang: Paras. [0054], [0093], [0159] do not disclose the SEI message indicating that the one of the picture and the video has been or has not been modified by any generative AI process and therefore, does not indicate that the one of the picture and the video has not been or has been modified by the other generative AI process.).
	
As per claim 10, Wang discloses a method for generating a supplemental enhancement information (SEI) message (Wang: Abstract), the method comprising:
obtaining text data (Wang: Paras. [0054], [0093], [0159] disclose obtaining SEI messages carrying text data such as subtitling, captioning, or user data unregistered for private use.); and
encoding a video bitstream comprising (i) one of a picture and a video and (ii) the SEI message associated with the one of the picture and the video, the SEI message including the text data (Wang: Paras. [0019], [0081], [0091], [0093] disclose encoding a video bitstream comprising VCL NAL units (picture/video) and SEI NAL units (supplemental enhancement information messages) associated with the video data and the SEI message including the text data.), wherein
the SEI message does not indicate whether the one of the picture and the video has been modified by any generative AI process (Wang: Paras. [0054], [0093], [0159] do not disclose the SEI message indicating whether the one of the picture and the video has been modified by any generative AI process and therefore, does not indicate whether the one of the picture and the video has been modified by any generative AI process.).
However, Wang does not explicitly disclose “… text data purposed for use with a generative artificial intelligence (AI) process …”.
Further, Chen is in the same field of endeavor and teaches the text data purposed for use with a generative artificial intelligence (AI) process (Chen: Paras. [0005], [0075], [0086] disclose using commentaries [text] as an information source for a system that automatically converts the text into embedded visualizations using NLP and computer vision models (a generative AI process).).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Wang and Chen before him or her, to modify the SEI messaging of Wang to include the text data utilized for Generative AI processing feature as described in Chen. The motivation for doing so would have been to improve user viewing experience by providing a configuration that enables generative content to be transmitted alongside video content in a standardized bitstream format, ensuring the receiving device has the necessary data to perform the augmentation/visualization.

	As per claim 11, Wang-Chen disclose the method of claim 10, wherein the generative AI process is implemented using a neural-network post-filtering process that includes one or more neural networks (Chen: Para. [0077] discloses the system 108 may include a video processor 116 for pre-processing and/or postprocessing the input video, which may include computer vision techniques based on deep learning, object detection, object tracking, pose estimation, and/or segmentation and Chen: Para. [0083] utilizing a ViT-Adapter [a neural network] trained on COCO 164K to perform tasks.).

	As per claim 13, Wang-Chen disclose the method of claim 10, wherein the text data is carried in a payload of the SEI message (Wang: Paras. [0053], [0093], [0159] disclose a payload of the SEI message carrying essential information, such as subtitling, captioning [text data].).

	As per claim 15, Wang discloses a method of processing visual media data (Wang: Abstract), the method comprising:
processing a bitstream of the visual media data according to a format rule (Wang: Paras. [0019], [0099], [0300] disclose a method of processing (decoding) video data including receiving an encoded video bitstream and processing it according to syntax/format rules defined by video coding standards (e.g., HEVC).), wherein
the bitstream includes (i) one of a picture and a video and (ii) a supplemental enhancement information (SEI) message associated with the one of the picture and the video (Wang: Paras. [0019], [0081], [0091] disclose receiving an encoded video bitstream comprising VCL NAL units (picture/video) and SEI NAL units (supplemental enhancement information messages) associated with the video data.), the SEI message including text data (Wang: Paras. [0054], [0093], [0159] disclose that SEI messages can carry text data such as subtitling, captioning, or user data unregistered for private use.); and
the format rule specifies that the text data is extracted from the SEI message (Wang: Paras. [0019], [0093], [0247], [0335] disclose parsing [extracts] syntax elements and information from the SEI messages according to the video coding standard [format rule] to determine how to process/present the video.).
However, Wang does not explicitly disclose “… text data purposed for use with a generative artificial intelligence (AI) process …”.
Further, Chen is in the same field of endeavor and teaches the text data purposed for use with a generative artificial intelligence (AI) process (Chen: Paras. [0005], [0075], [0086] disclose using commentaries [text] as an information source for a system that automatically converts the text into embedded visualizations using NLP and computer vision models (a generative AI process).).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Wang and Chen before him or her, to modify the SEI messaging of Wang to include the text data utilized for Generative AI processing feature as described in Chen. The motivation for doing so would have been to improve user viewing experience by providing a configuration that enables generative content to be transmitted alongside video content in a standardized bitstream format, ensuring the receiving device has the necessary data to perform the augmentation/visualization.  
 
	As per claim 16, Wang-Chen disclose the method of claim 15, wherein the format rule specifies that after the one of the picture and the video is decoded, the decoded one of the picture and the video is further modified with the generative AI process based on the text data (Wang: Para. [0093] discloses post-decoding operations and Chen: Paras. [0005], [0009], [0086] disclose a frame buffer for storing video stream comprising a temporal sequence of video frames, embedding the visualizations in the frames of the video stream and augmenting sports video clips based on a text by automatically converting the text into embedded visualizations and generating the visual content based on text descriptions using natural language processing (NLP) models (e.g., large language models.).).

	As per claim 18, Wang-Chen disclose the method of claim 15, wherein the text data comprises instructions for the generative AI process to modify the one of the picture and the video (Chen: Fig. 1 & Para. [0076] disclose obtaining raw video footage 102 and commentary text 106 of racket-based sports as input 104, and outputs an augmented video 134 [generative AI process to modify the one of the picture and the video].).

	As per claim 19, Wang-Chen disclose the method of claim 15, wherein the generative AI process is implemented using a neural-network post-filtering process that includes one or more neural networks (Chen: Para. [0077] discloses the system 108 may include a video processor 116 for pre-processing and/or postprocessing the input video, which may include computer vision techniques based on deep learning, object detection, object tracking, pose estimation, and/or segmentation and Chen: Para. [0083] utilizing a ViT-Adapter [a neural network] trained on COCO 164K to perform tasks.).

	As per claim 20, Wang-Chen disclose the method of claim 15, wherein the text data is carried in a payload of the SEI message (Wang: Paras. [0053], [0093], [0159] disclose a payload of the SEI message carrying essential information, such as subtitling, captioning [text data].).

Claims 6 & 12 are rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Chen in further view of Takada et al., hereinafter referred to as Takada (US 2024/0236312 A1). 

	As per claim 6, Wang-Chen disclose the method of claim 5, (Chen: Para. [0077] discloses the system 108 may include a video processor 116 for pre-processing and/or postprocessing the input video, which may include computer vision techniques based on deep learning, object detection, object tracking, pose estimation, and/or segmentation and Chen: Para. [0083] discloses utilizing a ViT-Adapter [a neural network] trained on COCO 164K to perform tasks.).
	However, Wang-Chen do not explicitly disclose “… wherein the SEI message indicates neural-network post-filter information of the neural-network post-filtering process.”.
	Furthermore, Takada is in the same field of endeavor and teaches wherein the SEI message indicates neural-network post-filter information of the neural-network post-filtering process (Takada: Para. [0057] discloses a neural network post-filter characteristics SEI (characteristics SEI, NNPFC_SEI, or characteristics information) and neural network post-filter activation SEI (activation SEI, NNPFA_SEI, activation SEI, or activation information). A NN filter unit 611 performs filtering processing on an attribute image using the derived NN model. The characteristics SEI used for filtering processing is specified by a syntax element, for example an nnpfa_target_id, included in the activation SEI.).  
	Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Wang-Chen and Takada before him or her, to modify the video encoding decoding system of Wang-Chen to include the SEI message indicating neural-network post-filter information feature as described in Takada. The motivation for doing so would have been to improve video coding performance by providing a configuration that uses deep learning techniques to adjust algorithms that can handle different image features without reducing image quality. 

As per claim 12, the claim(s) recites analogous limitations to claim(s) 6 above, and is/are therefore rejected on the same premise.
 
Claims 8 & 14 are rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Chen in further view of Pettersson et al., hereinafter referred to as Pettersson (WO 2022/220724 A1). 

As per claim 8, Wang-Chen disclose the method of claim 7, wherein the payload of the SEI message comprises a flag (Wang: Paras. [0053], [0093], [0157], [0159], [0163] disclose a payload of the SEI message carrying essential information, such as subtitling, captioning [text data] described in SEI prefix indication. For example, prefix_sei_payload_type indicates the payloadType value of the SEI messages for which one or more SEI prefix indications are provided in the SEI prefix indication SEI message.).
However, Wang-Chen do not explicitly disclose “… a flag indicating that a remainder of the payload of the SEI message is to be processed.”.
Furthermore, Pettersson is in the same field of endeavor and teaches a flag indicating that a remainder of the payload of the SEI message is to be processed (Pettersson: Paras. [0055], [0072]-[0073] disclose a film_grain_characteristics_cancel_flag found within the film grain characteristics SEI message syntax, determines if the subsequent parameters in the payload should be processed. For example, if the flag is set to a first value (e.g., 0), the remaining parameters (syntax elements) in the payload follow and are processed to enable the film grain process.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Wang-Chen and Pettersson before him or her, to modify the video encoding decoding system of Wang-Chen to include the flag indicating remainder feature as described in Pettersson. The motivation for doing so would have been to improve video coding efficiency by providing a configuration that reduces the overall bit cost of the video stream when functionalities are used arbitrarily or intermittently.

As per claim 14, the claim(s) recites analogous limitations to claim(s) 8 above, and is/are therefore rejected on the same premise.

Allowable Subject Matter
Claims 3 & 17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure and can be viewed in the list of references.
 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to PEET DHILLON whose telephone number is (571)270-5647. The examiner can normally be reached M-F: 5am-1:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sath V. Perungavoor can be reached at 571-272-7455. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/PEET DHILLON/Primary Examiner
Art Unit: 2488
Date: 01-14-2026

Read full office action

Prosecution Timeline

Jan 07, 2025

Application Filed

Jan 14, 2026

Non-Final Rejection — §103

Mar 02, 2026

Interview Requested

Mar 19, 2026

Examiner Interview Summary

Mar 19, 2026

Applicant Interview (Telephonic)

Precedent Cases

Applications granted by this same examiner with similar technology

18/834,078

Patent 12598346

A DISPLAY DEVICE AND OPERATION METHOD THEREOF

2y 5m to grant Granted Apr 07, 2026

18/731,883

Patent 12567263

IMAGING SYSTEM

2y 5m to grant Granted Mar 03, 2026

18/510,606

Patent 12548338

OBJECT SAMPLING METHOD AND IMAGE ANALYSIS APPARATUS

2y 5m to grant Granted Feb 10, 2026

18/599,015

Patent 12536812

CAMERA PERCEPTION TECHNIQUES TO DETECT LIGHT SIGNALS OF AN OBJECT FOR DRIVING OPERATION

2y 5m to grant Granted Jan 27, 2026

18/819,189

Patent 12537911

VIDEO PROCESSING APPARATUS

2y 5m to grant Granted Jan 27, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

83%

Grant Probability

99%

With Interview (+18.4%)

2y 6m

Median Time to Grant

Low

PTA Risk

Based on 281 resolved cases by this examiner. Grant probability derived from career allow rate.