Last updated: April 19, 2026

Application No. 18/702,105

VIDEO IMPLANTATION METHOD, APPARATUS, DEVICE AND COMPUTER-READABLE STORAGE MEDIUM

Non-Final OA §103

Filed

Apr 17, 2024

Examiner

LIN, JASON K

Art Unit

2425

Tech Center

2400 — Computer Networks

Assignee

Xingheshixiao (Beijing) Technology Co. Ltd.

OA Round

3 (Non-Final)

This examiner grants 49% of cases after interview

— +34.8% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 454 resolved cases, 2023–2026

Examiner Intelligence

LIN, JASON K View full profile →

Grants 49% of resolved cases

Career Allow Rate

221 granted / 454 resolved

-9.3% vs TC avg

Strong +35% interview lift

Without

With

+34.8%

Interview Lift

resolved cases with interview

Typical timeline

3y 7m

Avg Prosecution

28 currently pending

Career history

482

Total Applications

across all art units

Statute-Specific Performance

§101

5.2%

-34.8% vs TC avg

§103

61.2%

+21.2% vs TC avg

§102

16.0%

-24.0% vs TC avg

§112

9.3%

-30.7% vs TC avg

Black line = Tech Center average estimate • Based on career data from 454 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This office action is responsive to application No. 18/702,105 filed on 12/18/2025.  Claim(s) 2 and 10 have been cancelled. Claim(s) 1, 3-9, and 11-12 is/are pending and have been examined.
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/18/2025 has been entered.
Claim Objections
Claim(s) 1, 3, 6, and 8 is/are objected to because of the following informalities:
In recently filed claim amendments, Applicant has amended “source video” to “low code source video” and “source video clip” to “high code source video clip”.

Claim 1 recites:
	“generating object description information according to the visual object and the source video clip corresponding to the one or more frames, the object description information being used to describe the implantation position of the visual object in the one or more frames and the specific information of the one or more frames, the one or more frames being obtained by the following steps: analyzing the source video and recognizing one or more frames in which the visual object can be implanted”
“performing semantic analysis and/or content analysis on the source video through a video port provided by a publisher, it being unnecessary to obtain the complete source video when the source video being analyzed”
“after the semantic analysis and/or content analysis is performed on the source video, determining one or more videos in the source video”
“sending the one or more output videos and the video description information thereof to the publisher, so that the publisher obtains the final video according to the video description information, the one or more output videos and the source video data of the source video clip”
		
	Please amend to:
--generating object description information according to the visual object and the high code source video clip corresponding to the one or more frames, the object description information being used to describe the implantation position of the visual object in the one or more frames and the specific information of the one or more frames, the one or more frames being obtained by the following steps: analyzing the low code source video and recognizing one or more frames in which the visual object can be implanted--
--performing semantic analysis and/or content analysis on the low code source video through a video port provided by a publisher, it being unnecessary to obtain the complete low code source video when the low code source video being analyzed--
--after the semantic analysis and/or content analysis is performed on the low code source video, determining one or more videos in the low code source video--
--sending the one or more output videos and the video description information thereof to the publisher, so that the publisher obtains the final video according to the video description information, the one or more output videos and the source video data of the high code source video clip--

Claim 3 recites:
“the generating object description information according to the visual object and the source video clip corresponding to the one or more frames comprises:
analyzing a region of interest suitable for implantation of the visual object in the source video clip corresponding to the one or more frames to determine the object description information”

	Please amend to:
--the generating object description information according to the visual object and the high code source video clip corresponding to the one or more frames comprises:
analyzing a region of interest suitable for implantation of the visual object in the high code source video clip corresponding to the one or more frames to determine the object description information--

Claim 6 recites:
“the acquiring a source video clip corresponding to the one or more frames comprises:”
“the implanting the visual object into the source video clip corresponding to the one or more frames and generating one or more output videos comprises:”
	
	Please amend to:
--the acquiring a high code source video clip corresponding to the one or more frames comprises:--
--the implanting the visual object into the high code source video clip corresponding to the one or more frames and generating one or more output videos comprises:--

Claim 8 recites:
“the obtaining, by the publisher, the final video according to the video description information, the one or more output videos and the source video data of the source video clip comprises at least one of the following steps:”
	
	Please amend to:
-- the obtaining, by the publisher, the final video according to the video description information, the one or more output videos and the source video data of the high code source video clip comprises at least one of the following steps:--

Appropriate correction is required.
 Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 3-9, and 11-12 have been considered but are moot in view of the new ground(s) of rejection.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 1, 3-9, and 11-12 is/are rejected under 35 U.S.C. 103 as being unpatentable over in view of Srinivasan et al. (US 2020/0372937) in view of Fauqueur et al. (US 2014/0147096).
Consider claim 1, Srinivasan teaches a video implantation method, comprising steps of: 
analyzing a low code source video and recognizing one or more frames in which a visual object can be implanted (Paragraph 0038 teaches a source hub 102 comprising a video data analysis module, which performs pre-analysis in relation to source video data. Paragraph 0039 teaches pre-analysis may be fully automated in that it does not involve any human intervention. Paragraph 0040-0050 teaches the different types of pre-analysis that may be performed. Paragraph 0051 teaches source hub analyses the source video data to find regions within the source video material which are suitable for insertion of one or more additional visual objects in the image content of the source video. Paragraph 0071 teaches in step 320, the source hub 302 analyses the low-resolution video material to identify one or more “insertion zones” that correspond to one or more regions within the image contents of the video material that are suitable for insertion one or more additional visual objects. Paragraph 0072 teaches remainder of low-resolution video material is analysed to identify if the insertion zone appears in one or more other shots in the low-resolution video material);
acquiring a high code source video clip corresponding to the one or more frames (Paragraph 0074 teaches source hub 102 obtains high-resolution video data comprises a second plurality of frames of the video material that comprise at least the selected frames of the video material. The second plurality of frames are a smaller sub-set of the first plurality of frames. such that the source hub 102 obtains only a part of the high-resolution video material. The source hub 102 does not obtain the entire high-resolution video material. In some examples, the second plurality of frames may consist of only the selected frames of the video material. In other examples, the second plurality of frames may consist of the selected frames and also some additional frames of the video material); and
implanting the visual object into the high code source video clip corresponding to the one or more frames, and generating one or more output videos and video description information thereof (Paragraph 0075 teaches in step 350, the one or more additional visual objects are embedded into the selected frames of the high-resolution source video data. Paragraph 0076 teaches in step 360, output video data is created. The output video data comprises the selected frames of the high-resolution video material with the embedded one or more additional visual objects);
or
generating object description information according to the visual object and the source video clip corresponding to the one or more frames, the object description information being used to describe the implantation position of the visual object in the one or more frames and the specific information of the one or more frames, the one or more frames being obtained by the following steps: analyzing the source video and recognizing one or more frames in which the visual object can be implanted;
the analyzing the low code source video (Paragraph 0040-0051, 0071-0072) comprises: 
performing semantic analysis and/or content analysis on the source video through a video port provided by a publisher, it being unnecessary to obtain the complete source video when the source video being analyzed; determine one or more frames in which the visual object can be implanted (Paragraph 0035 teaches source hub may retrieve source video data as one or more digital files, supplied, for example, over a high-seed computer network, via the network, etc. Paragraph 0036 teaches source video data may be provided by a distributor or content owner. In the case of live video, source video would be provided on an on-going basis, so the complete source video would not be provided as it is not yet completed, thus analysis of the source video would occur on available segments/chunks of live video that is provided from the distributor. Paragraph 0038-0051 teaches source hub 102 comprising a video data analysis module, that may perform may different types of pre-analysis on the source video data, to find regions within the source video material which are suitable for insertion of one or more additional visual objects in the image content of the source video. Paragraph 0062 teaches video data of the distributor, where type of content may be live, VoD, etc. Paragraph 0068 teaches source hub 102 obtains source video data from a source video data holding entity 304, for example, the distributor or content producer. Paragraph 0071 teaches in step 320, the source hub 302 analyses the low-resolution video material to identify one or more “insertion zones” that correspond to one or more regions within the image contents of the video material that are suitable for insertion one or more additional visual objects);
wherein the method further comprises: 
sending the one or more output videos and the video description information thereof to the publisher, so that the publisher obtains the final video according to the video description information, the one or more output videos and the source video data of the source video clip (Paragraph 0076-0079, 0088; Paragraph 0068);
or 
sending the visual object and the object description information to the publisher, so that the publisher overlays the visual object on the frame corresponding to the one or more frames in the source video data in a mask manner according to the object description information; or 
sending the masked visual object and the object description information to the publisher, so that the publisher implants the masked visual object into the frame corresponding to one or more frames in a rendering fusion manner according to the object description information to obtain the final video.
Srinivasan does not explicitly teach after the semantic analysis and/or content analysis is performed on the source video, determining one or more videos in the source video that satisfy a preset requirement, the preset requirement being associated with the visual object; and 
analyzing the one or more videos to determine one or more frames in which the visual object can be implanted.
In an analogous art, Fauqueur teaches after semantic analysis and/or content analysis is performed on source video, determining one or more videos in the source video that satisfy a preset requirement, the preset requirement being associated with a visual object; and analyzing the one or more videos to determine one or more frames in which the visual object can be implanted (Paragraph 0102-0104).
Therefore, it would have been obvious to a person of ordinary skill in the art to modify the system of Srinivasan to include after semantic analysis and/or content analysis is performed on source video, determining one or more videos in the source video that satisfy a preset requirement, the preset requirement being associated with a visual object; and analyzing the one or more videos to determine one or more frames in which the visual object can be implanted, as taught by Fauqueur, for the advantage of better identifying suitable segments, as not all of the identified segments of the source video data are, in fact, suitable for product placement, thus, not all of the identified segments are selected for digital product placement (Fauqueur – Paragraph 0102), allowing the system to best select segment(s) that may better suited contextually, and of greater interest (Fauqueur – Paragraph 0103).

Consider claim 3, Srinivasan and Fauqueur teaches wherein, the generating object description information according to the visual object and the source video clip corresponding to the one or more frames comprises: analyzing a region of interest suitable for implantation of the visual object in the source video clip corresponding to the one or more frames to determine the object description information; and storing multiple versions of source videos by the publisher, each version of source video being different in code rate and/or language version; and the analyzing a source video and recognizing one or more frames in which a visual object can be implanted comprises: analyzing any version of source video among the multiple versions of source videos and recognizing one or more frames in which the visual object can be implanted (Claim 1 recites the following alternative limitation(s): 
Alternative limitation A - implanting the visual object into the high code source video clip corresponding to the one or more frames, and generating one or more output videos and video description information thereof;
or
Alternative limitation B - generating object description information according to the visual object and the source video clip corresponding to the one or more frames, the object description information being used to describe the implantation position of the visual object in the one or more frames and the specific information of the one or more frames, the one or more frames being obtained by the following steps: analyzing the source video and recognizing one or more frames in which the visual object can be implanted;
Srinivasan and Fauqueur teaches limitation A, see claim 1, thereby teaching one of the alternative limitation(s).
Claim 3 is dependent on claim 1 which recites additional limitations to further modify the alternative of limitation B. From a claim interpretation standard, claim 3 includes all the limitations of claim 1 and further modifies the alternative limitation B.
Accordingly, claim 3 is met by references Srinivasan and Fauqueur disclosing limitation A, since the alternative limitation B is also written in the alternative “or”, and it is not required to be taught by Srinivasan and Fauqueur, since Srinivasan and Fauqueur already teaches limitation A). 

Consider claim 4, Srinivasan and Fauqueur teach further comprising:
generating the video description information according to the respective time interval and/or frame interval of the one or more output videos, the video description information being used to describe the respective starting time and ending time of the one or more output videos in the source video data, and/or the video description information being used to describe the respective starting frame number and ending frame number of the one or more output videos in the source video data (Srinivasan – Paragraph 0076-0079, 0088).

Consider claim 5, Srinivasan and Fauqueur teach wherein, the analyzing the one or more videos to determine one or more frames in which the visual object can be implanted comprises: 
analyzing the one or more videos to determine a region of interest suitable for implantation of the visual object (Fauqueur - Paragraph 0053-0058; Srinivasan – Paragraph 0040-0050, 0071); and
determining the frame where the region of interest is located as the one or more frames (Fauqueur - Paragraph 0099-0104, 0123).

Consider claim 6, Srinivasan and Fauqueur teach wherein, the acquiring a source video clip corresponding to the one or more frames comprises: 
acquiring the frame corresponding to the one or more frames in high code rate source video data (Srinivasan - Paragraph 0074 teaches source hub 102 obtains high-resolution video data comprises a second plurality of frames of the video material that comprise at least the selected frames of the video material. The second plurality of frames are a smaller sub-set of the first plurality of frames. such that the source hub 102 obtains only a part of the high-resolution video material. The source hub 102 does not obtain the entire high-resolution video material. In some examples, the second plurality of frames may consist of only the selected frames of the video material. In other examples, the second plurality of frames may consist of the selected frames and also some additional frames of the video material); and
the implanting the visual object into the source video clip corresponding to the one or more frames and generating one or more output videos comprises: implanting the visual object into the frame corresponding to the one or more frames in the high code rate source video data, and generating one or more output videos (Srinivasan - Paragraph 0075 teaches in step 350, the one or more additional visual objects are embedded into the selected frames of the high-resolution source video data. Paragraph 0076 teaches in step 360, output video data is created. The output video data comprises the selected frames of the high-resolution video material with the embedded one or more additional visual objects).

Consider claim 7, Srinivasan and Fauqueur teach wherein, 
the acquiring the frame corresponding to the one or more frames in high code rate source video data further comprises: acquiring the frame corresponding to the one or more frames in the high code rate source video data according to a preset security frame strategy; wherein the preset security frame strategy is used to indicate the respective number of supplementary frames of the one or more frames (Srinivasan – Paragraph 0074, 0085-0086; Paragraph 0080-0082).

Consider claim 8, Srinivasan and Fauqueur teach wherein, 
the obtaining, by the publisher, the final video according to the video description information, the one or more output videos and the source video data of the source video clip comprises at least one of the following steps: 
replacing, by the publisher and according to the video description information, corresponding video segments in the source video data with the one or more output videos to obtain the final video (Srinivasan - Paragraph 0076-0079, 0088; Paragraph 0068); 
embedding, by the publisher and according to the video description information, the one or more output videos into the corresponding position in the source video data to obtain the final video; and 
overlaying, by the publisher and according to the video description information, corresponding video segments in the source video data by using the one or more output videos to obtain the final video.

Consider claim 9, Srinivasan and Fauqueur teach wherein, 
the overlaying, by the publisher and according to the video description information, corresponding video segments in the source video data by using the one or more output videos to obtain the final video comprises: overlaying, by the publisher and according to the video description information, the one or more output videos on corresponding video segments in the source video data in a floating layer manner to obtain the final video;
overlaying, by the publisher and according to the video description information, the one or more rendered and masked output videos with alpha channel information on corresponding video segments in the source video data in a floating layer manner to obtain the final video;
or 
implanting, by the publisher and according to the video description information, the one or more rendered and masked output videos with alpha channel information into corresponding video segments in the source video data in a rendering fusion manner according to the alpha channel information to obtain the final video
(Claim 8 recites the following limitation(s):
the obtaining, by the publisher, the final video according to the video description information, the one or more output videos and the source video data of the source video clip comprises at least one of the following steps: 
Limitation C -  replacing, by the publisher and according to the video description information, corresponding video segments in the source video data with the one or more output videos to obtain the final video; 
Limitation D -  embedding, by the publisher and according to the video description information, the one or more output videos into the corresponding position in the source video data to obtain the final video; and 
Limitation E -  overlaying, by the publisher and according to the video description information, corresponding video segments in the source video data by using the one or more output videos to obtain the final video

Srinivasan and Fauqueur teaches limitation C, see claim 1, thereby teaching at least one of the following steps.
Claim 9 is dependent on claim 8, which recites additional limitations to further modify limitation E. From a claim interpretation standard, claim 9 includes all the limitations of claim 8 and further modifies limitation E.
 Claim 8 does not require all the steps to be met, only requiring at least one of the following steps to be met. Accordingly, claim 9 is met by references Srinivasan and Fauqueur disclosing limitation C). 

Consider claim 11, Srinivasan and Fauqueur teach an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor, wherein, the memory has instructions stored thereon that can be executed by the at least one processor, and the instructions enable, when executed by the at least processor, the at least one processor to execute the method according to claim 1 (Srinivasan - Paragraph 0101-0105).

Consider claim 12, Srinivasan and Fauqueur teach a non-transient computer-readable storage medium having computer instructions stored thereon that are configured to cause a computer to execute the method according to claim 1 (Srinivasan - Paragraph 0101-0105).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASON K LIN whose telephone number is (571)270-1446.  The examiner can normally be reached on Monday-Friday 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Brian Pendleton can be reached on 571-272-7527.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JASON K LIN/Primary Examiner, Art Unit 2425

Read full office action

Prosecution Timeline

Apr 17, 2024

Application Filed

Feb 26, 2025

Non-Final Rejection — §103

Jun 02, 2025

Response Filed

Jun 14, 2025

Final Rejection — §103

Dec 18, 2025

Request for Continued Examination

Jan 08, 2026

Response after Non-Final Action

Feb 19, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/733,331

Patent 12604047

JUST IN TIME CONTENT CONDITIONING

2y 5m to grant Granted Apr 14, 2026

18/733,118

Patent 12593082

JUST IN TIME CONTENT CONDITIONING

2y 5m to grant Granted Mar 31, 2026

18/538,255

Patent 12556760

CREDITING EXPOSURE TO MEDIA IDENTIFIED USING SOURCE FILTERING

2y 5m to grant Granted Feb 17, 2026

17/553,353

Patent 12548455

GROUND-BASED CONTENT CURATION PLATFORM DISTRIBUTING GEOGRAPHICALLY-RELEVANT CONTENT TO AIRCRAFT INFLIGHT ENTERTAINMENT SYSTEMS

2y 5m to grant Granted Feb 10, 2026

18/073,358

Patent 12537993

SMART HOME AUTOMATION USING MULTI-MODAL CONTEXTUAL INFORMATION

2y 5m to grant Granted Jan 27, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

49%

Grant Probability

84%

With Interview (+34.8%)

3y 7m

Median Time to Grant

High

PTA Risk

Based on 454 resolved cases by this examiner. Grant probability derived from career allow rate.