Last updated: April 19, 2026

Application No. 18/075,543

REAL-TIME VIDEO OVERLAYING AND SHARING

Non-Final OA §103

Filed

Dec 06, 2022

Examiner

NGUYEN, PHUNG HOANG JOSEPH

Art Unit

2691

Tech Center

2600 — Communications

Assignee

Loop Now Technologies Inc.

OA Round

3 (Non-Final)

Interview Optional

— +32.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 877 resolved cases, 2023–2026

Examiner Intelligence

NGUYEN, PHUNG HOANG JOSEPH View full profile →

Grants 79% — above average

Career Allow Rate

694 granted / 877 resolved

+17.1% vs TC avg

Strong +32% interview lift

Without

With

+32.1%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

32 currently pending

Career history

909

Total Applications

across all art units

Statute-Specific Performance

§101

5.6%

-34.4% vs TC avg

§103

56.8%

+16.8% vs TC avg

§102

15.2%

-24.8% vs TC avg

§112

8.2%

-31.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 877 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-8, 11-14 and 16-28 are rejected under 35 U.S.C. 103 as being unpatentable over Takeda OR Kristal in view of Cote et al (US 2011/0090351) or Sun et al (US 2011/0187733) and further in view of  Zhou et al (US 2021/0019892), Choi et al (US 2018/0040304) OR Guzman Suarez et al (US 2012/0092438).

Claims 1, 27 and 28 Takeda or Kristal teaches a computer program product embodied in a non-transitory computer readable medium for machine learning, a computer system for machine learning and a  computer-implemented method for video content analysis comprising: 
capturing video output from a first camera on a first mobile device; 
Takeda: input image is captured by an image capture device, such as a camera or mobile device of a user, c6, l58-60.  
Kristal: a user may capture a single image (or multiple images, or a video clip) of the user by way of a user application or device; such as by taking a “selfie” image of himself, [0060]),
b) recognizing a portion of an individual, in the video output that was captured, wherein the recognizing determines a user body contour; 
Takeda: Figs. 1A and 1B;  
Kristal: identification of the precise body contour of a particular user, [0119]);
c)  generating a binary mask,  wherein the binary mask enables real-time video processing, which includes separating the user body contour from a background of the video output from the first camera; 
Takeda: A binary object mask is generated which can be utilized for removing the material surrounding that object as seen in Fig. 1B;  
Kristal: a computer vision algorithm may be used, or other image processing techniques, to trace around the determined shape or contour of the particular body of the particular user as shown in the image; and to cut-off or remove or discard the data of the background, leaving only a “net” image of the user himself, or a masking component of the user's image by itself and sans its background, [0124]);
d) smoothing one or more edges of the binary mask, (Takeda: using Gaussian functions,  c2.l2;   Kristal: The User Handler Module 4, via its Body completion process 4F, may cure inconsistencies or imperfections in the user's image, by smoothing and/or improving and/or modifying the contour line of the user's body image, [0136, 0152] using Gaussian method, [0347]), [by applying a low pass filter and gamma adjust process on the binary mask after generating the binary mask of user body contour = X1];
Regarding X1 in step d). Takeda and Kristal do not teach X1. 

Cote provides tone mapping per gamma adjustment, [0269, 0270] and using Gaussian filter when a particular threshold of noise content is detected in the input image, the selection logic 716 may be adapted to select one of the low pass filtered outputs G1out or G2out from which high frequency content, which may include noise, has been reduced, [0284-02860.
Sun teaches, 

    PNG
    media_image1.png
    318
    532
    media_image1.png
    Greyscale


e) merging the binary mask with the video output from the first camera, wherein the merging produces a merged first camera video output; 
Takeda:  Then segmentation is performed 46 at a finer image resolution taking the object mask 44, and the reduced resolution image 18. At this stage, the system already has a good mask estimated at the lower image resolution. The method then refines the upscaled object mask using the same segmentation method, such as described in FIG. 5, c5, l15-21;  and selecting one or more finer resolutions as segmentation proceeds; c.10, l9-21; 
Kristal: FIG. 37 illustrates example method operations that may be performed according to one or more embodiments to combine user selectable product images and facilitate visualization-assisted coordinated product transactions. In block 120, one or more products having one or more associated product images are identified, [0461-0464]); and 
f) creating a composite video, wherein the merged first camera video output is overlaid onto a video output from a second camera, wherein the video output from the second camera comprises video output from a camera on a second mobile device, wherein the composite video is included in a livestream event.
While Takeda disuses segmentation to produce finer image resolution and Kristal discusses combining using selectable images as shown above, It is unclear that they teach f).
Zhou teaches video can include live video captured using a camera of a client device, e.g., a front-facing camera, a rear camera, and/or one or more other cameras separate from the device. [0039] to “generate a composite video that includes a foreground video segmented from a captured scene, overlaid on a background different from the original captured background, [0031, 0091, 0131] and fig. 8).
Choi provides different embodiments addressing f) where Fig. 1, [0025-0027] detailing the composition of images from cameras from a single terminal that is with front camera and rear camera WHILE Fig. 5, [0053-0059] detailing the composition of images acquired from its own camera and a second image received from a counterpart terminal of the video call during a real-time video call.
Guzman: The video display generator 2230 generates a video display for the UI interaction and generation module 2205. The video display may be based on both images received from a local camera through camera input module 2270 as well as images received from the remote user through the conference management module 2225. In some embodiments, the video display generator generates a picture-in-picture (PIP) display that is sent to the UI interaction and generation module 2205. In other embodiments, the videos are sent to the UI interaction and generation module 2205 which puts together the PIP display, [0234].

Therefore it would have been obvious to the ordinary artisan before the effective filing date to incorporate the teaching of Cote or Sun into the teaching of Takeda OR Kristal for the purpose of  providing image sharpening techniques for improving the enhancement of textures and edges while also reducing noise in the output image (as shown in Cote) or enhancing images on light level or ambient light (as shown in Sun, and also to incorporate the teaching of Zhou, Choi or Guzman Suarez into the teaching of Takeda or Kristal for the purpose of enhancing the merging different images not only from one source (i.e., images from one camera) but can also be done from variety of sources (i.e., images from multiple different cameras) to obtain greater selection of images of foreground and background for merge to enhance communication…

	Claims 2-4, wherein the first camera and the second camera are included on the first mobile device;  wherein the first camera and the second camera are facing in opposite directions, (Zhou: front and back cameras,[0039]); wherein the video output from the first camera and the video output from the second camera are displayed on the first mobile device. (Zhou: Fig. 5, [0124]).
	Claim 5. The method of claim 4 further comprising rendering a picture-in-picture display, on the first mobile device, wherein the merged first camera video output is overlaid on the video output from the second camera.  (picture in picture displaying is well known in the art as shown by Guzman Suarez, Figs. 9, 11, 13-16 OR )
Claims 6-8. The method of claim 3 further comprising sharing the composite video with a second mobile device; further comprising selecting, by a user, a video effect overlay from a library of video effect overlays; and further comprising merging the video effect overlay with the merged first camera video output.  (Kristal: The output of the Universal Dressing Module 8 may be provided to the user for display via his or her application or electronic device 110 running a client application 110, and may be further shared or sent by the user to selected recipients, such as via the social networks 112, [0123].  Figs. 37 and 38 show user selectable item among multiple items to be shared or advertised, [0461-0464]).
Claims 9-10 (Cancelled)
Claims 11-13, further comprising enabling an ecommerce purchase of at least one product for sale by a viewer, wherein the ecommerce purchase is accomplished within a livestream window; wherein the video output from the second camera includes the at least one product for sale; further comprising recognizing the at least one product for sale from a library of products.  (Kristal: The output of the Universal Dressing Module 8 may be provided to the user for display via his or her application or electronic device 110 running a client application 110, and may be further shared or sent by the user to selected recipients, such as via the social networks 112, [0123].  Figs. 37 and 38 show user selectable item among multiple items (mapping to a library of product) to be shared or advertised, [0461-0464], i.e., product sale, i.e., women dress/clothes) from e-commerce sites, [0075, 0079, 0085] or from the real-life shops, [0076, 0085, 0470]
Claim 14. The method of claim 11 further comprising pinning a product card, using one or more processors, in the livestream window, wherein the product card represents the at least one product for sale.  (Zhou: live video call, [0050]; Kristal: products from e-commerce sites, [0075, 0079, 0085] or from the real-life shops, [0076, 0085, 0470]).

Claims 16-18, wherein the video output from the first camera comprises a smoothed selfie video; wherein the smoothed selfie video is shared with a second mobile device; and  wherein the smoothed selfie video is overlaid with a second selfie video captured by the second mobile device. (Kristal: a user may capture … a “selfie” image of himself, [0060, 0131, 0133]).

Claims 19-22. The method of claim 1 further comprising synchronizing frame rates of depth data, face metadata, and video data of the first camera; further comprising determining a first depth between a user face and the first camera and  further comprising using a cutoff depth to determine the user body contour; wherein the generating the binary mask is based on the first depth and the cutoff depth.  (Zhou: performing fine segmentation to obtain a binary mask for each frame, [0005, 0017].  Also fig. 5,  shows the binary mask is performed to discard the surroundings, i.e., background of top image to get the bottom image, [0124, 0134-0136].  In some implementations, the weight may be proportional to the distance. In some implementations, the global coherence weight may be used as a cutoff value for the weight, e.g., the value of the weight may be set as equal to the global coherence weight when the distance is equal to or greater than a cutoff distance value. The calculated weight may be stored in the weight map. In some implementations, the cutoff distance value may be determined experimentally, e.g., based on segmentation results obtained for a large number of videos, [0080] which also requires the depth data of a video frame, Fig. 3).
Claim 23. The method of claim 19 wherein the depth data is determined by a depth sensor. (Zhou teaches depth data, [0047] by one or more other cameras, [0039].  Examiner maps “one or more other cameras”  to a depth sensor).
Claim 24. The method of claim 1 further comprising employing a second alpha matte on the video output from the first camera. (Zhou: Gaussian filter provides alpha mating, [0088]).
Claims 25-26. The method of claim 24 further comprising correcting an orientation of the video output from the first camera.  (Zhou: calculating an L1 distance between a pixel location of the pixel and a mask boundary of the initial segmentation mask wherein the mask boundary includes locations where at least one foreground pixel is adjacent to at least one background pixel in the initial segmentation mask, [0009]; further comprising combining the video output from the first camera with the binary mask.  (Fig. 2, different video segments may be processed in parallel and the obtained foreground segments may be combined to form a foreground video, [0095-0099]).

Claims 15 and 29-30 are rejected under 35 U.S.C. 103 as being unpatentable over Takeda OR Kristal in view of Cote or Sun and further in view of  Zhou, Choi OR Guzman Suarez and further in view of Laska et al (US 2019/0156126), Montage (US 9,176,653) OR several YOUTUBE tutorials below (Hereinafter Tutorials)
	
	Claim 15. The method of claim 1 wherein the composite video is scaled in response to a user gesture.  (Laska: In addition to enhancing or optimizing the actual operation of the devices themselves with respect to their immediate functions, the extensible devices and services platform 300 may be directed to “repurpose” that data in a variety of automated, extensible, flexible, and/or scalable ways to achieve a variety of useful objectives. These objectives may be predefined or adaptively identified based on, e.g., usage patterns, device efficiency, and/or user input (e.g., requesting specific functionality), [0084] where user input can be a gesture, [0152].  Montague’s claim 1: a user may press and release for a drag length of zero to zoom out, or the user may press and drag farther than a five pixel threshold then release to zoom in, without the user having to exert extra effort and time to first select either a zoom in or a zoom out function by way of an additional user action such as moving a pointer to a toolbar or special location, use of a different mouse button, activating a popup menu, a key press, click of an icon, gesture, or marking menu stroke.

Claims 29-30. (New) The method of claim 1 wherein the smoothing includes transforming the binary mask to allow a drag-and-zoom feature; wherein the drag-and-zoom feature is based on a combination of changing a binary mask placement and binary mask scaling. 
Here, examiner notices that to smooth one or more edges of the binary mask, images, color contrast, Takeda uses Gaussian functions,  c2.l2;   Kristal also uses Gaussian method, [0347] and the User Handler Module 4, via its Body completion process 4F, may cure inconsistencies or imperfections in the user's image, by smoothing and/or improving and/or modifying the contour line of the user's body image, [0136, 0152].  Zhou uses Gaussian model, i.e., GMM (Gaussian Mixture Model).  Cotes using Gaussian filter when a particular threshold of noise content is detected in the input image, the selection logic 716 may be adapted to select one of the low pass filtered outputs G1out or G2out from which high frequency content, which may include noise, has been reduced, [0284-02860.
Similarly, Laska also uses Gaussian mixture model teaching “the server provides (2026) a composite video segment corresponding to the identified event of interest, the composite video segment including a plurality of composite frames each including a high-resolution portion covering the zone of interest, and a low-resolution portion covering regions outside of the zone of interest, [0418]”.  Laska further via Fig. 9L teaches “the customizable outline 947A may be adjusted by performing a dragging gesture with any corner or side of the customizable outline 947A, [0173-0174].  
Much more in detail… Laska teaches “The electronic device detects (1606) a first user input to zoom in on a respective portion of the first video feed. In some implementations, the first user input is a mouse scroll wheel, keyboard shortcuts, or selection of a zoom-in affordance (e.g., elevator bar or other widget) in a web browser accompanied by a dragging gesture to pane the zoomed region. For example, the user of the client device 504 is able to drag the handle 919 of the elevator bar in FIG. 9B to zoom-in on the video feed. Subsequently, the user of the client device 504 may perform a dragging gesture inside of the first region 903 to pane up, down, left, right, or a combination thereof, [0350-0352].

Montage teaches a method that during a drag, user can also zoom, pan, rotate, draw and/or manipulate, col. 2, line 52 to col. 3, line 46. 

In addition examiner wishes to present the “drag and zoom” feature in a visual demonstration of the following tutorials which were made prior to the provisional filing date (09/18/2019) that teach the claimed feature. 

By Witch Doctor Studios 
https://youtu.be/bHTgeNjZ2eE
By Max Novak
https://youtu.be/XNRnrsmDPJs
By Dudeinadrama
https://youtu.be/kP9LyCYhv-w
By Mobox Graphics
https://youtu.be/Axa38beTBvo

Therefore it would have been obvious to the ordinary artisan before the effective filing date to incorporate the teaching of Laska, Montage or Tutorials into the teaching of Takeda or Kristal in view of Zhou, Choi or Guzman Suarez  the purpose of providing greater convenience in making simple gesture, i.e., scaling, dragging and zooming to achieve better/smoother video presentation.

Inquiry
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHUNG-HOANG J. NGUYEN whose telephone number is (571)270-1949. The examiner can normally be reached Reg. Sched. 6:00-3:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached on 571-272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/PHUNG-HOANG J NGUYEN/Primary Examiner, Art Unit 2691

Read full office action

Prosecution Timeline

Dec 06, 2022

Application Filed

Dec 18, 2024

Non-Final Rejection — §103

Mar 31, 2025

Response Filed

Apr 08, 2025

Final Rejection — §103

Sep 15, 2025

Request for Continued Examination

Sep 18, 2025

Response after Non-Final Action

Sep 25, 2025

Examiner Interview (Telephonic)

Sep 29, 2025

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/843,731

Patent 12598256

DISRUPTED-SPEECH MANAGEMENT ENGINE FOR A MEETING MANAGEMENT SYSTEM

2y 5m to grant Granted Apr 07, 2026

18/518,577

Patent 12591408

DISPLAY APPARATUS AND METHOD INCORPORATING INTEGRATED SPEAKERS WITH ADJUSTMENTS

2y 5m to grant Granted Mar 31, 2026

17/989,972

Patent 12587612

Method and Device for Invoking Public or Private Interactions during a Multiuser Communication Session

2y 5m to grant Granted Mar 24, 2026

18/256,155

Patent 12587705

LIVESTREAMING AUDIO PROCESSING METHOD AND DEVICE

2y 5m to grant Granted Mar 24, 2026

18/629,549

Patent 12587700

GROUPING IN A SYSTEM WITH MULTIPLE MEDIA PLAYBACK PROTOCOLS

2y 5m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

79%

Grant Probability

99%

With Interview (+32.1%)

2y 9m

Median Time to Grant

High

PTA Risk

Based on 877 resolved cases by this examiner. Grant probability derived from career allow rate.

REAL-TIME VIDEO OVERLAYING AND SHARING

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email