Last updated: April 19, 2026
Application No. 18/560,609
Systems and Methods for Identifying and Extracting Object-Related Effects in Videos

Final Rejection §103
Filed
Nov 13, 2023
Examiner
GOEBEL, EMMA ROSE
Art Unit
2662
Tech Center
2600 — Communications
Assignee
Google LLC
OA Round
2 (Final)
Interview Optional

— +47.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 45 resolved cases, 2023–2026
Examiner Intelligence

GOEBEL, EMMA ROSE View full profile →
Grants 53% of resolved cases
Career Allow Rate
24 granted / 45 resolved
-8.7% vs TC avg
Strong +47% interview lift
Without
With
+47.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
40 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
18.2%
-21.8% vs TC avg
§103
60.1%
+20.1% vs TC avg
§102
11.8%
-28.2% vs TC avg
§112
8.4%
-31.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 45 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
Claims 1-20 are pending.

Priority
Acknowledgement is made of Applicant’s claim of priority from U.S. Provisional Application No. 63186971, filed May 11, 2021 and PCT Application No. PCT/US2022/028804, filed May 11, 2022.

Response to Arguments
Applicant’s arguments, see p. 8, filed January 26, 2026, with respect to the Claim Objections have been fully considered and are persuasive.  The amendments to the claims has overcome the previous objects and they therefore have been withdrawn. 

Applicant's arguments filed January 26, 2026 with respect to the 35 USC 103 rejection of the claims have been fully considered but they are not persuasive. Applicant argues that the Li reference does not teach “one or more object layers comprises image data illustrative of the corresponding object and one or more trace effects at least partially attributable to the corresponding object, wherein the one or more object layers are separate from the background layer” and that neither Shen or Yu cure the deficiencies. Examiner respectfully disagrees. As described in the 35 USC 103 rejections below, Li teaches that the foreground mask contains pixels corresponding to a vehicle and its cast shadow (i.e., object layer comprises image data illustrative of the corresponding object and one or more trace effects at least partially attributable to the corresponding object). In an analogous field of endeavor, Yu teaches separating the object portrayed in the image from the background. Although Li and Yu do not explicitly teach that the separated background and foreground are and object layer and background layer “wherein the one or more object layers are separate from the background layer”, the Shen reference teaches a separate background sky layer B and foreground layer F. Applicant argues that the references are not sufficient to teach the claimed invention of keeping the object and trace effect separate from the background layer. Applicant is reminded that the specification is not read into the claims, and Examiner asserts that the combination of references is sufficient to teach the invention as claimed by combining Li’s foreground mask containing a vehicle and its cast shadow with Yu’s teaching of separating the foreground from the background and Shen’s teaching of separate background and foreground layers. Thus, the 35 USC 103 rejection of the claims is upheld.

Applicant further argues that there is no motivation to combine the Zhan reference with the Li reference as applied in claims 7 and 8 and that Li teaches away from Zhan’s invention. Examiner respectfully disagrees. In rejecting claims 7 and 8, Zhan is relied upon to teach generating an optical flow for an object based on an input binary object mask. Applicant argues that Li’s reference cannot be combined with Zhan’s because Zhan requires user input which would render Li’s automated steps redundant. However, Applicant is once again reminded that the specification of the invention is not read into the claims. Nothing in the claimed limitation of “generating, by the computing system and based at least in part on the one or more binary object masks, one or more optical flows respectively for the one or more objects” indicates that the optical flow could not be determined in the process as described by Zhan of determining an optical flow of motion of an object based on a selected guidance point. In fact, the selection of a guidance point is not used in the rejection of the disclosed invention and instead Zhan’s teaching of generating an optical flow based on an input binary object mask, which is an automated process using a first neural network is relied upon. One having ordinary skill in the art would be motivated to combine Li’s binary object mask with Zhan’s neural network for determining an optical flow because doing so would allow for accurately predicting the motion of an object. Additionally, regarding Applicant’s arguments of Li teaching away from Zhan, Li’s reference merely discloses an alternative form of determining an optical flow which does not constitute a teaching away from the optical flow determination of Zhan. Therefore, the 35 USC 103 rejection of the claims is upheld, and consequently, THIS ACTION IS FINAL.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-6 and 9-15 are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (US 2015/0248590 A1) in view of Yu et al. (US 2022/0262009 A1, filed February 17, 2021) further in view of Shen et al. (US 2017/0236287 A1).


Regarding claim 1, Li teaches a computer-implemented method for identifying and extracting object-related effects in videos, the computer-implemented method comprising:
obtaining, by a computing system comprising one or more computing devices, video data, the video data comprising a plurality of image frames depicting one or more objects (Li, Para. [0041], processing a target image, the target image is an individual frame of a sequence of video frames); and
for each of the plurality of image frames:
generating, by the computing system, one or more binary object masks, wherein each of the one or more binary object masks is descriptive of a respective location of a corresponding object of the one or more objects within the image frame (Li, Para. [0047], foreground image, foreground frame, and foreground mask relate to a binary mask obtained by comparing (e.g., pixel-wise subtracting or performing goodness of fit tests) the current frame with the current background estimate, followed by thresholding. For example, pixels with a binary value of "1" after the comparison correspond to locations where foreground objects have been detected; conversely, pixels with a binary value of "0" correspond to locations where no foreground objects have been detected);
wherein each of the one or more object layers comprises image data illustrative of the corresponding object and one or more trace effects at least partially attributable to the corresponding object (Li, Para. [0059], the detected foreground mask (i.e., object layer) contains pixels corresponding to both the vehicle and its cast shadow).
Although Li teaches obtaining a binary mask indicating the presence of foreground objects and a background region (Li, Para. [0047]), Li does not explicitly teach “inputting, by the computing system, the image frame and the one or more binary object masks into a machine-learned matte generation model” and “receiving, by the computing system as output from the machine-learned matte generation model, a background layer illustrative of a background of the video data and one or more object layers respectively associated with the one or more binary object masks”. However, in an analogous field of endeavor, Yu teaches an act of generating a first alpha matte for a digital image via a first layer of a matting neural network utilizing the digital image (i.e., image frame) and a guidance mask corresponding to an object portrayed in the digital image (i.e., binary object mask) (Yu, Para. [0132]). Yu further teaches the alpha matte generation system 106 performs image matting by deconstructing pixel color values for the digital image 202 (denoted as “I” in FIG. 2) into the sum of two samples that account for the alpha values (“α”) of the final alpha matte 214. For example, a first sample corresponds to foreground pixel color values (denoted as “F”) that are multiplied by corresponding alpha values of the final alpha matte 214. Further, a second sample corresponds to background pixel color values (denoted as “B”) that are multiplied by a corresponding difference between one and the alpha values (or “1−α”). In this manner, the alpha matte generation system 106 can segregate (e.g., selects, extracts, and/or identifies) an object portrayed in the digital image 202 from a background of the digital image 202 by utilizing the final alpha matte 214 (Yu, Para. [0051]; Fig. 2).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date to modify the method of Li with the teachings of Yu by including a matting neural network using the image frame (i.e., digital image) and binary object mask (i.e., guidance mask) to generate an alpha matte for separating the object portrayed in the image from the background. One having ordinary skill in the art before the effective filing date would have been motivated to combine these references because doing so would allow for efficiently and flexibly generating enhanced, refined alpha mattes for digital image matting, as recognized by Yu.
Although Li in view of Yu teaches segregating an object portrayed in the image from the background (Yu, Para. [0051]), they do not explicitly teach that the separated background and foreground are “a background layer” and “an object layer” and “wherein the one or more object layers are separate from the background layer”. However, in an analogous field of endeavor, Shen teaches consider that the image is composed of the background sky layer B and the foreground layer F. An alpha matte defines the transparent/opacity areas of the background and the foreground layer (Shen, Para. [0051]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Li in view of Yu with the teachings of Shen by including that the matting neural network determines a background layer and foreground layer (i.e., object layer), wherein the object layer is separate from the background layer. One having ordinary skill in the art before the effective filing date would have been motivated to combine these references because doing so would allow for segmenting foreground objects from a background layer, as recognized by Shen. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.

Regarding claim 2, Li in view of Yu further in view of Shen teaches the computer-implemented method of claim 1, wherein the background layer and the one or more object layers comprise one or more color channels and an opacity matte (Yu, Para. [0044], an alpha matte (as a type of guidance mask) refers to a representation of a digital image that indicates, for one or more pixels, a corresponding alpha value (e.g., an opacity value or blending amount between foreground and background color values) (i.e., an opacity matte). Para. [0051], the alpha matte generation system performs image matting by deconstructing pixel color values for the digital image into the sum of two samples that account for the alpha values (“α”) of the final alpha matte. For example, a first sample corresponds to foreground pixel color values (denoted as “F”) that are multiplied by corresponding alpha values of the final alpha matte. Further, a second sample corresponds to background pixel color values (denoted as “B”) that are multiplied by a corresponding difference between one and the alpha values (or “1−α”) (i.e., one or more color channels)).
The proposed combination as well as the motivation for combining the Li, Yu and Shen references presented in the rejection of Claim 1, apply to Claim 2 and are incorporated herein by reference.  Thus, the method recited in Claim 2 is met by Li in view of Yu further in view of Shen.

Regarding claim 3, Li in view of Yu further in view of Shen teaches the computer-implemented method of claim 1, wherein, for each corresponding object, at least a portion of the one or more trace effects have locations which different from the respective location of the corresponding object (Li, Para. [0059]; Fig. 13, It can be seen that the detected foreground mask contains pixels corresponding to both the vehicle and its cast shadow. An additional step is required for shadow removal, whereby a decision is made about whether each detected foreground pixel is a shadow pixel or not. The method disclosed herein obtains a clean foreground mask (see lower right block of FIG. 13) containing only vehicle pixels through a background subtraction process (i.e., a portion of the one or more trace effects has a location different form the location of the corresponding object because it can be removed while the object pixels stay).

Regarding claim 4, Li in view of Yu further in view of Shen teaches the computer-implemented method of claim 1, wherein, for each corresponding object, at least a portion of the one or more trace effects are time-varying effects (Li, Para. [0061], without a shadow removal technique, multiple cars will be identified as a single moving object during morning hours due to the length of the cast shadows. A sample video used in conjunction with this scenario was 1.5 hours in length and captured activity during the peak breakfast hour on a sunny morning when cast shadows are the strongest (i.e., trace effects are time-varying).

Regarding claim 5, Li in view of Yu further in view of Shen teaches the computer-implemented method of claim 1, wherein each of the one or more binary object masks is descriptive of the respective location of the corresponding object independent of and excluding the one or more trace effects (Li, Para. [0059]; Fig. 13, It can be seen that the detected foreground mask contains pixels corresponding to both the vehicle and its cast shadow. An additional step is required for shadow removal, whereby a decision is made about whether each detected foreground pixel is a shadow pixel or not. The method disclosed herein obtains a clean foreground mask (see lower right block of FIG. 13) containing only vehicle pixels through a background subtraction process. Para. [0047], foreground image, foreground frame, and foreground mask relate to a binary mask obtained by comparing (e.g., pixel-wise subtracting or performing goodness of fit tests) the current frame with the current background estimate, followed by thresholding. For example, pixels with a binary value of "1" after the comparison correspond to locations where foreground objects have been detected; conversely, pixels with a binary value of "0" correspond to locations where no foreground objects have been detected)-.

Regarding claim 6, Li in view of Yu further in view of Shen teaches the computer-implemented method of claim 1, wherein, for at least one of the corresponding objects, the one or more trace effects comprise a shadow, a reflection, smoke generated by the object, or a ripple (Li, Para. [0059], the detected foreground mask contains pixels corresponding to both the vehicle and its cast shadow (i.e., trace effect comprises a shadow)).

Regarding claim 9, Li in view of Yu further in view of Shen teaches the computer-implemented method of claim 1, wherein at least one of the corresponding objects comprises a plurality of objects treated as a collective object (Li, Para. [0061], multiple cars will be identified as a single moving object during morning hours due to the length of the cast shadows).

Regarding claim 10, Li in view of Yu further in view of Shen teaches the computer-implemented method of claim 1, wherein the machine-learned matte generation model comprises a neural network (Yu, Para. [0047], using the matting neural network, the alpha matte generation system processes the digital image and the guidance mask.  A matting neural network refers to a neural network for generating an alpha matte).
The proposed combination as well as the motivation for combining the Li, Yu and Shen references presented in the rejection of Claim 1, apply to Claim 10 and are incorporated herein by reference.  Thus, the method recited in Claim 10 is met by Li in view of Yu further in view of Shen.

Regarding claim 11, Li in view of Yu further in view of Shen teaches the computer-implemented method of claim 1, wherein the machine-learned matte generation model has been trained based at least in part on a reconstruction loss, a flow loss, and a regularization loss (Yu, Para. [0116], the loss function comprises a summation of three loss components” regression loss, a composition loss, and a Laplacian loss).

Claims 12-15 recite systems with elements corresponding to the steps recited in Claims 1-3 and 6, respectively. Therefore, the recited elements of these claims are mapped to the proposed combination in the same manner as the corresponding steps in their corresponding method claims.  Additionally, the rationale and motivation to combine the Li, Yu and Shen references, presented in rejection of Claim 1, apply to these claims.  Finally, the combination of the Li, Yu and Shen references discloses a processor (Li, Para. [0115], on or more processors) and non-transitory computer-readable media (Li, Para. [0115], non-transitory computer-readable medium storing program instructions).

Claims 7 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (US 2015/0248590 A1) in view of Yu et al. (US 2022/0262009 A1, filed February 17, 2021) further in view of Shen et al. (US 2017/0236287 A1), as applied to claims 1-6 and 9-15 above, and further in view of Zhan et al. (US 2021/0279892 A1, with foreign priority to Application No. CN 201910086044.3, filed June 29, 2019, US PGPub used herein as a translation and for mapping purposes) and Palanisamy et al. (US 2020/0142421 A1).

Regarding claim 7, Li in view of Yu further in view of Shen teaches the computer-implemented method of claim 1, as described above.
Although Li in view of Yu further in view of Shen teaches performing foreground/motion detection on a target video frame (Li, Para. [0046]), they do not explicitly teach “generating, by the computing system and based at least in part on the one or more binary object masks, one or more object optical flows respectively for the one or more objects”. However, in an analogous field of endeavor, Zhan teaches the sparse motion, the binary mask and the to-be-processed image may be input to the first neural network to perform optical flow prediction, thereby obtaining the motion of the target object in the to-be-processed image (Zhan, Para. [0073]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Li in view of Yu further in view of Shen with the teachings of Zhan by including generating an optical flow for the object based at least in part on the binary mask. One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for high accuracy in predicting the motion of the object, as recognized by Zhan.
Although Li in view of Yu further in view of Shen and Zhan teaches determining optical flow based on sparse motion, the binary mask and a to-be processed image (Zhan, Para. [0073]), they do not explicitly teach “wherein inputting, by the computing system, the image frame and the one or more binary object masks into the machine-learned matte generation model comprises inputting, by the computing system, the image frame, the one or more binary object masks, and the one or more object optical flows into the machine-learned matte generation model”. However, in an analogous field of endeavor, Palanisamy teaches a first concatenation unit concatenates the pre-processed image data with both the segmentation map and the optical flow map to generating the dynamic scene output (Palanisamy, Para. [0074]). The convolutional neural network processes the dynamic scene output to extract features and generate a feature map of the extracted spatial features (Palanisamy, Para. [0075]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date to modify the method of Li in view of Yu further in view of Shen and Zhan with the teachings of Palanisamy by including that the input to the matte generation model is the pre-processed image data (i.e., image frame), the segmentation map (i.e., binary object mask), and the optical flow map (i.e., one or more object optical flows). One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for determining spatial features in an image based on an input image, a segmentation map, and an optical flow map, as recognized by Palanisamy. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date. 

Regarding claim 8, Li in view of Yu further in view of Shen, Zhan and Palanisamy teaches the computer-implemented method of claim 7, wherein each of the one or more object layers comprises a refined object optical flow for the corresponding object (Zhan, Para. [0073], the binary mask and the to-be-processed image may be input to the first neural network to perform optical flow prediction, thereby obtaining the motion of the target object in the to-be-processed image).
The proposed combination as well as the motivation for combining the Li, Yu, Shen, Zhan and Palanisamy references presented in the rejection of Claim 7, apply to Claim 8 and are incorporated herein by reference.  Thus, the method recited in Claim 8 is met by Li in view of Yu further in view of Shen, Zhan and Palanisamy.

Claims 16 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (US 2015/0248590 A1) in view of Yu et al. (US 2022/0262009 A1, filed February 17, 2021) further in view of Shen et al. (US 2017/0236287 A1), Labbe et al. (US 2018/0286053 A1) and Kim et al. (US 2019/0042882 A1).

Regarding claim 16, Li teaches one or more non-transitory computer-readable media that collectively store a machine-learned matte generation model (Li, Para. [0115], non-transitory computer-readable medium storing program instructions), wherein the machine-learned matte generation model has been trained by performance of operations, the operations comprising:
obtaining, by a computing system comprising one or more computing devices, video data, the video data comprising a plurality of image frames depicting one or more objects (Li, Para. [0041], processing a target image, the target image is an individual frame of a sequence of video frames); and
for each of the plurality of image frames:
generating, by the computing system, one or more binary object masks, wherein each of the one or more binary object masks is descriptive of a respective location of a corresponding object of the one or more objects within the image frame (Li, Para. [0047], foreground image, foreground frame, and foreground mask relate to a binary mask obtained by comparing (e.g., pixel-wise subtracting or performing goodness of fit tests) the current frame with the current background estimate, followed by thresholding. For example, pixels with a binary value of "1" after the comparison correspond to locations where foreground objects have been detected; conversely, pixels with a binary value of "0" correspond to locations where no foreground objects have been detected);
wherein each of the one or more object layers comprises image data illustrative of the corresponding object and one or more trace effects at least partially attributable to the corresponding object (Li, Para. [0059], the detected foreground mask (i.e., object layer) contains pixels corresponding to both the vehicle and its cast shadow).
Although Li teaches obtaining a binary mask indicating the presence of foreground objects and a background region (Li, Para. [0047]), Li does not explicitly teach “inputting, by the computing system, the image frame and the one or more binary object masks into a machine-learned matte generation model” and “receiving, by the computing system as output from the machine-learned matte generation model, a background layer illustrative of a background of the video data and one or more object layers respectively associated with the one or more binary object masks”. However, in an analogous field of endeavor, Yu teaches an act of generating a first alpha matte for a digital image via a first layer of a matting neural network utilizing the digital image (i.e., image frame) and a guidance mask corresponding to an object portrayed in the digital image (i.e., binary object mask) (Yu, Para. [0132]). Yu further teaches the alpha matte generation system 106 performs image matting by deconstructing pixel color values for the digital image 202 (denoted as “I” in FIG. 2) into the sum of two samples that account for the alpha values (“α”) of the final alpha matte 214. For example, a first sample corresponds to foreground pixel color values (denoted as “F”) that are multiplied by corresponding alpha values of the final alpha matte 214. Further, a second sample corresponds to background pixel color values (denoted as “B”) that are multiplied by a corresponding difference between one and the alpha values (or “1−α”). In this manner, the alpha matte generation system 106 can segregate (e.g., selects, extracts, and/or identifies) an object portrayed in the digital image 202 from a background of the digital image 202 by utilizing the final alpha matte 214 (Yu, Para. [0051]; Fig. 2).
The proposed combination as well as the motivation for combining the Li and Yu references presented in the rejection of Claim 1, apply to Claim 16 and are incorporated herein by reference. 
Although Li in view of Yu teaches segregating an object portrayed in the image from the background (Yu, Para. [0051]), they do not explicitly teach that the separated background and foreground are “a background layer” and “an object layer” and “wherein the one or more object layers are separate from the background layer”. However, in an analogous field of endeavor, Shen teaches consider that the image is composed of the background sky layer B and the foreground layer F. An alpha matte defines the transparent/opacity areas of the background and the foreground layer (Shen, Para. [0051]).
The proposed combination as well as the motivation for combining the Li, Yu and Shen references presented in the rejection of Claim 1, apply to Claim 16 and are incorporated herein by reference. 
Although Li in view of Yu further in view of Shen teaches segregating an object portrayed in the image from the background (Yu, Para. [0051]), they do not explicitly teach “compositing the background layer and the one or more object layers to generate a reconstructed frame”. However, in an analogous field of endeavor, Labbe teaches compositing the foreground layer and the background layer into a frame (Labbe, Para. [0132]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Li in view of Yu further in view of Shen with the teachings of Labbe by including generating a reconstructed frame by compositing the background layer and foreground layer (i.e., object layer). One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for reconstructing a video frame by compositing a foreground layer and background layer, as recognized by Labbe.
Although Li in view of Yu further in view of Shen and Labbe teaches generating a reconstructed frame by compositing the foreground and background layer (Labbe, Para. [0132]), they do not explicitly teach “evaluating a loss function that comprises a reconstruction loss term that compares the reconstructed frame with the image frame” and “modifying one or more values of one or more parameters of the machine-learned matte generation model based on the loss function”. However, in an analogous field of endeavor, Kim teaches in order to update their parameters, the neural networks utilize a reconstruction loss that represents a difference between the reconstruction source image (i.e., reconstructed frame) and the original source image (i.e., image frame). (Kim, Para. [0056]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Li in view of Yu further in view of Shen and Labbe with the teachings of Kim by including updating the parameters of the machine-learned matte generation model (i.e., neural network) based on the reconstruction loss representing a difference between a reconstructed frame and the original frame. One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for updating a model for reconstructing images with higher accuracy, as recognized by Kim. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date. 

Regarding claim 17, Li in view of Yu further in view of Shen, Labbe and Kim teaches the one or more non-transitory computer-readable media of claim 16, wherein the background layer and the one or more object layers comprise one or more color channels and an opacity matte (Yu, Para. [0044], an alpha matte (as a type of guidance mask) refers to a representation of a digital image that indicates, for one or more pixels, a corresponding alpha value (e.g., an opacity value or blending amount between foreground and background color values) (i.e., an opacity matte). Para. [0051], the alpha matte generation system performs image matting by deconstructing pixel color values for the digital image into the sum of two samples that account for the alpha values (“α”) of the final alpha matte. For example, a first sample corresponds to foreground pixel color values (denoted as “F”) that are multiplied by corresponding alpha values of the final alpha matte. Further, a second sample corresponds to background pixel color values (denoted as “B”) that are multiplied by a corresponding difference between one and the alpha values (or “1−α”) (i.e., one or more color channels)).
The proposed combination as well as the motivation for combining the Li, Yu, Shen, Labbe and Kim references presented in the rejection of Claim 16, apply to Claim 17 and are incorporated herein by reference.  Thus, the non-transitory computer-readable media recited in Claim 17 is met by Li in view of Yu further in view of Shen, Labbe and Kim.

Claim 20 is rejected under 35 U.S.C. 103 as being unpatentable over Li et al. (US 2015/0248590 A1) in view of Yu et al. (US 2022/0262009 A1, filed February 17, 2021) further in view of Shen et al. (US 2017/0236287 A1), Labbe et al. (US 2018/0286053 A1) and Kim et al. (US 2019/0042882 A1), as applied to claims 16 and 17 above, and further in view of Urtasun (US 2020/0160117 A1).

Regarding claim 20, Li in view of Yu further in view of Shen, Labbe and Kim teaches the one or more non-transitory computer-readable media of claim 16, as described above.
Although Li in view of Yu further in view of Shen, Labbe and Kim teaches a loss function based on a reconstruction loss (Kim, Para. [0056]), they do not explicitly teach “wherein the loss function further comprises a regularization loss term that encourages an opacity matte of each object layer toward sparsity”. However, in an analogous field of endeavor, Urtasun teaches the loss can be based at least in part on evaluation of a loss function based at least in part on a regularization term that is used to increase sparsity of the binarized target feature representation (i.e., opacity matte) (Urtasun, Para. [0273]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date to modify the method of Li in view of Yu further in view of Shen, Labbe and Kim with the teachings of Urtasun by including encouraging the matte toward sparsity using a regularization loss. One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for increasing sparsity in a feature representation, as recognized by Urtasun. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.

Allowable Subject Matter
Claims 18 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

For Examiner’s Statement of Reasons for Allowance, see the Non-Final Office Action mailed October 27, 2025.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Emma Rose Goebel whose telephone number is (703)756-5582. The examiner can normally be reached Monday - Friday 7:30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached at (571) 272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Emma Rose Goebel/Examiner, Art Unit 2662                                                                                                                                                                                                        
/AMANDEEP SAINI/Supervisory Patent Examiner, Art Unit 2662
Read full office action
Prosecution Timeline

Nov 13, 2023
Application Filed
Oct 23, 2025
Non-Final Rejection — §103
Dec 30, 2025
Applicant Interview (Telephonic)
Dec 30, 2025
Examiner Interview Summary
Jan 26, 2026
Response Filed
Feb 24, 2026
Final Rejection — §103
Apr 13, 2026
Applicant Interview (Telephonic)
Apr 13, 2026
Examiner Interview Summary
Precedent Cases

Applications granted by this same examiner with similar technology

18/146,581
Patent 12597236
FINE-TUNING JOINT TEXT-IMAGE ENCODERS USING REPROGRAMMING
2y 5m to grant Granted Apr 07, 2026
18/155,081
Patent 12597129
METHOD FOR ANALYZING IMMUNOHISTOCHEMISTRY IMAGES
2y 5m to grant Granted Apr 07, 2026
18/462,431
Patent 12597093
UNDERWATER IMAGE ENHANCEMENT METHOD AND IMAGE PROCESSING SYSTEM USING THE SAME
2y 5m to grant Granted Apr 07, 2026
18/568,996
Patent 12597124
DEBRIS DETERMINATION METHOD
2y 5m to grant Granted Apr 07, 2026
17/822,688
Patent 12588885
FAT MASS DERIVATION DEVICE, FAT MASS DERIVATION METHOD, AND FAT MASS DERIVATION PROGRAM
2y 5m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
53%
Grant Probability
99%
With Interview (+47.0%)
3y 0m
Median Time to Grant
Moderate
PTA Risk
Based on 45 resolved cases by this examiner. Grant probability derived from career allow rate.
Systems and Methods for Identifying and Extracting Object-Related Effects in Videos

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email