DETAILED ACTION
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on February 18, 2026, has been entered.
Response to Amendment
Claims 1-20 were previously pending. Applicant’s amendment filed February 18, 2026, has been entered in full. Claims 1 and 10 are amended. No claims are added or cancelled. Accordingly, claims 1-20 are now pending.
Response to Arguments
Applicant argues that amendments to claim 1 have overcome the previous objection (Remarks filed February 18, 2026, hereinafter Remarks: Page 10). Examiner agrees. The previous objection is withdrawn.
Applicant traverses the previous rejections under 35 U.S.C. 102, arguing that the previously-cited ‘Chen’ reference does not disclose all elements of the amended claims (Remarks: Pages 10-13).
First, Applicant argues that object mask
m
in Chen is binary rather than “an alpha matte channel that indicates a graded level of transparency …” as recited in the amended claims (Remarks: Pages 11-12). Examiner agrees that the binary object mask
m
in Chen does not read on the alpha matte of amended claim 1. However, Chen does teach a light attenuation index that does fall within the scope of the claimed alpha matte. The light attenuation index is represented as scalar
ρ
in equation 6 in Chen. Chen further teaches predicting an array of attenuation values for each pixel in an input image, the array being identified as attenuation mask
A
in equation 9 of Chen. “The value of this mask is in the range of [0,1], where 0 indicates no light can pass and 1 indicates the light will not be attenuated.” Sec. 4.2, Attenuation regression loss. Thus, the attenuation mask is an alpha matte indicating a graded level of transparency for each pixel value in the transparent object, with attenuation value 0 indicating no transparency, value 1 indicating full transparency, and graded values between 0 and 1 indicating degrees of transparency. In response to Applicant’s amendment, the attenuation mask of Chen is now mapped to the claimed alpha matte.
Second, Applicant argues that “Chen fails to anticipate ‘generating a modified digital image…according to a level of displacement of pixel values of the transparent object as indicated by the refractive flow…’” (Remarks: Pages 12-13). Applicant’s basis for this argument appears to be that, as discussed above, object mask
m
does not fall within the scope of the alpha matte in the amended claims. As discussed above, the attenuation mask in Chen is now mapped to the claimed alpha matte. Chen generates a modified image using a matting equation (equation 6) by modifying an image according to the attenuation mask (eqn. 6,
ρ
) as required by the claimed invention.
Applicant traverses the previous rejections under 35 U.S.C. 103, arguing that the amended claims are not obvious over Chen or other cited references (Remarks: Page 13-15).
To the extent Applicant’s arguments rely on the issues discussed above with respect to the rejections under 35 U.S.C. 102, Examiner respectfully disagrees for the reasons provided above.
Specifically regarding claim 10, Applicant argues that the claim is allowable (Remarks: Page 14). In view of Applicant’s amendment and further consideration, Examiner agrees. Chen describes a TOM-Net architecture that includes a Coarse Stage and a Refinement Stage (Fig. 2). The Coarse Stage is “a multi-scale encoder-decoder network” (Section 4, 1st paragraph). The Coarse Stage may accept a combination of an image and an accessed trimap of that image (Sec. 4.4, TOM-Net+Trimap) and that input is processed to generate encoded feature maps (e.g., Sec. 4.1, 2nd par.), which are then passed to decoder branches (e.g., Sec. 4.1, 2nd par.). However, claim 10 requires a “dual-head” decoder, which has been interpreted to require exactly two decoder heads, while the Coarse Stage in TOM-Net includes three decoder heads. Therefore, the Coarse Stage in Chen does not fall within the scope of claim 10.
The Refinement Stage in TOM-Net does include exactly two decoder heads (e.g., Fig. 2, right). However, these Refinement Stage decoder heads do not decode feature maps of a combination of the input image and the accessed trimap of the image as recited in claim 10. Chen does not describe the trimap being input to the Refinement Stage. Furthermore, the encoded features of the combined image and trimap in the Coarse Stage are not input to the Refinement Stage. Instead, the encoded features are decoded into a coarse matte and the coarse matte is then input (along with the input image) to the Refinement Stage (e.g., Sec. 4.3, 2nd par.). As Chen explicitly describes the coarse matte as a decoding of encoded feature maps of a combination of the input image and its accessed trimap, the coarse matte would not have been considered by one of ordinary skill in the art to fall within the scope of encoded feature maps as recited in claim 10. Therefore, the Refinement Stage in Chen also does not fall within the scope of claim 10. The previous rejections of claims 10-17 are withdrawn.
Claim Interpretation
Claims are given their broadest reasonable interpretation (BRI) during examination. MPEP 2111. Under BRI, the words of a claim are given their plain meaning, unless such meaning is inconsistent with the specification. MPEP 2111.01, Subsection I. The plain meaning of a term is the ordinary and customary meaning given to the term by those of ordinary skill in the art at the relevant time. Id.
Claim 3 recites “utilizing a dual-head decoder” at the third line. The plain meaning of the adjective “dual” is of or pertaining to two and/or composed or consisting of two parts. See, for example, the cited definition from the Oxford English Dictionary. In the context of the claim as a whole, the plain meaning of the term “dual” requires a decoder with exactly two heads. I.e., it excludes decoders with 1 head, 3 heads, 4 heads, or any other number of heads except 2. This interpretation is not inconsistent with the specification, which describes dual-head decoders with exactly two decoder heads – e.g., Fig. 5. The same interpretation applies to the “dual-head decoder” of claim 10.
Claim Objections
Claim(s) 1 is/are objected to because of the following informalities:
In claim 1, line 7, “properties that allows” should be “properties that allow”
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 11-15 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 11 depends from claim 10.
Claim 10 has been amended to recite “generating a modified digital image depicting the transparent object within an additional digital image by modifying a portion of the additional digital image behind the transparent object utilizing the alpha matte and the refractive flow” (emphasis added).
Claim 11 recites “generating a modified digital image depicting the transparent object in a second digital image by modifying a background of the second digital image utilizing the refractive flow.”
Taken together, these limitations suggest that claim 11 requires generating two distinct modified digital images, one depicting the transparent object in an “additional” image and the other depicting the transparent object in a “second” image.
Examiner has reviewed the specification, but has not identified description of such an embodiment. Instead, the specification describes generating only one modified digital image (e.g., Figs. 3, 7, and 9).
Claim 11 lacks adequate written description under 35 U.S.C. 112(a) because it recites two modified digital images based on an additional image and a second image, respectively, but the specification describes generating only one modified digital image based on one additional/second image.
Claims 12-15 also lack adequate written description for substantially the same reasons.
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 11-15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 11 depends from claim 10 and repeats several same or similar terms. See the summary table below. The claim is indefinite because it is unclear whether the same/similar terms are referring to the same element, or to separate and distinct elements. On the one hand, the claims use the articles “a” or “an” and, in some cases, different words (e.g., “additional” and “second”), which suggests that the claim elements are different. On the other hand, the words in claim 11 are either the same or synonymous with the words used in claim 10, which suggests that they are the same.
Claims 12-15 are also indefinite for substantially the same reasons.
Summary of Unclear Same/Similar Terms
Term in claim 10
Same/similar term in claim 11
a modified digital image
a modified digital image
an additional digital image
a second digital image
a portion of the additional digital image behind the transparent object
a background of the second digital image
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1-2, 5-6, and 9 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by ‘Chen’ (“Learning Transparent Object Matting,” 2019).
Regarding claim 1, Chen discloses a method comprising:
accessing a trimap for a first digital image depicting a transparent object (While much of the description focuses on inputting only a first digital image depicting a transparent object [e.g., Figure 1, top-left], Chen also teaches inputting an additional trimap for that image – see Section 4.4, TOM-Net+Trimap);
generating, utilizing a transparency properties neural network (e.g., Fig. 2, at least the coarse stage of TOM-Net), an alpha matte (Sec. 3, equations 5-6 and supporting text, attenuation index
ρ
; e.g., Sec. 4.2, Attenuation regression loss, attenuation indices are predicted for each pixel, forming an attenuation mask
A
; An example is shown at Fig. 2, Coarse Stage, middle output; Note that this mapping is different from what was applied in the Final Rejection) and a refractive flow for the transparent object (e.g., Sec. 3, text between eqns. 5 and 6, refractive flow is predicted for transparent object; e.g., Sec. 4, 1st paragraph, CoarseNet in TOM-Net “predicts an object mask, an attenuation mask, and a refractive flow field” – emphasis added; Fig. 2, coarse stage, lower output is the refractive flow field) from the trimap and the first digital image (e.g., Sec. 4.4, TOM-Net+Trimap, “[t]he variant model, denoted as TOM-Net+Trimap, takes both the input image and trimap as input”), wherein the alpha matte comprises an alpha matte channel that indicates a graded level of transparency for each pixel value in the transparent object (e.g., Sec. 4.2, Attenuation regression loss, attenuation channel includes graded values in range of [0,1], with 0 indicating no transparency, 1 indicating full transparency, and graded values between 0 and 1 indicating levels of transparency);
wherein the transparent object comprises translucent properties that allows for a level of light to pass through the transparent object according to the graded level of transparency (e.g., Sec. 4.2, Attenuation regression loss, “The value of this mask is in the range of [0,1], where 0 indicates no light can pass and 1 indicates the light will not e attenuated”); and
generating a modified digital image depicting the transparent object within a second digital image (e.g., Fig. 1, bottom row shows examples of modified/composite images; Sec. 3, eqn. 6 describes the modification; The following mapping focuses on the second term of eqn. 6 – i.e.,
m
ρ
M
T
,
P
) by modifying a portion of a background of the second digital image behind the transparent object (i.e., modification of background image
T
through sampling
M
at locations given by
P
and multiplication with
m
and
ρ
) according to the alpha matte channel for each pixel value in the transparent object as indicated by the graded level of transparency in the alpha matte (i.e.,
ρ
, which is a scalar value indicated a graded level of transparency/attenuation at a given pixel) and according to a level of displacement of pixel values of the transparent object as indicated by the refractive flow (Variable
P
represents the refractive flow, which indicates a level of displacement of pixel values of the transparent object – see, e.g., text between eqns. 4 and 6;
P
is used to provide locations for sampling
M
– see, e.g., eqns. 4-6 and related text).
Regarding claim 2, Chen discloses the method of claim 1, further comprising generating, utilizing an encoder of the transparency properties neural network (e.g., Fig. 2, Sec. 4.1, 2nd par., encoder of coarse stage, which includes six down-sampling convolutional blocks), feature maps of a combination of the first digital image depicting the transparent object and the trimap (e.g., Fig. 2, encoder in coarse stage generates feature maps that are passed to decoders; e.g., Sec. 4.4, TOM-Net+Trimap, “[t]he variant model, denoted as TOM-Net+Trimap, takes both the input image and trimap as input”).
Regarding claim 5, Chen discloses the method of claim 1, wherein generating the alpha matte and the refractive flow further comprises:
generating the alpha matte and the refractive flow by utilizing skip connections for encoders of the transparency properties neural network to corresponding decoders (e.g., Fig. 2, coarse stage, arrows connecting encoder blocks to corresponding decoder blocks; e.g., Sec. 4.1, 2nd par., “Features in the encoder layers are connected to the decoder layers having the same spatial dimensions through skip connections”) by:
generating the alpha matte utilizing a first activation layer of a first decoder branch (e.g., Fig. 2, coarse stage, each decoder branch includes 7 blocks, the first 6 of which are shaded orange in a color version of the reference to indicate that they include convolution [Conv], batch normalization [BN], ReLU activation [ReLU], and upsampling [Upsample] layers, and the 7th block is shaded green to indicate that it includes convolution, batch normalization, and ReLU activation layers; Any of the ReLU layers in the middle decoder branch that generates the alpha matte can be mapped to the “first activation layer”; For simplicity of explanation, the earliest [i.e., left-most] ReLU activation layer in the middle attenuation mask decoder branch is considered the “first” activation layer); and
generating the refractive flow utilizing a second activation layer of a second decoder branch (See explanation above; For simplicity of explanation, the last [i.e., right-most] ReLU activation layer in the bottom refractive flow decoder branch is considered the “second” activation layer), wherein the first activation layer is different than the second activation layer (As can be seen in Fig. 2, the first and second activation layers are “different” at least because they are distinct parts of TOM-Net and/or they have different sizes/dimensions).
Regarding claim 6, Chen discloses the method of claim 1, wherein:
generating the refractive flow comprises generating a two-channel vector (e.g., Sec. 4.2, Refractive flow regression loss, 1st sentence, “The predicted refractive flow field has a dimension of
2
×
H
×
W
, where we have one channel for the horizontal displacement and another for the vertical displacement”); and
generating the two-channel vector comprises generating a horizontal and vertical indication of displacing pixel values within the second digital image (e.g., Sec. 4.2, Refractive flow regression loss, 1st sentence, “The predicted refractive flow field has a dimension of
2
×
H
×
W
, where we have one channel for the horizontal displacement and another for the vertical displacement”).
Regarding claim 9, Chen discloses the method of claim 1, wherein generating the modified digital image depicting the transparent object within the second digital image further comprises compositing a warped background of the second digital image with a version of the transparent object modified by the alpha matte (e.g., Fig. 1, composited examples, where background is warped by transparent object; Also see, e.g., Sec. 3 and eqn. 6).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 7-8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen as applied above, and further in view of ‘Pao’ (US 2020/0134834 A1).
Regarding claim 7, Chen teaches the method of claim 1.
Chen is focused on a compositing application where a transparent object is placed over a background (i.e., “second”) image (e.g., Fig. 1). As part of the compositing process, Chen teaches generating a scaled refractive flow by scaling the refractive flow (e.g., Sec. 6.4, Fig. 10c, right two images show rescaling to shrink and expand refractive flow, respectively). Nevertheless, Chen does not explicitly teach that the scaling is based on a determined dimension of the second digital image.
However, Pao does teach an example of another compositing application that includes determining a dimension of a second digital image and performing scaling based on that dimension (e.g., Fig. 2, width dimension of second/original image is greater than width of sky image [note that the sky is the object being composited onto the original image], so the sky region is upscaled to fit the second/original image’s width; also see, e.g., [0043] and [0045] et seq.).
As demonstrated by Pao, compositing may be performed using images of two different sizes/dimensions and scaling based on the dimensions of the images may be necessary to obtain a composited image with consistent scale.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the method of Chen with the second-image-dimension-based scaling of Pao in order to improve the method with the reasonable expectation that this would result in a method that could perform compositing with images of different dimensions while advantageously obtaining a composited image with consistent scale. This technique for improving the method of Chen was within the ordinary ability of one of ordinary skill in the art based on the teachings of Chen and Pao.
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Chen and Pao to obtain the invention as specified in claim 7.
Regarding claim 8, Chen in view of Pao teaches the method of claim 7, and Chen further teaches that generating the modified digital image depicting the transparent object within the second digital image further comprises utilizing the scaled refractive flow to remap pixel values in a background of the second digital image (e.g., Sec. 6.4, Fig. 10c, right two images, rescaled environment matte – which includes the refractive flow – is used to remap pixels in background based on refraction of the transparent object).
Allowable Subject Matter
Claims 3 and 4 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Claims 10 and 16-20 are allowed.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEOFFREY E SUMMERS whose telephone number is (571)272-9915. The examiner can normally be reached Monday-Friday, 7:00 AM to 3:30 PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached at (571) 272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/GEOFFREY E SUMMERS/Examiner, Art Unit 2669