DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Claim Objections
Claims 4-6 and 14-16 are objected to because of the following informalities:
Claim 4 appears to contain a grammatical/typographical error, reciting in line 4 “rays in each batch if images” (where “rays in each batch of images” is assumed to be the intended language).
Claims 5-6 are objected to as incorporating the same language by reference.
Claim 14 appears to contain a grammatical/typographical error, reciting in line 4 “rays in each batch if images” (where “rays in each batch of images” is assumed to be the intended language).
Claims 15-16 are objected to as incorporating the same language by reference.
Appropriate correction is required.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are:
“an image capture device” (US PG-Pub. Spec. ¶18 – camera) and “a renderer” (US PG-Pub. Spec. ¶35, processing unit performing operations) in in claim 1, and incorporated by reference into claims 2-10.
“an image capture device”, “a renderer” and “a data storage device” (US PG-Pub. Spec. ¶70: memory) in claim 20.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 4-8 and 14-18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 4, the claim recites:
PNG
media_image1.png
31
423
media_image1.png
Greyscale
where the expression
PNG
media_image2.png
28
54
media_image2.png
Greyscale
is unclear as to the intended meaning. There is no clarification in the specification and the expression does not appear to have any clarification as to the meaning of the term. Applicant’s specification, ¶71 discloses the same equation, and clarifies
PNG
media_image3.png
25
54
media_image3.png
Greyscale
, but does not clarify
PNG
media_image2.png
28
54
media_image2.png
Greyscale
, and the remaining specification does not further clarify the meaning of the term. As such, the claim is rejected as indefinite.
Claims 5-6 depend from claim 4 and are rejected based on the same rationale.
Claims 14-16 include the same indefinite term as recited in claim 4 and are therefore rejected based on the same rationale as claim 4 set forth above.
Regarding claim 7, the claim recites “the binary cross entropy (L)” which lacks antecedent basis. Furthermore, the claim states “recalculates” which is confusing whether there is a calculation to begin with, which is not previously recited, and as to what the first calculation is intended to apply to. Claim 8 depends from 7 and incorporates the same indefinite language.
Regarding claim 17, the claim recites “the binary cross entropy (L)” which lacks antecedent basis. Furthermore, the claim states “recalculates” which is confusing whether there is a calculation to begin with, which is not previously recited, and as to what the first calculation is intended to apply to. Claim 18 depends from 17 and incorporates the same indefinite language.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1, 9, 11 and 19 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by
Chapman et al. (US 2020/0043227 A1).
Regarding claim 1, Chapman discloses:
An image conversion device (Chapman, Fig. 1 and ¶25 computer system for generating a 3D model of an object using a series of 2D images), comprising:
An image capture device configured to capture a plurality of two-dimensional (2D) images (Chapman, Figs. 2A-2C and ¶30: two-dimensional source images of an object, e.g. bridge; ¶33: images captured by a single camera or other suitable image capture device from multiple different angles and vantage points of a target object); and
A renderer configured to receive the 2D images and to render a three-dimensional (3D) model of an object captured in the 2D images, (Chapman, ¶34: the 3D point generation module (e.g., the 3D point generation module 124 illustrated in FIG. 1) uses the source images to generate the three-dimensional point cloud, where photogrammetry techniques can match pixels across images to identify edges and features of an object)
wherein in rendering the 3D model, the renderer first renders a binary edge map of the object (Chapman, ¶29: image pre-processing module 122 identifies pixels for exclusion in the source images 152; Fig. 2B and ¶31: mark image rejections for exclusion; ¶34: pixels marked for exclusion in the source images are ignored and not used when generating the three-dimensional point cloud; ¶37 discloses technique for identifying pixels for exclusion, such as color, etc.; ¶43: a two dimensional binary array (or table) could be initialized to represent each pixel in the array, where a “high” value (e.g., 1) could be set in the cell corresponding to pixels that should be ignored, and a “low” value (e.g., 0) could be set in the cell corresponding to the pixels that should be used.), and
next models textures for the 3D model (Chapman, ¶32: generate textures for the 3D point cloud; Fig. 3 and ¶34: pixels marked for exclusion in source images – block 304 – followed by generating 3D point cloud – block 306 – followed by texture 3D model – block 308)
Regarding claim 11, the device of claim 1 is substantially a device embodiment of the claimed method and as such claim 11 is rejected based on the same rationale as claim 1 set forth above.
Regarding claim 9, Chapman modified by Poulson further discloses:
Wherein, in modeling the textures for the 3D model, the renderer further selects at least one Region of Interest for the object (Chapman, ¶35: At block 308, the 3D texturing module (e.g., the 3D texturing module 126 illustrated in FIG. 1) adds textures to the three-dimensional point cloud generated at block 306. These textures can be added using the source images, while still ignoring the pixels identified for exclusion at block 304. – i.e. regions of interest of the object)
Regarding claim 19, the device of claim 9 is substantially a device embodiment of the claimed method and as such claim 19 is rejected based on the same rationale as claim 9 set forth above.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 2, 10, 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over:
Chapman et al. (US 2020/0043227 A1) in view of
Poulson et al. (US 2013/0121551 A1).
Regarding claim 2, the limitations included from claim 1 are rejected based on the same rationale as the rejection of claim 1 set forth above. Further regarding claim 2, Chapman is not explicitly clear that the binary edge map includes the contours and edges of the object (Fig. 2B is likely disclosing as such, but is a bit unclear).
Poulson discloses:
Wherein the binary edge map includes contours and edges of the object (Poulson, Poulsen, ¶30 discloses determining binary images of the marker for respective ones of the projection images, using threshold based segmentation to determine set of pixels in each two-dimensional projection image that are to be included in a binary image of the markers – see Fig. 3A, explained in ¶32 as binary images of marker; Figs. 3A and 3B show binary maps having contours and edges of object)
Both Chapman and Poulson are directed to 2D to 3D reconstruction using binary mapping of image data from different viewpoints. It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success to modify the system and technique of generating 3D models using binary mapping data as provided by Chapman, incorporating the defined object region including a complete outline of the object in the binary mapping as provided by Poulson, using known electronic interfacing and programming techniques. The modification merely substitutes one known type of binary map of data for another, yielding predictable results of using a mapped outline of an object for generating the 3D images using binary maps from the 2D images. The modification also allows for an improved system for more rapidly reconstructing the object in 3D by better isolating the target object with complete boundary data, better focusing and targeting the system processing for more efficient and faster processing, while also allowing for a more targeted reconstruction of the relevant objects for study.
Regarding claim 12, the device of claim 2 is substantially a device embodiment of the claimed method and as such claim 12 is rejected based on the same rationale as claim 2 set forth above.
Regarding claim 10, Chapman modified by Poulson further discloses:
A data storage device including a repository to store the 3D model (Chapman, ¶23: computing resources including storage, with storage¶24: the modules could pre-process the images and generate the three-dimensional model and store the pre-processed images (or related data) and the three-dimensional model at a storage location in the cloud)
The only limitation not explicitly disclosed by Chapman is that the image conversion device comprises a data storage device including a repository to store the 3D model.
Poulson discloses:
the image conversion device comprises a data storage device including a repository to store the 3D model (Poulson, ¶39: the three-dimensional model 550 may be stored in a non-transitory medium for future use (e.g., processing); ¶68: computing system includes memory and storage device for storing information)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention and with a reasonable expectation of success to modify the system of generating the 3D model for storing as provided by Chapman, on computer architecture that includes a data repository for storing as provided by Poulson, using known electronic interfacing and programming techniques. The modification merely substitutes one known computer architecture for another, yielding predictable results of using a system including the storage. The modification also allows for improved device that can be self-contained for easier or more versatile implementation or limiting network access for privacy, security, etc.
Claim(s) 3 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over:
Chapman et al. (US 2020/0043227 A1) in view of
Poulson et al. (US 2013/0121551 A1) and in further view of
Zhao et al. (US 10,140,544 B1).
Regarding claim 3, the limitations included from claim 2 are rejected based on the same rationale as the rejection of claim 1 set forth above. Further regarding claim 3, Zhoa discloses:
Wherein in rendering the binary edge map, the renderer minimizes a binary cross entropy (L) from the images (Zhao, Abstract: image segmentation and region of interest identification using predictive model trained based on deep fully convolutional neural network, trained using a loss function; [5:30-42]: Each pixel of the mask may contain a value used to denote whether a particular corresponding pixel of the digital image is among any ROI, where, if there is only a single type of ROI, a binary mask is sufficient to represent all ROIs – see Fig. 1; [9:8-14]: an input image 221 may then be processed through the contraction CNN 201 and expansion CNN 203 to generate a predicted segmentation mask 264; [9:15-10:2] discloses the DCNN training process, and output segmentation mask a result of determining loss function and a series of enhancement to the DCNN using intermediate loss functions; [11:10-34]: the stochastic gradient descent back-propagation may be used to minimize the sum of i) cross-entropy loss at the final stage (270 of FIG. 2) and ii) multi-instance loss L.sub.MIL (370 of FIG. 3) at intermediate stages of the DCNN, where by minimizing Ln activation score of Ha from all features or kernels are minimized so that features or kernels only give positive scores for ROI patterns; and [11:51-52] discloses cross-entropy loss computed as segmentation error for each individual image sample; [11:63-66] generator 401 may be the DCNN model for image segmentation model – see Fig. 4)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success to modify the system and technique of generating 3D models using binary mapping data as provided by Chapman, incorporating the defined object region including a complete outline of the object in the binary mapping as provided by Poulson, by further utilizing common machine learning image segmentation for binary masking as provided by Zhao, using known electronic interfacing and programming techniques. The modification merely substitutes one known type of image binary segmentation technique for another yielding predictable results of using machine learning to obtain a binary image mask for use in a system that uses binary mask data from 2D images for further processing to generate 3D models. The modification results in an improved 3D reconstruction based on segmentation of 2D image data by using a more robust automated machine learning model that results in an improved binary image segmentation for more accurate data extracted from a 2D image to provide improved 3D generation.
Regarding claim 13, the device of claim 3 is substantially a device embodiment of the claimed method and as such claim 13 is rejected based on the same rationale as claim 3 set forth above.
Claim(s) 7 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over:
Chapman et al. (US 2020/0043227 A1) in view of
Poulson et al. (US 2013/0121551 A1) and in further view of
Cheng et al. (US 2024/0029354 A1).
Regarding claim 7, the limitations included from claim 2 are rejected based on the same rationale as the rejection of claim 1 set forth above. Further regarding claim 7, Cheng discloses:
Wherein, in modeling the textures for the 3D model, the renderer further recalculates the binary cross entropy (L) (Cheng, Abstract: Systems and techniques are provided for generating a texture for a three-dimensional (3D) facial model; ¶29 discusses reconstruction of 3D object from one or more images; ¶34: 3D model generator uses input frames to generate 3D facial model, and 3D model fitting engine 206 generates/applies a texture to the underlying 3D model; ¶42: the machine learning model can be trained to generate local textures for portions of the face such as the eyes and the mouth that can be combined with a full facial texture for the 3DMM generated by the 3D model fitting engine 306; ¶45: UV texture map generated using machine learning model / neural network; ¶49: a loss function can be used to penalize errors in the output local UV textures 473A, 473B, 473C generated by the VAE (e.g., the encoder 412, latent code 414, and decoder 420) of the non-oblique branch 410, e.g. the oblique branch can be trained with a reconstruction loss function Lrec/Eye/Mouth as in equation 1; ¶81 discloses use of binary cross-loss function for GAN generating facial textures; ¶¶82-85 provide further loss functions; ¶92: the process 600 includes generating a full facial texture associated with the 3D facial model)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success to modify the system and technique of generating 3D models using binary mapping data as provided by Chapman, incorporating the defined object region including a complete outline of the object in the binary mapping as provided by Poulson, by further utilizing common machine learning for texturing a 3D model as provided by Cheng, using known electronic interfacing and programming techniques. The modification merely substitutes one known technique for applying texture to a 3D model for another yielding predictable results of using a machine learning model to perform texturing to an object in place of other algorithmic approaches. The modification results in an improved 3D reconstruction by using a more robust automated machine learning model that results in an improved and more robust texturing modification of applying textures to a 3D model, for more accurate results and less cumbersome intervention by a user.
Regarding claim 17, the device of claim 7 is substantially a device embodiment of the claimed method and as such claim 17 is rejected based on the same rationale as claim 7 set forth above.
Claim(s) 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over:
Chapman et al. (US 2020/0043227 A1) in view of
Poulson et al. (US 2013/0121551 A1) and
Zhao et al. (US 10,140,544 B1) in further view of
Cheng et al. (US 2024/0029354 A1).
Regarding claim 20, Chapman discloses:
An image conversion device (Chapman, Fig. 1 and ¶25 computer system for generating a 3D model of an object using a series of 2D images), comprising:
An image capture device configured to capture a plurality of two-dimensional (2D) images (Chapman, Figs. 2A-2C and ¶30: two-dimensional source images of an object, e.g. bridge; ¶33: images captured by a single camera or other suitable image capture device from multiple different angles and vantage points of a target object); and
A renderer configured to receive the 2D images and to render a three-dimensional (3D) model of an object captured in the 2D images, (Chapman, ¶34: the 3D point generation module (e.g., the 3D point generation module 124 illustrated in FIG. 1) uses the source images to generate the three-dimensional point cloud, where photogrammetry techniques can match pixels across images to identify edges and features of an object)
wherein in rendering the 3D model, the renderer first renders a binary edge map of the object (Chapman, ¶29: image pre-processing module 122 identifies pixels for exclusion in the source images 152; Fig. 2B and ¶31: mark image rejections for exclusion; ¶34: pixels marked for exclusion in the source images are ignored and not used when generating the three-dimensional point cloud; ¶37 discloses technique for identifying pixels for exclusion, such as color, etc.; ¶43: a two dimensional binary array (or table) could be initialized to represent each pixel in the array, where a “high” value (e.g., 1) could be set in the cell corresponding to pixels that should be ignored, and a “low” value (e.g., 0) could be set in the cell corresponding to the pixels that should be used.), and
next models textures for the 3D model (Chapman, ¶32: generate textures for the 3D point cloud; Fig. 3 and ¶34: pixels marked for exclusion in source images – block 304 – followed by generating 3D point cloud – block 306 – followed by texture 3D model – block 308)
A data storage device including a repository to store the 3D model (Chapman, ¶23: computing resources including storage, with storage¶24: the modules could pre-process the images and generate the three-dimensional model and store the pre-processed images (or related data) and the three-dimensional model at a storage location in the cloud)
Chapman does not explicitly disclose is that the image conversion device comprises a data storage device including a repository to store the 3D model.
Poulson discloses:
the image conversion device comprises a data storage device including a repository to store the 3D model (Poulson, ¶39: the three-dimensional model 550 may be stored in a non-transitory medium for future use (e.g., processing); ¶68: computing system includes memory and storage device for storing information)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention and with a reasonable expectation of success to modify the system of generating the 3D model for storing as provided by Chapman, on computer architecture that includes a data repository for storing as provided by Poulson, using known electronic interfacing and programming techniques. The modification merely substitutes one known computer architecture for another, yielding predictable results of using a system including the storage. The modification also allows for improved device that can be self-contained for easier or more versatile implementation or limiting network access for privacy, security, etc.
Zhao discloses:
Wherein in rendering the binary edge map, the renderer minimizes a binary cross entropy (L) from the images (Zhao, Abstract: image segmentation and region of interest identification using predictive model trained based on deep fully convolutional neural network, trained using a loss function; [5:30-42]: Each pixel of the mask may contain a value used to denote whether a particular corresponding pixel of the digital image is among any ROI, where, if there is only a single type of ROI, a binary mask is sufficient to represent all ROIs – see Fig. 1; [9:8-14]: an input image 221 may then be processed through the contraction CNN 201 and expansion CNN 203 to generate a predicted segmentation mask 264; [9:15-10:2] discloses the DCNN training process, and output segmentation mask a result of determining loss function and a series of enhancement to the DCNN using intermediate loss functions; [11:10-34]: the stochastic gradient descent back-propagation may be used to minimize the sum of i) cross-entropy loss at the final stage (270 of FIG. 2) and ii) multi-instance loss L.sub.MIL (370 of FIG. 3) at intermediate stages of the DCNN, where by minimizing Ln activation score of Ha from all features or kernels are minimized so that features or kernels only give positive scores for ROI patterns; and [11:51-52] discloses cross-entropy loss computed as segmentation error for each individual image sample; [11:63-66] generator 401 may be the DCNN model for image segmentation model – see Fig. 4)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success to modify the system and technique of generating 3D models using binary mapping data as provided by Chapman, incorporating the defined object region including a complete outline of the object in the binary mapping as provided by Poulson, by further utilizing common machine learning image segmentation for binary masking as provided by Zhao, using known electronic interfacing and programming techniques. The modification merely substitutes one known type of image binary segmentation technique for another yielding predictable results of using machine learning to obtain a binary image mask for use in a system that uses binary mask data from 2D images for further processing to generate 3D models. The modification results in an improved 3D reconstruction based on segmentation of 2D image data by using a more robust automated machine learning model that results in an improved binary image segmentation for more accurate data extracted from a 2D image to provide improved 3D generation.
Cheng discloses:
Wherein, in modeling the textures for the 3D model, the renderer further recalculates the binary cross entropy (L) (Cheng, Abstract: Systems and techniques are provided for generating a texture for a three-dimensional (3D) facial model; ¶29 discusses reconstruction of 3D object from one or more images; ¶34: 3D model generator uses input frames to generate 3D facial model, and 3D model fitting engine 206 generates/applies a texture to the underlying 3D model; ¶42: the machine learning model can be trained to generate local textures for portions of the face such as the eyes and the mouth that can be combined with a full facial texture for the 3DMM generated by the 3D model fitting engine 306; ¶45: UV texture map generated using machine learning model / neural network; ¶49: a loss function can be used to penalize errors in the output local UV textures 473A, 473B, 473C generated by the VAE (e.g., the encoder 412, latent code 414, and decoder 420) of the non-oblique branch 410, e.g. the oblique branch can be trained with a reconstruction loss function Lrec/Eye/Mouth as in equation 1; ¶81 discloses use of binary cross-loss function for GAN generating facial textures; ¶¶82-85 provide further loss functions; ¶92: the process 600 includes generating a full facial texture associated with the 3D facial model)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention and with a reasonable expectation of success to modify the system and technique of generating 3D models using binary mapping data as provided by Chapman, incorporating the defined object region including a complete outline of the object in the binary mapping as provided by Poulson, utilizing common machine learning image segmentation for binary masking as provided by Zhao, by further utilizing common machine learning for texturing a 3D model as provided by Cheng, using known electronic interfacing and programming techniques. The modification merely substitutes one known technique for applying texture to a 3D model for another yielding predictable results of using a machine learning model to perform texturing to an object in place of other algorithmic approaches. The modification results in an improved 3D reconstruction by merely laying known machine learning calculations to include texture processing, using a more robust automated machine learning model that results in an improved and more robust texturing modification of applying textures to a 3D model, for more accurate results and less cumbersome intervention by a user.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILLIAM A BEUTEL whose telephone number is (571)272-3132. The examiner can normally be reached Monday-Friday 9:00 AM - 5:00 PM (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, DANIEL HAJNIK can be reached at 571-272-7642. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/WILLIAM A BEUTEL/Primary Examiner, Art Unit 2616