DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 03/13/2024, 09/27/2024, 11/04/2025, and 01/21/2026 are in compliance with the provisions of 37 CFR 1.97. Accordingly, they are being considered by the examiner.
Drawings
The drawings are objected to because Figure 2 appears to show a graph, however the graph lacks axes. It appears that the graphs refer to depth values and the associated characterizations, but it remains unclear even in view of the description in Paragraphs 36-38 of the Specification. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
Applicant is reminded of the proper language and format for an abstract of the disclosure.
The abstract should be in narrative form and generally limited to a single paragraph on a separate sheet within the range of 50 to 150 words in length. The abstract should describe the disclosure sufficiently to assist readers in deciding whether there is a need for consulting the full patent text for details.
The language should be clear and concise and should not repeat information given in the title. It should avoid using phrases which can be implied, such as, “The disclosure concerns,” “The disclosure defined by this invention,” “The disclosure describes,” etc. In addition, the form and legal phraseology often used in patent claims, such as “means” and “said,” should be avoided.
Claim Objections
Claims 1, 11 and 18 are objected to because of the following informalities: these claims recite “that corresponds to the face; and;”. The “and” should be removed as it does not correspond to the final limitation of the claim. Appropriate correction is required.
Claims 5 and 15 are objected to because of the following informalities: these claims recite “an associated groundtruth blur radius image; and for each training image”. The “and” should be removed as it does not correspond to the final limitation of the claim. Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 (and similarly 11 and 18) recites “wherein the focal table includes parameters that indicate a focal range and at least one of a front slope or a back slope;”. The focal range, front slope, and back slope are insufficiently described and are therefore indefinite. The Examiner cannot understand the scope of these terms. Even in view of the description in the Specification (Paragraphs 36-38) and Figures 2a-c, the meaning of these elements remains inconclusive (see drawing objections above). Paragraph 62 somewhat explains the terms as indicating depth values and respective blur kernel radius, but it is unclear how these relate to a slope. The focal range, front slope, and back slope are being interpreted as ranges of distance values correlated to the region of interest, distance values correlated to the region in front of the region of interest, and distance values correlated to the region behind the region of interest, respectively.
Claim 1 (and similarly 11 and 18) recites “adjusting the focal table to include each of the face bounding boxes”. The nature of the including step is unclear, and the Examiner cannot determine the scope of the limitation. The focal table is merely data that describes parameters, as indicated in the claim. It is unclear how it could include bounding boxes. This is being interpreted as including facial regions in particular focal regions (i.e. in-focus region, background, foreground, etc.).
Claims 2-10, 12-17, and 19 are rejected as dependent on the above claims.
Claims 7 (and similarly 17) recites, “weighting the loss value by image gradient of the training image”. The examiner cannot determine the scope of this limitation. It is unclear how the weighting is done by image gradient, what the image gradient is, and how the loss value is actually weighted. Because the scope cannot be determined or reasonably assumed for these claims, they are not rejected under the Prior Art.
Claim 8 recites, “wherein the image does not include information about focus and depth.” Firstly, all images include information about focus and depth. All images have a resolution, which amounts to focus information. Additionally, an image may show an occluding object, lines that converge towards a vanishing point, and relative sizes of objects, all of which denote depth. Secondly, based on claim 1, all images receive information about depth and focus, as claim 1 estimates and determines both of them. This is being interpreted such that the image does not include a depth map or a focal table prior to their determination in the claims.
Claim 9 recites, “an image captured using a camera that does not store focus and depth information”. As explained above all images include some kind of information about focus and depth. All images have a resolution, which amounts to focus information. Additionally, an image may show an occluding object, lines that converge towards a vanishing point, and relative sizes of objects, all of which denote depth. Any camera that captures and saves an image therefore stores focus and depth information. This limitation is being interpreted such that the camera itself does not generate the claimed depth map or focal table prior to their generation through the steps of the claim.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1, 3, 4, 8-11, 13, 14, 18, and 20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Gong (Using Depth Mapping to realize Bokeh effect with a single camera Android device).
Regarding claim 1, Gong teaches “A computer-implemented method comprising: estimating depth for an image to obtain a depth map that indicates depth for each pixel of the image;” (Gong, Figure 5 and Page 3 last paragraph and onto page 4, “So far we have discussed obtaining the defocus map. The defocus blur is directly related to the size of the circle of confusion ( [Symbol font/0x73]db = k*c). Combining with eq. (1) one can obtain the depth map (distance map) of each pixel in the image provided the camera parameter values are known. Due to focal plan ambiguity, it is usually assumed that all defocussed pixel lie on one side of the focal plane during depth estimation. In this project we simply use the defocus map as a proxy for depth map when realizing.” Figure 5 shows the depth map for the image.)
“generating a focal table for the image, wherein the focal table includes parameters that indicate a focal range and at least one of a front slope or a back slope;” (Gong, Figure 6b, Page 4 first Paragraph of section 3.2.1, and Page 8 Paragraph 1, “After depth map is available, images can be segmented into different segmentations. Theoretically, the more segmentation we have, the better Bokeh effects we have. However, the resolution of the depth map is not so high that there are a lot of unsmooth transition between different segmentations. To achieve Bokeh effects, only 2 segments are applied here, resulting in a foreground and background. Then the binary masks of the foreground and background are obtained. The original RBG image is blurred using a Gaussian kernel. The binary mask of the background will be applied on the blurred RBG image and the binary mask of the foreground will be applied on the original RBG image. Finally, the two images are added together to produce the image with Bokeh effect.”; “Here, we are using Figure 6a) as the input image for Bokeh post-processing. Figure 6b) displays the foreground and background that are segmented based on the depth map. Then face detection is applied on this image, and Figure 7 a) shows the results of the face detection from Matlab. Then, the position and size of the bounding box is adjusted to include the hair, as shown in Figure 7b). inally, the Bokeh image with the correction using face detection is shown in Figure 7c). As you can see, features on the faces like the eyebrow, jaw and part of the hair are not blurred anymore.” The introduction further recites, “A depth map (an estimate of depth at each pixel in the photo) is used to identify portions of the image that are far away and belong to the background and therefore apply a digital blur to the background.” This reference discloses a mapping that separates foreground (focal range) from background (back slope) and applies selective blurring based on depth (background is blurred). This mapping of depth to blur amounts to the claimed ‘focal table’ as interpreted under broadest reasonable interpretation.)
“determining if one or more faces are detected in the image; if it is determined that one or more faces are detected in the image, identifying a respective face bounding box for each face of the one or more faces, wherein the respective face bounding box includes a region of the image that corresponds to the face; and adjusting the focal table to include each of the face bounding boxes;” (Gong, Figures 7a-c and Page 8 Paragraph 1, “Here, we are using Figure 6a) as the input image for Bokeh post-processing. Figure 6b) displays the foreground and background that are segmented based on the depth map. Then face detection is applied on this image, and Figure 7 a) shows the results of the face detection from Matlab. Then, the position and size of the bounding box is adjusted to include the hair, as shown in Figure 7b). Finally, the Bokeh image with the correction using face detection is shown in Figure 7c). As you can see, features on the faces like the eyebrow, jaw and part of the hair are not blurred anymore.” The focal table adjustment is mapped to the face protection wherein focal table parameters are adjusted to include face in the non-blurred foreground.)
“if it is determined that no faces are detected in the image, scaling the focal table;” (Gong, Page 1, Introduction, Paragraph 2, and Page 8 Paragraph 3, “Bokeh effect is usually achieved in high end SLR cameras using portrait lenses that are relatively large in size and have a shallow depth of field. It is extremely difficult to achieve the same effect (physically) in smart phones which have miniaturized camera lenses and sensors. However, the latest iPhone 7 has a portrait mode which can produce Bokeh effect thanks to the dual cameras configuration. To compete with iPhone 7, Google recently also announced that the latest Google Pixel Phone can take photos with Bokeh effect, which would be achieved by taking 2 photos at different depths to camera and combining then via software. There is a gap that neither of two biggest players can achieve Bokeh effect only using a single image from a single smartphone camera. In this project we seek to fill this gap. We are planning to produce a bokeh effect with photos taken using an Android device by post processing the photos within the Android device. The photos can also come from Photo Stream. The new images with Bokeh can be saved into Photo Stream. At the core of Bokeh effect production in smartphone photography is depth mapping. A depth map (an estimate of depth at each pixel in the photo) is used to identify portions of the image that are far away and belong to the background and therefore apply a digital blur to the background. As mentioned early, Bokeh effect is typically present in portrait photos. For non-portrait photos, we are seeking to apply cartoon effect on the foreground as a way of artistic enhancement. What’s more, other ways of artistic enhancement like changing the background of images can also be performed based on the depth map.”; “We apply bilateral filtering on the input image (fig. 8a) to decompose the image into a cartoon-like component (fig. 3b). Strong edges of the input images (fig. 8c) is obtained by using Canny method. Cartoon-like effect can be achieved after combining the above two images. In our case, we only apply the cartoon effect on the foreground so that the portrait in the image is cartooned while preserving the background.” Therefore, when there is no face detected (non-portrait photos), bilateral filtering is performed. Bilateral filtering includes a blurring step, which amounts to a scaling of the focal map, because the focal map amounts to a relationship between depth and blur.)
“and applying blur to the image using the focal table and the depth map to generate an output image, wherein the output image includes an in-focus region and one or more blurred regions.” (Gong, Figure 7c shows the blurred image using focal table (which is based on depth map) with in-focus person and blurred background. Note that as currently written, “the focal table” corresponds to the original focal table that is generated, rather than the adjusted or scaled focal table. Gong teaches both the use of the generated focal table and the use of the adjusted focal table for the blurring, because the adjusted focal table is created using the generated focal table.).
Regarding claim 3, Gong teaches “The computer-implemented method of claim 1,”
“wherein the focal table excludes the front slope when there are no foreground regions in the image that are in front of an image subject and excludes the back slope if there are no background regions in the image behind an image subject.” (Firstly, the underlined limitation refers to the non-selected additional embodiment. Secondly, the claim merely describes not setting a background area where there is no background. This is an inherent property of segmenting a foreground and background region, as is performed by Gong as recited in the rejection of claim 1.)
Regarding claim 4, Gong teaches “The computer-implemented method of claim 1,”
“wherein the in-focus region in the output image includes pixels that are associated with depth values in the depth map that correspond to a blur radius of zero.” (Gong, Figures 7a-c and Page 8 Paragraph 1, “Here, we are using Figure 6a) as the input image for Bokeh post-processing. Figure 6b) displays the foreground and background that are segmented based on the depth map. Then face detection is applied on this image, and Figure 7 a) shows the results of the face detection from Matlab. Then, the position and size of the bounding box is adjusted to include the hair, as shown in Figure 7b). Finally, the Bokeh image with the correction using face detection is shown in Figure 7c). As you can see, features on the faces like the eyebrow, jaw and part of the hair are not blurred anymore.” Note that the foreground and background are segmented based on depth, therefore each region corresponds to depth values. Additionally, figure 7c shows the unblurred in-focus foreground region. Therefore the pixels in the in-focus region are associated with depths without any blur.)
Regarding claim 8, Gong teaches “The computer-implemented method of claim 1,”
“wherein the image does not include information about focus and depth.” (The depth map is calculated and the focal table is generated by Gong as outlined in the rejection of claim 1. The image therefore does not initially have either the depth map or table. See interpretation under 35 USC 112(b).)
Regarding claim 9, Gong teaches “The computer-implemented method of claim 1,”
“wherein the image is a scanned photograph, an image stripped of metadata, an image captured using a camera that does not store focus and depth information, or a frame of a video.” (As interpreted under 35 USC 112(b), the underlined limitation amounts to an image captured by a camera wherein the camera itself does not generate the claimed depth map or focal table immediately prior to their generation through the steps of the claim. Gong discloses image capture by mobile phones and tablets, which do not automatically generate the claimed depth map and focal table prior to the disclosed image processing steps. See Page 4 Section “Information Flow”, “We built an Android app for this project in order to deploy our application on mobile phones and tablets. The app provides an interface for the user to take a picture with the mobile device camera, process the image using image processing algorithms we described above and display the processed image on the mobile device. Our image processing algorithm consists of several processing steps including linear filtering, non-linear filtering, solution to a sparse yet large linear system, morphological processing, etc. This requires a high amount computational power in order to complete the image processing in a reasonable amount of time (<1 min). So we employed a server to offload the image processing task.”)
Regarding claim 10, Gong teaches “The computer-implemented method of claim 1,”
“further comprising displaying the output image.” (Gong, Figure 7c shows the displayed output image.)
Claims 11, 13, and 14 recite a non-transitory computer-readable medium with instructions for a processer to perform steps corresponding to the steps recited in claims 1, 3, and 4. Therefore, the recited programming instructions of the claims are mapped to the analogous steps in the corresponding method claims. Finally, Gong teaches a non-transitory computer-readable medium with instructions for a processor (Gong, Page 4, Section “Setting up webserver, “We setup a webserver on a Dell 4800 workstation with Intel Core i7 processor with 8 cores and 16 GB RAM running 64-bit Windows 7. We setup a webserver using WAMP 64-bit software which comes with Apache, MySQL, and PHP services. We setup a local host thru port 8080 instead of the default port 80 in order to have stable performance by avoiding potential conflict with other windows system services that might listen to port 80. We also had to make other modifications to both webserver configuration and Windows firewall settings in order to make the webserver functional.”)
Claims 18 and 20 recite a system a processor and memory with elements corresponding to the steps recited in Claims 1 and 3. Therefore, the recited elements of these claims are mapped to the analogous steps in the corresponding method claims. Gong teaches a system a processor and memory (Gong, Page 4, Section “Setting up webserver, “We setup a webserver on a Dell 4800 workstation with Intel Core i7 processor with 8 cores and 16 GB RAM running 64-bit Windows 7. We setup a webserver using WAMP 64-bit software which comes with Apache, MySQL, and PHP services. We setup a local host thru port 8080 instead of the default port 80 in order to have stable performance by avoiding potential conflict with other windows system services that might listen to port 80. We also had to make other modifications to both webserver configuration and Windows firewall settings in order to make the webserver functional.”)
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 2, 12, and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gong in view of Liang (US20170091906A1).
Regarding claim 2, Gong teaches “The computer-implemented method of claim 1,”
While Gong teaches adjusting the focal table so that pixels of the face bounding box are in focus (Gong, Figure 7 and rejection of Claim 1), Gong does not expressly disclose that the adjusting comprises extending a range of depth values in focus.
Liang teaches adjusting a range of depth values in focus (Liang, Figure 5B shows a focus range 560 between first depth 570 and second depth 580. Paragraph 19, “The processed image may be displayed on a display device. The user may make further adjustments, such as adjusting the first focus depth and/or the second focus depth again, until the desired effect is obtained. Various user interface elements, controls, and the like may be provided to the user to facilitate determination of the first and second focus depths and/or control application of depth-based effects such as blurring. In addition to or in the alternative to application of blurring, the designated first and second focus depths may be used to control application of bokeh effects such as the application of blur effects, which may be circular, noncircular, and/or variable with depth. Further, the blurring and bokeh effects are merely exemplary; the system and method disclosed herein may be used to apply a wide variety of other effects besides blurring and bokeh effects. Such effects may include, but are not limited to, modification of exposure, contrast, saturation, and/or colorization of the image, replacement of a portion of an image with another image or portion thereof, and/or the like. Hence, it will be understood that reference to application of “blurring” in this disclosure is also disclosure of any other depth-based effect.”)
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to incorporate focal range adjustment of Liang for the focal table adjustment of Gong.
The motivation for doing so would have been to provide a means for the ‘face protection’ of Gong. Gong discloses protecting the face from blur in Figure 7 but does not recite a means for doing so. The method of Liang, when performed in conjunction with the facial identification of Gong, would achieve the means for protecting the face. Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Gong with the above teaching of Liang to fully disclose “wherein adjusting the focal table comprises extending a range of depth values in focus until pixels of each face bounding box are in the focal range.”
Claim 12 recites a non-transitory computer-readable medium with instructions for a processer to perform steps corresponding to the steps recited in claim 2. Therefore, the recited programming instructions of the claim is mapped to the analogous steps in the corresponding method claim. Additionally, the rationale and motivation to combine the Gong and Liang references apply here. Finally, Gong teaches a non-transitory computer-readable medium with instructions for a processor (Gong, Page 4, Section “Setting up webserver, “We setup a webserver on a Dell 4800 workstation with Intel Core i7 processor with 8 cores and 16 GB RAM running 64-bit Windows 7. We setup a webserver using WAMP 64-bit software which comes with Apache, MySQL, and PHP services. We setup a local host thru port 8080 instead of the default port 80 in order to have stable performance by avoiding potential conflict with other windows system services that might listen to port 80. We also had to make other modifications to both webserver configuration and Windows firewall settings in order to make the webserver functional.”)
Claims 19 recites a system a processor and memory with elements corresponding to the steps recited in Claim 2. Therefore, the recited elements of this claim are mapped to the analogous steps in the corresponding method claim. Additionally, the rationale and motivation to combine the Gong and Liang references apply here. Gong teaches a system a processor and memory (Gong, Page 4, Section “Setting up webserver, “We setup a webserver on a Dell 4800 workstation with Intel Core i7 processor with 8 cores and 16 GB RAM running 64-bit Windows 7. We setup a webserver using WAMP 64-bit software which comes with Apache, MySQL, and PHP services. We setup a local host thru port 8080 instead of the default port 80 in order to have stable performance by avoiding potential conflict with other windows system services that might listen to port 80. We also had to make other modifications to both webserver configuration and Windows firewall settings in order to make the webserver functional.”)
Claim(s) 5-6 and 15-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gong in view of Ignatov (Rendering Natural Camera Bokeh Effect with Deep Learning).
Regarding claim 5, Gong teaches “The computer-implemented method of claim 1,”
Gong does not disclose the generation of the focal table using a trained machine learning model.
Ignatov discloses the generation of the focal table using a trained machine learning model. (Ignatov, Sections 2 and 3.2 Paragraphs 1-2, “One of the biggest challenges in the bokeh rendering task is to get high-quality real data that can be used for training deep models. To tackle this problem, a large-scale Every thing is Better with Bokeh! (EBB!) dataset containing more than 10 thousand images was collected in the wild during several months. By controlling the aperture size of the lens, images with shallow and wide depth-of-field were taken. In each photo pair, the first image was captured with a narrow aperture (f/16) that results in a normal sharp photo, whereas the second one was shot using the highest aperture (f/1.8) leading to a strong bokeh effect… An example set of collected images is presented in Figure 2. … Finally, we computed a coarse depth map for each wide depth-of-field image using the Megadepth model proposed in [12]. These maps can be stacked directly with the input images and used as an additional guidance for the trained model.”; “Our initial results demonstrated that though adding the pre-computed depth map to the input data does not change the results radically (Fig. 6), this helps to refine the shape of the blurred image area and to deal with the most complex photo scenes. Therefore, in all subsequent experiments the estimated depth map was used by default. All model levels were trained with the L1 loss except for the output level 1 where the following combination of the loss functions was used: LLevel 1 = LL1 +(1−LSSIM)+0.01·LVGG, where LSSIM is the structural similarity (SSIM) loss [18] and LVGG is the perceptual VGG-based [9] loss function. The above coefficients were chosen based on the results of the preliminary experiments on the considered EBB! Dataset.”)
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to use the trained machine learning model of Ignatov for the focal table generation of Gong.
The motivation for doing so would have been to improve segmentation results for identifying background for blurring and foreground for resolution preservation. Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Gong with the above teaching of Ignatov to fully disclose, “wherein generating the focal table comprises using a focal table prediction model, wherein the focal table prediction model is a trained machine learning model”
Gong in view of Ignatov further disclose “and the method further comprises training the focal table prediction model wherein the training comprises: providing a plurality of training images as input to the focal table prediction model, wherein each training image has an associated depth map and an associated groundtruth blur radius image; and for each training image, generating, using the focal table prediction model, a predicted focal table; obtaining a predicted blur radius image using the predicted focal table and the depth map associated with the training image; computing a loss value based on the predicted blur radius image and the groundtruth blur radius image associated with the training image; and adjusting one or more parameters of the focal table prediction model using the loss value.” (Ignatov, Sections 2 and 3.2 Paragraphs 1-2, “One of the biggest challenges in the bokeh rendering task is to get high-quality real data that can be used for training deep models. To tackle this problem, a large-scale Every thing is Better with Bokeh! (EBB!) dataset containing more than 10 thousand images was collected in the wild during several months. By controlling the aperture size of the lens, images with shallow and wide depth-of-field were taken. In each photo pair, the first image was captured with a narrow aperture (f/16) that results in a normal sharp photo, whereas the second one was shot using the highest aperture (f/1.8) leading to a strong bokeh effect… An example set of collected images is presented in Figure 2. … Finally, we computed a coarse depth map for each wide depth-of-field image using the Megadepth model proposed in [12]. These maps can be stacked directly with the input images and used as an additional guidance for the trained model.”; “Our initial results demonstrated that though adding the pre-computed depth map to the input data does not change the results radically (Fig. 6), this helps to refine the shape of the blurred image area and to deal with the most complex photo scenes. Therefore, in all subsequent experiments the estimated depth map was used by default. All model levels were trained with the L1 loss except for the output level 1 where the following combination of the loss functions was used: LLevel 1 = LL1 +(1−LSSIM)+0.01·LVGG, where LSSIM is the structural similarity (SSIM) loss [18] and LVGG is the perceptual VGG-based [9] loss function. The above coefficients were chosen based on the results of the preliminary experiments on the considered EBB! Dataset.” See figure 6, as referenced above, for the predicted blur radius image based on the focal table (foreground and background parameters). Further, note that the above excerpt was incorporated with rationale and motivation in the previous limitation.)
Regarding claim 6, Gong in view of Ignatov teaches “The computer-implemented method of claim 5,”
“wherein the depth map associated with each training image is one of: a groundtruth depth map obtained at a time of image capture or an estimated depth map obtained using a depth prediction model.” (Ignatov, Sections 2 and 3.2 Paragraphs 1-2, “One of the biggest challenges in the bokeh rendering task is to get high-quality real data that can be used for training deep models. To tackle this problem, a large-scale Every thing is Better with Bokeh! (EBB!) dataset containing more than 10 thousand images was collected in the wild during several months. By controlling the aperture size of the lens, images with shallow and wide depth-of-field were taken. In each photo pair, the first image was captured with a narrow aperture (f/16) that results in a normal sharp photo, whereas the second one was shot using the highest aperture (f/1.8) leading to a strong bokeh effect… An example set of collected images is presented in Figure 2. … Finally, we computed a coarse depth map for each wide depth-of-field image using the Megadepth model proposed in [12]. These maps can be stacked directly with the input images and used as an additional guidance for the trained model.”; “Our initial results demonstrated that though adding the pre-computed depth map to the input data does not change the results radically (Fig. 6), this helps to refine the shape of the blurred image area and to deal with the most complex photo scenes. Therefore, in all subsequent experiments the estimated depth map was used by default. All model levels were trained with the L1 loss except for the output level 1 where the following combination of the loss functions was used: LLevel 1 = LL1 +(1−LSSIM)+0.01·LVGG, where LSSIM is the structural similarity (SSIM) loss [18] and LVGG is the perceptual VGG-based [9] loss function. The above coefficients were chosen based on the results of the preliminary experiments on the considered EBB! Dataset.” Note that the above excerpt was incorporated with rationale and motivation is the previous limitation.)
Claims 15 and 16 recite a non-transitory computer-readable medium with instructions for a processer to perform steps corresponding to the steps recited in claims 5 and 6. Therefore, the recited programming instructions of the claims are mapped to the analogous steps in the corresponding method claims. Additionally, the rationale and motivation to combine the Gong and Ignatov references apply here. Finally, Gong teaches a non-transitory computer-readable medium with instructions for a processor (Gong, Page 4, Section “Setting up webserver, “We setup a webserver on a Dell 4800 workstation with Intel Core i7 processor with 8 cores and 16 GB RAM running 64-bit Windows 7. We setup a webserver using WAMP 64-bit software which comes with Apache, MySQL, and PHP services. We setup a local host thru port 8080 instead of the default port 80 in order to have stable performance by avoiding potential conflict with other windows system services that might listen to port 80. We also had to make other modifications to both webserver configuration and Windows firewall settings in order to make the webserver functional.”)
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Suszek (US 20200143520 A1) teaches applying blur to images based on depth data in accordance with depth map associated with the image. Tsai (US 20210004962 A1) teaches the performance image processing strategies, including bokeh effects, based on identified foreground and backgrounds where the foreground is in focus and the background is blurred.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AARON JOSEPH SORRIN whose telephone number is (703)756-1565. The examiner can normally be reached Monday - Friday 9am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached at (571) 272-3638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AARON JOSEPH SORRIN/
Examiner, Art Unit 2672
/SUMATI LEFKOWITZ/Supervisory Patent Examiner, Art Unit 2672