Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Priority
Acknowledgment is made of applicant’s priority claim, for U.S. Application No. 18/620,312, to a U.S. Provisional Application filed on 03/28/2023.
Status of Claims
Claims 1–20 are pending in the application. Claims 1-20 are rejected.
Overview of Grounds of Rejection
Ground of Rejection
Claim(s)
Statute(s)
Reference(s)
1
1, 2, 3, 4, 5, 6, 7, 11, 12, 13, 14, 15, 16, 17
§ 103
Du et al. (NPL); OpenCV (NPL)
2
8, 9, 10, 18, 19, 20
§ 103
Du et al. (NPL); OpenCV (NPL); Smith et al. (US11120639B1)
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
(Please see the cited paragraphs, sections, pages, or surrounding text in the references for the paraphrased content.)
Ground of Rejection 1
Claims 1, 2, 3, 4, 5, 6, 7, 11, 12, 13, 14, 15, 16, 17 are rejected under 35 U.S.C. § 103 as being unpatentable over Du et al. (NPL) in view of OpenCV (NPL).
As per Claim 1, Du teaches the following preamble portion of Claim 1, which recites:
“A method for superimposing a two-dimensional (2D) image onto a surface, comprising:”
Du et al. (NPL) shows a complete AR pipeline with camera + depth, depth mesh, and texture decals/tri-planar mapping on real surfaces, establishing the general context of superimposing imagery on sensed geometry.
Du teaches the following portion of Claim 1, which recites:
“receiving image data and depth sensor data, of a surrounding environment, from at least one sensor;”
“Our input consists of the RGB camera image, depth map from ARCore Depth API … For each frame, we update the depth array … depth mesh … depth texture from the raw depth buffer.” — Du et al. (NPL), Figure 4 ‘System architecture of DepthLab’ (Tracking and Input / Data Structures)
Du et al. clearly discloses receiving image data (RGB camera) and depth sensor data
(ARCore depth or ToF/stereo) from phone sensors each frame, matching “receiving image data and depth sensor data… from at least one sensor.”
Du teaches the following portion of Claim 1, which recites:
“processing the depth sensor data to generate a three-dimensional (3D) wire mesh;”
“(c) shows the resulting depth mesh consisting of interconnected triangle surfaces… A mesh is a set of triangle surfaces that are connected to form a continuous surface…” — Du et al. (NPL), Figure 9 / ‘Real-time Depth Mesh’
“Depth mesh is a real-time triangulated mesh generated for each depth map on both CPU and GPU… We detail its generation procedure in Algorithm 2.” — Du et al. (NPL), ‘Data Structures of DepthLab’
Processing the depth data to create a triangulated (wire) mesh is exactly what Du et al. describes.
Du alone does not explicitly teach all the limitation(s) of the claim. However, when combined with OpenCV (NPL), they collectively teach all of the limitation(s).
OpenCV teaches the following portion of Claim 1, which recites: “pre-processing the 2D image to generate a pre-processed 2D image, wherein the pre-processing comprises assigning one or more pixels in the 2D image as one or more image anchor points;”
“To calculate a homography between two images, you need to know at least 4 point correspondences between the two images… in this post we are simply going to click the points by hand.” — OpenCV (NPL), “How to calculate a Homography?” (tutorial text and code)
“Four corners of the book in source image … Four corners of the book in destination image … Mat h = findHomography(pts_src, pts_dst); … warpPerspective(…) … “warp image onto the other.” — OpenCV (NPL), C++/Python example (code + narrative)
The claim’s “assigning … pixels … as … image anchor points” is met where OpenCV requires a user to select specific pixels (e.g., four corners) in the 2D source image as point correspondences, i.e., anchor/control points used to compute the transform before overlay. The tutorial expressly contemplates hand-clicked points, which are assigned pixels for alignment.
Du in combination with OpenCV teaches the following portion of Claim 1, which recites:
“determining a target placement area, on the 3D world mesh, for superimposing the 2D image, wherein the target placement area is located on a deformed surface;”
“A sub-region of the depth mesh can be used to create more localized effects, such as decals, virtual graffiti, splat effects, and more on real surfaces.” — Du et al. (NPL), ‘Decals and Splats’
“Real objects need to be represented as meshes… Conventional mobile experiences… render flat shadows… which leads to noticeable artifacts on non-planar objects.” — Du et al. (NPL), ‘Virtual Shadows’ (motivation for mesh-based surface handling)
Selecting a “sub-region of the depth mesh” corresponds to “determining a target placement area on the 3D world mesh.” The reference addresses non-planar objects—i.e., deformed surfaces—satisfying the last clause. The claimed feature is taught when Du’s superimposing is used with a pre-processed 2D image (as taught by OpenCV).
Du in combination with OpenCV teaches the following portion of Claim 1, which recites:
“superimposing the transformed image over the target placement area.”
“Tri-planar texture mapping … The appearance of real-world surfaces can be digitally altered with depth meshes … users can touch on the screen to change the look of physical surfaces… The depth mesh provides the 3D vertex position and the normal vector of surface points to compute world-space UV texture coordinates.” — Du et al. (NPL), ‘Tri-planar texture mapping’
Du et al. demonstrates superimposing transformed imagery (textures/decals) onto the selected mesh region, which is the claimed operation of overlaying the 2D image over the target placement area.
Before the effective filing date of the claimed invention, a POSITA implementing Du et al.’s mesh-based decal pipeline would reasonably draw on OpenCV (NPL)’s standard, widely-taught point-correspondence selection workflow to assign specific pixels in the 2D image as anchor/control points prior to warping. This improves placement control and registration accuracy while feeding directly into Du’s overlay stage. The combination yields predictable results: selected anchor pixels → homography/transform → superimposition on Du’s mesh sub-region (including non-planar surfaces), which is a routine integration of known CV pre-processing with AR mesh-texturing pipelines.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
As per Claim 2, Du teaches the limitation(s) of Claim 2 that recites:
“The method of claim 1, wherein the deformed surface has a degree of curvature.”
Du et al. addresses overlay on non-planar world geometry and employs tri-planar texture mapping on a depth-derived mesh, i.e., placement on curved (deformed) surfaces rather than only planar ones.
Selecting a mesh sub-region for decals and applying tri-planar mapping directly targets curved/irregular surface patches, which satisfies the curvature requirement of Claim 2.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
As per Claim 3, Du alone does not explicitly teach all the limitation(s) of the claim. However, when combined with OpenCV (NPL), they collectively teach all of the limitation(s).
OpenCV teaches the limitation(s) of Claim 3 that recites: “The method of claim 1, wherein the target placement area is determined by specifying one or more position selection points and the transformed image is superimposed by aligning the one or more anchor points with the one or more position selection points.”
“To calculate a homography between two images, you need to know at least 4 point correspondences … in this post we are simply going to click the points by hand.” — OpenCV (NPL), “How to calculate a Homography?”, pp. 4–5
“Four corners of the book in source image … Four corners of the book in destination image … Mat h = findHomography(pts_src, pts_dst); … warpPerspective(…) … warp image onto the other.” — OpenCV (NPL), code example, pp. 5–7
“specifying one or more position selection points” ↔ user clicks destination points by hand to define where the image should land.
“aligning the … anchor points with the … position selection points” ↔ choosing source anchor points (e.g., four corners) and destination points and then computing findHomography(pts_src, pts_dst) to warp/superimpose the image so that source anchors align with selected destination points.
Before the effective filing date, a POSITA implementing Du et al.’s depth-mesh decal workflow from Claim 1 would routinely adopt OpenCV’s point-correspondence selection and homography-based alignment so that user-specified position points on the sensed scene control where a 2D image is placed, while pre-assigned image anchor points drive the warp. This standard CV pre-processing naturally complements mesh-aware rendering and yields predictable results for accurate, user-directed superimposition.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
As per Claim 4, Du alone does not explicitly teach all the limitation(s) of the claim. However, when combined with OpenCV (NPL), they collectively teach all of the limitation(s).
OpenCV teaches the limitation(s) of Claim 4 that recites:
“The method of claim 3, wherein the one or more image anchor points are designated as one or more image corners.”
“Four corners of the book in source image … Four corners of the book in destination image … Mat h = findHomography(pts_src, pts_dst); … warpPerspective(…).” — OpenCV (NPL), C++ example, example narrative/code
“Let the size of the image you want to put on the virtual billboard be w × h. The corners of the image (pts_src) are therefore to be (0,0), (w-1,0), (w-1,h-1) and (0,h-1).” — OpenCV (NPL), “Virtual Billboard” steps
The tutorial designates the image’s corners as the control points (anchor points) used to compute the transform and overlay, directly meeting “image anchor points … designated as … image corners.”
The rationale and motivation to combine the references as set forth for claim 1 are incorporated herein by reference for the present claim.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
As per Claim 5, Du teaches the limitation(s) of Claim 5 that recites:
“The method of claim 1, wherein two or more 2D images are superimposed and the pre-processing step, the target placement area determination step and the image superimposition step are repeated for each additional 2D image.”
“A sub-region of the depth mesh can be used to create more localized effects, such as decals, virtual graffiti, splat effects, and more on real surfaces.” — Du et al. (NPL), Decals and Splats
“users can touch on the screen to change the look of physical surfaces …” — Du et al. (NPL), Tri-planar texture mapping
The plural “decals… virtual graffiti, splat effects” indicates placing multiple overlays on sub-regions of the mesh, and the touch-to-apply interaction reflects a repeatable select-and-apply pipeline. Thus, superimposing two or more 2D images and repeating the pre-processing/placement/superimposition steps for each additional image is taught/obvious in this system.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
As per Claim 6, Du teaches the limitation(s) of Claim 6 that recites:
“The method of claim 5, further comprising specifying different position selection points for the two or more 2D images.”
“A sub-region of the depth mesh can be used to create more localized effects, such as decals, virtual graffiti, splat effects, and more on real surfaces.” — Du et al. (NPL), Decals and Splats
“In this demonstration, users can touch on the screen to change the look of physical surfaces …” — Du et al. (NPL), Tri-planar texture mapping
The plural “decals … splat effects … and more” indicates placing multiple overlays; selecting a “sub-region of the depth mesh” by user touch corresponds to position selection points. Applying multiple decals via separate touches naturally means different selection points for each additional 2D image, satisfying Claim 6.
Before the effective filing date, a POSITA using Du et al.’s touch-driven decal workflow would routinely place additional decals by touching different mesh sub-regions, yielding predictable multi-overlay behavior where each image uses its own position selection point(s).
PNG
media_image1.png
13
460
media_image1.png
Greyscale
As per Claim 7, Du alone does not explicitly teach all the limitation(s) of the claim. However, when combined with OpenCV (NPL), they collectively teach all of the limitation(s).
OpenCV teaches the limitation(s) of Claim 7 that recites:
“The method of claim 1, wherein the 2D image or each 2D image is adjusted to a desired transparency level, scaling size, and orientation.”
“We need to know the aspect ratio… we can choose the output image size to be 300×400…” — OpenCV (NPL), Perspective Correction steps, pp. 9 (steps 1–4).
“Let the size of the image you want to put on the virtual billboard be w × h… The corners of the image (pts_src) are… (0,0)… (w-1,h-1).” — OpenCV (NPL), Virtual Billboard steps, pp. 11 (step 2).
“Apply the homography to the source image and blend it with the destination image to obtain the image in Figure 6.” — OpenCV (NPL), Virtual Billboard steps, pp. 11 (step 4).
“A Homography is a transformation (a 3×3 matrix) that maps the points in one image to the corresponding points in the other image.” — OpenCV (NPL), What is Homography?, pp. 2.
The cited size selection language meets “adjusted to a desired … scaling size.”
The “blend … with the destination image” step is the standard way to control image transparency during overlay.
Homography’s 3×3 transform rotates/aligns the image to the destination quadrilateral, satisfying “desired … orientation.”
The rationale and motivation to combine the references as set forth for claim 1 are incorporated herein by reference for the present claim. A POSITA implementing mesh-aware overlays (e.g., Claim 1’s pipeline) would ordinarily use OpenCV’s homography workflow to set size, orientation, and blended transparency for each overlay, yielding predictable results for appearance-controlled superimposition.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
System Claim 11 does not include any additional limitations that would significantly distinguish it from method claim 1. Therefore, it is likewise rejected under 35 U.S.C. § 103 in view of the same references and for the same reasons set forth above.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
System Claim 12 does not include any additional limitations that would significantly distinguish it from method claim 2. Therefore, it is likewise rejected under 35 U.S.C. § 103 in view of the same references and for the same reasons set forth above.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
System Claim 13 does not include any additional limitations that would significantly distinguish it from method claim 3. Therefore, it is likewise rejected under 35 U.S.C. § 103 in view of the same references and for the same reasons set forth above.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
System Claim 14 does not include any additional limitations that would significantly distinguish it from method claim 4. Therefore, it is likewise rejected under 35 U.S.C. § 103 in view of the same references and for the same reasons set forth above.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
System Claim 15 does not include any additional limitations that would significantly distinguish it from method claim 5. Therefore, it is likewise rejected under 35 U.S.C. § 103 in view of the same references and for the same reasons set forth above.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
System Claim 16 does not include any additional limitations that would significantly distinguish it from method claim 6. Therefore, it is likewise rejected under 35 U.S.C. § 103 in view of the same references and for the same reasons set forth above.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
System Claim 17 does not include any additional limitations that would significantly distinguish it from method claim 7. Therefore, it is likewise rejected under 35 U.S.C. § 103 in view of the same references and for the same reasons set forth above.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
Ground of Rejection 2
Claims 8, 9, 10, 18, 19, 20 are rejected under 35 U.S.C. § 103 as being unpatentable over Du et al. (NPL) in view of OpenCV (NPL), and further in view of Smith et al. (US11120639B1).
As per Claim 8, Du and OpenCV (NPL) alone do not explicitly teach all the limitation(s) of the claim. However, when combined with Smith et al. (US11120639B1), they collectively teach all of the limitation(s).
Smith teaches the limitation(s) of Claim 8 that recites:
“The method of claim 1, wherein the target placement area is determined automatically.”
“The depth image is processed by surface reconstruction techniques to generate a geometric representation … in the form of a surface mesh.” — Smith et al., ¶[41]
“Scene data … is processed by an … Fully Convolutional Network (FCN) to produce Signed Distance Field (SDF) data which indicates, for each pixel, a distance to a nearest instance boundary and an object label …” — Smith et al., ¶[42]
“The voxels of the surface mesh may be augmented … with the object label … After augmentation, the collection of voxels … that have been tagged with a shelf object label may be extracted/extruded from the surface mesh …” — Smith et al., ¶[43]
Smith et al. describes computing an object-labeled region on the mesh via FCN-based SDF and then extracting/extruding the labeled set of voxels. That labeled collection is a two-dimensional region with boundaries on the mesh - i.e., a target placement area determined automatically (no user click is required). This directly answers the “point vs. area” gap: the area (size, extent, boundary) is produced by the segmentation-and-extraction pipeline itself.
Before the effective filing date, a POSITA implementing Du et al.’s depth-mesh decal workflow would naturally incorporate automatic mesh-region selection like Smith et al.’s FCN/SDF-labeled region extraction to choose a placement area without user input. Doing so improves repeatability and reduces manual steps, and it yields predictable results: an automatically segmented mesh sub-region that can receive decals/overlays in the standard AR texturing pipeline.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
As per Claim 9, Du and OpenCV (NPL) alone do not explicitly teach all the limitation(s) of the claim. However, when combined with Smith et al. (US11120639B1), they collectively teach all of the limitation(s).
OpenCV teaches the limitation(s) of Claim 9 that recites:
“The method of claim 8, wherein the automatic determination of a target placement area comprises a pose-detection algorithm for determining at least one landmark or feature on the surface.”
“To calculate a homography between two images, you need to know at least 4 point correspondences… Usually, these point correspondences are found automatically by matching features like SIFT or SURF …” — OpenCV (NPL), How to calculate a Homography?
The tutorial describes a standard pose-estimation pipeline for planar surfaces in which automatically detected features (SIFT/SURF) serve as landmarks. From these matches, a homography is computed, which determines the surface pose used to place and warp imagery. Thus, the automatic determination in Claim 8 can comprise such a pose-detection algorithm that determines at least one landmark or feature on the surface.
Before the effective filing date, a POSITA implementing automatic region selection would readily integrate OpenCV’s automatic feature detection + homography-based pose to remove manual clicks and use detected surface landmarks to drive placement. This provides improved repeatability and predictable results: detected features/landmarks → estimated pose → automatic selection/placement of the overlay region.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
As per Claim 10, Du and OpenCV (NPL) alone do not explicitly teach all the limitation(s) of the claim. However, when combined with Smith et al. (US11120639B1), they collectively teach all of the limitation(s).
Du teaches the limitation(s) of Claim 10 that recites:
“The method of claim 9, further comprising the step of geometrically calculating the position of the at least one landmark or feature.”
“Given a screen point
PNG
media_image2.png
30
97
media_image2.png
Greyscale
, we look up its depth value… then re-project it to a camera-space vertex
PNG
media_image3.png
33
24
media_image3.png
Greyscale
using the camera intrinsic matrix
PNG
media_image4.png
30
18
media_image4.png
Greyscale
… Given the camera extrinsic matrix
PNG
media_image5.png
30
109
media_image5.png
Greyscale
… we derive the global coordinates
PNG
media_image6.png
33
25
media_image6.png
Greyscale
in the world space:
PNG
media_image7.png
33
138
media_image7.png
Greyscale
.” — Du et al. (NPL), “Screen-space to/from World-space Conversion”
Du et al. provides the concrete math to geometrically calculate the position of that landmark/feature: use depth and camera intrinsics/extrinsics to compute its world-space coordinates
PNG
media_image6.png
33
25
media_image6.png
Greyscale
. This directly satisfies “calculating the position of the at least one landmark or feature.”
Before the effective filing date, a POSITA using automatic landmark detection in an AR pipeline would routinely apply Du et al.’s screen-to-world re-projection to obtain the world-space position of each detected feature. This improves placement accuracy and yields predictable results: detected feature point → depth lookup and matrix re-projection → calculated world-space position suitable for subsequent mesh-aware overlay.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
System Claim 18 does not include any additional limitations that would significantly distinguish it from method claim 8. Therefore, it is likewise rejected under 35 U.S.C. § 103 in view of the same references and for the same reasons set forth above.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
System Claim 19 does not include any additional limitations that would significantly distinguish it from method claim 9. Therefore, it is likewise rejected under 35 U.S.C. § 103 in view of the same references and for the same reasons set forth above.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
System Claim 20 does not include any additional limitations that would significantly distinguish it from method claim 10. Therefore, it is likewise rejected under 35 U.S.C. § 103 in view of the same references and for the same reasons set forth above.
PNG
media_image1.png
13
460
media_image1.png
Greyscale
Conclusion
The prior art made of record and relied upon in this action is as follows:
Patent Literature:
Smith et al. (US11120639B1) — “Projecting telemetry data to visualization models”
Non-Patent Literature (NPL):
Du et al. — “DepthLab: Real-time 3D Interaction with Depth Maps for Mobile AR” (2020). Available at: [https://augmentedperception.github.io/depthlab/assets/Du_DepthLab-Real-Time3DInteractionWithDepthMapsForMobileAugmentedReality_UIST2020.pdf]
OpenCV / Mallick — “Homography examples using OpenCV (Python/C++)” (2016). Available at: [https://learnopencv.com/homography-examples-using-opencv-python-c/]
Note: A PDF copy of each NPL reference is attached with this Office Action. URLs are included for applicant convenience. If a link becomes unavailable in the future, the citation information may be used to locate the reference or access archived versions via the Wayback Machine.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure and is listed as follows:
Patent Literature:
(none)
Non-Patent Literature (NPL):
Nuernberger et al. — “Anchoring 2D Gesture Annotations in Augmented Reality” (2016). Available at: [https://sites.cs.ucsb.edu/~mturk/pubs/vr_2016_bnuernberger.pdf]
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ADEEL BASHIR whose telephone number is (571) 270-0440. The examiner can normally be reached Monday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Hajnik can be reached on (571) 276-7642. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/ADEEL BASHIR/
Examiner, Art Unit 2616
/DANIEL F HAJNIK/Supervisory Patent Examiner, Art Unit 2616