Prosecution Insights
Last updated: April 19, 2026
Application No. 18/213,115

METHOD, APPARATUS, AND COMPUTER-READABLE MEDIUM FOR ROOM LAYOUT EXTRACTION

Non-Final OA §103
Filed
Jun 22, 2023
Examiner
SATCHER, DION JOHN
Art Unit
2676
Tech Center
2600 — Communications
Assignee
Geomagical Labs Inc.
OA Round
1 (Non-Final)
85%
Grant Probability
Favorable
1-2
OA Rounds
3y 0m
To Grant
99%
With Interview

Examiner Intelligence

Grants 85% — above average
85%
Career Allow Rate
33 granted / 39 resolved
+22.6% vs TC avg
Moderate +14% lift
Without
With
+14.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
29 currently pending
Career history
68
Total Applications
across all art units

Statute-Specific Performance

§101
14.2%
-25.8% vs TC avg
§103
61.9%
+21.9% vs TC avg
§102
15.1%
-24.9% vs TC avg
§112
8.3%
-31.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 39 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Status of Claims This communication is in response to the Application Filed on 06/22/2023 Claims 1–20 are pending in this application. Information Disclosure Statement The information disclosure statement (IDS) submitted on 10/17/2023 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner. Drawings The Fig. 34 is objected to because the quality of the image filed with application are at an insufficient quality to be clearly readable. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or non-obviousness. Claim(s) 1–6 , 10 and 13– 20 are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (See NPL attached, "Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image", hereafter, "Yang") in view of Phalak et al. (US 20210279950 A1, hereafter, "Phalak") . Regarding claim 1 , Yang teaches a method executed by one or more computing devices for layout extraction ( See Yang, [Abstract], Single-image room layout reconstruction aims to reconstruct the enclosed 3D structure of a room from a single image ) , the method comprising: [ storing a plurality of scene priors corresponding to an image of a scene, the plurality of scene priors comprising a semantic map indicating semantic labels associated with a plurality of pixels in the image ], geometry information corresponding to the plurality of pixels in the image ( See Yang, [3.1. Plane and Line Detection]. To reconstruct the 3D room layout, we further estimate the 3D parameters for each plane. The 3D plane parameters include its surface normal n∈ S 2 and offset d . Note: Examiner is interpreting the 3D parameters as geometry information ) , and one or more line segments corresponding to the scene ( See Yang, [3.1. Plane and Line Detection], By introducing the virtual plane passing through the occlusion line, all adjacent walls in the image space are physically connected in the 3D space. Thus, we directly calculate the boundary of adjacent walls with their 3D parameters. Note: Examiner is interpreting the vertical line between walls as the border ) ; generating one or more borders based on the one or more line segments, each border representing a separation between two layout planes in a plurality of layout planes of the scene ( See Yang, [3.1. Plane and Line Detection], By introducing the virtual plane passing through the occlusion line, all adjacent walls in the image space are physically connected in the 3D space. Thus, we directly calculate the boundary of adjacent walls with their 3D parameters. Note: Examiner is interpreting the vertical line between walls as the border ) ; and generating a plurality of plane masks corresponding to the plurality of layout planes that estimate the geometry of the scene, the plurality plane masks based at least in part on at least one of the plurality of scene priors and the one or more borders ( See Yang, [3.2. 3D Layout Reconstruction], If two walls are physically connected in 3D space, we directly calculate the boundary with their 3D parameters. Furthermore, we expect to construct a geometrically consistent 3D room layout between detected planes and lines. We optimize the 3D plane parameters to align the calculated boundary with the detected intersection line. [3.1. Plane and Line Detection]. To reconstruct the 3D room layout, we further estimate the 3D parameters for each plane. The 3D plane parameters include its surface normal n∈ S 2 and offset d. See also [Figure 3 (d)], 2D layout segmentation after optimization. Note: the examiner is interpreting the room layout 2D segmentation as the plane mask. The 3D parameters are being interpreted as the geometry information ) . However, Yang fail (s) to teach storing a plurality of scene priors corresponding to an image of a scene, the plurality of scene priors comprising a semantic map indicating semantic labels associated with a plurality of pixels in the image . Phalak , working in the same field of endeavor, teaches: storing a plurality of scene priors corresponding to an image of a scene, the plurality of scene priors comprising a semantic map indicating semantic labels associated with a plurality of pixels in the image ( See Phalak, ¶ [0270], One or more local features capturing geometric structures may be extracted at 1504D at least by recursively performing semantic feature extraction on nested partitioning of the set of points. A PointNet -based module may be employed to extract local features or points ) . Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Yang ’s reference to storing a plurality of scene priors corresponding to an image of a scene, the plurality of scene priors comprising a semantic map indicating semantic labels associated with a plurality of pixels in the image based on the method of Phalak ’s reference. The suggestion/motivation would have been to efficiently and accurately generate the layout of an indoor scene ( See Phalak , ¶ [0146] ). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Phalak with Yang to obtain the invention as specified in claim 1 . Regarding claim 2 , Yang teaches t he method of claim 1, wherein the plurality of layout planes comprise at least one of a planar plane or a curved plane ( See Yang, [3.1. Plane and Line Detection], Then, we use different CNN-based heads to detect the planes, vertical lines between adjacent walls and regress 3D parameters of planes, respectively ) . Regarding claim 3 , Yang teaches t he method of claim 1, wherein the semantic labels comprise at least one of a wall, a ceiling, or a floor ( See Yang, [3.1. Plane and Line Detection], Each channel of the center likelihood map C represents semantic different categories, i.e., wall, floor, and ceiling ) . Regarding claim 4 , Yang teaches t he method of claim 1, wherein the image is a red-green-blue (RGB) image ( See Yang, [3. Our Method], Our goal is to reconstruct the 3D room layout from a single RGB image ) . Regarding claim 5 , Yang teaches t he method of claim 1, wherein the plurality of scene priors further comprises one or more of: a gravity vector corresponding to the scene; an edge map corresponding to a plurality of edges in the scene; a normal map corresponding to a plurality of normals in the scene; camera parameters of a camera configured to capture the image ( See Yang, [3.2. 3D Layout Reconstruction], Next, we calculate the intersection line of two adjacent walls by their 3D parameters and project it into image space with the known camera intrinsic matrix. Note: only one needs to be taught ) ; or an orientation map corresponding to a plurality of orientation values in the scene. Regarding claim 6 , Yang in view of Phalak teaches t he method of claim 1, wherein the geometry information comprises one or more of: [ a depth map corresponding to the plurality of pixels ] ; photogrammetry points corresponding to a plurality of three-dimensional point values in the plurality of pixels; a sparse depth map corresponding to the plurality of pixels; a plurality of depth pixels storing both color information and depth information. a mesh representation corresponding to the plurality of pixels; a voxel representation corresponding to the plurality of pixels; or depth information associated with one or more polygons corresponding to the plurality of pixels. However, Yang fail (s) to teach a depth map corresponding to the plurality of pixels . Phalak , working in the same field of endeavor, teaches: a depth map corresponding to the plurality of pixels ( See Phalak, ¶ [0367], A depth map and a wall segmentation mask may be generated at 1504E by using, for example, a multi-view depth estimation network and a PSPNet -based and/or a Resnet-based segmentation module. In some embodiments, a per-frame dense depth map may be generated at 1502E with, for example, a Multiview depth estimation network . Note: only one needs to be taught ). Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Yang ’s reference a depth map corresponding to the plurality of pixels based on the method of Phalak ’s reference. The suggestion/motivation would have been to efficiently and accurately generate the layout of an indoor scene ( See Phalak, ¶ [0146] ). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Phalak with Yang to obtain the invention as specified in claim 6 . Regarding claim 10 , Yang in view of Phalak teaches t he method of claim 1, wherein the generating a plurality of plane masks corresponding to the plurality of layout planes that estimate the geometry of the scene comprises: generating a plurality of initial plane masks, the plurality of initial plane masks corresponding to the plurality of layout planes ( See Yang, [1. Introduction], To this end, we first train Convolutional Neural Networks (CNNs) to detect planes and vertical lines in the input RGB image . S ee also [Figure 3 ( c )], 2 D layout segmentation before optimization ) ; generating a plurality of plane connectivity values based at least in part on the one or more borders, each plane connectivity value indicating connectivity between two layout planes in the plurality of layout planes ( See Yang, [3.2. 3D Layout Reconstruction], Next, we calculate the intersection line of two adjacent walls by their 3D parameters and project it into image space with the known camera intrinsic matrix. [ 3.2. 3D Layout Reconstruction ], Each potential intersection line region has at most one detected line, and we choose the one with the highest confidence when there are multiple detected lines. Note: The examiner is interpreting the intersection line confidence as the connectivity value as the intersection line connects two adjacent walls (e.g., planes) and the confidence is how likely there is a line that connects two planes ) ; and refining the plurality of initial plane masks based at least in part on an estimated geometry of the plurality of layout planes, the estimated geometry based at least in part on at least one of at least one of the plurality of scene priors, the plurality of initial plane masks, and the plurality of connectivity value s ( See Yang, [3.2. 3D Layout Reconstruction], If two walls are physically connected in 3D space, we directly calculate the boundary with their 3D parameters. Furthermore, we expect to construct a geometrically consistent 3D room layout between detected planes and lines. We optimize the 3D plane parameters to align the calculated boundary with the detected intersection line. See also [Figure 3 (d)], 2 D layout segmentation after optimization. Note: Examiner is interpreting the optimization as refining the 2D segmentation (e.g., plane mask) ) . Regarding claim 13 , Yang teaches a method executed by one or more computing devices for layout extraction, the method comprising: storing a first scene prior and a second scene prior corresponding to an image of a scene, the first scene prior and the second scene prior comprising [ a semantic map indicating semantic labels associated with a plurality of pixels in the image ] and geometry information corresponding to the plurality of pixels in the image ( See Yang, [Abstract], Single- image room layout reconstruction aims to reconstruct the enclosed 3D structure of a room from a single image ); generating one or more borders based on at least one of the first scene prior or the second scene prior, each border representing a separation between two layout planes in a plurality of layout planes of the scene; and ( See Yang, [3.1. Plane and Line Detection], By introducing the virtual plane passing through the occlusion line, all adjacent walls in the image space are physically connected in the 3D space. Thus, we directly calculate the boundary of adjacent walls with their 3D parameters. Note: Examiner is interpreting the vertical line between walls as the border ) ; and generating a plurality of plane masks corresponding to the plurality of layout planes that estimate a geometry of the scene, the plurality plane masks based at least in part on the one or more borders. ( See Yang, [3.2. 3D Layout Reconstruction], If two walls are physically connected in 3D space, we directly calculate the boundary with their 3D parameters. Furthermore, we expect to construct a geometrically consistent 3D room layout between detected planes and lines. We optimize the 3D plane parameters to align the calculated boundary with the detected intersection line. [ 3.1. Plane and Line Detection ]. To reconstruct the 3D room layout, we further estimate the 3D parameters for each plane. The 3D plane parameters include its surface normal n∈ S 2 and offset d. See also [Figure 3 (d)], 2D layout segmentation after optimization. Note: the examiner is interpreting the room layout 2D segmentation as the plane mask. The 3D parameters are being interpreted as the geometry information ) . However, Yang fail (s) to teach a semantic map indicating semantic labels associated with a plurality of pixels in the image . Phalak , working in the same field of endeavor, teaches: a semantic map indicating semantic labels associated with a plurality of pixels in the image ( See Phalak, ¶ [0270], One or more local features capturing geometric structures may be extracted at 1504D at least by recursively performing semantic feature extraction on nested partitioning of the set of points. A PointNet -based module may be employed to extract local features or points ). Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Yang ’s reference a semantic map indicating semantic labels associated with a plurality of pixels in the image based on the method of Phalak ’s reference. The suggestion/motivation would have been to efficiently and accurately generate the layout of an indoor scene ( See Phalak, ¶ [0146] ). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Phalak with Yang to obtain the invention as specified in claim 13 . Regarding claim 14 , Yang in view Phalak teaches t he method of claim 13, [ wherein the image is one of a plurality of images of the scene, wherein the first scene prior and the second scene prior correspond to the plurality of images of the scene, and wherein the semantic map indicating semantic labels is associated with a plurality of pixels in the plurality of images ] However, Yang fail (s) to teach wherein the image is one of a plurality of images of the scene, wherein the first scene prior and the second scene prior correspond to the plurality of images of the scene, and wherein the semantic map indicating semantic labels is associated with a plurality of pixels in the plurality of images . Phalak , working in the same field of endeavor, teaches: wherein the image is one of a plurality of images of the scene , wherein the first scene prior and the second scene prior correspond to the plurality of images of the scene ( See Phalak, ¶ [0 016 ], In some of these embodiments, determining the room classification and the wall classification may include identifying the input image, wherein the input image comprises one image or a sequence of images from a three-dimensional scan of the indoor scene; and determining an input point cloud for the input image ) , and wherein the semantic map indicating semantic labels is associated with a plurality of pixels in the plurality of images ( See Phalak, ¶ [0270], One or more local features capturing geometric structures may be extracted at 1504D at least by recursively performing semantic feature extraction on nested partitioning of the set of points. A PointNet -based module may be employed to extract local features or points ). Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Yang ’s reference wherein the image is one of a plurality of images of the scene, wherein the first scene prior and the second scene prior correspond to the plurality of images of the scene, and wherein the semantic map indicating semantic labels is associated with a plurality of pixels in the plurality of images based on the method of Phalak ’s reference. The suggestion/motivation would have been to efficiently and accurately generate the layout of an indoor scene ( See Phalak, ¶ [0146] ). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Phalak with Yang to obtain the invention as specified in claim 14 . Regarding claim 15 , claim 15 is rejected the same as claim 5 and the arguments similar to that presented above for claim 5 are equally applicable to the claim 15 , and all of the other limitations similar to claim 5 are not repeated herein, but incorporated by reference. Regarding claim 16 , claim 16 is rejected the same as claim 6 and the arguments similar to that presented above for claim 6 are equally applicable to the claim 16 , and all of the other limitations similar to claim 6 are not repeated herein, but incorporated by reference. Regarding claim 17 , Yang teaches t he method of claim 13, wherein the image is of one or more corners in the scene ( See Yang, [3.2. 3D Layout Reconstruction], Specifically, we calculate the 3D corner q ∈ R 3 of the room layout by solving a system of linear equations ) . Regarding claim 18 , Yang teaches t he method of claim 13, wherein the generating a plurality of plane masks corresponding to the plurality of layout planes that estimate the geometry of the scene comprises: applying a non-linear optimization function based at least in part on at least one of the one or more scene priors ( See Yang, [3.2. 3D Layout Reconstruction], Furthermore, we expect to construct a geometrically consistent 3D room layout between detected planes and lines. We optimize the 3D plane parameters to align the calculated boundary with the detected intersection line. Note: The optimization inherently requires non-linear optimization due to the complex, non-linear relationship between a 3D scene and its 2D image projections . The optimization is based off the geometric information (normal, orientation) and the line segments ) . Regarding claim 19 , Yang teaches t he method of claim 13, further comprising: generating one or more three-dimensional (3D) plane parameters corresponding to the plurality of layout planes that estimate the geometry of the scene ( See Yang, [3.1. Plane and Line Detection], To reconstruct the 3D room layout, we further estimate the 3D parameters for each plane ) . Regarding claim 20 , Yang in view of Phalak teaches a method executed by one or more computing devices for layout extraction, the method comprising: [ storing a plurality of scene priors corresponding to an image of a scene, the plurality of scene priors comprising a semantic map indicating semantic labels associated with the plurality of pixels in the image and a plurality of line segments ] ; detecting a plurality of borders in the scene based at least in part on one or more of the plurality of scene priors, each border representing a separation between two layout planes in a plurality of layout planes, wherein each layout plane comprises a wall plane, a ceiling plane, or a floor plane ( See Yang, [3.1. Plane and Line Detection], By introducing the virtual plane passing through the occlusion line, all adjacent walls in the image space are physically connected in the 3D space. Thus, we directly calculate the boundary of adjacent walls with their 3D parameters ) ; generating a plurality of initial plane masks ( [1. Introduction], To this end, we first train Convolutional Neural Networks (CNNs) to detect planes and vertical lines in the input RGB image . See also [Figure 3 (c)], 2D layout segmentation before optimization ) and a plurality of plane connectivity values based at least in part on the plurality of borders, wherein the plurality of initial plane masks correspond to the plurality of layout planes and wherein each plane connectivity value indicates connectivity between two layout planes in the plurality of layout planes ( See Yang, [3.2. 3D Layout Reconstruction], Next, we calculate the intersection line of two adjacent walls by their 3D parameters and project it into image space with the known camera intrinsic matrix. [ 3.2. 3D Layout Reconstruction ], Each potential intersection line region has at most one detected line, and we choose the one with the highest confidence when there are multiple detected lines. Note: The examiner is interpreting the intersection line confidence as the connectivity value as the intersection line connects two adjacent walls (e.g., planes) and the confidence is how likely there is a line that connects two planes ) ; and generating a plurality of optimized plane masks by refining the plurality of initial plane masks based at least in part on an estimated geometry of the plurality of layout planes, wherein the estimated geometry is determined based at least in part on one or more of the plurality of scene priors, the plurality of initial plane masks, and the plurality of connectivity values See Yang, [3.2. 3D Layout Reconstruction], If two walls are physically connected in 3D space, we directly calculate the boundary with their 3D parameters. Furthermore, we expect to construct a geometrically consistent 3D room layout between detected planes and lines. We optimize the 3D plane parameters to align the calculated boundary with the detected intersection line. See also [Figure 3 (d)], D layout segmentation after optimization. Note: Examiner is interpreting the optimization as refining the 2D segmentation (e.g., plane mask) . However, Yang fail (s) to teach storing a plurality of scene priors corresponding to an image of a scene, the plurality of scene priors comprising a semantic map indicating semantic labels associated with the plurality of pixels in the image and a plurality of line segments . Phalak , working in the same field of endeavor, teaches: storing a plurality of scene priors corresponding to an image of a scene, the plurality of scene priors comprising a semantic map indicating semantic labels associated with the plurality of pixels in the image and a plurality of line segments ( See Phalak, ¶ [0270], One or more local features capturing geometric structures may be extracted at 1504D at least by recursively performing semantic feature extraction on nested partitioning of the set of points. A PointNet -based module may be employed to extract local features or points ) . Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Yang ’s reference to storing a plurality of scene priors corresponding to an image of a scene, the plurality of scene priors comprising a semantic map indicating semantic labels associated with the plurality of pixels in the image and a plurality of line segments based on the method of Phalak ’s reference. The suggestion/motivation would have been to efficiently and accurately generate the layout of an indoor scene ( See Phalak, ¶ [0146] ). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Phalak with Yang to obtain the invention as specified in claim 20 . Claim(s) 7 is rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (See NPL attached, "Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image", hereafter, "Yang") in view of Phalak et al. (US 20210279950 A1, hereafter, "Phalak") further in view of Gallagher (US 7,583,858 B2, hereafter, "Gallagher") . Regarding claim 7 , Yang in view of Phalak teaches t he method of claim 1, wherein generating one or more borders based on the one or more line segments comprises computing an [ orientation map by: detecting a horizontal line from a pixel of the plurality of pixels in the image; calculating a vanishing point based on the horizontal line and a gravity vector associated with the pixel; and combining the vanishing point with the gravity vector to determine a plurality of normal estimates ] . However, Yang and Phalak fail(s) to teach orientation map by: detecting a horizontal line from a pixel of the plurality of pixels in the image; calculating a vanishing point based on the horizontal line and a gravity vector associated with the pixel; and combining the vanishing point with the gravity vector to determine a plurality of normal estimates . Phalak, working in the same field of endeavor, teaches: orientation map ( See Gallagher, [Col. 7, ln. 17–19], In general, the transform 60 is created by determining preferred positions for the gravity vanishing point (and possibly additional vanishing points). Note: Examiner is interpreting the transforms as the orientation map ) by: detecting a horizontal line from a pixel of the plurality of pixels in the image ( See Gallagher, [Col. 6, ln. 40-43], For example, in a brick wall, the lines along rows of bricks define a horizontal vanishing point while the lines along columns of bricks are vertical scene lines defining a vertical vanishing point (coincident to the gravity vector) ) ; calculating a vanishing point based on the horizontal line and a gravity vector associated with the pixel ( See Gallagher, [Col. 6, ln. 43-50], A set of two vanishing points related to two orthogonal sets of lines (i.e. the vertical lines parallel to gravity and the horizontal lines parallel to the scene ground plane are orthogonal) define a vanishing line for planes parallel to both sets oflines . The data processor 20 then generates the transform 60 based on the gravity vector and possibly additional vanishing points 50 found with image analysis ) ; and combining the vanishing point with the gravity vector to determine a plurality of normal estimates ( See Gallagher, [Col. 7, ln. 66-67 - Col. 8, ln. 1-8], The transform HIR is used to remove the tilt that is apparent in images when the camera is unintentionally rotated with respect to the scene (i.e. when the gravity vector is not orthogonal to the x-axis or y-axis of the imaging system). The angle a represents the negative of the angle of rotation of the camera from a vertical orientation, and the transform HIR is applied by the image processor 36 to produce an enhanced digital image 120 rotated by angle a relative to the original digital image 102, thereby removing the effect of undesirable rotation of the camera from the image ) . Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Yang ’s reference orientation map by: detecting a horizontal line from a pixel of the plurality of pixels in the image; calculating a vanishing point based on the horizontal line and a gravity vector associated with the pixel; and combining the vanishing point with the gravity vector to determine a plurality of normal estimates based on the method of Gallagher ’s reference. The suggestion/motivation would have been to accurately avoid the effect of perspective distortion in the image ( See Gallagher , [Col. 1, ln. 25–45 and Col. 6, ln. 50–62 ] ). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Gallagher with Yang and Phalak to obtain the invention as specified in claim 7 . Claim(s) 8 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (See NPL attached, "Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image", hereafter, "Yang") in view of Phalak et al. (US 20210279950 A1, hereafter, "Phalak") further in view of Elmekies (US 20170069142 A1, hereafter, " Elmekies ") . Regarding claim 8 , Yang in view of Phalak teaches t he method of claim 1, wherein generating one or more borders based on the one or more line segments comprises: [ detecting a first set of borders comprising lines that form seams between two walls ] ; detecting a second set of borders comprising lines that separate walls in the scene ( See Yang, [3.2. 3D Layout Reconstruction], If two walls are physically connected in 3D space, we directly calculate the boundary with their 3D parameters. Furthermore, we expect to construct a geometrically consistent 3D room layout between detected planes and lines ) ; and [ detecting a third set of borders comprising lines that separate walls from floors or ceilings in the scene ] . However, Yang and Phalak fail(s) to teach detecting a first set of borders comprising lines that form seams between two walls ; detecting a third set of borders comprising lines that separate walls from floors or ceilings in the scene . Elmekies , working in the same field of endeavor, teaches: detecting a first set of borders comprising lines that form seams between two walls ( See Elmekies , ¶ [0024], 3. Edge Categorization: In this stage, the merged edges are now each ranked as likely candidates for a seam between two surfaces in the target virtual room. The “target virtual space” is our assumed arrangement of walls and ceiling, e.g., in our current implementation, we start with a simple virtual box, the interiors of which represent the walls, floor and ceiling of an idealized room ) ; detecting a third set of borders comprising lines that separate walls from floors or ceilings in the scene ( See Elmekies , ¶ [0021], So, in the ranking stage described here we determine the likelihood, e.g., that a given edge is the seam between the ceiling and the back wall. We do this by giving each edge a “score” that represents the likelihood that it is a given seam (i.e., back-wall/ceiling, back-wall/left-wall, etc.). Every edge is given a score based on a series of rules. For instance, one rule might say that edges that are recognized as being lower down in the room are more likely to be floor/wall seams than ceiling/wall seams ) . Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Yang ’s reference to detecting a first set of borders comprising lines that form seams between two walls ; detecting a third set of borders comprising lines that separate walls from floors or ceilings in the scene based on the method of Elmekies’s reference. The suggestion/motivation would have been to accurately replicate an entire room or portion of room based on video or images ( See Elmekies , ¶ [0009] ). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Elmekies with Yang and Phalak to obtain the invention as specified in claim 8 . Regarding claim 9 , Yang in view of Phalak teaches t he method of claim 8, [ wherein the detecting the first set of borders comprising lines that form seams between two walls comprises, for each line segment in the one or more line segments: determining a first end and a second end of the line segment; and determining whether the line segment forms a seam between two walls based at least in part on the first end and the second end and a normal map of the scene ] . However, Yang and Phalak fail(s) to teach wherein the detecting the first set of borders comprising lines that form seams between two walls comprises, for each line segment in the one or more line segments: determining a first end and a second end of the line segment; and determining whether the line segment forms a seam between two walls based at least in part on the first end and the second end and a normal map of the scene . Elmekies , working in the same field of endeavor, teaches: wherein the detecting the first set of borders comprising lines that form seams between two walls ( See Elmekies , ¶ [0024], 3. Edge Categorization: In this stage, the merged edges are now each ranked as likely candidates for a seam between two surfaces in the target virtual room. The “target virtual space” is our assumed arrangement of walls and ceiling, e.g., in our current implementation, we start with a simple virtual box, the interiors of which represent the walls, floor and ceiling of an idealized room ) comprises, for each line segment in the one or more line segments: determining a first end and a second end of the line segment ( See Elmekies , ¶ [0023], 2. Edge Merging: In the Edge Merging stage, line segments (Edges) that have been extracted from the image are compared and, if their end points and slopes are suitably close and/or have commonality, are merged to form longer segments ) ; and determining whether the line segment forms a seam between two walls based at least in part on the first end and the second end and a normal map of the scene ( See Elmekies , ¶ [0024], 3. Edge Categorization: In this stage, the merged edges are now each ranked as likely candidates for a seam between two surfaces in the target virtual room. The “target virtual space” is our assumed arrangement of walls and ceiling, e.g., in our current implementation, we start with a simple virtual box, the interiors of which represent the walls, floor and ceiling of an idealized room ). Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Yang ’s reference wherein the detecting the first set of borders comprising lines that form seams between two walls comprises, for each line segment in the one or more line segments: determining a first end and a second end of the line segment; and determining whether the line segment forms a seam between two walls based at least in part on the first end and the second end and a normal map of the scene based on the method of Elmekies’s reference. The suggestion/motivation would have been to accurately replicate an entire room or portion of room based on video or images ( See Elmekies , ¶ [0009] ). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Elmekies with Yang and Phalak to obtain the invention as specified in claim 9 . Claim(s) 11 is rejected under 35 U.S.C. 103 as being unpatentable over Yang et al. (See NPL attached, "Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image", hereafter, "Yang") in view of Phalak et al. (US 20210279950 A1, hereafter, "Phalak") further in view of Zhang et al. (See NPL attached, “ Edge-Semantic Learning Strategy for Layout Estimation in Indoor Environment”, hereafter, “Zhang”) . Regarding claim 11 , Yang in view of Phalak teaches t he method of claim 10, wherein the generating the plurality of initial plane masks and the plurality of plane connectivity values based at least in part on the plurality of borders comprises: [ superimposing the semantic map on the plurality of borders to select for pixels corresponding to the plurality of layout planes of the scene ] . However, Yang and Phalak fail(s) to teach superimposing the semantic map on the plurality of borders to select for pixels corresponding to the plurality of layout planes of the scene. Zhang , working in the same field of endeavor, teaches: superimposing the semantic map on the plurality of borders to select for pixels corresponding to the plurality of layout planes of the scene ( See Zhang, [4 LAYOUT GENERATION AND REFINEMENT], Therefore, in order to combine the two information together, the scoring function that reflects the consistency of l with the estimated edge map E and segmentation map M. [1 INTRODUCTION], The semantic labels can be converted to a single segmentation map, which is a labeling map that represents the semantic surfaces with different labels. Note: The examiner is interpreting the scoring as superimpose the edge (e.g., borders) and segmentation map (e.g., semantic) ) . Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Yang ’s reference to superimposing the semantic map on the plurality of borders to select for pixels corresponding to the plurality of layout planes of the scene based on the method of Zhang ’s reference. The suggestion/motivation would have been to generate accurate borders in the layout ( See Zhang , [5.1 Layout estimation performance] ). Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Zhang with Yang and Phalak to obtain the invention as specified in claim 11 . Allowable Subject Matter Claim(s) 12 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Claim(s) 12 contain subject matter that is not disclosed or made obvious in the cited art. In regard to claim 12 , when considering claim 12 as a whole, prior art of record fails to disclose or render obvious, alone or in combination: “ The method of claim 10, wherein the refining the plurality of initial plane masks based at least in part on an estimated geometry of the plurality of layout planes comprises: applying a non-linear optimization function based at least in part on the plurality of initial plane masks, the plurality of connectivity values, and at least one of the one or more scene priors to generate an initial estimated geometry of the plurality of layout planes, the initial estimated geometry comprising confidence values associated with the plurality of layout planes; detecting and refining one or more low confidence layout planes in the plurality of layout planes in the initial estimated geometry having confidence values below a predetermined threshold to generate a refined estimated geometry; and refining the plurality of initial plane masks based at least in part on the refined estimated geometry to generate the plurality of plane masks ”. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Liu (See NPL attached, “ PlaneRCNN : 3D Plane Detection and Reconstruction from a Single Image”) teaches a deep neural architecture, PlaneRCNN , that detects and reconstructs piecewise planar surfaces from a single RGB image. PlaneRCNN employs a variant of Mask R-CNN to detect planes with their plane parameters and segmentation masks. Naumann (See NPL attached, “Refined Plane Segmentation for Cuboid-Shaped Objects by Leveraging Edge Detection”) teaches a post-processing algorithm to align the segmented plane masks with edges detected in the image. This allows us to increase the accuracy of state-of-the-art approaches, while limiting ourselves to cuboid-shaped objects. Hutchcroft et al. (US 20230206393 A1) teaches automated operations to analyze visual data from images acquired in multiple rooms of a building to generate multiple types of building information (e.g., to include a floor plan for the building), such as by using a trained neural network to jointly generate the multiple types of building information by combining visual data from pairs of the images, and for subsequently using the generated building information in one or more further automated manners, with the building information generation further performed in some cases without having or using information from any distance-measuring devices about distances from an image's acquisition location to walls or other objects in the surrounding room. The automated operations may include generating some types of building information with respect to each image pixel column, and other types of building information by combining data from both images of a pair. Any inquiry concerning this communication or earlier communications from the examiner should be directed to FILLIN "Examiner name" \* MERGEFORMAT DION J SATCHER whose telephone number is FILLIN "Phone number" \* MERGEFORMAT (703)756-5849 . The examiner can normally be reached FILLIN "Work Schedule?" \* MERGEFORMAT Monday - Thursday 5:30 am - 2:30 pm, Friday 5:30 am - 9:30 am PST . Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, FILLIN "SPE Name?" \* MERGEFORMAT Henok Shiferaw can be reached at FILLIN "SPE Phone?" \* MERGEFORMAT (571) 272-4637 . The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /DION J SATCHER/ Patent Examiner, Art Unit 2676 /Henok Shiferaw/ Supervisory Patent Examiner, Art Unit 2676 /HADI AKHAVANNIK/ Primary Examiner, Art Unit 2676
Read full office action

Prosecution Timeline

Jun 22, 2023
Application Filed
Sep 01, 2025
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12586218
MOTION ESTIMATION WITH ANATOMICAL INTEGRITY
2y 5m to grant Granted Mar 24, 2026
Patent 12579787
INSTRUMENT RECOGNITION METHOD BASED ON IMPROVED U2 NETWORK
2y 5m to grant Granted Mar 17, 2026
Patent 12573066
Depth Estimation Using a Single Near-Infrared Camera and Dot Illuminator
2y 5m to grant Granted Mar 10, 2026
Patent 12555263
SYSTEMS AND METHODS FOR TWO-STAGE OBJECTION DETECTION
2y 5m to grant Granted Feb 17, 2026
Patent 12548140
DETERMINING PROCESS DEVIATIONS THROUGH VIDEO ANALYSIS
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
85%
Grant Probability
99%
With Interview (+14.2%)
3y 0m
Median Time to Grant
Low
PTA Risk
Based on 39 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month