Last updated: April 19, 2026
Application No. 18/541,251
TWO DIMENSIONAL IMAGE PROCESSING TO GENERATE A THREE DIMENSIONAL MODEL AND DETERMINE A TWO DIMENSIONAL PLAN

Non-Final OA §101§103
Filed
Dec 15, 2023
Examiner
SUN, HAI TAO
Art Unit
2616
Tech Center
2600 — Communications
Assignee
Amazon Technologies, Inc.
OA Round
3 (Non-Final)
Interview Optional

— +26.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 476 resolved cases, 2023–2026
Examiner Intelligence

SUN, HAI TAO View full profile →
Grants 73% — above average
Career Allow Rate
347 granted / 476 resolved
+10.9% vs TC avg
Strong +27% interview lift
Without
With
+26.6%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
35 currently pending
Career history
511
Total Applications
across all art units
Statute-Specific Performance

§101
6.9%
-33.1% vs TC avg
§103
65.8%
+25.8% vs TC avg
§102
2.3%
-37.7% vs TC avg
§112
15.9%
-24.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 476 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/05/2026 has been entered.
 
Response to Arguments
Applicant's arguments filed 02/05/2026 have been fully considered but they are not persuasive. 
Regarding to claim 1, the applicant argues that the cited references fail to teach or suggest “determine, based at least in part on the semantic segmentation, that a three-dimensional representation of the window is included in the three-dimensional model; based at least in part on determining that the three-dimensional representation is included in the three-dimensional model, correct the three-dimensional model”. The arguments have been fully considered, but they are not persuasive. The examiner cannot concur with the applicant for following reasons:
Gausebeck discloses “determine, based at least in part on the semantic segmentation,   that a three-dimensional representation of the window is included in the three-dimensional model”. For example, in paragraph [0078], Gausebeck teaches a floorplan model is a simplified representation of surfaces, e.g., walls, floors, ceilings, etc., portals, e.g., door openings, and window openings associated with an interior environment. In paragraph [0079], Gausebeck teaches the 3D model generation component 118 employs common architectural notation to illustrate architectural features of an architectural structure, e.g., doors, windows, fireplaces, length of walls, other features of a building, etc.; Gausebeck teaches a floorplan model comprises a series of lines in 3D space which represent intersections of walls and/or floors, outlines of doorways and/or windows, edges of steps, outlines of other objects of interest. In paragraph [0080], Gausebeck teaches  lines for floors, walls and ceilings are dimensioned, e.g., annotated, with an associated size. In Fig. 4 and paragraph [0086], Gausebeck teaches 3D dollhouse view representation 400 of a model includes windows and door as illustrated in Fig. 4. In paragraph [0161], Gausebeck teaches  the semantic labeling component 928 employs one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc.; Gausebeck further teaches the semantic labeling component 928 assigns labels to the recognized objects identifying the object; Gausebeck further more teaches the semantic label/segmentation information. In paragraph [0252], Gausebeck teaches the training data development component 3316 extracts additional scene information associated with a 3D space model, such as semantic labels included in the indexed semantic label data 3310; Gausebeck further teaches training a 3D-from-2D neural network model to predict semantic labels, e.g. wall, ceiling, door, windows, etc. In paragraph [0255], Gausebeck teaches the training data development component 3316 determines semantic labels for the images and synthetic 3D data for the 2D image.
Gausebeck further discloses “based at least in part on determining that the three-dimensional representation is included in the three-dimensional model, correct the three-dimensional model by at least setting a depth property of the three-dimensional representation to a value associated with window depths”. For example, in paragraph [0050], Gausebeck teaches the final 3D reconstruction was generated using a more precise alignment process relative to an alignment process used to generate the initial 3D reconstruction. In paragraph  [0082], Gausebeck teaches the 3D model is rendered at the user device 130 and updated in real-time based on new image data; Gausebeck further teaches look for potential alignment errors, assess scan quality. In Fig. 2 and paragraph [0084], Gausebeck teaches  the 3D model 200 of a living room is dynamically updated and corrected based on depth data derived for the respective images. In paragraph [0085], Gausebeck teaches the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300. In paragraph [0112],  Gausebeck  teaches applying pixel color data to the depth map; perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image; Gausebeck further teaches the stitching component 508 fills in any possible small holes in the panorama with neighboring color data, thereby unifying exposure data across the boundaries between the respective 2D images; Gausebeck further more teaches determining more precise 3D data. In paragraph [0130], Gausebeck teaches filling in the gaps where the derived 3D data is lacking; Gausebeck further teaches facilitating aligning the 2D image and associated derived 3D data 116 with other 2D images and associated derived 3D data sets. In paragraph [0149], Gausebeck teaches determining data about the photometric match quality between the images at various depths. In paragraph [0161], Gausebeck teaches facilitating the alignment process in association with 3D model generation. In paragraph  [0171], Gausebeck teaches employing average depth measurement values for respective pixels, features, areas/regions etc., of a 2D image. In paragraph  [0198], Gausebeck teaches the intermediate versions are generated and rendered with relatively little processing time, enabling a real-time 3D reconstruction process that provides continually updated rough 3D version of a scene during the capture process. 
 	Claims 4 and 13 are not allowable due to the similar reasons as discussed above.
	Claims 2-3, 5-12, and 14-20 are not allowable due to the similar reasons as discussed above.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 13 is rejected under 35 U.S.C. 101 because the claimed invention of claim 13 is not directed to one of the four subject matter categories, i.e. process, machine, manufacture and composition of matter; a computer-readable medium may be a carrier wave, a signal per se, and thus non-statutory (MPEP 2106 (I)) or  (MPEP 2106 Patent Subject Matter Eligibility (I)). Claim 13 recites “One or more computer-readable storage media” which may encompass transitory media such as, carrier wavers. In paragraph [0112], the specification describes that medium includes Random Access Memory, i.e., RAM. RAM is  volatile, temporary storage that loses all data when the computer is powered off. The specification also describes the medium is any other medium.
Claims 14-20 are rejected under 35 U.S.C. 101 for the same reason as claim 13 since they directly or indirectly incorporate the steps of claim 13 which are not tired to a valid statutory category and the individual claims do not add any features that are tired to any of the four statutory categories of invention.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-10 and 13-20 are rejected under 35 U.S.C. 103 as being unpatentable over Gausebeck (US 20190026956 A1) and in view of Rejeb Sfar (US 20190205485 A1).
Regarding to claim 1 (Currently Amended),  Gausebeck discloses a computer system comprising (Fig. 1; [0061]:  a system 100 facilitates deriving 3D data from 2D image data and generating reconstructed 3D models based on the 3D data and the 2D image data; [0062]:  a computing device 104 receives and processes 2D image data 102; process the 2D image data 102 to derive 3D data; generate an alignment between the 2D images and the features):
 one or more processors ([0063]: one processor 124;  processors execute the computer-executable instructions; [0257]: the computer 3512 includes a processing unit 3514, a system memory 3516, and a system bus 3518; various processors); and 
one or more memory storing instructions that, upon execution by the one or more processors, configure the computer system to ([0063]: one memory and at least one processor;  one memory stores computer-executable instructions; the processors execute the computer-executable instructions; [0178]: one memory stores the computer-executable instructions executed by the at least one processor; Fig. 35; [0259]: the system memory 3516 includes volatile memory 3520 and nonvolatile memory 3522; RAM): 
receive a video file generated by a camera, the video file showing a space ([0082]: receive the 2D image data; Fig. 2; [0084]: capture and receive the new images of the living room; Fig. 3; [0085]: the 2D image data of the portion of the house was captured by a camera held and operated by a user as the user walked from room to room; Fig. 6; [0117]:  a processor receives a panoramic image; [0141]: the 2D image data 102 includes video data 902; the sequential frames of video are captured in association with movement of the video camera; [0151]: automatically classify respective frames of video); 
receive a user input via a user interface (Fig. 3; [0085]: the 2D image data of the portion of the house was captured by a camera held and operated by a user as the user walked from room to room; Fig. 1; [0087]: the user device 130 receives user input; generate representations of the 3D model based on the user input; [0114]: receive the user input; [0120]:  the request is received from a user device based on user input; Fig. 14; [0179]: user device 1402 facilitates capturing 2D images by user input), the user input indicating a request to generate a two-dimensional representation of the space (Fig. 3; [0085]: the 2D image data of the portion of the house was captured and generated by a camera held and operated by a user as the user walked from room to room;  Fig. 1; [0087]: the user device 130 receives user input; generate representations of the 3D model based on the user input;  the representations include 2D images associated with the 3D model;  [0114]: the user input is received that identifies or indicates the desired portion for cropping; Fig. 14; [0174]: one or more cameras 1404 capture 2D images by user input; 
    PNG
    media_image1.png
    126
    370
    media_image1.png
    Greyscale
; Fig. 14; [0179]: the user device 1402 facilitates capturing 2D images; the representation of the 3D models include 2D floorplan models; 
    PNG
    media_image2.png
    300
    364
    media_image2.png
    Greyscale
); 
generate, by at least using a video file portion of the video file as a first input to a first machine learning model, a three-dimensional model of a room within the space, the video file portion showing the room ([0066]: the machine learning techniques generate the derived 3D data 116 for received 2D image data 102; [0074]: use the machine learning techniques to determine the derived 3D data 116; Fig. 2; [0084]: generate the 3D model of a room  by the 3D model generation component 118;  generate and present 3D model 200 to a user at the client device; 
    PNG
    media_image3.png
    408
    622
    media_image3.png
    Greyscale
 ; Fig. 3; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300;  Fig. 4; [0086]:  generate an 3D dollhouse view representation 400 of a model  by the 3D model generation component 118 based on image data captured of the environment; Fig. 6; [0117]: the system employs a 3D-from-2D convolutional neural network model to derive 3D data from the panoramic image; [0118]: re-project a portion of the panoramic image processed by the preceding layer in association with deriving depth data for the panoramic image; [0120]: generate a 3D model based on depth data associated with a region; Fig. 8; [0122]: generate reconstructed 3D models based on the 3D data); 
generate, by at least using the video file portion as a second input to a second machine learning model, a semantic segmentation of the room (Fig. 9; [0161]: the semantic labeling component 928 employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images; the semantic labeling component 928 performs semantic segmentation and further identifies defined boundaries of recognized objects in the 2D images; the semantic label/segmentation information associated with a 2D image are used as input to one or more augmented 3D-from-2D models along with the 2D image to generate derived 3D data 116), the semantic segmentation indicating that a window is shown in a first image frame of the video file portion ([0161]: the semantic labeling component 928 employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc.; indicate windows by labels; the semantic labels/boundaries associated with features included in a 2D image are characterized); 
determine, based at least in part on the semantic segmentation,   that a three-dimensional representation of the window is included in the three-dimensional model ([0078]:  a floorplan model is a simplified representation of surfaces, e.g., walls, floors, ceilings, etc., portals, e.g., door openings, and window openings associated with an interior environment; [0079]: the 3D model generation component 118 employs common architectural notation to illustrate architectural features of an architectural structure, e.g., doors, windows, fireplaces, length of walls, other features of a building, etc.; a floorplan model comprises a series of lines in 3D space which represent intersections of walls and/or floors, outlines of doorways and/or windows, edges of steps, outlines of other objects of interest; [0080]: Lines for floors, walls and ceilings are dimensioned, e.g., annotated, with an associated size; Fig. 4; [0086]: 3D dollhouse view representation 400 of a model includes windows and door as illustrated in Fig. 4; [0161]: the semantic labeling component 928 employs one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc.; the semantic labeling component 928 assigns labels to the recognized objects identifying the object; the semantic label/segmentation information; [0252]: the training data development component 3316 extracts additional scene information associated with a 3D space model, such as semantic labels included in the indexed semantic label data 3310; train a 3D-from-2D neural network model to predict semantic labels, e.g. wall, ceiling, door, windows, etc.; [0255]:  the training data development component 3316 determines semantic labels for the images and synthetic 3D data for the 2D image); 
based at least in part on determining that the three-dimensional representation is included in the three-dimensional model, correct the three-dimensional model by at least setting a depth property of the three-dimensional representation to a value associated with window depths ([0050]: the final 3D reconstruction was generated using a more precise alignment process relative to an alignment process used to generate the initial 3D reconstruction; [0082]: the 3D model is rendered at the user device 130 and updated in real-time based on new image data; look for potential alignment errors, assess scan quality; Fig. 2; [0084]:  the 3D model 200 of a living room is dynamically updated and corrected based on depth data derived for the respective images; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300; [0112]: apply pixel color data to the depth map; perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image; the stitching component 508 fills in any possible small holes in the panorama with neighboring color data, thereby unifying exposure data across the boundaries between the respective 2D images; determine more precise 3D data; [0130]: fill in the gaps where the derived 3D data is lacking; facilitate aligning the 2D image and associated derived 3D data 116 with other 2D images and associated derived 3D data sets; [0149]: determine data about the photometric match quality between the images at various depths; [0161]: facilitate the alignment process in association with 3D model generation; [0171]: employ average depth measurement values for respective pixels, features, areas/regions etc., of a 2D image; [0198]: the intermediate versions are generated and rendered with relatively little processing time, enabling a real-time 3D reconstruction process that provides continually updated rough 3D version of a scene during the capture process); 
generate, after the three-dimensional model is corrected, a two-dimensional model of the room by at least projecting the three-dimensional model on a two-dimensional plane ([0077]: generate 2D representations of the 3D model; [0080]: a floorplan model generated by the 3D model generation component 118 is a 2D floorplan model; [0087]: the representations include 2D images associated with the 3D model; [0092]: a representation of a 3D model generated in floor plan mode can appear 2D or substantially 2D; [0094]:  a visualization of the 3D model includes 2D images and mixed 2D/3D representations of the 3D model; [0112]: perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image); 
generate a two-dimensional floor plan of the room by at least determining an outer boundary of the two-dimensional model ([0078]:  the floorplan model contains locations of boundary edges for each given surface; [0080]: a floorplan model generated by the 3D model generation component 118 can be a 2D floorplan model;  a 2D floorplan model includes surfaces, e.g., walls, floors, ceilings, etc.,, portals, e.g., door openings, and window openings;  [0089]: define a boundary of the object or feature; [0161]: the semantic labeling component 928 performs semantic segmentation and further identifies the defined boundaries of recognized objects in the 2D images); and 
cause a presentation of the two-dimensional floor plan at the user interface ([0080]: a floorplan model generated by the 3D model generation component 118 can be a 3D floorplan model or a 2D floorplan model;  a 2D floorplan model include surfaces, e.g., walls, floors, ceilings, etc., portals, e.g., door openings, and window openings;  generate a 3D model and project the 3D model to a flat 2D surface).
Gausebeck fails to explicitly disclose 
a predefined value; 
a two-dimensional map of the room and a map;
In same field of endeavor, Rejeb Sfa teaches:
a predefined value ([0156]: wall/window/door height or width are predefined);
a two-dimensional map of the room and a map (Fig. 5; [0082]: common scanned floor plans, i.e. maps of a room; [0083]: a 2D floor plan image; structural 2D elements of the plan; 
    PNG
    media_image4.png
    258
    242
    media_image4.png
    Greyscale
; Fig. 10; [0093]:  there two doors 106 in a two-dimensional floor plan of the room);
generating a two-dimensional floor plan of the room (Rejeb Sfar; Fig. 1; [0054]: converting the semantic segmentation into a 2D model representing the layout of the building; Fig. 10; [0093]:  there two doors 106 in a two-dimensional floor plan of the room; 
    PNG
    media_image5.png
    342
    482
    media_image5.png
    Greyscale
).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gausebeck to include a predefined value; a two-dimensional map of the room and a map; generating a two-dimensional floor plan of the room as taught by Rejeb Sfa. The motivation for doing so would have been to improve a solution for processing a 2D floor plan; to assign semantic information to each pixel of the 2D floor plan in input as taught by Rejeb Sfa in paragraphs [0008] and [0077].

Regarding to claim 2 (Original), Gausebeck in view of Rejeb Sfar discloses the computer system of claim 1, wherein the one or more memory storing instructions that, upon execution by the one or more processors, configure the computer system to (same as rejected in claim 1): 
determine, by using the video file portion as a third input to a third machine learning model, that a second image frame of the video file shows a door (Gausebeck; Fig. 9; [0161]:   the semantic labeling component 928 can be configured to employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc..); 
determine a pose data set of the camera corresponding to when the second image frame was generated by the camera (Gausebeck;  [0035]: the orientation information is determined based on internal measurement data associated with the 2D image generated by an IMU in association with capture of the 2D image; [0100]: identify a capture position and a capture orientation of the 2D/3D panoramic image; [0108]: information regarding the capture positions and orientations of the respective 2D images; [0123]: information regarding capture position and orientation of the 2D image, information regarding capture parameters of the capture device that generated the 2D image); 
generate a cluster of pose data sets of the camera, the cluster including the pose data set (Gausebeck; [0036]: the one or more image capture parameters are selected from a group; [0100]: identify a capture position and a capture orientation of the 2D/3D panoramic image;  [0108]: information regarding the capture positions and orientations of the respective 2D images; [0123]: information regarding capture position and orientation of the 2D image, information regarding capture parameters of the capture device that generated the 2D image; [0156]: a group of two or more related images); 
determine, from the video file, image frames that correspond to the cluster (Gausebeck; [0108]: combine two or more 2D images into a single, larger field-of-view image; [0111]: the initial derived depth information and calibrated capture positions/orientations of the respective images; [0129]: facilitate aligning images captured at different capture positions and  orientations; [0156]: a group of two or more related images; the one or more other images are related in the group); and 
associate the image frames with the room, the image frames forming the video file portion (Gausebeck;  Fig. 2; [0084]:  as new images of the living room are captured, received and aligned with previously aligned image data based on depth data derived for the respective images, the 3D model 200 can be dynamically updated; Fig. 3; [0085]: 2D image data of the portion of the house depicted in the 3D floorplan model was captured by a camera held and operated by a user as the user walked from room to room and took pictures of the house from different perspectives within the rooms;  [0108]: combine two or more 2D images into a single, larger field-of-view image; [0129]: facilitate aligning images captured at different capture positions and/or orientations relative to one another in a three-dimensional coordinate space).

Regarding to claim 3 (Original), Gausebeck in view of Rejeb Sfar discloses the computer system of claim 1, wherein the room and the two-dimensional floor plan are a first room and a first two-dimensional floor plan (Gausebeck;  [0078]:  the floorplan model contains locations of boundary edges for each given surface; [0080]: a floorplan model generated by the 3D model generation component 118 can be a 3D floorplan model or a 2D floorplan model;  a 2D floorplan model includes surfaces, e.g., walls, floors, ceilings, etc.,, portals, e.g., door openings, and window openings; Fig. 2; [0084]: capture and receive the new images of the living room; Fig. 3; [0085]: 2D image data of the portion of the house was captured by a camera held and operated by a user as the user walked from room to room and took pictures of the house from different perspectives within the rooms), and wherein the one or more memory storing instructions that, upon execution by the one or more processors, configure the computer system to (same as rejected in claim 1): 
determine a first location of a door in the first two-dimensional floor plan (Rejeb Sfar; Fig. 10; [0093]:  two doors 106; 
    PNG
    media_image5.png
    342
    482
    media_image5.png
    Greyscale
); 
determine a second location of the door in a second two-dimensional floor plan generated for a second room (Gausebeck; Fig. 9; [0161]:  the semantic labeling component 928 can be configured to employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc.); and 
align, by at least matching the first location and the second location, the first two-dimensional floor plan and the second two-dimensional floor plan (Gausebeck;  [0044]: align the 2D images to one another based on the 3D data; [0048]: align the 2D images to one another based on the depth data to generate a 3D model of the object or environment; [0051]: a spatial alignment; [0062]: generate an alignment between the 2D images and the features; [0129]: facilitate aligning images captured at different capture positions and/or orientations relative to one another in a three-dimensional coordinate space).

Regarding to claim 4 (Currently Amended), Gausebeck discloses a computer-implemented method (Fig. 1; [0061]:  a system 100 facilitates deriving 3D data from 2D image data and generating reconstructed 3D models based on the 3D data and the 2D image data; [0062]:  a computing device 104 receives and processes 2D image data 102; process the 2D image data 102 to derive 3D data; generate an alignment between the 2D images and the features) comprising: 
generating, by at least using a video file portion of a video file as a first input to a first machine learning model, a three-dimensional model of a room ([0066]: the machine learning techniques generate the derived 3D data 116 for received 2D image data 102; [0074]: use the machine learning techniques to determine the derived 3D data 116; Fig. 2; [0084]: generate the 3D model of a room  by the 3D model generation component 118;  generate and present 3D model 200 to a user at the client device; 
    PNG
    media_image3.png
    408
    622
    media_image3.png
    Greyscale
 ; Fig. 3; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300;  Fig. 4; [0086]:  generate an 3D dollhouse view representation 400 of a model  by the 3D model generation component 118 based on image data captured of the environment; Fig. 6; [0117]: the system employs a 3D-from-2D convolutional neural network model to derive 3D data from the panoramic image; [0118]: re-project a portion of the panoramic image processed by the preceding layer in association with deriving depth data for the panoramic image; [0120]: generate a 3D model based on depth data associated with a region; Fig. 8; [0122]: generate reconstructed 3D models based on the 3D data); 
generating, by at least using the video file portion as a second input to a second machine learning model, a semantic segmentation of the room (Fig. 9; [0161]: the semantic labeling component 928 employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images; the semantic labeling component 928 performs semantic segmentation and further identifies defined boundaries of recognized objects in the 2D images; the semantic label/segmentation information associated with a 2D image are used as input to one or more augmented 3D-from-2D models along with the 2D image to generate derived 3D data 116), the semantic segmentation indicating that an object having an object type is shown in a first image frame of the video file portion ([0161]: the semantic labeling component 928 employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc.; indicate windows by labels; the semantic labels/boundaries associated with features included in a 2D image are characterized); 
determining, based at least in part on the semantic segmentation, a three-dimensional representation of the object in the three-dimensional model ([0078]:  a floorplan model is a simplified representation of surfaces, e.g., walls, floors, ceilings, etc., portals, e.g., door openings, and window openings associated with an interior environment; [0079]: the 3D model generation component 118 employs common architectural notation to illustrate architectural features of an architectural structure, e.g., doors, windows, fireplaces, length of walls, other features of a building, etc.; a floorplan model comprises a series of lines in 3D space which represent intersections of walls and/or floors, outlines of doorways and/or windows, edges of steps, outlines of other objects of interest; [0080]: Lines for floors, walls and ceilings are dimensioned, e.g., annotated, with an associated size;  Fig. 2; [0084]: generate the 3D model of a room  by the 3D model generation component 118;  generate and present 3D model 200 to a user at the client device; 
    PNG
    media_image3.png
    408
    622
    media_image3.png
    Greyscale
 ; Fig. 3; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300; Fig. 4; [0086]: 3D dollhouse view representation 400 of a model includes windows and door as illustrated in Fig. 4; [0161]: the semantic labeling component 928 employs one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc.; the semantic labeling component 928 assigns labels to the recognized objects identifying the object; the semantic label/segmentation information; [0252]: the training data development component 3316 extracts additional scene information associated with a 3D space model, such as semantic labels included in the indexed semantic label data 3310; train a 3D-from-2D neural network model to predict semantic labels, e.g. wall, ceiling, door, windows, etc.; [0255]:  the training data development component 3316 determines semantic labels for the images and synthetic 3D data for the 2D image); 
based at least in part on determining that the three-dimensional representation of the object is included in the three-dimensional model, correcting the three-dimensional model by at least setting a property of the three-dimensional representation to a value ([0050]: the final 3D reconstruction was generated using a more precise alignment process relative to an alignment process used to generate the initial 3D reconstruction; [0082]: the 3D model is rendered at the user device 130 and updated in real-time based on new image data; look for potential alignment errors, assess scan quality; Fig. 2; [0084]:  the 3D model 200 of a living room is dynamically updated and corrected based on depth data derived for the respective images; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300; [0112]: apply pixel color data to the depth map; perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image; the stitching component 508 fills in any possible small holes in the panorama with neighboring color data, thereby unifying exposure data across the boundaries between the respective 2D images; determine more precise 3D data; [0130]: fill in the gaps where the derived 3D data is lacking; facilitate aligning the 2D image and associated derived 3D data 116 with other 2D images and associated derived 3D data sets; [0149]: determine data about the photometric match quality between the images at various depths; [0161]: facilitate the alignment process in association with 3D model generation; [0171]: employ average depth measurement values for respective pixels, features, areas/regions etc., of a 2D image; [0198]: the intermediate versions are generated and rendered with relatively little processing time, enabling a real-time 3D reconstruction process that provides continually updated rough 3D version of a scene during the capture process); and 
generating, after the three-dimensional model is corrected, a two-dimensional model of the room by at least projecting the three-dimensional model on a two-dimensional plane (Gausebeck; [0077]: generate 2D representations of the 3D model; [0080]: a floorplan model generated by the 3D model generation component 118 is a 2D floorplan model; [0087]: the representations include 2D images associated with the 3D model; [0092]: a representation of a 3D model generated in floor plan mode can appear 2D or substantially 2D; [0094]:  a visualization of the 3D model includes 2D images and mixed 2D/3D representations of the 3D model; [0112]: perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image); and 
generating a two-dimensional floor plan of the room based at least in part on two-dimensional model ([0077]: generate 2D representations of the 3D model; [0087]: the representations include 2D images associated with the 3D model; [0092]: a representation of a 3D model generated in floor plan mode can appear 2D or substantially 2D; [0094]:  a visualization of the 3D model includes 2D images and mixed 2D/3D representations of the 3D model; [0112]: perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image).
Gausebeck fails to explicitly disclose: 
a predefined value;
a two-dimensional map of the room and a map.
In same field of endeavor, Rejeb Sfa teaches:
a predefined value ([0156]: wall/window/door height or width are predefined);
a two-dimensional map of the room and a map (Fig. 5; [0082]: common scanned floor plans, i.e. maps of a room; [0083]: a 2D floor plan image; structural 2D elements of the plan; 
    PNG
    media_image4.png
    258
    242
    media_image4.png
    Greyscale
; Fig. 10; [0093]:  there two doors 106 in a two-dimensional floor plan of the room);
generating a two-dimensional floor plan of the room (Rejeb Sfar; Fig. 1; [0054]: converting the semantic segmentation into a 2D model representing the layout of the building; Fig. 10; [0093]:  there two doors 106 in a two-dimensional floor plan of the room; 
    PNG
    media_image5.png
    342
    482
    media_image5.png
    Greyscale
).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gausebeck to include a predefined value; a two-dimensional map of the room and a map; generating a two-dimensional floor plan of the room as taught by Rejeb Sfa. The motivation for doing so would have been to improve a solution for processing a 2D floor plan; to assign semantic information to each pixel of the 2D floor plan in input as taught by Rejeb Sfa in paragraphs [0008] and [0077].

Regarding to claim 5 (Original), Gausebeck in view of Rejeb Sfar discloses the computer-implemented method of claim 4, wherein the video file portion, the room, and the two-dimensional floor plan are a first video file portion, a first room, and a first two-dimensional floor plan (Gausebeck; [0078]:  the floorplan model contains locations of boundary edges for each given surface; [0080]: a floorplan model generated by the 3D model generation component 118 can be a 2D floorplan model;  a 2D floorplan model includes surfaces, e.g., walls, floors, ceilings, etc.,, portals, e.g., door openings, and window openings; Fig. 2; [0084]: generate the 3D model of a room  by the 3D model generation component 118;  generate and present 3D model 200 to a user at the client device; 
    PNG
    media_image3.png
    408
    622
    media_image3.png
    Greyscale
 ; Fig. 3; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300;  [0089]: define a boundary of the object or feature), and further comprising: 
receiving the video file, the video file showing a space that includes the first room and a second room (Gausebeck; [0082]: receive the 2D image data; Fig. 2; [0084]: capture and receive the new images of the living room; Fig. 3; [0085]: the 2D image data of the portion of the house was captured by a camera held and operated by a user as the user walked from room to room; Fig. 6; [0117]:  a processor receives a panoramic image; [0141]: the 2D image data 102 includes video data 902; the sequential frames of video are captured in association with movement of the video camera; [0151]: automatically classify respective frames of video); 
receiving a user input via a user interface (Gausebeck; Fig. 3; [0085]: the 2D image data of the portion of the house was captured by a camera held and operated by a user as the user walked from room to room; Fig. 1; [0087]: the user device 130 receives user input; generate representations of the 3D model based on the user input; [0114]: receive the user input; [0120]:  the request is received from a user device based on user input; Fig. 14; [0179]: user device 1402 facilitates capturing 2D images by user input), the user input indicating a request to generate a two-dimensional representation of the space (Gausebeck;  Fig. 3; [0085]: the 2D image data of the portion of the house was captured and generated by a camera held and operated by a user as the user walked from room to room;  Fig. 1; [0087]: the user device 130 receives user input; generate representations of the 3D model based on the user input;  the representations include 2D images associated with the 3D model;  [0114]: the user input is received that identifies or indicates the desired portion for cropping; Fig. 14; [0174]: one or more cameras 1404 capture 2D images by user input; 
    PNG
    media_image1.png
    126
    370
    media_image1.png
    Greyscale
; Fig. 14; [0179]: the user device 1402 facilitates capturing 2D images; the representation of the 3D models include 2D floorplan models); 
generating, by at least using a second video file portion of the video file showing a second room, a second two-dimensional floor plan of the second room (Rejeb Sfar; Fig. 10; [0093]:  two doors 106; 
    PNG
    media_image5.png
    342
    482
    media_image5.png
    Greyscale
); 
generating, absent additional user input related to aligning two-dimensional floor plans, the two-dimensional representation of the space based at least in part on an alignment of the first two-dimensional floor plan and the second two-dimensional floor plan (Gausebeck;  [0044]: align the 2D images to one another based on the 3D data; [0048]: align the 2D images to one another based on the depth data to generate a 3D model of the object or environment; [0051]: a spatial alignment; [0062]: generate an alignment between the 2D images and the features; [0129]: facilitate aligning images captured at different capture positions and/or orientations relative to one another in a three-dimensional coordinate space); and 
causing a presentation of the two-dimensional representation at the user interface (Gausebeck;  [0080]: a floorplan model generated by the 3D model generation component 118 can be a 3D floorplan model or a 2D floorplan model;  a 2D floorplan model include surfaces, e.g., walls, floors, ceilings, etc., portals, e.g., door openings, and window openings;  generate a 3D model and project the 3D model to a flat 2D surface).

Regarding to claim 6 (Original), Gausebeck in view of Rejeb Sfar discloses the computer-implemented method of claim 4, wherein the video file portion, the room, and the two-dimensional floor plan are a first video file portion, a first room, and a first two-dimensional floor plan, and further (Gausebeck; [0078]:  the floorplan model contains locations of boundary edges for each given surface; [0080]: a floorplan model generated by the 3D model generation component 118 can be a 2D floorplan model;  a 2D floorplan model includes surfaces, e.g., walls, floors, ceilings, etc.,, portals, e.g., door openings, and window openings; Fig. 2; [0084]: generate the 3D model of a room  by the 3D model generation component 118;  generate and present 3D model 200 to a user at the client device; 
    PNG
    media_image3.png
    408
    622
    media_image3.png
    Greyscale
 ; Fig. 3; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300;  [0089]: define a boundary of the object or feature) comprising: 
determining, by at least using the video file as a third input to a third machine learning model and based at least in part on pose data of a camera that generated the video file, that the first video file portion corresponds to the first room and that a second video file portion corresponds to a second room (Gausebeck;  Fig. 9; [0161]:   the semantic labeling component 928 can be configured to employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc..); and 
generating a second two-dimensional floor plan of the second room based at least in part on the second video file portion (Rejeb Sfar; Fig. 10; [0093]:  two doors 106 and second 2d floor plan as illustrated in Fig. 10; 
    PNG
    media_image5.png
    342
    482
    media_image5.png
    Greyscale
).
Same motivation of claim 4 is applied here.

Regarding to claim 7 (Original), Gausebeck in view of Rejeb Sfar discloses the computer-implemented method of claim 6, further comprising: 
determining, based at least in part on a third output of the third machine learning model in response to the third input, that a door is common to the first room and the second room (Gausebeck; [0079]: the 3D model generation component 118 employs common architectural notation to illustrate architectural features of an architectural structure, e.g., doors, windows, fireplaces, length of walls, other features of a building, etc.; [0161]:  employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc.); 
determining, based at least in part on a projection of the three-dimensional model on a two-dimensional plane, door data associated with the door (Gausebeck; [0078]: a floorplan model contains locations of boundary edges for each given surface, portal, e.g., door opening, and window opening; [0161]: walls, floors, ceilings, windows, doors, furniture, people, buildings; [0220]: doors and windows); and 
generating a two-dimensional representation of a space by at least aligning the first two-dimensional floor plan and the second two-dimensional floor plan based at least in part on the door data (Gausebeck;  [0044]: align the 2D images to one another based on the 3D data; [0048]: align the 2D images to one another based on the depth data to generate a 3D model of the object or environment; [0051]: a spatial alignment; [0062]: generate an alignment between the 2D images and the features; [0129]: facilitate aligning images captured at different capture positions and/or orientations relative to one another in a three-dimensional coordinate space).

Regarding to claim 8 (Original), Gausebeck in view of Rejeb Sfar discloses the computer-implemented method of claim 4, wherein the object and the object type are a first object and a first object type (Gausebeck;  [0161]: walls, floors, ceilings, windows, doors, furniture, people, buildings), and further comprising: 
determining that the semantic segmentation indicates a second object having a second object type is show in the first image frame (Gausebeck;  Fig. 9; [0161]: the semantic labeling component 928 employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images; the semantic labeling component 928 performs semantic segmentation and further identifies defined boundaries of recognized objects in the 2D images; the semantic label/segmentation information associated with a 2D image are used as input to one or more augmented 3D-from-2D models along with the 2D image to generate derived 3D data 116); 
determining a value of a property of the second object based at least in part on the three-dimensional model (Gausebeck;  [0171]: the 3D data optimization component 1302 employ average depth measurement values for respective pixels, super pixels, features, areas/regions etc., of a 2D image that averages the corresponding depth measurement values reflected in the initial depth data and the derived 3D data 116); and
setting, based at least in part on the second object type, the predefined value to be equal to the value (Gausebeck; [0171]: the 3D data optimization component 1302 maps depth measurements). 
Gausebeck in view of Rejeb Sfar further discloses:
setting, based at least in part on the second object type, the predefined value to be equal to the value (Rejeb Sfar;  [0156]: wall/window/door height or width are predefined).
Same motivation of claim 4 is applied here.

Regarding to claim 9 (Original), Gausebeck in view of Rejeb Sfar discloses the computer-implemented method of claim 4, wherein the object and the object type are a first object and a first object type (Gausebeck;  [0161]: walls, floors, ceilings, windows, doors, furniture, people, buildings), and further comprising: 
determining that the semantic segmentation indicates a second object having a second object type is show in one or more image frames of the video file portion (Gausebeck;  Fig. 9; [0161]: the semantic labeling component 928 employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images; the semantic labeling component 928 performs semantic segmentation and further identifies defined boundaries of recognized objects in the 2D images; the semantic label/segmentation information associated with a 2D image are used as input to one or more augmented 3D-from-2D models along with the 2D image to generate derived 3D data 116); and 
determining, based at least in part on the second object type, that an update to a property of the second object is to be excluded from the correcting of the three-dimensional model (Gausebeck;  [0050]: the final 3D reconstruction was generated using a more precise alignment process relative to an alignment process used to generate the initial 3D reconstruction; [0082]: the 3D model is rendered at the user device 130 and updated in real-time based on new image data; look for potential alignment errors, assess scan quality; Fig. 2; [0084]:  the 3D model 200 of a living room is dynamically updated and corrected based on depth data derived for the respective images).

Regarding to claim 10 (Original), Gausebeck in view of Rejeb Sfar discloses the computer-implemented method of claim 4, further comprising: 
determining that the three-dimensional model includes missing data (Gausebeck;  Fig. 2; [0084]: the 3D model 200 as depicted is currently under construction and includes missing image data); 
determining that the missing data corresponds to at least the first image frame (Gausebeck; Fig. 4; [0084]: as new images of the living room are captured, received and aligned with previously aligned image data based on depth data derived for the respective images, the 3D model 200 is dynamically updated); and 
determining that the predefined value is to be used for the property of the object based at least in part on the semantic segmentation indicating that the object has the object type and is shown in the first image frame (Gausebeck; Fig. 4; [0084]: as new images of the living room are captured, received and aligned with previously aligned image data based on depth data derived for the respective images, the 3D model 200 is dynamically updated; [0112]: the stitching component 508 fills in any possible small holes in the panorama with neighboring color data, thereby unifying exposure data across the boundaries between the respective 2D images).

Regarding to claim 13 (Currently Amended), Gausebeck discloses one or more computer-readable storage media storing instructions, that upon execution on a system, cause the system to perform operations (Fig. 1; [0061]:  a system 100 facilitates deriving 3D data from 2D image data and generating reconstructed 3D models based on the 3D data and the 2D image data; [0062]:  a computing device 104 receives and processes 2D image data 102; process the 2D image data 102 to derive 3D data; generate an alignment between the 2D images and the features; [0063]: one memory and at least one processor;  one memory stores computer-executable instructions; the processors execute the computer-executable instructions; [0178]: one memory stores the computer-executable instructions executed by the at least one processor; Fig. 35; [0259]: the system memory 3516 includes volatile memory 3520 and nonvolatile memory 3522; RAM) comprising: 
generating, by at least using a video file portion of a video file as a first input to a first machine learning model, a three-dimensional model of a room ([0066]: the machine learning techniques generate the derived 3D data 116 for received 2D image data 102; [0074]: use the machine learning techniques to determine the derived 3D data 116; Fig. 2; [0084]: generate the 3D model of a room  by the 3D model generation component 118;  generate and present 3D model 200 to a user at the client device; 
    PNG
    media_image3.png
    408
    622
    media_image3.png
    Greyscale
 ; Fig. 3; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300;  Fig. 4; [0086]:  generate an 3D dollhouse view representation 400 of a model  by the 3D model generation component 118 based on image data captured of the environment; Fig. 6; [0117]: the system employs a 3D-from-2D convolutional neural network model to derive 3D data from the panoramic image; [0118]: re-project a portion of the panoramic image processed by the preceding layer in association with deriving depth data for the panoramic image; [0120]: generate a 3D model based on depth data associated with a region; Fig. 8; [0122]: generate reconstructed 3D models based on the 3D data); 
generating, by at least using the video file portion as a second input to a second machine learning model, a semantic segmentation of the room (Fig. 9; [0161]: the semantic labeling component 928 employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images; the semantic labeling component 928 performs semantic segmentation and further identifies defined boundaries of recognized objects in the 2D images; the semantic label/segmentation information associated with a 2D image are used as input to one or more augmented 3D-from-2D models along with the 2D image to generate derived 3D data 116), the semantic segmentation indicating that an object having an object type is shown in a first image frame of the video file portion ([0161]: the semantic labeling component 928 employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc.; indicate windows by labels; the semantic labels/boundaries associated with features included in a 2D image are characterized); 
determining, based at least in part on the semantic segmentation, a three-dimensional representation of the object in the three-dimensional model ([0078]:  a floorplan model is a simplified representation of surfaces, e.g., walls, floors, ceilings, etc., portals, e.g., door openings, and window openings associated with an interior environment; [0079]: the 3D model generation component 118 employs common architectural notation to illustrate architectural features of an architectural structure, e.g., doors, windows, fireplaces, length of walls, other features of a building, etc.; a floorplan model comprises a series of lines in 3D space which represent intersections of walls and/or floors, outlines of doorways and/or windows, edges of steps, outlines of other objects of interest; [0080]: Lines for floors, walls and ceilings are dimensioned, e.g., annotated, with an associated size; Fig. 4; [0086]: 3D dollhouse view representation 400 of a model includes windows and door as illustrated in Fig. 4; [0161]: the semantic labeling component 928 employs one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc.; the semantic labeling component 928 assigns labels to the recognized objects identifying the object; the semantic label/segmentation information; [0252]: the training data development component 3316 extracts additional scene information associated with a 3D space model, such as semantic labels included in the indexed semantic label data 3310; train a 3D-from-2D neural network model to predict semantic labels, e.g. wall, ceiling, door, windows, etc.; [0255]:  the training data development component 3316 determines semantic labels for the images and synthetic 3D data for the 2D image); 
based at least in part on determining that the three-dimensional representation of the object is included in the three-dimensional model, correcting the three-dimensional model by at least setting a property of the three-dimensional representation to a value ([0050]: the final 3D reconstruction was generated using a more precise alignment process relative to an alignment process used to generate the initial 3D reconstruction; [0082]: the 3D model is rendered at the user device 130 and updated in real-time based on new image data; look for potential alignment errors, assess scan quality; Fig. 2; [0084]:  the 3D model 200 of a living room is dynamically updated and corrected based on depth data derived for the respective images; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300; [0112]: apply pixel color data to the depth map; perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image; the stitching component 508 fills in any possible small holes in the panorama with neighboring color data, thereby unifying exposure data across the boundaries between the respective 2D images; determine more precise 3D data; [0130]: fill in the gaps where the derived 3D data is lacking; facilitate aligning the 2D image and associated derived 3D data 116 with other 2D images and associated derived 3D data sets; [0149]: determine data about the photometric match quality between the images at various depths; [0161]: facilitate the alignment process in association with 3D model generation; [0171]: employ average depth measurement values for respective pixels, features, areas/regions etc., of a 2D image; [0198]: the intermediate versions are generated and rendered with relatively little processing time, enabling a real-time 3D reconstruction process that provides continually updated rough 3D version of a scene during the capture process); 
generating, after the three-dimensional model is corrected, a two-dimensional model of the room by at least projecting the three-dimensional model on a two-dimensional plane (Gausebeck; [0077]: generate 2D representations of the 3D model; [0080]: a floorplan model generated by the 3D model generation component 118 is a 2D floorplan model; [0087]: the representations include 2D images associated with the 3D model; [0092]: a representation of a 3D model generated in floor plan mode can appear 2D or substantially 2D; [0094]:  a visualization of the 3D model includes 2D images and mixed 2D/3D representations of the 3D model; [0112]: perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image); and
 generating a two-dimensional floor plan of the room based at least in part on two-dimensional model ([0077]: generate 2D representations of the 3D model; [0087]: the representations include 2D images associated with the 3D model; [0092]: a representation of a 3D model generated in floor plan mode can appear 2D or substantially 2D; [0094]:  a visualization of the 3D model includes 2D images and mixed 2D/3D representations of the 3D model; [0112]: perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image).
Gausebeck fails to explicitly disclose: 
a predefined value;
a two-dimensional map of the room and a map;
In same field of endeavor, Rejeb Sfa teaches:
a predefined value ([0156]: wall/window/door height or width are predefined);
a two-dimensional map of the room and a map (Fig. 5; [0082]: common scanned floor plans, i.e. maps of a room; [0083]: a 2D floor plan image; structural 2D elements of the plan; 
    PNG
    media_image4.png
    258
    242
    media_image4.png
    Greyscale
; Fig. 10; [0093]:  there two doors 106 in a two-dimensional floor plan of the room);
generating a two-dimensional floor plan of the room (Rejeb Sfar; Fig. 1; [0054]: converting the semantic segmentation into a 2D model representing the layout of the building; Fig. 10; [0093]:  there two doors 106 in a two-dimensional floor plan of the room; 
    PNG
    media_image5.png
    342
    482
    media_image5.png
    Greyscale
).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gausebeck to include a predefined value; a two-dimensional map of the room and a map; generating a two-dimensional floor plan of the room as taught by Rejeb Sfa. The motivation for doing so would have been to improve a solution for processing a 2D floor plan; to assign semantic information to each pixel of the 2D floor plan in input as taught by Rejeb Sfa in paragraphs [0008] and [0077].

Regarding to claim 14 (Previously Presented), Gausebeck in view of Rejeb Sfar discloses the one or more computer-readable storage media of claim 13, wherein the operations further comprise: 
determining, by at least using the video file as a third input to a third machine learning model, that a door is shown in the video file portion (Gausebeck; Fig. 9; [0161]:   the semantic labeling component 928 can be configured to employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc.); and 
generating an updated two-dimensional model by at least updating a two-dimensional representation of the door in the two-dimensional model (Gausebeck;  [0050]: the final 3D reconstruction was generated using a more precise alignment process relative to an alignment process used to generate the initial 3D reconstruction; [0082]: the 3D model is rendered at the user device 130 and updated in real-time based on new image data; look for potential alignment errors, assess scan quality; Fig. 2; [0084]:  the 3D model 200 of a living room is dynamically updated and corrected based on depth data derived for the respective images), wherein the two-dimensional floor plan is generated based at least in part on the updated two-dimensional map (Gausebeck;  [0078]:  the floorplan model contains locations of boundary edges for each given surface; [0080]: a floorplan model generated by the 3D model generation component 118 can be a 2D floorplan model;  a 2D floorplan model includes surfaces, e.g., walls, floors, ceilings, etc.,, portals, e.g., door openings, and window openings;  [0089]: define a boundary of the object or feature; [0161]: the semantic labeling component 928 performs semantic segmentation and further identifies the defined boundaries of recognized objects in the 2D images);
Gausebeck in view of Rejeb Sfar further discloses a two-dimensional map of the room and a map (Rejeb Sfar; Fig. 5; [0082]: common scanned floor plans, i.e. maps of a room; [0083]: a 2D floor plan image; structural 2D elements of the plan; 
    PNG
    media_image4.png
    258
    242
    media_image4.png
    Greyscale
; Fig. 10; [0093]:  there two doors 106 in a two-dimensional floor plan of the room).
Same motivation of claim 13 is applied here.

Regarding to claim 15 (Previously Presented), Gausebeck in view of Rejeb Sfar discloses the one or more computer-readable storage media of claim 13, wherein the operations further comprise: 
determining a correction to be performed on the two-dimensional map, the correction associated with the object type indicated by the semantic segmentation (Gausebeck; [0050]: the final 3D reconstruction was generated using a more precise alignment process relative to an alignment process used to generate the initial 3D reconstruction; [0082]: the 3D model is rendered at the user device 130 and updated in real-time based on new image data; look for potential alignment errors, assess scan quality; Fig. 2; [0084]:  the 3D model 200 of a living room is dynamically updated and corrected based on depth data derived for the respective images; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300; [0112]: apply pixel color data to the depth map; perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image; the stitching component 508 fills in any possible small holes in the panorama with neighboring color data, thereby unifying exposure data across the boundaries between the respective 2D images; determine more precise 3D data; [0149]: determine data about the photometric match quality between the images at various depths; [0161]: facilitate the alignment process in association with 3D model generation; [0171]: employ average depth measurement values for respective pixels, features, areas/regions etc., of a 2D image); 
determining a two-dimensional representation of the object in the two-dimensional model (Gausebeck;  [0080]: a floorplan model generated by the 3D model generation component 118 can be a 3D floorplan model or a 2D floorplan model;  a 2D floorplan model include surfaces, e.g., walls, floors, ceilings, etc., portals, e.g., door openings, and window openings;  generate a 3D model and project the 3D model to a flat 2D surface); and 
generating an updated two-dimensional projection by at least updating the two-dimensional representation based at least in part on the correction, wherein the two-dimensional floor plan is generated based at least in part on the updated two-dimensional map (Gausebeck; [0050]: the final 3D reconstruction was generated using a more precise alignment process relative to an alignment process used to generate the initial 3D reconstruction; [0082]: the 3D model is rendered at the user device 130 and updated in real-time based on new image data; look for potential alignment errors, assess scan quality; Fig. 2; [0084]:  the 3D model 200 of a living room is dynamically updated and corrected based on depth data derived for the respective images; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300; [0112]: apply pixel color data to the depth map; perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image; the stitching component 508 fills in any possible small holes in the panorama with neighboring color data, thereby unifying exposure data across the boundaries between the respective 2D images; determine more precise 3D data; [0149]: determine data about the photometric match quality between the images at various depths; [0161]: facilitate the alignment process in association with 3D model generation; [0171]: employ average depth measurement values for respective pixels, features, areas/regions etc., of a 2D image).
Gausebeck in view of Rejeb Sfar further discloses a two-dimensional map of the room and a map (Rejeb Sfar; Fig. 5; [0082]: common scanned floor plans, i.e. maps of a room; [0083]: a 2D floor plan image; structural 2D elements of the plan; 
    PNG
    media_image4.png
    258
    242
    media_image4.png
    Greyscale
; Fig. 10; [0093]:  there two doors 106 in a two-dimensional floor plan of the room).
Same motivation of claim 13 is applied here.

Regarding to claim 16 (Previously Presented), Gausebeck in view of Rejeb Sfar discloses the one or more computer-readable storage media of claim 13, wherein the operations further comprise: 
associating a three-dimensional representation of the object in the three-dimensional model with a label, the label including the object type (Gausebeck;  Fig. 9; [0161]: the semantic labeling component 928 employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images; the semantic labeling component 928 performs semantic segmentation and further identifies defined boundaries of recognized objects in the 2D images; the semantic label/segmentation information associated with a 2D image are used as input to one or more augmented 3D-from-2D models along with the 2D image to generate derived 3D data 116); 
generating a two-dimensional projection of the three-dimensional model on a two-dimensional plane, the two-dimensional projection comprising the two-dimensional model and including a two-dimensional representation of the object (Gausebeck;  [0077]: generate 2D representations of the 3D model; [0087]: the representations include 2D images associated with the 3D model; [0092]: a representation of a 3D model generated in floor plan mode can appear 2D or substantially 2D; [0094]:  a visualization of the 3D model includes 2D images and mixed 2D/3D representations of the 3D model; [0112]: perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image); 
associating the two-dimensional representation of with the label (Gausebeck;  Fig. 9; [0161]: the semantic labeling component 928 employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images; the semantic labeling component 928 performs semantic segmentation and further identifies defined boundaries of recognized objects in the 2D images; the semantic label/segmentation information associated with a 2D image are used as input to one or more augmented 3D-from-2D models along with the 2D image to generate derived 3D data 116); 
determining a correction to be performed on the two-dimensional projection based at least in part on the label (Gausebeck;  [0082]: the 3D model is rendered at the user device 130 and updated in real-time based on new image data; look for potential alignment errors, assess scan quality; Fig. 2; [0084]:  the 3D model 200 of a living room is dynamically updated and corrected based on depth data derived for the respective images); and 
generating an updated two-dimensional projection by at least updating the two-dimensional representation based at least in part on the correction, wherein the two-dimensional floor plan is generated based at least in part on the updated two-dimensional projection (Gausebeck; [0050]: the final 3D reconstruction was generated using a more precise alignment process relative to an alignment process used to generate the initial 3D reconstruction; [0082]: the 3D model is rendered at the user device 130 and updated in real-time based on new image data; look for potential alignment errors, assess scan quality; Fig. 2; [0084]:  the 3D model 200 of a living room is dynamically updated and corrected based on depth data derived for the respective images; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300; [0112]: apply pixel color data to the depth map; perform reverse-projecting of color data from colored point clouds or depth maps to create a single 2D panoramic image; the stitching component 508 fills in any possible small holes in the panorama with neighboring color data, thereby unifying exposure data across the boundaries between the respective 2D images; determine more precise 3D data; [0149]: determine data about the photometric match quality between the images at various depths; [0161]: facilitate the alignment process in association with 3D model generation; [0171]: employ average depth measurement values for respective pixels, features, areas/regions etc., of a 2D image).
Gausebeck in view of Rejeb Sfar further discloses a two-dimensional map of the room and a map (Rejeb Sfar; Fig. 5; [0082]: common scanned floor plans, i.e. maps of a room; [0083]: a 2D floor plan image; structural 2D elements of the plan; 
    PNG
    media_image4.png
    258
    242
    media_image4.png
    Greyscale
; Fig. 10; [0093]:  there two doors 106 in a two-dimensional floor plan of the room).
Same motivation of claim 13 is applied here.

Regarding to claim 17 (Previously Presented), Gausebeck in view of Rejeb Sfar discloses the one or more computer-readable storage media of claim 13, wherein the operations further comprise: 
determining that a first section of the outer boundary occupies a first grid unit of a grid by a first area value that exceeds a threshold value (Rejeb Sfar; [0140]: above a first predetermined collinearity threshold; [0141]: above a second predetermined collinearity threshold); 
determining that a second section of the outer boundary occupies a second grid unit of the grid by a second area value that is smaller than the threshold value (Rejeb Sfar; [0143]: spacings between two substantially collinear walls are lower than this threshold); and 
generating an updated outer boundary by retaining the first section and removing the second section, wherein the two-dimensional floor plan is generated based at least in part on the updated outer boundary (Gausebeck;  [0078]: a floorplan model contains locations of boundary edges for each given surface, portal, e.g., door opening, and window opening; [0080]: Calculation of area, e.g., square footage, is determined for any identified surface or portion of a 3D model with a known boundary; [0089]: the 3D data derivation component 110 identifies features, objects, etc. included in the 2D images and associate information with the derived 3D data for the respective features, objects, etc., identifying them and defining a boundary of the object or feature; [0112]: the boundaries between the respective 2D images).

Regarding to claim 18 (Previously Presented), Gausebeck in view of Rejeb Sfar discloses the one or more computer-readable storage media of claim 13, wherein the operations further comprise: 
determining that a first wall belongs the outer boundary (Gausebeck; [0078]: representation of surfaces, e.g., walls, floors, ceilings, etc.; Fig. 4; [0086]: generate a 3D model with walls at the outer boundary as illustrated in Fig. 4); and 
determining that a second wall is contained within the outer boundary, wherein the two-dimensional floor plan is generated by at least retaining the first wall and removing the second wall (Gausebeck; [0077]: remove objects photographed, e.g., walls, furniture, fixtures, etc., from the 3D model; [0091]: one or more walls are removed; [0111]: remove outlier readings from the average calculation; [0112]: perform graph cuts at the edges).

Regarding to claim 19 (Original), Gausebeck in view of Rejeb Sfar discloses the one or more computer-readable storage media of claim 13, wherein the two-dimensional floor plan and the room are a first two-dimensional floor plan a first room (Gausebeck; [0078]:  the floorplan model contains locations of boundary edges for each given surface; [0080]: a floorplan model generated by the 3D model generation component 118 can be a 2D floorplan model;  a 2D floorplan model includes surfaces, e.g., walls, floors, ceilings, etc.,, portals, e.g., door openings, and window openings; Fig. 2; [0084]: generate the 3D model of a room  by the 3D model generation component 118;  generate and present 3D model 200 to a user at the client device; 
    PNG
    media_image3.png
    408
    622
    media_image3.png
    Greyscale
 ; Fig. 3; [0085]: the 3D model generation component 118 uses depth data derived from the respective images to generate the 3D floorplan model 300;  [0089]: define a boundary of the object or feature), and wherein the operations further comprise: 
determining a first location of a first two-dimensional representation of a door in the first two-dimensional floor plan (Rejeb Sfar; Fig. 10; [0093]:  two doors 106; 
    PNG
    media_image5.png
    342
    482
    media_image5.png
    Greyscale
); 
determining a second location of a second two-dimensional representation of the door in a first two-dimensional floor plan of a second room (Gausebeck; Fig. 9; [0161]:  the semantic labeling component 928 can be configured to employ one or more machine learning object recognition techniques to automatically identify defined objects and features included in the 2D images, e.g., walls, floors, ceilings, windows, doors, furniture, people, buildings, etc.); and 
generating a third two-dimensional representation of a space by at least aligning the first two-dimensional representation and the second two-dimensional representation based at least in part on the first location and the second location and by at least removing an overlap between the first two-dimensional representation and the second two-dimensional representation (Gausebeck;  [0044]: align the 2D images to one another based on the 3D data; [0048]: align the 2D images to one another based on the depth data to generate a 3D model of the object or environment; [0051]: a spatial alignment; [0062]: generate an alignment between the 2D images and the features; [0129]: facilitate aligning images captured at different capture positions and/or orientations relative to one another in a three-dimensional coordinate space).

Regarding to claim 20 (Original), Gausebeck in view of Rejeb Sfar discloses the one or more computer-readable storage media of claim 13, wherein the two-dimensional floor plan and the room are a first two-dimensional floor plan a first room (Gausebeck;  [0078]:  the floorplan model contains locations of boundary edges for each given surface; [0080]: a floorplan model generated by the 3D model generation component 118 can be a 3D floorplan model or a 2D floorplan model;  a 2D floorplan model includes surfaces, e.g., walls, floors, ceilings, etc.,, portals, e.g., door openings, and window openings; Fig. 2; [0084]: capture and receive the new images of the living room; Fig. 3; [0085]: 2D image data of the portion of the house was captured by a camera held and operated by a user as the user walked from room to room and took pictures of the house from different perspectives within the rooms), and wherein the operations further comprise: 
generating a third two-dimensional representation of a space by at least aligning a first two-dimensional representation and a second two-dimensional representation of a second room (Gausebeck;  [0044]: align the 2D images to one another based on the 3D data; [0048]: align the 2D images to one another based on the depth data to generate a 3D model of the object or environment; [0051]: a spatial alignment; [0062]: generate an alignment between the 2D images and the features; [0129]: facilitate aligning images captured at different capture positions and/or orientations relative to one another in a three-dimensional coordinate space), determining a gap between a first wall in the first two-dimensional representation and a second wall in the second two-dimensional representation, and re-positioning at least the first wall (Fig. 2; [0084]: a visualization of an example 3D model 200 of a living room in association with generation of the 3D model; as new images of the living room are captured, received and aligned with previously aligned image data based on depth data derived for the respective images, the 3D model 200 with walls is dynamically updated; 
    PNG
    media_image3.png
    408
    622
    media_image3.png
    Greyscale
 ; [0130]: fill in the gaps where the derived 3D data is lacking; facilitate aligning the 2D image and associated derived 3D data 116 with other 2D images and associated derived 3D data sets).

Claims 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Gausebeck (US 20190026956 A1) in view of Rejeb Sfar (US 20190205485 A1), and further in view of Mech (US 20240161366 A1).
Regarding to claim 11 (Currently Amended), Gausebeck in view of Rejeb Sfar discloses the computer-implemented method of claim 4, further comprising: 
Gausebeck in view of Rejeb Sfar fails to explicitly disclose:
wherein the two-dimensional map comprises a two-dimensional density map of the room, and wherein the computer-implemented method further comprises:
determining, based at least in part on the two-dimensional density map, an outer boundary of the room, wherein the two-dimensional floor plan is generated based at least in part on the outer boundary.
In same field of endeavor, Mech teaches:
wherein the two-dimensional map comprises a two-dimensional density map of the room, and wherein the computer-implemented method further comprises (Fig. 4; [0074]: the density map 406 includes higher density values at object boundaries of the two-dimensional image 400 and lower density values within the object boundaries); and 
determining, based at least in part on the two-dimensional density map, an outer boundary of the room, wherein the two-dimensional floor plan is generated based at least in part on the outer boundary (Fig. 4; [0074]: the density map 406 includes higher density values at object boundaries of the two-dimensional image 400 and lower density values within the object boundaries; Fig. 5; [0079]: preserves the boundaries of the objects of the two-dimensional image while remaining consistent with the density map; Fig. 7B; [0088]: determine corners/edges of a room).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Gausebeck in view of Rejeb Sfar to include wherein the two-dimensional map comprises a two-dimensional density map of the room, and wherein the computer-implemented method further comprises and determining, based at least in part on the two-dimensional density map, an outer boundary of the room, wherein the two-dimensional floor plan is generated based at least in part on the outer boundary as taught by Mech. The motivation for doing so would have been to provide improved computer functionality by leveraging three-dimensional representations of two-dimensional images to apply modifications to the two-dimensional images; to determine corners/edges of a room as taught by Mech in paragraphs [0046] and [0088].

Regarding to claim 12 (Original), Gausebeck in view of Rejeb Sfar and Mech discloses the computer-implemented method of claim 11, further comprising: 
determining that the outer boundary includes a two-dimensional representation of a structure (Gausebeck; [0078]: a floorplan model contains locations of boundary edges for each given surface, portal, e.g., door opening, and window opening; [0080]: Calculation of area, e.g., square footage, is determined for any identified surface or portion of a 3D model with a known boundary; [0089]: the 3D data derivation component 110 identifies features, objects, etc. included in the 2D images and associate information with the derived 3D data for the respective features, objects, etc., identifying them and defining a boundary of the object or feature; [0112]: the boundaries between the respective 2D images); and 
removing the two-dimensional representation from the outer boundary (Gausebeck; [0077]: remove objects photographed, e.g., walls, furniture, fixtures, etc., from the 3D model; [0091]: one or more walls are removed; [0111]: remove outlier readings from the average calculation; [0112]: perform graph cuts at the edges).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Hai Tao Sun whose telephone number is (571)272-5630. The examiner can normally be reached 9:00AM-6:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Hajnik can be reached at 5712727642. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/HAI TAO SUN/Primary Examiner, Art Unit 2616
Read full office action
Prosecution Timeline

Dec 15, 2023
Application Filed
Jul 31, 2025
Non-Final Rejection — §101, §103
Oct 16, 2025
Applicant Interview (Telephonic)
Oct 16, 2025
Examiner Interview Summary
Oct 22, 2025
Response Filed
Nov 07, 2025
Final Rejection — §101, §103
Feb 05, 2026
Request for Continued Examination
Feb 20, 2026
Response after Non-Final Action
Mar 11, 2026
Non-Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/597,939
Patent 12602816
SIMULATED CONFIGURATION EVALUATION APPARATUS AND METHOD
2y 5m to grant Granted Apr 14, 2026
18/684,393
Patent 12603024
DISPLAY CONTROL DEVICE
2y 5m to grant Granted Apr 14, 2026
18/527,903
Patent 12586310
APPARATUS AND METHOD WITH IMAGE PROCESSING
2y 5m to grant Granted Mar 24, 2026
18/066,199
Patent 12578846
GENERATING MASKED REGIONS OF AN IMAGE USING A PREDICTED USER INTENT
2y 5m to grant Granted Mar 17, 2026
18/414,841
Patent 12579727
APPARATUS AND METHOD FOR ASYNCHRONOUS RAY TRACING
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
73%
Grant Probability
99%
With Interview (+26.6%)
2y 7m
Median Time to Grant
High
PTA Risk
Based on 476 resolved cases by this examiner. Grant probability derived from career allow rate.