Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
The Amendment filed November 27th, 2025 has been entered. Claims 1, 6-7, 9-10, 12, 17-18, and 20-22 have been amended. Claims 1-22 remain pending and rejected in the application. Applicant’s amendments to the specifications have overcome each and every objection previously set forth in the Non-Final Office Action mailed August 27th, 2025 and have therefore been withdrawn.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-22 are rejected under 35 U.S.C. 103 as being unpatentable over Eraker et al. (U.S. Patent: #9,836,885 B1), hereinafter Eraker, in view of Hovden et al. (Pub. No.: US 2019/0251352 A1), hereinafter Hovden and further in view of Li et al. (Pub. No.: US 2022/0114291 A1), hereinafter Li.
Regarding claim 1, Eraker discloses a method (FIG. 8 and Col. 4, Lines 46-49 FIG. 8 shows a block diagram of an exemplary embodiment of a system to implement the methods of image-based rendering for real estate and other real scenes disclosed herein) comprising:
accessing interior image frames captured by a mobile device (Col. 10, Lines 4-8 teach that in step 601, input images are collected. The input images can come from various sources such as mobile devices (e.g., smartphones), point-and-shoot cameras, GoPros on a rig, and specialty cameras systems such as LadyBug5 from Point Grey). However, Eraker fails to disclose that the mobile device is moved through an interior of a building.
Hovden teaches accessing interior image frames by a mobile device as the mobile device is moved through the interior of a building (Paragraph 50 teaches that the camera operator can then cause the camera to rotate 360°, capturing visual image data (e.g., one or more two-dimensional images) and 3D (3D) data (e.g., depth data) over the course of rotation. The captured data can be combined to generate a 360° panoramic image of the environment with depth data at the scan position/location. The camera operator can then move (or cause the camera to move via a robotic mount) to a new position in the environment and direct the camera to capture a second panoramic image and corresponding panoramic depth data at the new position). Since Eraker teaches capturing input images from a building using a mobile device and Hovden teaches using that mobile device to capture the images as you move throughout the interior of an environment/building, it would have been obvious to a person having ordinary skill in the art to combine the skills together, so that any of the images being captured would specifically be from the interior of a building and would be captured as the camera moved throughout that interior.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eraker to incorporate the teachings of Hovden, so that the combined skills together would allow for a user to have access to additional images captured by a camera device as it moves throughout the interior of a building.
Additionally, Eraker in view of Hovden disclose accessing exterior image frames captured by an unmanned aerial vehicle ("UAV") as the UAV navigates around an exterior of the building (Col. 10, Lines 8-11 of Eraker teach that the input images can be shot with a tripod, or by holding the device in hand or on top of other devices such as drones, robots or other kinds of automated, semi-automated, or remote controlled devices and paragraph 50 of Hovden teaches that the camera operator can then move (or cause the camera to move via a robotic mount) to a new position in the environment and direct the camera to capture a second panoramic image and corresponding panoramic depth data at the new position);
generating a 3D model representative of the building based on the accessed interior image frames and accessed exterior image frames (Col. 6, Lines 49-55 of Eraker teach that in step 302, the captured image data is processed to create panoramas, determine camera geometry, and 3D reconstruction algorithms are employed to generate dense representations of a 3D model, geometric proxies, parcel maps and floor plans. In step 303, one or more rendering algorithms are applied to the data and 3D model to render a 3D view of the real estate property. Additionally, paragraph 131 of Hovden teaches that the interior and exterior scan images 1920 can include a collection of images captured of a structure, (e.g., a house, a building, etc.), including both interior and exterior images. In various embodiments, the interior and exterior scan images include images captured in association with a scan of the structure, for example, for the purpose of generating a 3D model of the structure.);
generating an interface displaying an interior portion of the 3D model corresponding to an interior of the building in a first interface portion (FIG. 4 and Col. 6, Lines 56-60 of Eraker teach that FIG. 4 shows an embodiment of a User Interface System 400 for image-based rendering of real estate. Three different user interface elements serve the dual purposes of informing the user of his location in the model and simultaneously enabling spatial navigation);
identifying an exterior portion of the 3D model that corresponds to an exterior of the building and that corresponds to a location associated with the interior portion of the 3D model, the identified exterior portion of the 3D model associated with one or more of the accessed exterior image frames (Col. 7, Lines 3-5 of Eraker teach that the UI element 408 is a text overlay that displays one or more labels associated with the user's location within the virtual model and Col. 8, Lines 1-9 teach that the ground truth data may also include labels (outside: backyard, or groundfloor: bathroom) which are ground truth in the sense that they are directly collected at the scene and are not synthetic approximations. For example, when a user is near a position where ground truth image data is captured, very little geometry is required to render the most photorealistic view of the model. At the exact position of capture, the use of an image primitive is the most photorealistic view possible. Additionally, paragraph 160 of Hovden teaches that at 2802, a system operatively coupled to a processor, (e.g., system 1900, system 2700 or the like), can identify exterior image data (e.g., a panoramic image) comprising imagery of an exterior of a building (e.g., using identification component 1904), wherein the exterior image data is associated with location information corresponding to a capture location of the exterior image data relative to a global positioning coordinate system, and wherein the building is associated with interior scan location information corresponding to interior capture locations, relative to the global positioning coordinate system, of interior images captured inside the building.);
modifying the displayed interior portion of the 3D model to display an interface element at a location within the 3D model corresponding to the interior portion of the 3D model (FIG. 4 and Col. 7, Lines 6-25 of Eraker teach that FIG. 4 shows the three primary user interface elements of an embodiment of the navigation tool for image-based renderings of real estate. UI element 402 shows a rendering of the model from a specific viewpoint position and also serves as a means to navigate to adjacent positions and to change the viewing vector from a fixed position. Navigation can occur using various well known input/output (IO) devices, such as a keyboard, touchscreen, eye-tracking technology, gesture recognition technology, or a computer mouse. For densely sampled spherical panoramas, one example of navigation using UI element 402 with a mouse would be to click on the rendered view to translate in the XY plane to another panorama location in the model. Another example of navigation using UI element 402 would be to click and hold the mouse button, enabling rotation about the Z axis, thus “looking around” without translating in the XY plane. As the user navigates, the rendered viewpoint shown in UI element 402 changes in real time based on a new position and viewing vector associated with the new location and Col. 13, Lines 21-28 of Eraker teach two arrows shown between capture location 2 and 14 and 11 and 20 represent doors that connect the spatial boundaries in a connectivity graph. In later rendering stages, users will be able to move between the indoor and outdoor areas based on this connectivity graph);
and in response to a selection of the displayed interface element, modifying a second interface portion to display the one or more accessed exterior image frames (Col. 13, Lines 41-45 of Eraker teach that when a user is navigating panoramas, another type of view that a user may request is a transition between panoramas that is indicative of the physical experience of transitioning between the panorama capture locations in the real world. Also, Col. 13, Lines 53-57 of Eraker teach that at decision point 802, the rendering system evaluates the data at its disposal to aid the rendering of a requested viewpoint. If the viewpoint has been sampled as panorama, an image or as a video, then the rendering system can use the sampled raw and/or processed sensor-captured data to render the requested viewpoint in step 503). However, Eraker in view of Hovden fail to disclose modifying a second interface portion to display the one or more accessed exterior image frames simultaneously with the display of the interior portion of the 3D model within the first interface portion.
Li discloses modifying a second interface portion to display the one or more accessed exterior image frames simultaneously with the display of the interior portion of the 3D model within the first interface portion (Paragraph 14 teaches that the described techniques provide various benefits in various embodiments, including to use 3D models and/or 2.5D models and/or 2D floor map models of multi-room buildings and other structures (e.g., that are generated from images acquired in the buildings or other structures) to display various types of information about building interiors, such as in a coordinated and simultaneous manner with other types of related information, including to use information about the actual as-built buildings (e.g., internal structural components and/or other interior elements, nearby external buildings and/or vegetation, actual building geographical location and/or orientation, actual typical weather patterns, etc.). Additionally, FIG. 6 and paragraph 79 teach that the illustrated embodiment of the routine begins in block 605, where information or instructions are received. The routine continues to block 610 to determine whether the instructions in block 605 are to present integrated information for an indicated building, such as in a corresponding GUI. If so, the routine continues to perform blocks 615-650, and otherwise continues to block 690. In particular, if it is determined in block 610 that the instructions received in block 605 are to present integrated information for an indicated building, the routine continues to block 615 to obtain building information of multiple types for the indicated building, such as a 3D model of the building, images of the interior (and optionally, exterior) of the building, videos of the interior (and optionally, exterior) of the building, information about an interactive tour of a plurality of viewing/capture locations within the building interior (and optionally, exterior) at which image and/or other information was captured, a 2-D floor map or other floor plan, audio and/or textual descriptions of particular locations or areas (e.g., rooms, points of interest, etc.), simulated and/or actual lighting information, information about surrounding buildings and/or vegetation and/or other exterior aspects (vehicle traffic, foot traffic, noises, etc.), surfaces and/or areas available for virtual staging or otherwise for adding virtual objects, information about types of building information to use as POIs (e.g., information from automated analysis of visual data of images captured for the building), etc.). Since, Eraker in view of Hovden teach a method for using an interface that can view internal and external images of a 3D building model with the ability to display and view information related to the different images being viewed and Li teaches a method for using an interface that can display and view information related to different internal and external images of a 3D building model simultaneously, it would have been obvious to a person having ordinary skill in the art to combine the skills together, so that in addition to being able to view the different interior and exterior displayed images, a user could also view multiple images (including interior and exterior images of a 3D building model) and information related to those multiple image, simultaneously if needed.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eraker in view of Hovden to incorporate the teachings of Li, so that the combined features together would provide the user with the capabilities to view both internal and external images of a 3D model building simultaneously, which would help provide the user with a better overall understanding of the different spatial relationships between the inside and outside of the building.
Regarding claim 2, Eraker in view of Hovden and Li disclose everything claimed as applied above (see claim 1), in addition, Eraker in view of Hovden and Li disclose integrating depth information with the interior image frames such that both the interior image frames and the depth information provide a more comprehensive representation of the interior of the building (Paragraph 57 of Hovden teaches that in association with generating a 3D model of an environment, depth data can be captured (or determined) for respective images captured at each capture location throughout the environment. Additionally, Col. 9, Lines 9-17 of Eraker teach that in step 510, a second pass feature detection algorithm is applied to generate a dense representation of the 3D model geometry (e.g., high level features such as planes, lines, floors, etc.) or individual 3D points which can create a dense point cloud. Note that any other geometric data collected from the scene such as with a laser range scanner, infrared based depth maps (e.g., such as from a Microsoft Kinect), or other manual approaches can increase the accuracy of the geometry but are not required in the disclosed system);
generating an interior 3D model based on the interior image frames and the depth information (Col. 7 Lines 59-67 of Eraker teach that rendering displays the processed data to an end-user via an IO device. During rendering, the user's position and navigation influence which elements of geometry and image data are combined for a given Image-Based Rendering algorithm at any possible location in or around the virtual model. Ground truth image data is the captured image associated with a particular capture location, and optionally may include any metadata associated with the captured image, such as GPS coordinates, IR point clouds, etc.);
and mapping the interior 3D model to a coordinate system or a floor plan for the building (Col. 9, Lines 49-54 of Eraker teach that in step 514, the 3D data is used to generate a parcel map (e.g., the exterior and aerial views of the real estate) and floor plan. A 2D floor plan may be used to create a navigation-enabled map of the interior of the real estate. For example, a 2D floor plan of the rooms of a house can be automatically generated from the 3D data).
Regarding claim 3, Eraker in view of Hovden and Li disclose everything claimed as applied above (see claim 2), in addition, Eraker in view of Hovden and Li disclose aligning the 3D model with the image frames for consistency and accuracy in spatial representation (Col. 11, Lines 4-21 of Eraker teach that in step 611, RGB data may be registered with depth data (sometimes herein referred to as RGB+D data) by using different heuristics such as using relative directional differences in the sensors (assuming the relative positions of the sensors is insignificant) or a per-pixel registration of the RGB and depth sensor data can be achieved by warping the depth image to the perspective of the RGB sensor image and computing per-pixel depth in the RGB space after processing the warped depths. In some embodiments, registering RGB data with depth data may be used to interpret a pixel location from a particular image as a 3D ray or vector for later use in mapping texture onto a 2-D or 3-D polygon mesh or a point cloud. In step 612, a collection of RGB+D images may be registered together using an algorithm such as Iterative Closest Point (ICP) or other similar algorithms for reducing the difference between two clouds of points or geometric alignment of 3D models).
Regarding claim 4, Eraker in view of Hovden and Li disclose everything claimed as applied above (see claim 1), in addition, Eraker in view of Hovden and Li disclose defining a structure of the interface, wherein the interface comprises two portions such that one portion is configured to display the 3D model and another portion is configured to display the image frames (FIG. 4 and Col. 6 Line 56 through Col. 7 Line 5 of Eraker teach that FIG. 4 shows an embodiment of a User Interface System 400 for image-based rendering of real estate. Three different user interface elements serve the dual purposes of informing the user of his location in the model and simultaneously enabling spatial navigation. These three elements are shown in the user interface (400) embodiment of FIG. 4, which would normally be contained within a browser window, within a framed client application, or as the entire screen during full screen mode. User Interface element 402 is the viewpoint within a virtual model generated by combining geometry and image data using Image-Based Rendering (IBR), thus creating a 3-dimensional (3D) view. UI element 404 is a two-dimensional (2D) map overlay that displays the relative location in the virtual model of the current viewpoint 406 shown in UI element 402. UI element 408 is a text overlay that displays one or more labels associated with the user's location within the virtual model);
integrating the 3D model and the image frames into the interface structure (Col. 7, Lines 6-8 of Eraker teach that FIG. 4 shows the three primary user interface elements of an embodiment of the navigation tool for image-based renderings of real estate);
deploying the interface to a user device for visualization and interaction by the user (Col. 7, Lines 8-11 of Eraker teach that UI element 402 shows a rendering of the model from a specific viewpoint position and also serves as a means to navigate to adjacent positions and to change the viewing vector from a fixed position).
Regarding claim 5, Eraker in view of Hovden and Li disclose everything claimed as applied above (see claim 1), in addition, Eraker in view of Hovden and Li disclose extracting features from the exterior image frames (Col. 7, Lines 49-58 of Eraker teach that the image processing creates HDR spherical panoramas from input images. 3D reconstruction involves removing noise from point clouds, reconstructing a real estate environment's geometry to varying degrees of approximation, generating geometric proxies that describe the environment with simpler meta primitives, feature matching between spherical panoramas, positioning of spherical panorama data in 3D space, feature matching between the panorama and 3D space, and computing view dependent texture maps for the geometry and/or geometric proxies);
matching the exterior image frame features with corresponding features within the 3D model (Col. 13, Lines 13-20 FIG. 7 shows an example of a panorama capture map 700 having of a distribution of captured spherical panoramas on a parcel 701 having a house 702, where each number 1-21 represents a Data Sampling location (e.g., a location where a spherical panorama photo was taken). Note that different sampling densities in different spatial boundaries, such as dense sampling indoors 704 and sparse sampling outdoors 706. Additionally, paragraph 140 of Hovden teaches that in other embodiments in which the exterior scan images are not labeled with metadata identifying them as exterior images of the structure, the image classification component 1906 can provide for classifying images as exterior or interior images. The identification component 1904 can then apply the classification assigned by the image classification component 1906 to identify and select the exterior images for processing);
aligning the exterior image frames with the 3D model based on the matching (Paragraph 140 of Hovden teaches that in some implementations, the image classification component 1906 can also determine and associate metadata with the exterior images indicating whether they provide a view of the front, side, or back of the structure);
and determining the displayed portion of the 3D model that corresponds with the exterior image frames based on the alignment (Paragraph 140 of Hovden teaches that for instance, interior scans have depth data available for most of the FOV. Thus, in some embodiments, the image classification component 1906 can determine whether an image included in the interior and exterior scan images 1920 is an exterior image of a structure or an interior image of the structure based on the amount and/or quality of the depth data associated therewith).
Regarding claim 6, Eraker in view of Hovden and Li disclose everything claimed as applied above (see claim 1), in addition, Eraker in view of Hovden and Li disclose generating the interface element, wherein the interface element indicates one or more exterior views of the identified portion of the 3D model are available for viewing (Col. 13, Lines 21-28 of Eraker teach two arrows shown between capture location 2 and 14 and 11 and 20 represent doors that connect the spatial boundaries in a connectivity graph. In later rendering stages, users will be able to move between the indoor and outdoor areas based on this connectivity graph);
and placing the interface element at the location within the 3D model corresponding to the identified portion of the 3D model (Col. 8, Lines 10-20 of Eraker teach that the composited image primitives such as a spherical panorama enable 2 DOF rotational navigation. When translating directly between two spherical panorama locations, other algorithms such as optical flow may provide more photorealistic warping during the rendering of predetermined translational pathways defined in the connectivity graph. When translating between other locations within the virtual model, the use of VDTM over explicit geometric proxies combined with depth and feature matching between nearby panoramas during rendering provides a decrease in photorealism but enables fluid movement to any spatial location).
Regarding claim 7, Eraker in view of Hovden and Li disclose everything claimed as applied above (see claim 1), in addition, Eraker in view of Hovden and Li disclose listening for user interactions with the interface element and detecting when a user has selected the interface element (FIG. 2E and paragraph 32 of Li teach that FIG. 2E continues the examples of FIGS. 2A-2D, and illustrates a 360° spherical panorama image 238a similar to that of FIG. 2D, but for the hallway in the middle of the house rather than for the living room—in particular, the panorama image 238a is displayed in a first pane 250e of a GUI being displayed by the MIGM system to an MIGM system operator user (not shown), along with user-selectable GUI controls 295 to select different types of functionality);
upon detecting user interaction with the interface element, retrieving location information that corresponds to the portion of the building represented by the interface element (Paragraph 59 of Li teaches that for example, both panorama images may be displayed to a user who selects one or more common points in both images (e.g., a common plane with infinite points in both images), with the MIGM system determining the corresponding locations of the visual information of the two panorama images based on the indicated common point(s));
using the location information, identifying the exterior images corresponding to the selected interface element (Paragraph 62 of Li teaches that one or more exterior panorama images may be used to identify a shape of an exterior wall of the building, the quantity and/or locations of one or more windows in the exterior wall, identification of one or more floors of the building that are visible from that exterior, etc.);
and displaying the identified exterior images in the second interface portion (Paragraph 11 of Li teaches that (optionally one or more additional panes showing additional panorama images of additional rooms to potentially connect to one or more of the first and second rooms), the displayed information between the panes may be coordinated in the GUI, such as to simultaneously update corresponding information in other panes as a user manipulates information in one of the panes (e.g., to change relative locations of the first and second rooms as the user adjusts location of at least one of the rooms in one of the panes). Additionally, paragraph 62 teaches that in at least some embodiments additional information may be used as part of generating a floor plan for a building that is obtained outside of the building, such as one or more panorama images acquired outside of the building (e.g., in which some or all of the building is visible), one or more panorama images acquired of outbuildings or other structures on the same property, satellite and/or drone images from overhead, images from a street adjacent to the building, information from property records or other sources about dimensions of the exterior of the building, etc. As one example, one or more exterior panorama images may be used to identify a shape of an exterior wall of the building, the quantity and/or locations of one or more windows in the exterior wall, identification of one or more floors of the building that are visible from that exterior, etc., such as from an automated analysis of the panorama images and/or based on manual annotations of the panorama images by one or more MIGM system operator users, and with such information subsequently used to eliminate/select and/or to rank possible room connections according to how they fit with the information acquired from the exterior panorama image(s). As another example, one or more exterior panorama images may be treated as being part of one or more exterior rooms that surround or are otherwise associated with the building, with the exterior rooms being modeled (e.g., with room shapes) and connected to and used with other interior rooms of the building in a floor plan and/or in other manners).
Regarding claim 8, Eraker in view of Hovden and Li disclose everything claimed as applied above (see claim 7), in addition, Eraker in view of Hovden and Li disclose retrieving the location information corresponding to the portion of the building represented by the interface element, wherein the location information comprises GPS coordinates, a coordinate system, or building floor plan coordinates (Paragraph 84 of Li teaches that in other embodiments the routine may generate other types of mapping information for the building, whether instead of or in addition to a 2D schematic floor plan as discussed for this example embodiment of routine 500—non-exclusive examples of other mapping information include a 2.5D texture map in which 360° panorama images can optionally be re-projected on the geometry of the displayed texture map, a 3D structure that illustrates accurate height information as well as width and length (and in which 360° panorama images can optionally be re-projected on the geometry of the displayed 3D structure), etc. In addition, in some embodiments additional information may be generated and used, such as to determine a geographical alignment (e.g., with respect to true north or magnetic north) and/or geographical location (e.g., with respect to latitude and longitude, or GPS coordinates) for the building and corresponding parts of the generated floor plan, and to optionally further align with other external information (e.g., satellite or other external images, including street-level images to provide a ‘street view’ of the building; neighborhood information, such as nearby street maps and/or points of interest; etc.));
comparing the interface element's location with the location information of each exterior image in the exterior image frames (Paragraph 50 of Li teaches that the modified adjacency graph further illustrates additional inter-room connections that have been made for three other wall openings of the kitchen/dining room to other rooms or areas of the house (i.e. landing 245 in the middle of the stairway, the family room 241, and the patio 246 outside the eastern exterior door 197 of the house, connected to wall openings 267a, 263c and 197 of the kitchen/dining room, respectively));
and based on the comparison of location information, identifying the exterior images corresponding to the selected interface element (Paragraph 62 of Li teaches that the information from property records or other sources about dimensions of the exterior of the building, etc. As one example, one or more exterior panorama images may be used to identify a shape of an exterior wall of the building, the quantity and/or locations of one or more windows in the exterior wall, identification of one or more floors of the building that are visible from that exterior, etc., such as from an automated analysis of the panorama images and/or based on manual annotations of the panorama images by one or more MIGM system operator users, and with such information subsequently used to eliminate/select and/or to rank possible room connections according to how they fit with the information acquired from the exterior panorama image(s). As another example, one or more exterior panorama images may be treated as being part of one or more exterior rooms that surround or are otherwise associated with the building, with the exterior rooms being modeled (e.g., with room shapes) and connected to and used with other interior rooms of the building in a floor plan and/or in other manners.).
Regarding claim 9, Eraker in view of Hovden and Li disclose everything claimed as applied above (see claim 1), in addition , Eraker in view of Hovden and Li disclose providing an additional interface element within the second interface portion, wherein selection of the additional interface element causes the second interface portion to switch between or provide control of the one or more accessed exterior image frames (Paragraph 74 of Li teaches that if it is instead determined in block 410 that the instructions or other information recited in block 405 are not to acquire images and other data representing a building interior, the routine continues instead to block 490 to perform any other indicated operations as appropriate, such as to generate and store inter-panorama image connections between panorama images for a building or other structure (e.g., for each panorama image, to determine directions within that panorama image toward one or more other acquisition locations of one or more other panorama images, such as to enable later display of an arrow or other visual representation with a panorama image for each such determined direction from the panorama image to enable an end-user to select one of the displayed visual representations to switch to a display of the other panorama image at the other acquisition location to which the selected visual representation corresponds)).
Regarding claim 10, Eraker in view of Hovden and Li disclose everything claimed as applied above (see claim 1), in addition, Eraker in view of Hovden and Li disclose providing an additional interface element within the second interface portion, wherein selection of the additional interface element causes the second interface portion to display a corresponding interior view of the building (Paragraph 82 teaches that the displaying of information includes displaying multiple GUI panes that have different but related information, such as to display a room layout view of a first GUI pane that shows a partial or full floor plan with interconnected room shapes for two or more rooms, and to optionally display one or more additional GUI panes that each include one or more panorama images for at least one of the rooms shown in the floor plan pane and that optionally have additional information overlaid on the displayed panorama image(s)).
Regarding claim 11, Eraker in view of Hovden and Li disclose everything claimed as applied above (see claim 1), in addition, Eraker in view of Hovden and Li disclose identifying the displayed portion of the 3D model that corresponds to one or more of the accessed interior image frames (Col. 7, Lines 3-5 of Eraker teach that the UI element 408 is a text overlay that displays one or more labels associated with the user's location within the virtual model and Col. 8, Lines 1-9 teach that the ground truth data may also include labels (outside: backyard, or groundfloor: bathroom) which are ground truth in the sense that they are directly collected at the scene and are not synthetic approximations. For example, when a user is near a position where ground truth image data is captured, very little geometry is required to render the most photorealistic view of the model. At the exact position of capture, the use of an image primitive is the most photorealistic view possible);
modifying the first interface portion to display the interface element at the location corresponding to the identified portion of the 3D model (Col. 13, Lines 21-28 of Eraker teach two arrows shown between capture location 2 and 14 and 11 and 20 represent doors that connect the spatial boundaries in a connectivity graph. In later rendering stages, users will be able to move between the indoor and outdoor areas based on this connectivity graph);
and in response to the selection of the displayed interface element, modifying a second interface portion to display the one or more accessed interior image frames that correspond to the identified portion of the 3D model (Col. 13, Lines 41-45 of Eraker teach that when a user is navigating panoramas, another type of view that a user may request is a transition between panoramas that is indicative of the physical experience of transitioning between the panorama capture locations in the real world. Also, Col. 13, Lines 53-57 of Eraker teach that at decision point 802, the rendering system evaluates the data at its disposal to aid the rendering of a requested viewpoint. If the viewpoint has been sampled as panorama, an image or as a video, then the rendering system can use the sampled raw and/or processed sensor-captured data to render the requested viewpoint in step 503).
Regarding claim 12, the system steps correspond to and are rejected similarly to the method steps of claim 1 (see claim 1 above). In addition, Eraker discloses a system (FIG. 2 and Col. 4, Lines 62-66 teach that FIG. 2 shows an exemplary computing environment 200 for implementing various aspects of the disclosed inventions. The computing environment 200 includes a computer 202, the computer 202 including a processing unit 204, a system memory 206 and a system bus 208) comprising:
a hardware processor (Col. 5, Lines 1-5 teach that the processing unit 204 may be any of various commercially available processors. Dual microprocessors and other multi-processor architectures may also be employed as the processing unit 204);
and a non-transitory computer-readable storage medium storing executable instructions that, when executed by the hardware processor, cause the hardware processor to perform steps (FIG. 2 and Col. 4, Line 62 through Col. 5, Line 1 teach that FIG. 2 shows an exemplary computing environment 200 for implementing various aspects of the disclosed inventions. The computing environment 200 includes a computer 202, the computer 202 including a processing unit 204, a system memory 206 and a system bus 208. The system bus 208 couples system components including, but not limited to, the system memory 206 to the processing unit 204).
Regarding claim 13, the system steps correspond to and are rejected similarly to the method steps of claim 2 (see claim 2 above).
Regarding claim 14, the system steps correspond to and are rejected similarly to the method steps of claim 2 (see claim 2 above).
Regarding claim 15, the system steps correspond to and are rejected similarly to the method steps of claim 4 (see claim 4 above).
Regarding claim 16, the system steps correspond to and are rejected similarly to the method steps of claim 5 (see claim 5 above).
Regarding claim 17, the system steps correspond to and are rejected similarly to the method steps of claim 6 (see claim 6 above).
Regarding claim 18, the system steps correspond to and are rejected similarly to the method steps of claim 7 (see claim 7 above).
Regarding claim 19, the system steps correspond to and are rejected similarly to the method steps of claim 8 (see claim 8 above).
Regarding claim 20, the non-transitory computer readable medium corresponds to and is rejected similarly to the system steps of claim 12 (see claim 12 above).
Regarding claim 21, Eraker discloses a method (FIG. 8 and Col. 4, Lines 46-49 FIG. 8 shows a block diagram of an exemplary embodiment of a system to implement the methods of image-based rendering for real estate and other real scenes disclosed herein) comprising:
accessing interior image frames captured by a mobile device (Col. 10, Lines 4-8 teach that in step 601, input images are collected. The input images can come from various sources such as mobile devices (e.g., smartphones), point-and-shoot cameras, GoPros on a rig, and specialty cameras systems such as LadyBug5 from Point Grey). However, Eraker fails to disclose that the mobile device is moved through an interior of a building.
Hovden teaches accessing interior image frames by a mobile device as the mobile device is moved through the interior of a building (Paragraph 50 teaches that the camera operator can then cause the camera to rotate 360°, capturing visual image data (e.g., one or more two-dimensional images) and 3D (3D) data (e.g., depth data) over the course of rotation. The captured data can be combined to generate a 360° panoramic image of the environment with depth data at the scan position/location. The camera operator can then move (or cause the camera to move via a robotic mount) to a new position in the environment and direct the camera to capture a second panoramic image and corresponding panoramic depth data at the new position). Since Eraker teaches capturing input images from a building using a mobile device and Hovden teaches using that mobile device to capture the images as you move throughout the interior of an environment/building, it would have been obvious to a person having ordinary skill in the art to combine the skills together, so that any of the images being captured would specifically be from the interior of a building and would be captured as the camera moved throughout that interior.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eraker to incorporate the teachings of Hovden, so that the combined skills together would allow for a user to have access to additional images captured by a camera device as it moves throughout the interior of a building.
Additionally, Eraker in view of Hovden disclose accessing exterior image frames captured by an unmanned aerial vehicle ("UAV") as the UAV navigates around an exterior of the building (Col. 10, Lines 8-11 of Eraker teach that the input images can be shot with a tripod, or by holding the device in hand or on top of other devices such as drones, robots or other kinds of automated, semi-automated, or remote controlled devices and paragraph 50 of Hovden teaches that the camera operator can then move (or cause the camera to move via a robotic mount) to a new position in the environment and direct the camera to capture a second panoramic image and corresponding panoramic depth data at the new position);
accessing a floor plan of the building (Paragraph 115 of Hovden teaches that in this regard, the floorplan model 1402 can be a schematic floorplan of a property, a schematic floorplan of an architectural structure (e.g., a building, a house, etc.), a schematic floorplan of an interior space of an architectural structure (e.g., a house), and the like);
aligning the interior image frames and the exterior image frames to the accessed floor plan (Paragraph 115 of Hovden teaches that in one or more implementations, the floorplan model 1402 can be or correspond to an aerial view of a 3D model generated based on aligned image data and associated 3D data captured of a space (e.g., during a scan) and paragraph 118 teaches that the floorplan alignment component 1406 can facilitate matching and aligning a floorplan model 1402 of a structure with a corresponding satellite image providing an aerial view of the structure included in the satellite data);
generating an interface displaying one or more interior image frames in a first interface portion (FIG. 4 and Col. 6, Lines 56-60 of Eraker teach that FIG. 4 shows an embodiment of a User Interface System 400 for image-based rendering of real estate. Three different user interface elements serve the dual purposes of informing the user of his location in the model and simultaneously enabling spatial navigation);
identifying a displayed portion of the 3D model that corresponds to one or more of the accessed exterior image frames using the floor plan (Col. 7, Lines 3-5 of Eraker teach that the UI element 408 is a text overlay that displays one or more labels associated with the user's location within the virtual model and Col. 8, Lines 1-9 teach that the ground truth data may also include labels (outside: backyard, or groundfloor: bathroom) which are ground truth in the sense that they are directly collected at the scene and are not synthetic approximations. For example, when a user is near a position where ground truth image data is captured, very little geometry is required to render the most photorealistic view of the model. At the exact position of capture, the use of an image primitive is the most photorealistic view possible);
modifying the first interface portion to display an interface element at a location corresponding to the identified displayed interior frame (Col. 13, Lines 21-28 of Eraker teach two arrows shown between capture location 2 and 14 and 11 and 20 represent doors that connect the spatial boundaries in a connectivity graph. In later rendering stages, users will be able to move between the indoor and outdoor areas based on this connectivity graph);
and in response to a selection of the displayed interface element, modifying a second interface portion to display the one or more accessed exterior image frames that correspond to the identified displayed interior frame (Col. 13, Lines 41-45 of Eraker teach that when a user is navigating panoramas, another type of view that a user may request is a transition between panoramas that is indicative of the physical experience of transitioning between the panorama capture locations in the real world. Also, Col. 13, Lines 53-57 of Eraker teach that at decision point 802, the rendering system evaluates the data at its disposal to aid the rendering of a requested viewpoint. If the viewpoint has been sampled as panorama, an image or as a video, then the rendering system can use the sampled raw and/or processed sensor-captured data to render the requested viewpoint in step 503). However, Eraker and Hovden fail to disclose modifying the first interface portion to display an interface element at a location corresponding to the identified displayed interior frame simultaneously with the display of the interior image frame within the first interface portion.
Li discloses modifying the first interface portion to display an interface element at a location corresponding to the identified displayed interior frame simultaneously with the display of the interior image frame within the first interface portion (Paragraph 14 teaches that the described techniques provide various benefits in various embodiments, including to use 3D models and/or 2.5D models and/or 2D floor map models of multi-room buildings and other structures (e.g., that are generated from images acquired in the buildings or other structures) to display various types of information about building interiors, such as in a coordinated and simultaneous manner with other types of related information, including to use information about the actual as-built buildings (e.g., internal structural components and/or other interior elements, nearby external buildings and/or vegetation, actual building geographical location and/or orientation, actual typical weather patterns, etc.). Additionally, FIG. 6 and paragraph 79 teach that the illustrated embodiment of the routine begins in block 605, where information or instructions are received. The routine continues to block 610 to determine whether the instructions in block 605 are to present integrated information for an indicated building, such as in a corresponding GUI. If so, the routine continues to perform blocks 615-650, and otherwise continues to block 690. In particular, if it is determined in block 610 that the instructions received in block 605 are to present integrated information for an indicated building, the routine continues to block 615 to obtain building information of multiple types for the indicated building, such as a 3D model of the building, images of the interior (and optionally, exterior) of the building, videos of the interior (and optionally, exterior) of the building, information about an interactive tour of a plurality of viewing/capture locations within the building interior (and optionally, exterior) at which image and/or other information was captured, a 2-D floor map or other floor plan, audio and/or textual descriptions of particular locations or areas (e.g., rooms, points of interest, etc.), simulated and/or actual lighting information, information about surrounding buildings and/or vegetation and/or other exterior aspects (vehicle traffic, foot traffic, noises, etc.), surfaces and/or areas available for virtual staging or otherwise for adding virtual objects, information about types of building information to use as POIs (e.g., information from automated analysis of visual data of images captured for the building), etc.). Since, Eraker in view of Hovden teach a method for using an interface that can view internal and external images of a 3D building model with the ability to display and view information related to the different images being viewed and Li teaches a method for using an interface that can display and view information related to different internal and external images of a 3D building model simultaneously, it would have been obvious to a person having ordinary skill in the art to combine the skills together, so that in addition to being able to view the different interior and exterior displayed images, a user could also view multiple images (including interior and exterior images of a 3D building model) and information related to those multiple image, simultaneously if needed.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eraker in view of Hovden to incorporate the teachings of Li, so that the combined features together would provide the user with the capabilities to view both internal and external images of a 3D model building simultaneously, which would help provide the user with a better overall understanding of the different spatial relationships between the inside and outside of the building.
Regarding claim 22, Eraker discloses a method (FIG. 8 and Col. 4, Lines 46-49 FIG. 8 shows a block diagram of an exemplary embodiment of a system to implement the methods of image-based rendering for real estate and other real scenes disclosed herein) comprising:
accessing interior image frames captured by a mobile device (Col. 10, Lines 4-8 teach that in step 601, input images are collected. The input images can come from various sources such as mobile devices (e.g., smartphones), point-and-shoot cameras, GoPros on a rig, and specialty cameras systems such as LadyBug5 from Point Grey). However, Eraker fails to disclose that the mobile device is moved through an interior of a building.
Hovden teaches accessing interior image frames by a mobile device as the mobile device is moved through the interior of a building (Paragraph 50 teaches that the camera operator can then cause the camera to rotate 360°, capturing visual image data (e.g., one or more two-dimensional images) and 3D (3D) data (e.g., depth data) over the course of rotation. The captured data can be combined to generate a 360° panoramic image of the environment with depth data at the scan position/location. The camera operator can then move (or cause the camera to move via a robotic mount) to a new position in the environment and direct the camera to capture a second panoramic image and corresponding panoramic depth data at the new position). Since Eraker teaches capturing input images from a building using a mobile device and Hovden teaches using that mobile device to capture the images as you move throughout the interior of an environment/building, it would have been obvious to a person having ordinary skill in the art to combine the skills together, so that any of the images being captured would specifically be from the interior of a building and would be captured as the camera moved throughout that interior.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eraker to incorporate the teachings of Hovden, so that the combined skills together would allow for a user to have access to additional images captured by a camera device as it moves throughout the interior of a building.
Additionally, Eraker in view of Hovden disclose accessing exterior image frames captured by an unmanned aerial vehicle ("UAV") as the UAV navigates around an exterior of the building (Col. 10, Lines 8-11 of Eraker teach that the input images can be shot with a tripod, or by holding the device in hand or on top of other devices such as drones, robots or other kinds of automated, semi-automated, or remote controlled devices and paragraph 50 of Hovden teaches that the camera operator can then move (or cause the camera to move via a robotic mount) to a new position in the environment and direct the camera to capture a second panoramic image and corresponding panoramic depth data at the new position);
aligning the interior image frames and the exterior image frames to a coordinate system (Paragraph 59 of Hovden teaches that for example, the position data can include the information provided by the position data for scan images 122 that identifies the aligned position of the scan image relative to the local 3D coordinate system, which corresponds to the capture location of the scan image relative to the local 3D coordinate system. In this regard, in addition to aligned position data for each scan images that reflects its capture location relative to a common 3D coordinate space, each scan image can be associated with optimized GPS coordinates (e.g., that include a latitude and a longitude value) corresponding to the capture location of the scan image relative to the global positioning system);
generating an interface displaying one or more interior image frames in a first interface portion (FIG. 4 and Col. 6, Lines 56-60 of Eraker teach that FIG. 4 shows an embodiment of a User Interface System 400 for image-based rendering of real estate. Three different user interface elements serve the dual purposes of informing the user of his location in the model and simultaneously enabling spatial navigation);
identifying a displayed interior image frame that corresponds to one or more of the accessed exterior image frames using the coordinate system (Col. 7, Lines 3-5 of Eraker teach that the UI element 408 is a text overlay that displays one or more labels associated with the user's location within the virtual model and Col. 8, Lines 1-9 teach that the ground truth data may also include labels (outside: backyard, or groundfloor: bathroom) which are ground truth in the sense that they are directly collected at the scene and are not synthetic approximations. For example, when a user is near a position where ground truth image data is captured, very little geometry is required to render the most photorealistic view of the model. At the exact position of capture, the use of an image primitive is the most photorealistic view possible);
modifying the first interface portion to display an interface element at a location corresponding to the identified displayed interior frame (Col. 13, Lines 21-28 of Eraker teach two arrows shown between capture location 2 and 14 and 11 and 20 represent doors that connect the spatial boundaries in a connectivity graph. In later rendering stages, users will be able to move between the indoor and outdoor areas based on this connectivity graph);
and in response to a selection of the displayed interface element, modifying a second interface portion to display the one or more accessed exterior image frames that correspond to the identified displayed interior frame (Col. 13, Lines 41-45 of Eraker teach that when a user is navigating panoramas, another type of view that a user may request is a transition between panoramas that is indicative of the physical experience of transitioning between the panorama capture locations in the real world. Also, Col. 13, Lines 53-57 of Eraker teach that at decision point 802, the rendering system evaluates the data at its disposal to aid the rendering of a requested viewpoint. If the viewpoint has been sampled as panorama, an image or as a video, then the rendering system can use the sampled raw and/or processed sensor-captured data to render the requested viewpoint in step 503). However, Eraker in view of Hovden fail to disclose modifying a second interface portion to display the one or more accessed exterior image frames that correspond to the identified displayed interior frame simultaneously with the display of the interior image frame within the first interface portion.
Li discloses modifying a second interface portion to display the one or more accessed exterior image frames that correspond to the identified displayed interior frame simultaneously with the display of the interior image frame within the first interface portion. (Paragraph 14 teaches that the described techniques provide various benefits in various embodiments, including to use 3D models and/or 2.5D models and/or 2D floor map models of multi-room buildings and other structures (e.g., that are generated from images acquired in the buildings or other structures) to display various types of information about building interiors, such as in a coordinated and simultaneous manner with other types of related information, including to use information about the actual as-built buildings (e.g., internal structural components and/or other interior elements, nearby external buildings and/or vegetation, actual building geographical location and/or orientation, actual typical weather patterns, etc.). Additionally, FIG. 6 and paragraph 79 teach that the illustrated embodiment of the routine begins in block 605, where information or instructions are received. The routine continues to block 610 to determine whether the instructions in block 605 are to present integrated information for an indicated building, such as in a corresponding GUI. If so, the routine continues to perform blocks 615-650, and otherwise continues to block 690. In particular, if it is determined in block 610 that the instructions received in block 605 are to present integrated information for an indicated building, the routine continues to block 615 to obtain building information of multiple types for the indicated building, such as a 3D model of the building, images of the interior (and optionally, exterior) of the building, videos of the interior (and optionally, exterior) of the building, information about an interactive tour of a plurality of viewing/capture locations within the building interior (and optionally, exterior) at which image and/or other information was captured, a 2-D floor map or other floor plan, audio and/or textual descriptions of particular locations or areas (e.g., rooms, points of interest, etc.), simulated and/or actual lighting information, information about surrounding buildings and/or vegetation and/or other exterior aspects (vehicle traffic, foot traffic, noises, etc.), surfaces and/or areas available for virtual staging or otherwise for adding virtual objects, information about types of building information to use as POIs (e.g., information from automated analysis of visual data of images captured for the building), etc.). Since, Eraker in view of Hovden teach a method for using an interface that can view internal and external images of a 3D building model with the ability to display and view information related to the different images being viewed and Li teaches a method for using an interface that can display and view information related to different internal and external images of a 3D building model simultaneously, it would have been obvious to a person having ordinary skill in the art to combine the skills together, so that in addition to being able to view the different interior and exterior displayed images, a user could also view multiple images (including interior and exterior images of a 3D building model) and information related to those multiple image, simultaneously if needed.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Eraker in view of Hovden to incorporate the teachings of Li, so that the combined features together would provide the user with the capabilities to view both internal and external images of a 3D model building simultaneously, which would help provide the user with a better overall understanding of the different spatial relationships between the inside and outside of the building.
Response to Arguments
Applicant’s arguments with respect to the independent claims 1, 12 and 20-22 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. The prior art of Li has been incorporated into the rejections of the independent claims and therefore teach the newly amended claim language (See respectively, claims 1, 12 and 20-22 above).
In regards to the additional arguments regarding any of the dependent claims 2-11 and 13-19, for the virtue of their dependency are moot because the independent claims are not allowable.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Vincent et al. (U.S. Patent: #10,825,247 B1) teaches operations related to presenting visual information of multiple types about a building and simultaneously presenting other types of related information about the building interior.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to George Renze whose telephone number is (703)756-5811. The examiner can normally be reached Monday-Friday 9:00am - 6:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached at (571) 272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/G.R./Examiner, Art Unit 2613
/XIAO M WU/Supervisory Patent Examiner, Art Unit 2613