Last updated: April 19, 2026
Application No. 18/684,117
FACADE BIASING FOR REFLECTION CORRECTION IN PHOTOGRAMMETRIC RECONSTRUCTION

Non-Final OA §103
Filed
Feb 15, 2024
Examiner
GALERA, PATRICK PAUL CONTRER
Art Unit
2617
Tech Center
2600 — Communications
Assignee
Microsoft Technology Licensing, LLC
OA Round
1 (Non-Final)
Interview Optional

— +16.7% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 7 resolved cases, 2023–2026
Examiner Intelligence

GALERA, PATRICK PAUL CONTRER View full profile →
Grants 86% — above average
Career Allow Rate
6 granted / 7 resolved
+23.7% vs TC avg
Strong +17% interview lift
Without
With
+16.7%
Interview Lift
resolved cases with interview
Typical timeline
2y 5m
Avg Prosecution
21 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
2.1%
-37.9% vs TC avg
§103
72.9%
+32.9% vs TC avg
§102
18.8%
-21.2% vs TC avg
§112
5.2%
-34.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 7 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment/Restriction Requirement
The applicant elects without traverse Group 1 (claims 1-12, and 19-20) in accordance with the remarks filed on December 12, 2025.
Claims 13-18, and 20-21 are cancelled, and claims 22-29 are new.

Specification
The disclosure is objected to because of the following informalities:
The phrase “received height filed image” should read “received height field image” in paragraph 128.  
Appropriate correction is required.

Claim Objections
Claims 8, and 28 are objected to because of the following informalities:
The phrase “received height filed image” should read “received height field image”.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-9, 11-12, 19, and 22-29 are rejected under 35 U.S.C. 103 as being unpatentable over Frahm et al. (US 20130060540 A1, hereinafter “Frahm”) in view of Hucks et al. (US 20220091259 A1, hereinafter “Hucks”).

Regarding claim 19,
Frahm teaches:
A system (Frahm: ¶55, “. . . three dimensional models produced by systems/methods according to some embodiments. . .; "¶69, “. . . reflective or specular surfaces can lead to gross errors for LIDAR and stereo ranging techniques. This may lead to large perturbations or spikes on the glass windows of buildings. These errors may be suppressed by systems/methods according to some embodiments. . . ”), 
comprising:
one or more processors; memory in electronic communication with the one or more processors; and instructions stored in the memory, the instructions executable by the one or more processors to: (Frahm: ¶107, “. . . includes a processor 22, . . . include a volatile storage, such as a random access memory (RAM), or a non-volatile storage. . .”; ¶201-202, “. . . computer program instructions may be provided to a processor . . . such that the instructions, which execute via the processor . . . computer program instructions may also be stored in a computer readable memory . . .)
obtain a range-image estimation with an estimation of free-space and filled-space (Frahm: ¶75, “. . . Each range measurement is taken along a ray 18 originating from a range source (or reference point) RP1, RP2, etc. The space between the range source and the measured range is assumed to be empty (i.e., not occupied by a structure). The point equal to the range measurement is assumed to be full (i.e., occupied by a structure), . . .; NOTE: A free-space is an empty space, not occupied by a structure. A filled-space is the space occupied by a structure);
identify a feature from the range-image estimation (Frahm: ¶69, “. . . Range information acquired from city streets or other locations can be used to generate vertical height maps of an urban environment. Because the range source is at ground level, the height map precision may be high enough to capture buildings, terrain, cars, bushes, mailboxes, lamp posts, and other urban features. . .” NOTE: Features identified in an urban environment by Frahm’s system includes buildings, terrain, cars, etc. as described in paragraph 69 from the range information. Frahm’s range information is the range-image estimation.);
in accordance with a determination that the feature is a building or water (Frahm:¶69, “. . . capture buildings, terrain, cars, bushes, mailboxes, lamp posts, and other urban features . . .”; NOTE: Frahm’s system can distinguish urban features such as buildings from other urban features such as terrains, which includes bodies of water. Therefore, Frahm’s system can determine that the feature is a building or water.):
determine a footprint of the feature (Frahm: ¶74, “. . . referring to FIG. 1B, a reference plane 12 is illustrated. The reference plane 12, which is illustrated as lying in the x-y plane of a three dimensional Cartesian coordinate system, is divided up into an M.times.N array of cells, each having dimensions of .DELTA.x by .DELTA.y. The reference plane 12 may correspond, for example, to a city block, a neighborhood, or any other geographical area. . . “; NOTE: Figure 1B illustrates the footprint of a building 10 within a reference plane 12.);
generate an observation model with a bias towards filled-space within the footprint of the feature (Frahm: ¶169, “Methods for reconstructing scenes using an n-layer height map are illustrated in FIG. 14. These methods may include the following steps: lay out the volume of interest (Block 502); construct the probabilistic occupancy grid over the volume (Block 504); and compute the n-layer height map (Block 506). Optionally, a mesh may be extracted from the n-layer height map, and texture maps may be generated from the images”; ¶181, “The notation "P(O|Z)" means the probability of O, the occupancy of the voxel, . . . The occupancy prior P(O) is used to slightly bias the volume to be empty above the camera center and full below. That is, the occupancies may be initialized to be 0 for voxels above the camera center and 1 below the camera center. This may help to prevent rooftops extending into empty space since the cameras may not observe them from the ground”; NOTE: Frahm’s occupancy values are biased towards the filled space of a building by initializing it to 1. Frahm’s system knows that the volume below the center of the camera is full, which is the filled-space where a building exist. Since the occupancy is initialized to 1 where a building exist filling a space, and 0 to voxels above the camera, therefore, the bias is towards the filled-space occupied by a building, and the building is within its footprint. The constructed probabilistic occupancy grid of the building, is the generated observation model.);
generate a three- dimensional (3D) reconstruction of the feature by applying a voxel reconstruction algorithm to the observation model (Frahm: ¶55, “FIG. 11 illustrates untextured and textured three dimensional models produced by systems/methods according to some embodiments. . .”; ¶81-83, “Votes may be accumulated by summing. Once the votes have been accumulated for all voxels in a column, the height value may be computed by choosing the height value which reduces, minimizes or otherwise satisfies a cost function. An example of a suitable cost function is the sum of the votes in voxels below the chosen height value minus the sum of the votes in voxels above the chosen height value. Such a cost function may be expressed as:

c z = v z &gt; z .phi. v - v z &lt; z .phi. v ( 2 ) ##EQU00006## . . . A minimum of the cost function may be found by looping over all the voxels in the column and recording the height value with the lowest cost.”; ¶162, “. . . some other depth map fusion techniques, such as C. Zach et al., A Globally Optimal Algorithm For Robust Tv-L1 Range Image Integration, International Conference on Computer Vision (2007.3), use an occupancy grid for depth map fusion, Although such methods recover a general three dimensional surface from the occupancy grid, systems/methods according to some embodiments may simplify the fusion problem by recovering only a height map. This may allow the systems/methods to be more efficient in terms of processing time and/or memory utilization, potentially providing better scalability . . .”; NOTE: The voxel reconstruction algorithm that Frahm uses is disclosed expressed in paragraphs 81-83. Additionally, Frahm references a voxel reconstruction algorithm Tv-L1 in paragraph 162, which is the preferred voxel reconstruction algorithm by the applicant as described in applicant’s disclosure paragraph 70. Frahm’s discloses that their algorithm is more efficient providing better scalability.); 
and store the generated 3D reconstruction of the feature in a database (Frahm: ¶67, “Some embodiments of the present invention provide a fast approach for automatic dense large scale three dimensional urban reconstruction . . . the surface is represented by a height map . . .” ¶136, “The final height map may then be stored, ¶29, “A system for generating a vertical heightmap model of an object . . . a three dimensional modeling system coupled to the database and configured to assign weights to respective voxels in a three dimensional grid . . .; ¶88, “. . . storing and transmitting the full three dimensional polygonal mesh . . .”).
Although Frahm can distinguish and classify features in an urban environment such as buildings, terrain, cars, and other urban features, and determine footprints of a building as shown in Fig. 1B based on range information acquired from city streets. Frahm does not use a land-cover classification and fails to teach: determine whether the feature is one of a building or a water based on a land-cover classification.
The analogous art Hucks teaches:
determine whether the feature is one of a building or water based on a land-cover classification (Hucks: ¶32, “. . . The system 30 uses image semantic segmentation to classify land-use land-cover (LULC) features . . .”; ¶61, “. . . Further optimization may be achieved by performing supervised landmark-based image segmentation that employs game-theoretic concepts. This is done by creating a reward matrix with land cover classifications and different model solvers, as shown in the table 85 of FIG. 6. The reward matrix illustratively includes land cover classifications and different model solvers as shown. . .  In the simulation, seven classes were used, namely: water; roads; vegetation low; vegetation medium; vegetation high; built up areas (BUAs); and bare earth. . .; ¶100, “. . . the outputs of the processing paths are collectively classified using the above-described classifications (e.g., buildings, water, roads, vegetation, etc.) . . .”; NOTE:  Huck’s system uses land cover classification to distinguish and determine whether a feature is a building or water.)
It would have been obvious to a person having ordinary skill in the art (PHOSITA) before the effective filing date of the claimed invention to combine Frahm and Hucks to include: determine whether the feature is one of a building or water based on a land-cover classification. Hucks ¶49 acknowledges that using semantic labels can guide the 3d reconstruction and “can be more readily predicted by measuring the difference in appearance with respect to a given semantic class. The incorporation of semantic features enables better results to be achieved, with simpler models”. The semantic labeling to classify features such as building or water is taught by Hucks.
The reason for doing so is to allow “not only for the support of multi-spectral and panchromatic images, but also the use of images with and without sensor information” (Hucks: ¶40) and “the land cover classification labels assist with decision analytics. Within this formulation, a weighted reward matrix is used for consistent labeling of height values with classification factors, resulting in higher accuracy and precision” (Hucks: ¶60).

Regarding claim 22, depending on 19,
The combination of Frahm and Hucks teaches:
The system of claim 19, 
Frahm further teaches:
wherein the observation model includes an observation volume that aggregates a plurality of range-image estimations of the free-space and the filled- space for the feature (Frahm: ¶169, “. . . reconstructing scenes using an n-layer height map are illustrated in FIG. 14. These methods may include the following steps: lay out the volume of interest (Block 502); construct the probabilistic occupancy grid over the volume. . .”; (NOTE: The occupancy grid is the observation model and the observation volume is the volume of interest); ¶98, “Free and filled space votes are accumulated from the viewing rays of all depth maps”; ¶105, “. . . The inputs to these systems/methods are one or more video sequences along with the estimated camera poses and the intrinsic calibration for every frame, a depth map for every frame . . .”; ¶139, “Each voxel v is projected into every depth map, and a vote .phi..sub.p for each depth pixel p is computed as described above. NOTE: The observation model including an observation volume, is Frahm’s occupancy grid. The plurality of range-image estimation is the depth map for every frame. By projecting each voxel into every depth map, it aggregates the plurality of range-image estimations of the free-space and filled-space for the feature accumulated from the viewing rays of all depth maps.)

Regarding claim 23, depending on 19,
The combination of Frahm and Hucks teaches:
The system of claim 19, 
Frahm further teaches:
wherein the observation model includes filled-spaces where free-spaces existed in the range-image estimation for one or more non-Lambertian features of the feature (Frahm: ¶69, “. . . Range information acquired . . . used to generate vertical height maps of an urban environment. . . Note that a single height map cannot adequately capture overhanging structures, such as eves, awnings, tree branches, etc. However, errors in the range measurements which would create such overhanging structures may also be suppressed. For example, reflective or specular surfaces can lead to gross errors for LIDAR and stereo ranging techniques. This may lead to large perturbations or spikes on the glass windows of buildings. These errors may be suppressed by systems/methods according to some embodiments. . .”; NOTE: The non-Lambertian feature of the feature is the reflective or specular surfaces such as surface of a glass window on a building. The position where errors in the range measurements are measured is the free-space area occupied by an object with non-Lambertian feature, such as a reflective glass window. Therefore, the free spaces existed and are detected in the range-image estimation when acquiring range information. The filled space correspond to a wall which fills the space and the free space correspond to the window having non-Lambertian features.);

Regarding claim 24, depending on 19,
The combination of Frahm and Hucks teaches:
The system of claim 19, 
Frahm further teaches that the system of claim 19, 
further comprising receiving a height field image with the feature (Frahm: ¶138, “. . . a vertical height map model may be generated from image, position and depth map data by first selecting a column of interest from the reference plane. . .”; NOTE: The height field image received is the height map model generated from image. The reference plane includes the building footprint as illustrated in Fig. 1B. The feature is the building 10. ),
However, Frahm fails to teach: wherein the bias towards filled-space is determined by a height value of the feature.
The analogous art Hucks teaches:
wherein the bias towards filled-space is determined by a height value of the feature (Hucks: ¶49, “. . . produces pixel-based height maps . . .” ¶59, “. . . predict the scene heights by knowing the relationships between the features. Estimating height from image features . . . image-to-height learning algorithm. . .”; ¶107, “. . . a model may be trained to recognize image features of differing heights using CNN . . .”; ¶61, “Further optimization may be achieved. . . as shown in the table 85 of FIG. 6. The reward matrix illustratively includes land cover classifications . . . seven classes were used, namely: water; roads; vegetation low; vegetation medium; vegetation high; built up areas (BUAs); and bare earth. . .”; ¶63, “¶63, “There is a need for detailed surface representations . . . for which the building is necessary. . . 3D land-use zoning, and allowed building volumes, usage, and density. They are the main tools that help define the image of a city and bring into focus . . . each land use/land cover feature may be used for optimal decision making of which model in the ensemble should be chosen per voxel. . .””; NOTE: Similar to Frahm, Hucks also generates height maps of the features along with their classifications. In Fig. 6, a weighted reward matrix (MAX/MIN) is assigned to land-cover classification. MAX bias for optimization is assigned toward BUA and VEG_HIGH, having heights taller than the other classifications. The BUA or VEG_HIGH both occupy or fill spaces. Therefore, the MAX bias toward filled-space is determined by a height value of the feature, the feature being the BUA or VEG_HIGH. The system assigns MAX bias to identify necessary building for detailed surface representations.).
It would have been obvious to a person having ordinary skill in the art (PHOSITA) before the effective filing date of the claimed invention to combine Frahm and Hucks and include: wherein the bias towards filled-space is determined by a height value of the feature.
The reason for doing so is “for consistent labeling of height values with classification factors, resulting in higher accuracy and precision” (Hucks: ¶61) and to provide “enhanced geospatial models (e.g., DSMs) for next generation mapping applications (e.g., Google Earth, NGA Virtual Earth, etc.)” (Hucks: ¶65) and to be “used for numerous commercial and civil applications, such as: 3D Data (and 3D change) for energy exploration, mining/site assessment and remediation, power/utilities facilities and corridors, infrastructure/urban planning, disaster response/mitigation, wireless modeling, etc. Other example applications may include volumetric processing, such as for EO and SAR applications” (Hucks: ¶65).

Regarding claim 25, depending on 24,
The combination of Frahm and Hucks teaches:
The system of claim 24, 
Hucks further teaches:
wherein the bias toward filled-space is proportional to the height value (Hucks: ¶62, “use a linear program to optimally guide the height prediction with feature classes from imagery”; ¶62, “The GTO module 56 may solve the reward matrix using a linear program. . .”; NOTE: A linear program will output a proportional vale. The reward matrix outputs biases toward filled space proportional to the height value. In reference to figure 6, vegetation is classified to three feature types VEG_LOW with low height features, VEG_MED with medium height features, and VEG_HIGH with high height features. The rewards assigned is proportional to their height. VEG_LOW gets MIN bias, VEG_MED gets MEAN bias, and VEG_HIGH gets MAX bias.).

Regarding claim 26, depending on 19,
The combination of Frahm and Hucks teaches:
The system of claims 19, 
However, Frahm fails to teach: wherein the bias toward filled-space is determined by analyzing the footprint of the feature.
The analogous art Hucks teaches:
wherein the bias toward filled-space is determined by analyzing the footprint of the feature (Hucks: ¶34, “. . . Automatic extraction of image areas that represent a feature of interest involves two steps: accurate classification of pixels that represent the region, while minimizing misclassified pixels, and vectorization, which extracts a contiguous boundary along each classified region. This boundary, when paired with its geo-location, can be inserted into a feature database independent of the image . . .”; ¶61-64, “Further optimization may be achieved by performing supervised landmark-based image segmentation that employs game-theoretic concepts. This is done by creating a reward matrix with land cover classifications and different model solvers, as shown in the table 85 of FIG. 6. The reward matrix illustratively includes land cover classifications and different model solvers as shown. . .”; ¶63, “There is a need for detailed surface representations . . . for which the building is necessary. . . 3D land-use zoning, and allowed building volumes, usage, and density. They are the main tools that help define the image of a city and bring into focus . . . each land use/land cover feature may be used for optimal decision making of which model in the ensemble should be chosen per voxel. . .”); NOTE: The footprint, which is a boundary of a region is analyzed and extracted as described in paragraph 34 >> The system uses a semantic segmentation and land-cover classification to classify if the feature is a building, water, vegetation etc. as described in paragraph 61 >> system applies a MAX bias toward filled-space regions such as regions with built up areas (BUA) with buildings, and/or High Vegetation which occupies and fills space, to be used for optimal decision making of which model should be chosen per voxel.)
It would have been obvious to a person having ordinary skill in the art (PHOSITA) before the effective filing date of the claimed invention to combine Frahm and Hucks and include: wherein the bias toward filled-space is determined by analyzing the footprint of the feature.
The reason for doing so is to provide “enhanced geospatial models (e.g., DSMs) for next generation mapping applications (e.g., Google Earth, NGA Virtual Earth, etc.)” (Hucks: ¶65) and to be “used for numerous commercial and civil applications, such as: 3D Data (and 3D change) for energy exploration, mining/site assessment and remediation, power/utilities facilities and corridors, infrastructure/urban planning, disaster response/mitigation, wireless modeling, etc. Other example applications may include volumetric processing, such as for EO and SAR applications” (Hucks: ¶65).

Regarding claim 27, depending on 19,
The combination of Frahm and Hucks teaches:
The system of claim 19, 
Frahm further teaches:
further comprising determining whether the generated observation model includes one or more facades, wherein the bias toward filled-space is applied to the one or more facades (Frahm: ¶70, “Generating height maps of urban environments leads to an effective way of finding vertical surfaces, such as building facades. . . In a vertical height map, these vertical surfaces can be identified as discontinuities or large and abrupt changes in the height values”; ¶99, “Facades of buildings are of particular interest in urban modeling. In a height map representation according to some embodiments, facades appear as large depth gradients between the roof tops and the ground below. These height discontinuities may be detected with a threshold, and strictly vertical polygons may be generated to connect the ground and roof, as discussed below in connection with FIG. 3 . . .”; ¶169, “Methods for reconstructing scenes using an n-layer height map are illustrated in FIG. 14. These methods may include the following steps: lay out the volume of interest (Block 502); construct the probabilistic occupancy grid over the volume (Block 504); and compute the n-layer height map (Block 506). Optionally, a mesh may be extracted from the n-layer height map, and texture maps may be generated from the images”; ¶181, “The notation "P(O|Z)" means the probability of O, the occupancy of the voxel, . . . The occupancy prior P(O) is used to slightly bias the volume to be empty above the camera center and full below. That is, the occupancies may be initialized to be 0 for voxels above the camera center and 1 below the camera center; NOTE: In reference to figure 14, at 504, the observation model, which is the probabilistic occupancy grid is generated. At 506, the height map, which is the 3d reconstruction is generated. After the height map is generated, the facades are identified as discontinuities or large and abrupt changes in the height values and appear as large depth gradients between the roof tops and the ground below. As also discussed in claim 19 rejection and in reference to paragraph 169, the occupancy of the voxel is based on an occupancy prior used to slightly bias the volume below the camera center, this is the volume where a building filling the space is located. Biasing towards an area where a building is located is a bias toward filled-space. Since the facades are part of the building and physically located within the biased volume, therefore, the bias toward filled-space inherently applied to the facades. Range information >> generates observation model, which is the probabilistic occupancy grid, with bias towards filled space, which is the volume below the camera center where a building is located >> generate height map >> determine facades )

Regarding claim 28, depending on 27,
The combination of Frahm and Hucks teaches:
The system of claim 27, 
Frahm further teaches:
wherein determining whether the generated observation model includes the one or more facades further includes analyzing a received height field image using image processing edge detection (Frahm: ¶70, “Generating height maps of urban environments leads to an effective way of finding vertical surfaces, such as building facades. In ground-based city modeling, facades are of major importance, since they fill much of the field of view from an urban viewpoint. In a vertical height map, these vertical surfaces can be identified as discontinuities or large and abrupt changes in the height values”; ¶99, “Facades of buildings are of particular interest in urban modeling. In a height map representation according to some embodiments, facades appear as large depth gradients between the roof tops and the ground below. These height discontinuities may be detected with a threshold, and strictly vertical polygons may be generated to connect the ground and roof, as discussed below in connection with FIG. 3”; NOTE: The vertical surfaces, such as building facades are edges of a building. The height field image is the height map. The building facades/vertical surface/edges are identified in a height map as discontinuities in the height values detected with a threshold. Therefore, the discontinuity detection from analyzing the received height map of Frahm is the analyzing the height field image using image processing edge detection ).

Regarding claim 29, depending on 27,
The combination of Frahm and Hucks teaches:
The system of claim 27, 
Frahm further teaches:
wherein determining whether the generated observation model includes one or more facades includes analyzing the footprint of the feature (NOTE: Frahm’s system: Range data acquired as described in paragraph 72, also see fig. 1A >> Analyze footprint of the feature/building as described in paragraphs 73-74, also see fig. 1B >> generate observation model, which is the occupancy grid based on biasing the volume below the camera center to be full or filled-space as described in paragraphs 172-181 >> generate height map/3d reconstruction as described in paragraph 70,73, fig. 14 506 >>  identify facades using discontinuity threshold as edge detection. Since analyzing the footprint is part of Frahm’s workflow, therefore, the determining whether the generated observation model includes one or more facades includes analyzing the footprint of the feature/building).

Regarding claims 1-9,
Claims 1-9 are drawn to the method corresponding to the instructions of using same as claimed in apparatus claims 19, 22-29 respectively. Therefore, claims 1-9 are drawn to the method corresponding to the instructions of using same as claimed in apparatus claims 19, 22-29 respectively, and are rejected for the same reasons of obviousness as used above.

Regarding claim 11, depending on claim 1,
The combination of Frahm and Hucks teaches:
The method of claim 1, 
However, Frahm fails to teach: wherein one or more machine learning models perform one or more of the estimation of free-space and filled-space for a feature, generating the observation model, determining whether the feature is a building or water, applying the voxel reconstruction algorithm, or generating the observation model.
The analogous art Hucks teaches:
wherein one or more machine learning models perform one or more of the estimation of free-space and filled-space for a feature, generating the observation model, determining whether the feature is a building or water, applying the voxel reconstruction algorithm, or generating the observation model (Frahm: ¶32, “. . . Deep learning on geospatial data is performed with a convolutional neural network (CNN) trained end-to-end. The system 30 uses image semantic segmentation to classify land-use land-cover (LULC) features. . .”; ¶57, “Semantic segmentation uses a label for each pixel. The system 30 may use deep learning to determine a precise measurement of land-use/land-cover from high-resolution aerial imagery to differentiate classes with similar visual characteristics. To assign a classification of features over an image, supervised learning . . .”; ¶61, “. . . seven classes were used, namely: water; roads; vegetation low; vegetation medium; vegetation high; built up areas (BUAs); and bare earth . . .”; NOTE: Hucks uses machine learning, which is a CNN for land-cover classification, which includes building and water. Therefore, Hucks teaches wherein one or more machine learning models (CNN) perform one or more . . . determining whether the feature is a building or water . . . or . . .”).
It would have been obvious to a person having ordinary skill in the art (PHOSITA) before the effective filing date of the claimed invention to combine Frahm and Hucks, and include wherein one or more machine learning models perform one or more of the estimation of free-space and filled-space for a feature, generating the observation model, determining whether the feature is a building or water, applying the voxel reconstruction algorithm, or generating the observation model.
The reason for doing so is “for consistent labeling of height values with classification factors, resulting in higher accuracy and precision” (Hucks: ¶60).

Regarding claim 12, depending on 1,
The combination of Frahm and Hucks teaches:
The method of claim 1, 
Frahm further teaches:
further comprising presenting the generated 3D reconstruction of the feature in an image (Frahm: Fig. 11; ¶163, “FIG. 11 illustrates untextured and textured three dimensional models produced by systems/methods according to some embodiments. This challenging scene features many reflective cars and glass store-front facades. Similarly, FIG. 12 illustrates untextured and textured three dimensional models of a residential scene produced by systems/methods according to some embodiments “; ¶110, “The height map model and/or the textured or untextured three dimensional mesh representation can be output to a display server 52 and rendered as a three dimensional rendering 56 on a monitor screen 54 connected to the display server 52”)

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Frahm in view of Hucks further in view of Jin et al. (US 20060045351 A1, hereinafter “Jin”).

Regarding claim 10, depending on 1,
The combination of Frahm and Hucks teaches: 
The method of claim 1, 
Although Hucks teaches rectifying the stereo-geographic image data on the rectification surface as described in paragraph 70, and although both Frahm and Hucks teaches footprint identification, the combination of Frahm and Hucks still fails to teach: wherein determining the footprint of the feature is based on an orthorectified image of the feature.
The analogous art Jin teaches: 
wherein determining the footprint of the feature is based on an orthorectified image of the feature (Jin: Abstract, “The present invention can extract a change not only from an orthorectified image . . . Foot-print information is extracted from each of images  . . .”; ¶9, “. . . In this way, it is possible to readily realize the extraction of change information from new and old (aerial, satellite) images such as a monocular vision, an orthorectified image and the like  . . .”; ¶23, “. . . it links extracted polygon lines of the building region's boundary and extracts a boundary shape of region . . .; NOTE: Jin determine the footprint of the feature by extracting the footprint information from the orthorectified images. The satellite image where the footprints are extracted are orthorectified images of buildings. Therefore, the footprint determination is based on an orthorectified image of the feature/building.)
It would have been obvious to a person having ordinary skill in the art (PHOSITA) before the effective filing date of the claimed invention to combine Frahm, Hucks, and Jin and include wherein determining the footprint of the feature is based on an orthorectified image of the feature.
The reason for doing so is to provide capability of detecting a change in an object by comparing object foot-prints which are comprised of object boundary properties, inner image properties, and positional relationship properties of objects that are generated an orthorectified image (Jin:¶5)

Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PATRICK GALERA whose telephone number is (571)272-5070. The examiner can normally be reached Mon-Fri 0800-1700 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, King Poon can be reached at 571-270-0728. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PATRICK P GALERA/Examiner, Art Unit 2617                                                                                                                                                                                                        /KING Y POON/Supervisory Patent Examiner, Art Unit 2617
Read full office action
Prosecution Timeline

Feb 15, 2024
Application Filed
Mar 06, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/555,546
Patent 12602567
SYSTEM AND METHOD FOR RENDERING A VIRTUAL MODEL-BASED INTERACTION
2y 5m to grant Granted Apr 14, 2026
18/264,402
Patent 12597184
IMAGE PROCESSING METHOD AND APPARATUS, DEVICE AND READABLE STORAGE MEDIUM
2y 5m to grant Granted Apr 07, 2026
18/430,695
Patent 12586549
Image conversion apparatus and method having timing reconstruction mechanism
2y 5m to grant Granted Mar 24, 2026
18/399,412
Patent 12579921
ELECTRONIC DEVICE HAVING FLEXIBLE DISPLAY AND METHOD FOR CONTROLLING THE SAME
2y 5m to grant Granted Mar 17, 2026
17/875,699
Patent 12491085
SYSTEMS AND METHODS FOR ORTHOPEDIC IMPLANT FIXATION
2y 5m to grant Granted Dec 09, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
86%
Grant Probability
99%
With Interview (+16.7%)
2y 5m
Median Time to Grant
Low
PTA Risk
Based on 7 resolved cases by this examiner. Grant probability derived from career allow rate.