Last updated: April 19, 2026
Application No. 18/580,038
IMAGE MATCHING APPARATUS, CONTROL METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM

Non-Final OA §101§103
Filed
Jan 17, 2024
Examiner
AUGUSTIN, MARCELLUS
Art Unit
2682
Tech Center
2600 — Communications
Assignee
NEC Corporation
OA Round
1 (Non-Final)
Interview Optional

— +15.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 838 resolved cases, 2023–2026
Examiner Intelligence

AUGUSTIN, MARCELLUS View full profile →
Grants 82% — above average
Career Allow Rate
684 granted / 838 resolved
+19.6% vs TC avg
Strong +16% interview lift
Without
With
+15.9%
Interview Lift
resolved cases with interview
Typical timeline
2y 8m
Avg Prosecution
31 currently pending
Career history
869
Total Applications
across all art units
Statute-Specific Performance

§101
11.0%
-29.0% vs TC avg
§103
50.7%
+10.7% vs TC avg
§102
18.5%
-21.5% vs TC avg
§112
12.0%
-28.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 838 resolved cases
Office Action

§101 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Filed IDS of 02/13/2025, and 01/17/2024 have been entered and considered.
Amendments/remarks of 01/17/2024 have been entered. 
Claims 3-7, 10-14, and 17-20 have been amended.
Dependent claim 21 has been cancelled. 
Currently claims 1-20 remained pending.
Please refer to the action below.

Examiner Notes
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. However, the claimed subject matter, not the specification, is the measure of the invention. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim(s) 1, recite(s) mental processes and software processes directed to an apparatus, a memory, a camera, and one or more processors. Claim 1 (being the representative claim) further recites similar limitations to those discussed with regards to respective independent claims 8, and 15, and therefore discussion is omitted for brevity. 

Independent claim 1 includes limitations that recite an abstract idea. Claim 1 recites: An image matching apparatus comprising: at least one memory that is configured to store instructions; and at least one processor that is configured to execute the instructions to: acquire a ground-view image, an aerial-view image, a ground depth image, and an aerial depth image, the ground depth image being an image that indicates distance from a ground camera to each location captured in the ground-view image, the aerial depth image being an image that indicates distance from a center location captured in the aerial-view image to each location captured in the aerial-view image, extract features from the ground-view image and the ground depth image to compute ground feature; extract features from the aerial-view image and the aerial depth image to compute aerial feature; and determine whether or not the ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature.

The claim recite the steps of computing ground feature from extracted features of ground-view image and a ground depth image, and similarly computing aerial view features from extracted aerial-view image features and aerial-view depth image, and matching the ground-view image and the aerial-view image based on the extracted ground feature and aerial-view feature which further falls within the mathematical concepts grouping of abstract ideas where the matching and computing are generated by performing mathematical calculations. The claims also recite the steps of acquiring the said images, extracted features of the images, and determining whether the ground-view image and the aerial-view image match each other which steps may be further performed practically in the human mind as mental processes by observing the images and performing an evaluation to identify and determine whether the ground-view image and the aerial-view image match each other based on extracted image features.
     Thus, the claim recites a distance calculation mathematical as well as a mathematical computing and matching calculation, both of which fall within the mathematical concepts grouping of abstract ideas. As explained in the MPEP, when a claim recites multiple abstract ideas that fall in the same or different groupings, examiners should consider the limitations together as a single abstract idea, rather than as a plurality of separate abstract ideas to be analyzed individually. See MPEP 2106.04, subsection II.B. As the steps (b) and (c) fall within the same grouping of abstract ideas (i.e., mathematical concepts), these limitations are considered together as a single abstract idea for further analysis. Accordingly, the claim recites an abstract idea.

 This judicial exception is not integrated into a practical application because the claims merely recite mental steps that can be performed by a person and/or software steps that can be performed by component or units of a software. That is, other than reciting “extract features from the ground-view image and the ground depth image to compute ground feature; extract features from the aerial-view image and the aerial depth image to compute aerial feature; and determine whether or not the ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature” nothing in the claim element precludes the steps from practically being performed in the mind and/or purely by software. The additional elements of “extract features from the ground-view image and the ground depth image to compute ground feature; extract features from the aerial-view image and the aerial depth image to compute aerial feature; and determine whether or not the ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature” does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception. Hence, claim 1 is not subject matter eligible.

        The dependent claims 2-7, 9-14, and 16-20 do not recite any further limitations that cause the claim(s) to be subject matter eligible. Rather, the limitations of dependent claims are directed toward additional aspects of the judicial exception and/or well-understood, routine and conventional additional elements that do not integrate the judicial exception into a practical application. Based on broadest reasonable interpretation of the claims, all of the steps recited in the independent claims 1, 8, and 15 corresponding to dependent claims 2-7, 9-14, and 16-20 further correspond to concepts performed by at least software components which may be further performed in the human mind. Additionally, a person can mentally perform in the human mind and/or software the assessing of 
whether or not the ground-view image and the aerial-view image match each other based on extracted ground image feature and the aerial view feature. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea.

        Concepts performed in the human mind have been identified in the 2019 PEG as an exemplar in the “Mental Process” grouping of abstract ideas. For the reasons above, the claims do not amount to significantly more than an abstract idea. Even when considered in combination, these additional elements represent mere instructions to apply an exception and insignificant extra-solution activity, which do not provide an inventive concept and therefore, the claims are not patent-eligible.
Furthermore, these additional generic hardware elements perform no more than their basic computer function. Generic computer‐implementation of a method is not a meaningful limitation that alone can amount to significantly more than an abstract idea. Moreover, when viewed as a whole with such additional element considered as an ordered combination, claims modified by adding generic hardware elements are nothing more than a purely conventional computerized implementation of an idea in the general field of computer processing and do not provide significantly more than an abstract idea.

Consequently, the identified additional generic hardware elements taken into consideration individually and in combination fail to amount to significantly more than the abstract idea above.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 8, 10, 15, and 17 is/are rejected under 35 U.S.C. 103 as obvious over Lin et al. (NPL, cited in IDS), in view of Seitz et al. (US 8761457, A1).

     Regarding claim 1, Lin teaches an image matching apparatus (Lin teaches a deep cross-view image matching network computing system of at least section 3 of page 5009 corresponding to at least Figs. 1-3 further comprising said image matching apparatus to retrieve or extract ground and aerial view images features and to perform cross-view pairs matching of said ground and aerial view images based on their extracted features), comprising:
at least one memory that is configured to store instructions (a wellknown memory is known in the art to store said executed deep cross-view image matching network model of at least section 3 of page 5009 and Figs. 1-3); and 
at least one processor (a wellknown processor as implied is utilized to execute the  deep cross-view image matching network model of at least section 3 of page 5009 and Figs. 1-3) that is configured to execute the instructions to: 
acquire a ground-view image, an aerial-view image, a ground depth image, and an aerial depth image, the ground depth captured in the aerial-view image to each location captured in the aerial-view image (at least section 1-3 of pages 5008-5009 and Figs. 1-3 further teaches the acquired ground-view images, aerial-view images, ground depth data, and an aerial depth data, the ground depth image data as understood in the art indicates distance being near or far away from a ground camera to each location captured in the ground-view image, as said aerial depth image data likewise expresses a distance from a center location captured in the aerial-view image to each location captured in the aerial-view image); 
extract features from the ground-view image and the ground depth 
extract features from the aerial-view image and the aerial depth 
determine whether or not the ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature (the system at least in Fig. 2 and section 3 determine according to the cross-view image pairs matching whether or not the ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature).
       Lin teaches at least in pages 5007-5008 and figs. 1-3 the claimed invention of determining and extracting ground view depth data and aerial view death information correlating to at least cross-view data analysis of the ground-view image and the aerial-view image having match each other based on the ground feature and the aerial feature except for specifically citing said acquired ground depth image, and an aerial depth image, the ground depth image being an image that indicates distance from a ground camera to each location captured in the ground-view image, the aerial depth image being an image that indicates distance from a center location captured in the aerial-view image to each location captured in the aerial-view image, and said extract features from the ground depth image to compute ground feature and extract features from the depth image to compute aerial feature.
     Seitz teaches in at least Fig. 1-2 the acquiring of street-view images 100 and at least in Fig. 2 depth map images 110 of the street-view images 100, and further in Fig. 3 acquiring of aerial-view images 130 and corresponding aerial depth map images of Fig. 4 associated with the aerial-view images 130, and further in at least Fig. 5 identify or extract features of the aerial images and ground based images, and perform alignment of the said images to determine at least whether or not said ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature. It would have been obvious to one of ordinary skill in the art at the time the invention was made to combine the teachings of Lin in view of Seitz to include wherein said acquired ground depth image, and an aerial depth image, the ground depth image being an image that indicates distance from a ground camera to each location captured in the ground-view image, the aerial depth image being an image that indicates distance from a center location captured in the aerial-view image to each location captured in the aerial-view image, and said extract features from the ground depth image to compute ground feature and extract features from the depth image to compute aerial feature, as discussed above, as Lin in view of Seitz are in the same field of mapping generated ground-view images to generated aerial-view images based on detected image features, Seitz’s generated ground depth image and aerial depth map images corresponding to respective ground-view images and aerial-view images further complements the cross-view image feature matching of acquired ground-view and aerial-view images of Lin in the sense that said ground depth map image and aerial depth map images corresponding to the respective ground-view images and aerial-view images of Seitz when combined with Lin’s specific camera angles information of ground view images and that of the aerial view according to further acquired depth map images and/or image data further facilitates in the art alignments of said images for determining at least whether or not said ground-view image and the aerial-view image match each other based on the detected image features which ultimately help realize in a case target object position detection and/or implied distance to an object of interest according to further known methods to yield predictable results since known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art as said combination is thus the adaptation of an old idea or invention using newer technology that is either commonly available and understood in the art thereby a variation on already known art (See MPEP 2143, KSR Exemplary Rationale F).

     Regarding claim 3 (according to claim 1), Lin further teaches wherein the determination of Fig. 2 whether or not the ground-view image and the aerial-view image match each other includes: computing similarity between the ground feature and the aerial feature (the cross-view pairs sampled of further Fig. 2 and section further implies at least a positive match by computing obviously similarity between the ground feature and the aerial feature);
and determining that the ground-view image and the aerial-view image match each other when the computed similarity is larger than or equal to a predetermined threshold (cross-view image matching of further section 3 and Fig. 2 further comprises said determining that the ground-view image and the aerial-view image match each other when the implied computed similarity is larger than or equal to a predetermined threshold).

     Regarding claim 8, Lin teaches an control method performed by a computer (Lin teaches a deep cross-view image matching network computing system of at least section 3 of page 5009 corresponding to at least Figs. 1-3 further comprising a known controller to include inherently said control method performed by a processor computing means to retrieve or extract further at least section 3 of page 5009 and Figs. 1-3 ground and aerial view images features and to perform cross-view pairs matching of said ground and aerial view images based on their extracted features), comprising: 
acquiring a ground-view image, an aerial-view image, a ground depth image, and an aerial depth image, the ground depth view image (at least section 1-3 of pages 5008-5009 and Figs. 1-3 further teaches the acquired ground-view images, aerial-view images, ground depth data, and an aerial depth data, the ground depth image data as understood in the art evidently and inherently indicates a distance from a ground camera to each location captured in the ground-view image being near or far away from a ground camera to each location captured in the ground-view image, as said aerial depth image data likewise expresses a distance from a location captured in the aerial-view image to each location captured in the aerial-view image); 
extracting features from the ground-view image and the ground depth 
extracting features from the aerial-view image and the aerial depth 
determining whether or not the ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature (the system at least in Fig. 2 and section 3 determine according to the cross-view image pairs matching whether or not the ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature).
      Lin teaches at least in pages 5007-5008 and figs. 1-3 the claimed invention of determining and extracting ground view depth data and aerial view death information correlating to at least cross-view data analysis of the ground-view image and the aerial-view image having match each other based on the ground feature and the aerial feature except for specifically citing said acquiring ground depth image, and an aerial depth image, the ground depth image being an image that indicates distance from a ground camera to each location captured in the ground-view image, the aerial depth image being an image that indicates distance from a center location captured in the aerial-view image to each location captured in the aerial-view image, and said extracting features from the ground depth image to compute ground feature and extracting features from the depth image to compute aerial feature.
     Seitz teaches in at least Fig. 1-2 the acquiring of street-view images 100 and at least in Fig. 2 depth map images 110 of the street-view images 100, and further in Fig. 3 acquiring of aerial-view images 130 and corresponding aerial depth map images of Fig. 4 associated with the aerial-view images 130, and further in at least Fig. 5 identify or extract features of the aerial images and ground based images, and perform alignment of the said images to determine at least whether or not said ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature. It would have been obvious to one of ordinary skill in the art at the time the invention was made to combine the teachings of Lin in view of Seitz to include wherein said acquiring ground depth image, and an aerial depth image, the ground depth image being an image that indicates distance from a ground camera to each location captured in the ground-view image, the aerial depth image being an image that indicates distance from a center location captured in the aerial-view image to each location captured in the aerial-view image, and said extracting features from the ground depth image to compute ground feature and extracting features from the depth image to compute aerial feature, as discussed above, as Lin in view of Seitz are in the same field of mapping generated ground-view images to generated aerial-view images based on detected image features, Seitz’s generated ground depth image and aerial depth map images corresponding to respective ground-view images and aerial-view images further complements the cross-view image feature matching of acquired ground-view and aerial-view images of Lin in the sense that said ground depth map image and aerial depth map images corresponding to the respective ground-view images and aerial-view images of Seitz when combined with Lin’s specific camera angles information of ground view images and that of the aerial view according to further acquired depth map images and/or image data further facilitates in the art alignments of said images for determining at least whether or not said ground-view image and the aerial-view image match each other based on the detected image features which ultimately help realize in a case target object position detection and/or implied distance to an object of interest according to further known methods to yield predictable results since known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art as said combination is thus the adaptation of an old idea or invention using newer technology that is either commonly available and understood in the art thereby a variation on already known art (See MPEP 2143, KSR Exemplary Rationale F).

     Regarding claim 10 (according to claim 8), Lin further teaches wherein the determination of whether or not the ground-view image and the aerial-view image match each other includes:
computing similarity between the ground feature and the aerial feature (the cross-view pairs sampled of further Fig. 2 and section further implies at least a positive match by computing obviously similarity between the ground feature and the aerial feature);
and determining that the ground-view image and the aerial-view image match each other when the computed similarity is larger than or equal to a predetermined threshold (cross-view image matching of further section 3 and Fig. 2 further comprises said determining that the ground-view image and the aerial-view image match each other when the implied computed similarity is larger than or equal to a predetermined threshold).

     Regarding claim 15, Lin implies a non-transitory computer-readable storage medium storing a program that causes a computer (Lin teaches a deep cross-view image matching network computing system of at least section 3 of page 5009 corresponding to at least Figs. 1-3 known to comprise at least a memory and at least one or more processor, said at least one memory may obviously comprise said computer-readable storage medium with instructions means to retrieve or extract ground and aerial view images features and to perform cross-view pairs matching of said ground and aerial view images based on their extracted features), to execute:
acquiring a ground-view image, an aerial-view image, a ground depth image, and an aerial depth image, the ground depth location captured in the aerial-view image to each location captured in the aerial-view image (at least section 1-3 of pages 5008-5009 and Figs. 1-3 further teaches the acquired ground-view images, aerial-view images, ground depth data, and an aerial depth data, the ground depth image data as understood in the art evidently and inherently indicates a distance from a ground camera to each location captured in the ground-view image being near or far away from a ground camera to each location captured in the ground-view image, as said aerial depth image data likewise expresses a distance from a location captured in the aerial-view image to each location captured in the aerial-view image); 
extracting features from the ground-view image and the ground depth 
extracting features from the aerial-view image and the aerial depth 
determining whether or not the ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature (the system at least in Fig. 2 and section 3 determine according to the cross-view image pairs matching whether or not the ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature).
      Lin teaches at least in pages 5007-5008 and figs. 1-3 the claimed invention of determining and extracting ground view depth data and aerial view death information correlating to at least cross-view data analysis of the ground-view image and the aerial-view image having match each other based on the ground feature and the aerial feature except for specifically citing said acquiring ground depth image, and an aerial depth image, the ground depth image being an image that indicates distance from a ground camera to each location captured in the ground-view image, the aerial depth image being an image that indicates distance from a center location captured in the aerial-view image to each location captured in the aerial-view image, and said extracting features from the ground depth image to compute ground feature and extracting features from the depth image to compute aerial feature.
     Seitz teaches in at least Fig. 1-2 the acquiring of street-view images 100 and at least in Fig. 2 depth map images 110 of the street-view images 100, and further in Fig. 3 acquiring of aerial-view images 130 and corresponding aerial depth map images of Fig. 4 associated with the aerial-view images 130, and further in at least Fig. 5 identify or extract features of the aerial images and ground based images, and perform alignment of the said images to determine at least whether or not said ground-view image and the aerial-view image match each other based on the ground feature and the aerial feature. It would have been obvious to one of ordinary skill in the art at the time the invention was made to combine the teachings of Lin in view of Seitz to include wherein said acquiring ground depth image, and an aerial depth image, the ground depth image being an image that indicates distance from a ground camera to each location captured in the ground-view image, the aerial depth image being an image that indicates distance from a center location captured in the aerial-view image to each location captured in the aerial-view image, and said extracting features from the ground depth image to compute ground feature and extracting features from the depth image to compute aerial feature, as discussed above, as Lin in view of Seitz are in the same field of mapping generated ground-view images to generated aerial-view images based on detected image features, Seitz’s generated ground depth image and aerial depth map images corresponding to respective ground-view images and aerial-view images further complements the cross-view image feature matching of acquired ground-view and aerial-view images of Lin in the sense that said ground depth map image and aerial depth map images corresponding to the respective ground-view images and aerial-view images of Seitz when combined with Lin’s specific camera angles information of ground view images and that of the aerial view according to further acquired depth map images and/or image data further facilitates in the art alignments of said images for determining at least whether or not said ground-view image and the aerial-view image match each other based on the detected image features which ultimately help realize in a case target object position detection and/or implied distance to an object of interest according to further known methods to yield predictable results since known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art as said combination is thus the adaptation of an old idea or invention using newer technology that is either commonly available and understood in the art thereby a variation on already known art (See MPEP 2143, KSR Exemplary Rationale F).

     Regarding claim 17 (according to claim 15), Lin further teaches wherein the determination of whether or not the ground-view image and the aerial-view image match each other includes: computing similarity between the ground feature and the aerial feature (the cross-view pairs sampled of further Fig. 2 and section further implies at least a positive match by computing obviously similarity between the ground feature and the aerial feature);
and determining that the ground-view image and the aerial-view image match each other when the computed similarity is larger than or equal to a predetermined threshold (cross-view image matching of further section 3 and Fig. 2 further comprises said determining that the ground-view image and the aerial-view image match each other when the implied computed similarity is larger than or equal to a predetermined threshold).

Claim Objections
Dependent claims 2, 4-7, 9, 11-14, 16, and 18-20 are objected over the prior arts of record as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if all outstanding rejections are overcome. The prior arts do not appear to teach specifically: [Claim 2] (Original) The image matching apparatus according to 1, The image matching apparatus according to wherein the acquisition of the aerial depth image includes: generating the aerial depth image by computing distance between each pixel of the aerial image and a center of the aerial image and setting, to each pixel of the aerial depth image, a value proportional to the computed distance between that pixel and the center of the aerial depth image; and acquiring the generated aerial depth image.  
[Claim 4] (Currently amended) The image matching apparatus according to claim 1,wherein the at least one memory is configured to further store a first model and a second model, the first model being trained to extract features from the ground- view image and the ground depth image to output the ground feature, the second model being trained to extract features from the aerial-view image and the aerial depth image to output the aerial feature, and the at least one processor is further configured to execute the instructions to: acquiring a training dataset that includes the ground-view image, the ground depth image, a positive example of the aerial-view image, a negative example of the aerial-view image, and the aerial depth image; inputting the ground-view image and the ground depth image into the first model to obtain the ground feature; inputting the positive example and the aerial depth image into the second model to obtain an aerial feature of the positive example; inputting the negative example and the aerial depth image into the second model to obtain an aerial feature of the negative example; and updating trainable parameters of the first model and the second model based on the ground feature, the aerial feature of the positive example, and the aerial feature of the negative example.  
[Claim 5] (Currently amended) The image matching apparatus according to claim 1wherein the at least one memory is configured to further store a first model and a second model, the first model being trained to extract features from the ground- view image and the ground depth image to output the ground feature, the second model being trained to extract features from the aerial-view image and the aerial depth image to output the aerial feature, and the at least one processor is further configured to execute the instructions to: acquiring a training dataset that includes the aerial-view image, a positive example of the ground-view image, a negative example of the aerial-view image, the aerial depth image, a ground depth image corresponding to the positive example, and a ground depth image corresponding to the negative example; inputting the aerial-view image and the aerial depth image into the second model to obtain the aerial feature; inputting the positive example and the ground depth image corresponding to the positive example into the first model to obtain a ground feature of the positive example; inputting the negative example and the ground depth image corresponding to the negative example into the first model to obtain a ground feature of the negative example; and updating trainable parameters of the first model and the second model based on the aerial feature, the ground feature of the positive example, and the ground feature of the negative example.
  [Claim 6] (Currently amended) The image matching apparatus according to claim 4,wherein the at least one processor is further configured to execute the instructions to modify the ground-view image in the training dataset by determining pixels of the ground depth image in the training dataset that indicate distance larger than a predetermined threshold, and modifying pixels of the ground-view image corresponding to the determined pixels of the ground depth image. 
[Claim 7] (Currently amended) The image matching apparatus according to claim 4, wherein the at least one processor is further configured to execute the instructions to modify the aerial-view image in the training dataset by modifying pixels of the aerial-view image whose distance from a center of that aerial-view image is larger than a predetermined threshold.  
[Claim 9] (Original) The control method according to claim 8 wherein the acquisition of the aerial depth image includes: generating the aerial depth image by computing distance between each pixel of the aerial image and a center of the aerial image and setting, to each pixel of the aerial depth image, a value proportional to the computed distance between that pixel and the center of the aerial depth image; and acquiring the generated aerial depth image.  
[Claim 11] (Currently amended) The control method according to claim 9, wherein the computer is configured to store a first model and a second model, the first model being trained to extract features from the ground-view image and the ground depth image to output the ground feature, the second model being trained to extract features from the aerial-view image and the aerial depth image to output the aerial feature, and the control method further comprises: acquiring a training dataset that includes the ground-view image, the ground depth image, a positive example of an aerial-view image, a negative example of an aerial-view image, and the aerial depth image; inputting the ground-view image and the ground depth image to the first model to obtain the ground feature; inputting the positive example of the aerial-view image and the aerial depth image to the second model to obtain an aerial feature of the positive example; inputting the negative example of the aerial-view image and the aerial depth image to the second model to obtain an aerial feature of the negative example; and updating trainable parameters of the first model and the second model based on the ground feature, the aerial feature of the positive example, and the aerial feature of the negative example.  
[Claim 12] (Currently amended)The control method according to claim 9, wherein the computer is configured to store a first model and a second model, the first model being trained to extract features from the ground-view image and the ground depth image to output the ground feature, the second model being trained to extract features from the aerial-view image and the aerial depth image to output the aerial feature, and the control method further comprises: acquiring a training dataset that includes the aerial-view image, a positive example of the ground-view image, a negative example of the aerial-view image, the aerial depth image, a ground depth image corresponding to the positive example, and a ground depth image corresponding to the negative example; inputting the aerial-view image and the aerial depth image into the second model to obtain the aerial feature; inputting the positive example and the ground depth image corresponding to the positive example into the first model to obtain a ground feature of the positive example; inputting the negative example and the ground depth image corresponding to the negative example into the first model to obtain a ground feature of the negative example; and updating trainable parameters of the first model and the second model based on the aerial feature, the ground feature of the positive example, and the ground feature of the negative example.  
[Claim 13] (Currently amended) The control method according to claim 11 further comprising: modifying the ground-view image in the training dataset by determining pixels of the ground depth image in the training dataset that indicate distance larger than a predetermined threshold, and modifying pixels of the ground-view image corresponding to the determined pixels of the ground depth image.  
[Claim 14] (Currently amended)The control method according to claim 11, further comprising: modifying the aerial-view image in the training dataset by modifying pixels of the aerial-view image whose distance from a center of that aerial-view image is larger than a predetermined threshold.  
[Claim 19] (Currently amended) The storage medium according to claim 15, further storing a first model and a second model, the first model being trained to extract features from the ground-view image and the ground depth image to output the ground feature, the second model being trained to extract features from the aerial-view image and the aerial depth image to output the aerial feature, wherein the program further causes the computer to execute: acquiring a training dataset that includes the aerial-view image, a positive example of the ground-view image, a negative example of the aerial-view image, the aerial depth image, a ground depth image corresponding to the positive example, and a ground depth image corresponding to the negative example; inputting the aerial-view image and the aerial depth image into the second model to obtain the aerial feature; inputting the positive example and the ground depth image corresponding to the positive example into the first model to obtain a ground feature of the positive example; inputting the negative example and the ground depth image corresponding to the negative example into the first model to obtain a ground feature of the negative example; and updating trainable parameters of the first model and the second model based on the aerial feature, the ground feature of the positive example, and the ground feature of the negative example.  
[Claim 20] (Currently amended) The storage medium according to claim 18, wherein the program further causes the computer to execute: modifying the ground-view image in the training dataset by determining pixels of the ground depth image in the training dataset that indicate distance larger than a predetermined threshold, and modifying pixels of the ground-view image corresponding to the determined pixels of the ground depth image.  

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARCELLUS AUGUSTIN whose telephone number is (571)270-3384. The examiner can normally be reached 9 AM- 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, BENNY TIEU can be reached at 571-272-7490. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/MARCELLUS J AUGUSTIN/Primary Examiner, Art Unit 2682                                                                                                                                                                                                        01/16/2026
Read full office action
Prosecution Timeline

Jan 17, 2024
Application Filed
Jan 20, 2026
Non-Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/811,884
Patent 12597126
IMAGE SETTING DEVICE, IMAGE SETTING METHOD, AND IMAGE SETTING PROGRAM
2y 5m to grant Granted Apr 07, 2026
17/761,578
Patent 12586170
SYSTEM AND METHOD FOR GENERATING PREDICTIVE IMAGES FOR WAFER INSPECTION USING MACHINE LEARNING
2y 5m to grant Granted Mar 24, 2026
17/887,618
Patent 12573079
System and Method for Identifying Feature in an Image of a Subject
2y 5m to grant Granted Mar 10, 2026
17/969,537
Patent 12573388
BEHAVIOR DETECTION
2y 5m to grant Granted Mar 10, 2026
17/990,471
Patent 12569129
ANATOMICAL LOCATION DETECTION OF FEATURES OF A GASTROINTESTINAL TRACT OF A PATIENT
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
82%
Grant Probability
98%
With Interview (+15.9%)
2y 8m
Median Time to Grant
Low
PTA Risk
Based on 838 resolved cases by this examiner. Grant probability derived from career allow rate.
IMAGE MATCHING APPARATUS, CONTROL METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email