Prosecution Insights
Last updated: May 29, 2026
Application No. 18/289,304

OBJECT DETECTION APPARATUS, LEARNING APPARATUS, LEARNING METHOD, OBJECT DETECTION PROGRAM, AND STORAGE MEDIUM

Final Rejection §103§112
Filed
Nov 02, 2023
Priority
Jun 13, 2022 — JP PCT/JP2022/023572 +1 more
Examiner
TORRES, JOSE
Art Unit
2664
Tech Center
2600 — Communications
Assignee
NEC Corporation
OA Round
2 (Final)
82%
Grant Probability
Favorable
3-4
OA Rounds
5m
Est. Remaining
94%
With Interview

Examiner Intelligence

Grants 82% — above average
82%
Career Allowance Rate
524 granted / 640 resolved
+19.9% vs TC avg
Moderate +12% lift
Without
With
+12.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
14 currently pending
Career history
665
Total Applications
across all art units

Statute-Specific Performance

§101
2.7%
-37.3% vs TC avg
§103
66.7%
+26.7% vs TC avg
§102
14.1%
-25.9% vs TC avg
§112
11.8%
-28.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 640 resolved cases

Office Action

§103 §112
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Comments The Amendment – After Non-Final Rejection filed on December 31, 2025 has been entered and made of record. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph: The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention. Claims 7, 9, and 11 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claim 7 recites the limitation “the second map being a weight map indicating a difference between the first image and the second image” in lines 13-14. There is insufficient antecedent basis for this limitation in the claim. It is unclear as to which the second image the limitation refers to (e.g., training data which includes at least one first image, at least one second image, claim 7 lines 4-5; calculating a second map from a second image, claim 7 line 13). Claim 11 depends upon claim 7. Claim 9 recites the limitation “the second map being a weight map indicating a difference between the first image and the second image” in lines 11-12. There is insufficient antecedent basis for this limitation in the claim. Similar to claim 7 above, it is unclear as to which the second image the limitation refers to (e.g., training data which includes at least one first image, at least one second image, claim 9 line 3; calculating a second map from a second image, claim 9 line 11). Appropriate correction is required. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1, 5-7, and 9-11 are rejected under 35 U.S.C. 103 as being unpatentable over Ogino et al. (U.S. Pub. No. 2021/0272277) in view of Iyer et al. (U.S. Pub. No. 2022/0207306) and Sawada et al. (U.S. Pub. No. 2023/0410532). As to claims 1 and 10, Ogino et al. teaches an object detection apparatus (i.e., “medical imaging apparatus”, Abstract)/non-transitory tangible computer-readable storage medium storing therein an object detection program causing a computer to function as (See for example, “Data and programs required for processing of the image processing unit 200 are stored in the storage device 130”, Paragraph [0048]) an apparatus comprising at least one processor (i.e., “image processing unit 200”, Paragraph [0043]), the at least one processor carrying out/the object detection program causing the computer to carry out the: an image acquisition process comprising acquiring a first image (i.e., “image processing unit 200 includes an image reconstructing unit 210 that reconstructs an image (first image) from the image signal received from the imaging unit 100”, Paragraph [0043]); a calculation process comprising using a first model to calculate a first map from the first image (See for example, “a feature quantity extraction unit 232 that extracts a first feature quantity A from first image data”, Paragraph [0044]; and Paragraph [0071]); and a detection process comprising carrying out object detection with reference to at least the first map (i.e., “an identification unit 235 that calculates a predetermined parameter value using the third feature quantity C using an identification model and performs prediction”, Paragraph [0044]; and “This model calculates a predetermined parameter value from a feature quantity after conversion, and predicts the presence or absence of a lesion site, malignancy, etc. represented by the parameter value”, Paragraph [0072]). However, Ogino et al. does not explicitly disclose in a case where the at least one processor acquires not only the first image but also a second image in the image acquisition process, in the calculation process, the at least one processor using a second model to calculate a second map from the first image and the second image, the second map being a weight map indicating a difference between the first image and the second image, and in the detection process, the at least one processor carrying out object detection with reference to a third map obtained by multiplying the first map by the second map. Iyer et al. teaches in a case where at least one processor (i.e., “disease diagnosis system 102 may include a processor 202”, Paragraph [0034]) acquires not only the first image but also a second image in the image acquisition process (i.e., “data acquisition module 404 may transmit the first image of the plurality of diagnostic images and the second image of the plurality of diagnostic images to the first pipeline 418a and the second pipeline 418b”, Paragraph [0057]), and in the calculation process, the at least one processor using a second model to calculate a second map from the first image and the second image (i.e., “After receiving the first image and the second image of the plurality of diagnostic images, the stacked convolutional layers 418 of the temporal CNN model 416 in the temporal CNN module 406 may be configured to extract feature maps from the first image and the second image of the plurality of diagnostic images”, Paragraph [0059]; and “the temporal CNN module 406 may be configured to sort the highly relevant activation feature maps and concatenate the highly relevant feature maps to form a union of the highly relevant activation feature maps. The union of the highly relevant activation feature maps may correspond to concatenated feature map”, Paragraph [0063]), the second map being a weight map indicating a difference between the first image and the second image (i.e., “The temporal CNN layer 420 of the temporal CNN model 416 in the temporal CNN module 406 may be configured to identify relevant feature maps extracted from the feature maps of the first image and the second image. The relevant feature maps may indicate a temporal difference among image classes from the first image and the second image”, Paragraph [0059]). The combination of Ogino et al. and Iyer et al. does not explicitly disclose in the detection process, the at least one processor carrying out object detection with reference to a third map obtained by multiplying the first map by the second map. Sawada et al. teaches in the detection process, at least one processor (i.e., “object detection device 200 includes a processor 41”, Paragraph [0116]) carrying out object detection with reference to a third map obtained by multiplying the first map by the second map (i.e., “generates the third feature map FM3 by performing addition or multiplication of the second feature map FM2 using the first feature map FM1 and weighting the second feature map FM2, and the object detection unit 24 that detects an object in the captured image using the third feature map FM3”, Paragraph [0206]). Ogino et al., Iyer et al. and Sawada et al. are analogous art because they are from the field of digital image processing. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to modify Ogino et al. by incorporating in a case where the at least one processor acquires not only the first image but also a second image in the image acquisition process, in the calculation process, the at least one processor using a second model to calculate a second map from the first image and the second image, the second map being a weight map indicating a difference between the first image and the second image, as taught by Iyer et al., and in the detection process, the at least one processor carrying out object detection with reference to a third map obtained by multiplying the first map by the second map, as taught by Sawada et al. The suggestion/motivation for doing so would have been to make use of temporal differences between images to reduce memory usage for processing the images, and to cope with variations in size of individual objects to be detected in the images. Therefore, it would have been obvious to combine Iyer et al. and Sawada et al. with Ogino et al. to obtain the invention as specified in claims 1 and 10. As to claim 5, Ogino et al. teaches wherein the at least one processor further carries out: a training data acquisition process comprising acquiring training data which includes at least one first image (i.e., “input image”, Paragraph [0053]), at least one second image (i.e., “a different image from the input image”, Paragraph [0053]), and label information indicative of an object included in the at least one first image (i.e., “a label such as the presence or absence (benign or malignant) of a lesion or a grade of lesion malignancy as learning data”, Paragraph [0054]); and a first learning process comprising training the first model by machine learning with reference to the at least one first image and the label information which are included in the training data (i.e., “The CNN of this predictive model is learned to extract a feature quantity A 410 for accurately identifying the presence or absence of the lesion of an input image 400 by the CNN repeating the convolution calculation and pooling on input data of the input image 400 for learning divided into a plurality of patches by the patch processing unit 231”, Paragraph [0055]; and “Learning is performed until an error between an output and teacher data falls within a predetermined range”, Paragraph [0056]). However, Ogino et al. does not explicitly disclose a second learning process comprising training the first model and the second model by machine learning with reference to the at least one first image, the at least one second image, and the label information which are included in the training data. Iyer et al. teaches a second learning process comprising training the first model and the second model by machine learning (i.e., “The extracted feature map for each of the training images (such as the training image 504a and the training image 504b) are generated simultaneously from the first pipeline 502a and the second pipeline 502b of stacked convolutional layers”, Paragraph [0076]) with reference to the at least one first image, the at least one second image (i.e., “the sequence of diagnostic images in the training dataset includes a first image 504a and a second image 504b”, Paragraph [0072]), and the label information which are included in the training data (i.e., “the temporal convolutional layer 502c may be configured to map the concatenated feature maps to a corresponding predefined diagnostic class 516 stored in the data repository 412”, Paragraph [0083]). Therefore, in view of Iyer et al. and Sawada et al., it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ogino et al. by incorporating the second learning process comprising training the first model and the second model by machine learning with reference to the at least one first image, the at least one second image, and the label information which are included in the training data, as taught by Iyer et al., in order to perform inferencing of the input images in a better way. As to claim 6, Ogino et al. teaches wherein the at least one processor further carries out a presentation process comprising outputting a result of detection by the detection process (i.e., “output unit 120 may display the parameter value output from the diagnosis support processing unit 230”, Paragraph [0105]), in the detection process, the at least one processor detects an object that is a lesion which is capable of being detected from an image captured by carrying out an examination with respect to a subject (i.e., “a description will be given of the case where an image input to the diagnosis support processing unit is an image acquired the MRI apparatus. However, the invention is not limited thereto. For example, other modality images of the CT, X-rays, ultrasonic waves, etc. may be input”, Paragraph [0119]), and in the presentation process, the at least one processor outputs a result of detection of the lesion for supporting decision making by a medical worker (i.e., “processing result of the diagnosis support processing unit 230 may be output to the output unit 120 provided in the image processing apparatus 20, or may be sent to the medical imaging apparatus to which the image data is sent, a facility in which the medical imaging apparatus is placed, a database in another medical institution, etc.”, Paragraph [0128]; and Paragraph [0138]). However, Ogino et al. does not explicitly disclose the examination is an endoscopic examination. Iyer et al. teaches an examination that is an endoscopic examination (See for example, “the image sensor 104 may have suitable optical instruments, such as lenses and actuators for the lenses, to capture the diagnostic images. Examples of implementation of the image sensor 104 may include, but not limited to, high-definition scanners and cameras (such as, endoscope cameras)”, Paragraph [0023]). Therefore, in view of Iyer et al. and Sawada et al., it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ogino et al. by incorporating the examination as an endoscopic examination, as taught by Iyer et al., in order to perform effective object detection and diagnosis utilizing conventional medical image acquisition. As to claims 7, 9 and 11, as best understood, Ogino et al. teaches a learning apparatus/method (i.e., “medical imaging apparatus”, Abstract)/non-transitory tangible computer-readable storage medium (See for example, “Data and programs required for processing of the image processing unit 200 are stored in the storage device 130”, Paragraph [0048]) storing therein a learning program causing a computer to function as a learning apparatus comprising at least one processor (i.e., “image processing unit 200”, Paragraph [0043]), the at least one processor carrying out/the learning program causing the computer to carry out: a training data acquisition process comprising acquiring training data which includes at least one first image (i.e., “input image”, Paragraph [0053]), at least one second image (i.e., “a different image from the input image”, Paragraph [0053]), and label information indicative of an object included in the at least one first image (i.e., “a label such as the presence or absence (benign or malignant) of a lesion or a grade of lesion malignancy as learning data”, Paragraph [0054]); a first learning process comprising training a first model with reference to the at least one first image and the label information which are included in the training data, the first model calculating a first map from a first image (i.e., “The CNN of this predictive model is learned to extract a feature quantity A 410 for accurately identifying the presence or absence of the lesion of an input image 400 by the CNN repeating the convolution calculation and pooling on input data of the input image 400 for learning divided into a plurality of patches by the patch processing unit 231”, Paragraph [0055]; and “Learning is performed until an error between an output and teacher data falls within a predetermined range”, Paragraph [0056]). However, Ogino et al. does not explicitly disclose a second learning process comprising training the first model and a second model with reference to the at least one first image, the at least one second image, and the label information which are included in the training data, the second model; calculating a second map from a second image and the first image, the second map being a weight map indicating a difference between the first image and the second image; and performing object detection with reference to a third map obtained by multiplying the first map by the second map. Iyer et al. teaches a second learning process comprising training the first model and a second model (i.e., “The extracted feature map for each of the training images (such as the training image 504a and the training image 504b) are generated simultaneously from the first pipeline 502a and the second pipeline 502b of stacked convolutional layers”, Paragraph [0076]) with reference to the at least one first image, the at least one second image (i.e., “the sequence of diagnostic images in the training dataset includes a first image 504a and a second image 504b”, Paragraph [0072]), and the label information which are included in the training data (i.e., “the temporal convolutional layer 502c may be configured to map the concatenated feature maps to a corresponding predefined diagnostic class 516 stored in the data repository 412”, Paragraph [0083]), the second model; and calculating a second map from a second image and the first image, the second map being a weight map indicating a difference between the first image and the second image (i.e., “concatenate the relevant feature map from the temporal convolutional layer 502c of the temporal CNN model 502, based on sorting of the relevant feature maps. In accordance with an embodiment, the concatenated feature map may aid in processing or reading the feature maps accurately. The concatenated feature map may also facilitate in performing inferencing of input images as to which image class the input images belong (normal, inconclusive or infected) in a better way. Further, the concatenated feature map determined in the decision image that may bring in a temporal difference to the two diagnostic classes are isolated and used for inferencing”, Paragraph [0083]). The combination of Ogino et al. and Iyer et al. do not explicitly disclose performing object detection with reference to a third map obtained by multiplying the first map by the second map. Sawada et al. teaches performing object detection with reference to a third map obtained by multiplying the first map by the second map (i.e., “generates the third feature map FM3 by performing addition or multiplication of the second feature map FM2 using the first feature map FM1 and weighting the second feature map FM2, and the object detection unit 24 that detects an object in the captured image using the third feature map FM3”, Paragraph [0206]). Therefore, in view of Iyer et al. and Sawada et al., it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Ogino et al. by incorporating the second learning process comprising training the first model and a second model with reference to the at least one first image, the at least one second image, and the label information which are included in the training data, the second model, calculating a second map from a second image and the first image, the second map being a weight map indicating a difference between the first image and the second image, as taught by Iyer et al., and performing object detection with reference to a third map obtained by multiplying the first map by the second map, as taught by Sawada et al., in order to make use of temporal differences between images to reduce memory usage for processing the images, and to cope with variations in size of individual objects to be detected in the images. Claims 3 and 4 are rejected under 35 U.S.C. 103 as being unpatentable over Ogino et al. in view of Iyer et al. and Sawada et al. as applied to claim 1 above, and further in view of Hosoya et al. (U.S. Pub. No. 2019/0313883). The teachings of Ogino et al., Iyer et al. and Sawada et al. have been discussed above. As to claim 3, Ogino et al., Iyer et al. and Sawada et al. do not explicitly disclose wherein the at least one processor further carries out determination process comprising determining whether the at least one processor acquires the first image or acquires the first image and the second image in the image acquisition process. Hosoya et al. teaches at least one processor (i.e., “management server 10”, Paragraph [0016]) that carries out determination process comprising determining whether the at least one processor acquires the first image or acquires the first image and the second image in the image acquisition process (i.e., “determining whether or not an endoscopic RAW image is similar to an endoscopic image for which an abnormal finding has been confirmed in the past when performing a compression process on the endoscopic RAW image and adding predetermined information to the image that has been compressed when the endoscopic RAW image is determined to be similar”, Paragraph [0021]). Ogino et al., Iyer et al., Sawada et al. and Hosoya et al. are analogous art because they are from the field of digital image processing. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to further modify Ogino et al., Iyer et al. and Sawada et al. by incorporating the at least one processor further carries out determination process comprising determining whether the at least one processor acquires the first image or acquires the first image and the second image in the image acquisition process, as taught by Hosoya et al. The suggestion/motivation for doing so would have been to allow for an image observation to be efficiently performed. Therefore, it would have been obvious to combine Hosoya et al. with Ogino et al., Iyer et al. and Sawada et al. to obtain the invention as specified in claim 3. As to claim 4, Hosoya et al. teaches wherein the at least one processor carries out the determination process with reference to a flag indicating whether the first image is acquired or whether the first image and the second image are acquired (i.e., “compression processing unit 40 adds information indicating analysis results provided from the bleeding state determination unit 34, the group identification unit 36, and the similarity determination unit 38 to the compressed image. More specifically, to the compressed image having an image ID provided from the bleeding state determination unit 34, the compression processing unit 40 adds information indicating that the image is a bleeding image. This information may be added as flag information”, Paragraph [0040]). Response to Arguments Drawings With respect to the drawings, Applicant filed a Replacement Sheet for FIG. 4 on November 2, 2023 addressing the informalities. Therefore, the objections have been withdrawn. The drawings filed on November 2, 2023 are accepted by the Examiner. Claim Rejections - 35 USC §§ 102 and 103 With respect to claims 1-7 and 9-11, Applicant’s arguments (Remarks dated December 31, 2025, pages 7-10) have been fully considered. However, they are moot in view of the new ground(s) of rejection (Refer to Claim Rejections - 35 USC § 103 Section above). With respect to Sawada et al., Applicant respectfully submits that Sawada and Ogino cannot be relied upon to allegedly teach at least the claimed … “in the detection process, the at least one processor carrying out object detection with reference to a third map obtained my multiplying the first map by the second map” (Remarks dated December 31, 2025, pages 9-10). Examiner respectfully disagrees. With respect to Sawada, the multiplication of a first map by a second map in order to carry out object detection is disclosed in at least Paragraph [0206]. Sawada et al. teaches the generation of a feature map by performing multiplication of a second feature map using a first feature map. This generated feature map is then used to perform object detection, similar to the claimed invention. The weight map indicating a difference between a first image and a second image is a limitation taught by Iyer et al., and explained above (Refer to Claim Rejections - 35 USC § 103 Section above). Thus, as evidenced by Sawada et al., performing object detection utilizing multiplied feature maps improves the accuracy of the detection. Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSE M TORRES whose telephone number is (571)270-1356. The examiner can normally be reached Monday thru Friday; 10:00 AM to 6:00 PM EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached at 571-272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /JOSE M TORRES/Examiner, Art Unit 2664 04/13/2026 /JENNIFER MEHMOOD/Supervisory Patent Examiner, Art Unit 2664
Read full office action

Prosecution Timeline

Nov 02, 2023
Application Filed
Oct 01, 2025
Non-Final Rejection mailed — §103, §112
Dec 31, 2025
Response Filed
Apr 22, 2026
Final Rejection mailed — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12639805
METHOD TO AUTOMATE GAS LEAK DETECTION IN BATTERY MANUFACTURING USING DATA FROM OPTICAL GAS IMAGING SYSTEM
2y 6m to grant Granted May 26, 2026
Patent 12639822
DEEP LEARNING BASED IMAGE SEGMENTATION METHOD INCLUDING BIODEGRADABLE STENT IN INTRAVASCULAR OPTICAL TOMOGRAPHY IMAGE
2y 2m to grant Granted May 26, 2026
Patent 12633396
DOCUMENT CREATION SUPPORT APPARATUS, DOCUMENT CREATION SUPPORT METHOD, AND PROGRAM
2y 11m to grant Granted May 19, 2026
Patent 12632965
UNCERTAINTY ESTIMATION VIA OBJECT-SPECIFIC AND OBJECT-AGNOSTIC SEGMENTATION DISAGREEMENT
2y 11m to grant Granted May 19, 2026
Patent 12632964
SEQUENTIAL CONVOLUTIONAL NEURAL NETWORKS FOR NUCLEI SEGMENTATION
2y 11m to grant Granted May 19, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4
Expected OA Rounds
82%
Grant Probability
94%
With Interview (+12.2%)
3y 0m (~5m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 640 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month