DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Notice to Applicants
This action is in response to the Restriction Election filed on 03/05/2026.
Claims 1-19 are pending.
Priority
The Application claims priority to EP-23187559.2 with filing date 07/25/2023, as well as to Provisional Application 63/487,961 with filing date 03/02/2023, both of which are acknowledged.
Information Disclosure Statements
The Information Disclosure Statements filed on 02/12/2024 and 07/23/2024 have both been fully considered by the examiner.
Restriction/Election
The examiner thanks Applicant for their careful consideration of the restriction requirement mailed on 01/16/2026.
Applicant’s election without traverse of Group I (Claims 1-12) in the reply filed on 03/05/2026 is acknowledged.
Claims 13-19 are withdrawn from further consideration pursuant to 37 CFR 1.142(b) as being drawn to a nonelected invention, there being no allowable generic or linking claim. Election was made without traverse in the reply filed on 03/05/2026.
Claim Rejections – 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1 and 7-12 are rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract ideas without significantly more.
Analysis for claim 1 is provided in the following. Claim 1 is reproduced in the following (annotation added):
A computer-implemented method of tracking a target in medical imaging data, the method comprising:
determining an encoded representation of a search image of the medical imaging data in a feature space using a feature encoding network,
the search image depicting a target and a surrounding of the target,
determining encoded representations of one or more template images of the medical imaging data in the feature space using the feature encoding network,
the one or more template images depicting the target,
determining fused features by fusing the encoded representations of the one or more template images and the encoded representation of the search image using a fusion network,
based on the fused features, determining a position prediction of the target in the search image,
based on the fused features, determining a segmentation of a context of the target in the search image,
and refining the position prediction of the target based on the segmentation of the context of the target.
Step 1: Does the claim belong to one of the statutory categories? Claim 1 is directed to a process, which is a statutory category of invention (YES).
Step 2A Prong One: Does the claim recite a judicial exception? Steps g-i can be regarded as mental processes, such as observations, evaluations, judgements, or opinions that can be practically performed in the human mind. Step g requires determining a predicted position of the target in the search image, with the only restriction being “based on the fused features”, thus a mental determination/recognition would read on this step. Likewise, the segmentation determination of step h is only restricted by “based on the fused features”, thus a mental determination/recognition would read on this step as well. Finally, in step i, the examiner argues that mentally refining or otherwise changing the mental position prediction would read on this step as well, as it is only restricted to be “based on the segmentation of the context of the target”.
The examiner notes that steps b, d, and f cannot be practically performed in the human mind, both because of the broadest reasonable interpretation of “encoded representations”, and each of these steps being performed using feature encoding and fusion networks.
Step 2A Prong Two: Does the claim recite additional elements that integrate the judicial exception into a practical application? Step a is a preamble describing the method. Steps b and d recite determining encoded representations of the search and template images using a feature encoding network, and step f recites fusing the encoded representations of steps b and d using a fusion network. In the context of the claim, steps b, d, and f are only used to obtain the fused features that are then acted on by the mental processes. Finally, steps c and e merely limit the search and template images to depict the target (NO).
Step 2B: Does the claim as a whole amount to significantly more than the recited exception? The claim as a whole first recites multiple steps (a-f) that cannot be practically performed in the human mind to generate data (fused features). However, all remaining steps of the claim (g-i) only apply this data by reciting processing that can be practically performed in the human mind (NO). Claim 1 is not eligible.
Similar analysis is applicable to independent claim 11, which additionally recites a computerized system at a high level of generality, which does not integrate the judicial exceptions into a practical application. Claim 11 is not eligible.
Claim 2 recites determining an optical flow based on the segmentation of the context and one or more previous segmentations of the context; this cannot be practically performed in the human mind, considering the broadest reasonable interpretation of the specific term “optical flow”, as known in the art. The claim further recites “wherein the position prediction of the target is refined based on the optical flow”; these limitations together apply the results of the mental processes performed in claim 1 and thus integrate the judicial exceptions into a practical application. Claim 2 is eligible.
Claim 3 narrows the refining of the position prediction to include applying a refinement network to the optical flow and position prediction, which still cannot be performed in the human mind and further integrates into a practical application. Claim 3 is eligible.
Claims 4-6 further narrow the refinement processes. Claims 4-6 are eligible based on their dependence on eligible claim 2.
Claims 7, 8, and 12 narrow the target and context to specific species of objects. Claims 7, 8, and 12 are not eligible.
Claim 9 recites that the template images have at least one of a lower resolution or size than the search image, which does not integrate the judicial exceptions into a practical application. Claim 9 is not eligible.
Claim 10 recites that the fused features are determined using a vision transformer network, which cannot be performed in the human mind. However, claim 10 is still not eligible for similar reasons to claim 1 above – the fused features are only ultimately used to perform mental processes. Claim 10 is not eligible.
Claim Rejections – 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-8 and 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. ("A Temporary Transformer Network for Guide-Wire Segmentation", CISP-BMEI paper, 7 December 2021) in view of Benseghir et al. (U.S. Publ. US-2023/0083936-A1).
Regarding claim 1, Zhang discloses a computer-implemented method (see figure 1) of tracking a target in medical imaging data (see Section I, where guide-wires are to be segmented from medical imaging data; see Section III.A, where a catheter image dataset is collected and subsequently analyzed), the method comprising:
determining an encoded representation of a search image of the medical imaging data in a feature space using a feature encoding network (see figure 1, current frame and Section II.A, where the current frame/search image is input into a CNN to obtain feature representations/encoded representations),
the search image depicting a target and a surrounding of the target (see figure 1, current frame and Section II.B, which specifies that the current frame depicts guide-wire information),
determining encoded representations of one or more template images of the medical imaging data in the feature space using the feature encoding network (see figure 1, previous frame and Section II.A, where a previous frame/template image is input into another CNN to obtain feature representations/encoded representations),
the one or more template images depicting the target (see figure 1, previous frame and Section II.B, which specifies that the previous frame also depicts guide-wire information),
determining fused features by fusing the encoded representations of the one or more template images and the encoded representation of the search image using a fusion network (see figure 1, feature concatenation boxes and Section II.B, where the features of the current and previous frames are concatenated to generate a mixed feature representation),
based on the fused features, determining a position prediction of the target in the search image, based on the fused features, determining a segmentation of a context of the target in the search image (see figure 1, segmentation result; figure 3, more segmentation results, and Section II.B, where the mixed features are used to generate a segmentation mask of the guide-wire; this reads on both a position prediction of the guide-wire, as well as a segmentation of the context/entire body of the wire),
Zhang fails to disclose refining the position prediction of the target based on the segmentation of the context of the target.
Pertaining to the same field of endeavor, Benseghir discloses determining a position prediction of the target in the search image (see figure 3, step 335 and paragraph 0034, where a position of an interventional tool/target is estimated),
determining a segmentation of a context of the target in the search image (see figure 3, step 325 and paragraph 0032, where the interventional tool/target is additionally segmented),
and refining the position prediction of the target based on the segmentation of the context of the target (first see figure 3, step 330 and paragraph 0033, where an optical flow is determined between the input and previous images, including movement of both tissue and the tool; then see figure 3, step 340 and paragraph 0034, where the estimated position and segmentation of the tool are corrected/refined based on the optical flow of the tissue and tool).
Zhang and Benseghir are considered analogous art, as they are both directed to deep learning models for medical instrument segmentation. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have integrated the teachings of Benseghir into Zhang because doing so increases accuracy of instrument segmentation by accounting for patient movement (see Benseghir paragraphs 0034, 0038, and 0052).
Regarding claim 2, Zhang fails to disclose the limitations of claim 2.
Pertaining to the same field of endeavor, Benseghir discloses determining an optical flow based on the segmentation of the context and one or more previous segmentations of the context (see figure 3, step 330 and paragraph 0033, where an optical flow is determined between the input and previous images, including movement of both tissue and the tool),
wherein the position prediction of the target is refined based on the optical flow (see figure 3, step 340 and paragraph 0034, where the estimated position & segmentation of the tool are corrected/refined based on the optical flow of the tissue and tool).
Zhang and Benseghir are considered analogous art, as they are both directed to deep learning models for medical instrument segmentation. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have integrated the teachings of Benseghir into Zhang because doing so increases accuracy of instrument segmentation by accounting for patient movement (see Benseghir paragraphs 0034, 0038, and 0052).
Regarding claim 3, Zhang fails to disclose the limitations of claim 3.
Pertaining to the same field of endeavor, Benseghir discloses wherein said refining of the position prediction comprises applying a refinement network to the optical flow and the position prediction of the target (see figure 3, step 340 and paragraph 0034, where the estimated position & segmentation of the tool are corrected/refined based on the optical flow of the tissue and tool; paragraph 0019 specifies that deep learning techniques can be used for the disclosed image processing techniques).
Zhang and Benseghir are considered analogous art, as they are both directed to deep learning models for medical instrument segmentation. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have integrated the teachings of Benseghir into Zhang because doing so increases accuracy of instrument segmentation by accounting for patient movement (see Benseghir paragraphs 0034, 0038, and 0052).
Regarding claim 4, Zhang fails to disclose the limitations of claim 4.
Pertaining to the same field of endeavor, Benseghir discloses refining the segmentation of the context of the target based on the optical flow (see figure 3, step 340 and paragraph 0034, where the estimated position & segmentation of the tool are corrected/refined based on the optical flow of the tissue and tool).
Zhang and Benseghir are considered analogous art, as they are both directed to deep learning models for medical instrument segmentation. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have integrated the teachings of Benseghir into Zhang because doing so increases accuracy of instrument segmentation by accounting for patient movement (see Benseghir paragraphs 0034, 0038, and 0052).
Regarding claim 5, Zhang fails to disclose the limitations of claim 5.
Pertaining to the same field of endeavor, Benseghir discloses wherein the segmentation of the context of the target is refined further based on a vessel segmentation (see figure 5, steps 505-560 and paragraphs 0041-0048, where the method can include identifying anatomical features such as vessels, registering the tool to a model of the anatomy, and correcting the tool segmentation based on the optical flow of the anatomy).
Zhang and Benseghir are considered analogous art, as they are both directed to deep learning models for medical instrument segmentation. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have integrated the teachings of Benseghir into Zhang because doing so increases accuracy of instrument segmentation by accounting for patient movement (see Benseghir paragraph 0047).
Regarding claim 6, Zhang fails to disclose the limitations of claim 6.
Pertaining to the same field of endeavor, Benseghir discloses wherein the segmentation of the context of the target is refined in a spatial-temporal mask refinement (the correction processes cited in regards to claim 4 above combine both spatial information from the segmentation with temporal information from the optical flow to correct/refine the segmentation mask).
Zhang and Benseghir are considered analogous art, as they are both directed to deep learning models for medical instrument segmentation. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have integrated the teachings of Benseghir into Zhang because doing so increases accuracy of instrument segmentation by accounting for patient movement (see Benseghir paragraphs 0034, 0038, and 0052).
Regarding claim 7, Zhang in view of Benseghir discloses wherein the target is a tip of an interventional medical instrument, and wherein the context is a body of the interventional medical instrument extending away from the tip (see Zhang figure 1, segmentation result and figure 3, more segmentation results, where the segmentation includes both a position prediction of a tip of the instrument as well as of the entire body of the instrument).
Regarding claim 8, Zhang fails to disclose the limitations of claim 8.
Pertaining to the same field of endeavor, Benseghir discloses wherein the context are predefined anatomical features in a surrounding of the target (see figure 5, steps 505-560 and paragraphs 0041-0048, where the method can include identifying anatomical features such as vessels, registering the tool to a model of the anatomy, and correcting the tool segmentation based on the optical flow of the anatomy).
Zhang and Benseghir are considered analogous art, as they are both directed to deep learning models for medical instrument segmentation. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have integrated the teachings of Benseghir into Zhang because doing so increases accuracy of instrument segmentation by accounting for patient movement (see Benseghir paragraph 0047).
Regarding claim 11, Zhang discloses a processing device (see figure 1, operating console 142) comprising: a processor (see figure 1, processor 181), and a memory storing program code (see figure 1, memory 182), wherein the processor is configured to load and execute the program code, upon executing the program code, the processor being configured to (see paragraph 0021).
The remainder of claim 11 recites steps identical to those of claim 1. Therefore, Zhang in view of Benseghir discloses claim 11 as applied to claim 1 above.
Regarding claim 12, Zhang in view of Benseghir discloses claim 12 as applied to claim 7 above.
Claims 9-10 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. ("A Temporary Transformer Network for Guide-Wire Segmentation", CISP-BMEI paper, 7 December 2021) in view of Benseghir et al. (U.S. Publ. US-2023/0083936-A1), and further in view of Yan et al. ("Learning Spatio-Temporal Transformer for Visual Tracking", ICCV paper, 28 February 2022).
Regarding claim 9, Zhang in view of Benseghir fails to disclose the limitations of claim 9.
Pertaining to the same field of endeavor, Yan discloses wherein each of the one or more template images has at least one of a lower resolution or a smaller size than the search image (see figures 2 and 4, where the template images are smaller than the search regions/search images, while still including the object to be tracked; Section 4.1 specifies that the search images can be 320x320 pixels, whereas template images can be 128x128 pixels).
Zhang and Yan are considered analogous art, as they are both directed to deep learning models for object segmentation. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have integrated the teachings of Yan into Zhang and Benseghir because one of ordinary skill in the art would recognize that using lower resolutions/sizes for template images improves performance speed while still including the object to be tracked.
Regarding claim 10, Zhang in view of Benseghir fails to disclose the limitations of claim 10.
Pertaining to the same field of endeavor, Yan discloses wherein the fused features are determined using a vision transformer network (see figures 2, 4, and Section 3.1, where a transformer encoder fuses the features of the search and template images using a multi-head self-attention module).
Zhang and Yan are considered analogous art, as they are both directed to deep learning models for object segmentation. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have integrated the teachings of Yan into Zhang and Benseghir because the transformer encoder captures dependencies among all elements in the input sequence, improving detection accuracy (see Yan Section 3.1).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICHOLAS JOHN HELCO whose telephone number is (703)756-5539. The examiner can normally be reached on Monday-Friday from 9:00 AM to 5:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Bella, can be reached at telephone number 571-272-7778. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from Patent Center. Status information for published applications may be obtained from Patent Center. Status information for unpublished applications is available through Patent Center for authorized users only. Should you have questions about access to Patent Center, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) Form at https://www.uspto.gov/patents/uspto-automated- interview-request-air-form.
/NICHOLAS JOHN HELCO/Examiner, Art Unit 2667
/MATTHEW C BELLA/Supervisory Patent Examiner, Art Unit 2667