Prosecution Insights
Last updated: April 19, 2026
Application No. 18/531,536

PROCESSING MEDICAL IMAGES BASED ON MACHINE LEARNING MODELS

Final Rejection §103
Filed
Dec 06, 2023
Examiner
WALLACE, JOHN R
Art Unit
2682
Tech Center
2600 — Communications
Assignee
Shanghai United Imaging Intellgence Co. Ltd.
OA Round
2 (Final)
77%
Grant Probability
Favorable
3-4
OA Rounds
2y 9m
To Grant
99%
With Interview

Examiner Intelligence

Grants 77% — above average
77%
Career Allow Rate
283 granted / 366 resolved
+15.3% vs TC avg
Strong +26% interview lift
Without
With
+26.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
22 currently pending
Career history
388
Total Applications
across all art units

Statute-Specific Performance

§101
6.9%
-33.1% vs TC avg
§103
60.1%
+20.1% vs TC avg
§102
12.9%
-27.1% vs TC avg
§112
18.0%
-22.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 366 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Arguments Applicant’s arguments with respect to claim(s) 1-9 and 11-20 have been considered but are moot because the new ground of rejection does not rely on the combination of references applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Specifically, see the citation of the Kirillov et al. (“Segment Anything”, copy attached, see PTO-892) reference in the rejection that follows. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1, 2, 4-7, 9, 11, 12, 14-17, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Kirillov et al. (“Segment Anything”, copy attached, see PTO-892) in view of Zhao (U.S.P.G. Pub. No. 2025/0061566). Regarding claim 1, Kirillov et al. (“Segment Anything”, copy attached, see PTO-892) discloses: An apparatus, comprising: one or more processors configured to: obtain a image (see Figure 1, Abstract – segmentation on images); receive, from a user of the apparatus, a natural language based prompt to tag a target structure in the image (see Figure 1, promptable engine for segmentation; page 4, “2. Segment Anything Task”, prompt from NLP to segmentation); determine one or more text features associated with the natural language based prompt (pages 2, 4, the prompt includes a variety of information indicating what to segment including, but not limited to free-form text that describes a part of an image); identify one or more visual features corresponding to the target structure based on the one or more text features associated with the natural language based prompt and a machine-learning (ML) model (page 4, the NLP indicates what to segment in the image; see also pages 5-6 regarding annotation of structures), wherein the ML model has been pre-trained to learn a correspondence between a plurality of text embeddings and a plurality visual embeddings in an embedding space (page 4, pre-training; see also pages 5-6 regarding assisted-manual, semi-automatic annotation), and wherein the one or more processors are configured to identify the one or more visual features corresponding to the target structure by determining, based on the correspondence between the plurality of text embeddings and the plurality visual embeddings learned by the ML model, that the one or more visual features are correlated to the one or more text features associated with the natural language based prompt (Figure 1, 4, pages 5-6, the corresponding structures are annotated); and tag the target structure in the image based on the one or more identified visual features (Figure 1, 4, pages 5-6, the corresponding structures are annotated/tagged/segmented; see additionally “Masks” that are output) Kirillov does not explicitly disclose: obtain a medical image; tagging a medical image; Zhao (U.S.P.G. Pub. No. 2025/0061566) discloses: obtain a medical image (paragraphs [0022]-[0024], medical image data); tagging a medical image (paragraphs [0023]-[0024], [0028], [0031], DICOM file, for example, includes text strings indicating content of the medical image data – e.g., “’NERUO^HEAD” describes head region data– but other attributes can be ‘Series Description’ or ‘Body Part Examined’; further, EMR includes findings in medical records in text; see also paragraph [0036], the processing module can further highlight the detected objects of interest or display a text label describing an aspect of the image that can be displayed with the user selects the object) Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the system of Zhao with the system of Kirillov such that the image obtained was a medical image and the medical image was then tagged as described in Zhao. The suggestion/motivation would have been in order to implement a system capable of “increasing efficiency” and have a “significant amount of time [be] saved” within the application of “searching for findin[g] matching objects of interest described in the EMR” (paragraph [0018] of the Zhao reference). Regarding claim 2, the combination of Kirillov and Zhao discloses the apparatus of the parent claim (claim 1). Kirillov additionally discloses: wherein the ML model comprises a vision transformer configured to encode image features of the image (pages 2, 4, image encoder), the ML model further comprising a text encoder configured to encode the natural language based prompt (pages 2, 4, the image is the prompt includes a variety of information indicating what to segment including, but not limited to free-form text that describes a part of an image); Kirillov does not explicitly disclose: Wherein the image is a medical image; Zhao additionally discloses: wherein the ML model comprises a vision transformer configured to encode image features of the medical image (paragraphs [0030], the classifier is used to identify a type of organ), the ML model further comprising a text encoder configured to encode the prompt (paragraphs [0029], [0031]-[0033], the image is segmented via machine learning/AI to parse the image into particular structures using image and textual descriptions) Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the system of Zhao with the system of Kirillov such that the image obtained was a medical image and the medical image was then tagged as described in Zhao. The suggestion/motivation would have been in order to implement a system capable of “increasing efficiency” and have a “significant amount of time [be] saved” within the application of “searching for findin[g] matching objects of interest described in the EMR” (paragraph [0018] of the Zhao reference). Regarding claim 4, the combination of Kirillov and Zhao discloses the apparatus of the parent claim (claim 1). Kirillov additionally discloses: wherein the one or more processors being configured to tag the target structure in the image comprises the one or more processors being configured to generate a heatmap that indicates a location of the target structure in the image (Figure 5, page 6, a probability associated with the masks is generated; this constitutes a “heatmap” since it correlates location and probability of the masks) As previously noted, Zhao discloses: Wherein the image is a medical image (paragraphs [0022]-[0024], medical image data); Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the system of Zhao with the system of Kirillov such that the image obtained was a medical image and the medical image was then tagged as described in Zhao. The suggestion/motivation would have been in order to implement a system capable of “increasing efficiency” and have a “significant amount of time [be] saved” within the application of “searching for findin[g] matching objects of interest described in the EMR” (paragraph [0018] of the Zhao reference). Regarding claim 5, the combination of Kirillov and Zhao discloses the apparatus of the parent claim (claim 1). Kirillov additionally discloses: wherein the one or more processors are further configured to: obtain multiple textual descriptions associated with a set of image classification labels (page 5, various annotated labels described by text); pair the medical image with one or more of the multiple text descriptions to obtain one or more image-text pairs (pages 6-7, the images are matched to the text labels); and determine a class of the image based on the one or more image-text pairs and the correspondence between the plurality of text embeddings and the plurality visual embeddings learned by the ML model (pages 6-7, the appropriate label is assigned to the image in association with the generated mask) As previously noted, Zhao discloses: Wherein the image is a medical image (paragraphs [0022]-[0024], medical image data); Zhao additionally discloses: obtain multiple textual descriptions associated with a set of image classification labels (paragraphs [0023]-[0024], [0028], [0031], DICOM file, for example, includes text strings indicating content of the medical image data – e.g., “’NERUO^HEAD” describes head region data– but other attributes can be ‘Series Description’ or ‘Body Part Examined’; further, EMR includes findings in medical records in text); pair the medical image with one or more of the multiple text descriptions to obtain one or more corresponding image-text pairs (paragraphs [0029]-[0031], the image is used in conjunction with the associated text data from attributes and EMR); classify the medical image based on a machine-learning (ML) model and the one or more image-text pairs, wherein the ML model is configured to predict respective similarities between the medical image and the corresponding text descriptions in the one or more image-text pairs (paragraphs [0029], [0031]-[0033], the image is segmented via machine learning/AI to parse the image into particular structures using both image and textual descriptions), and wherein the one or more processors are configured to determine a class of the medical image by comparing the similarities predicted by the ML model (paragraphs [0032]-[0033], [0035], the descriptions in EMR can be used to identify WHAT – a particular abnormality or disease/pathology findings and WHERE– the anatomical location of the abnormality) Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the system of Zhao with the system of Kirillov such that the image obtained was a medical image and the medical image was then tagged as described above in regards to Zhao. The suggestion/motivation would have been in order to implement a system capable of “increasing efficiency” and have a “significant amount of time [be] saved” within the application of “searching for findin[g] matching objects of interest described in the EMR” (paragraph [0018] of the Zhao reference). Regarding claim 6, the combination of Kirillov and Zhao discloses the apparatus of the parent claim (claim 5). Kirillov does not explicitly disclose: wherein the set of image classification labels identifies multiple body parts, multiple imaging modalities, multiple image views, or multiple imaging protocols. Zhao discloses: wherein the set of image classification labels identifies multiple body parts, multiple imaging modalities, multiple image views, or multiple imaging protocols (paragraphs [0023]-[0024], [0028], [0031], DICOM file, for example, includes text strings indicating content of the medical image data – e.g., “’NERUO^HEAD” describes head region data– but other attributes can be ‘Series Description’ or ‘Body Part Examined’. The DICOM schema allows for classification of a variety of different body parts); Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the system of Zhao with the system of Kirillov such that the image obtained was a medical image and the medical image was then tagged with the set of image classification labels identifies multiple body parts, multiple imaging modalities, multiple image views, or multiple imaging protocols as described in Zhao. The suggestion/motivation would have been in order to implement a system capable of “increasing efficiency” and have a “significant amount of time [be] saved” within the application of “searching for findin[g] matching objects of interest described in the EMR” (paragraph [0018] of the Zhao reference). Regarding claim 7, the combination of Kirillov and Zhao discloses the apparatus of the parent claim (claim 5). Kirillov does not explicitly disclose wherein at least one of the multiple textual descriptions includes a negation of an association between the medical image and one of the image classification labels Zhao additionally discloses: wherein at least one of the multiple textual descriptions includes a negation of an association between the medical image and one of the image classification labels (paragraphs [0035]-[0036], for example, the textual data may indicate a nodule is not malignant or non-cancerous) Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the system of Zhao with the system of Kirillov such that the image obtained was a medical image and the medical image was then tagged with at least one of the multiple textual descriptions includes a negation of an association between the medical image and one of the image classification labels as described in Zhao. The suggestion/motivation would have been in order to implement a system capable of “increasing efficiency” and have a “significant amount of time [be] saved” within the application of “searching for findin[g] matching objects of interest described in the EMR” (paragraph [0018] of the Zhao reference). Regarding claim 9, the combination of Kirillov and Zhao discloses the apparatus of the parent claim (claim 1). Kirillov additionally discloses: wherein the one or more processors are further configured to track the target structure in one or more other images based on the natural language based prompt and the ML model (pages 4-6, the system can segment further images for the same structure based on the NL prompt and the model) As previously noted, Zhao discloses: Wherein the image is a medical image (paragraphs [0022]-[0024], medical image data); Regarding claim 11, the structural elements of apparatus claim 1 perform all of the steps of method claim 11. Thus, claim 11 is rejected for the same reasons discussed in the rejection of claim 1. Regarding claim 12, the structural elements of apparatus claim 2 perform all of the steps of method claim 12. Thus, claim 12 is rejected for the same reasons discussed in the rejection of claim 2. Regarding claim 14, the structural elements of apparatus claim 4 perform all of the steps of method claim 14. Thus, claim 14 is rejected for the same reasons discussed in the rejection of claim 4. Regarding claim 15, the structural elements of apparatus claim 5 perform all of the steps of method claim 15. Thus, claim 15 is rejected for the same reasons discussed in the rejection of claim 5. Regarding claim 16, the structural elements of apparatus claim 6 perform all of the steps of method claim 16. Thus, claim 16 is rejected for the same reasons discussed in the rejection of claim 6. Regarding claim 17, the structural elements of apparatus claim 7 perform all of the steps of method claim 17. Thus, claim 17 is rejected for the same reasons discussed in the rejection of claim 7. Regarding claim 19, the structural elements of apparatus claim 9 perform all of the steps of method claim 19. Thus, claim 19 is rejected for the same reasons discussed in the rejection of claim 9. Regarding claim 20, arguments analogous to claims 1 and 11 are applicable. The computer readable medium is inherently taught as evidenced by the discussion of the computerized models through in Kirillov and inherently taught in Zhao et al. as evidenced by the electronic computing device executing sequences of instructions designed to implement the disclosed methods (see paragraph [0012] of Zhao et al.) Claim(s) 3 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Kirillov in view of Zhao as applied above, further in view of Reicher (U.S.P.G. Pub. No. 2023/0335261). Regarding claim 3, the combination of Kirillov and Zhao discloses the apparatus of the parent claim (claim 1). The combination of Kirillov and Zhao does not explicitly disclose: wherein the natural language based prompt includes a voice prompt provided by the user of the apparatus. Reicher et al. (U.S.P.G. Pub. No. 2023/0335261) discloses: wherein the natural language based prompt includes a voice prompt provided by the user of the apparatus (paragraphs [0005], [0060]-[0061], the user can prompt with voice in NL to help determine an area of interest) Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the system of Reicher with the combination of Zhao and Kirillov such that the natural language based prompt would include a voice prompt provided by the user of the apparatus as described in Reicher. The suggestion/motivation would have been in order to implement a system capable of “facilitate[ing] more accurate and simplified reporting” of medical images (paragraph [0005] of the Reicher reference). Regarding claim 13, the structural elements of apparatus claim 3 perform all of the steps of method claim 13. Thus, claim 13 is rejected for the same reasons discussed in the rejection of claim 3. Claim(s) 8 and 18 rejected under 35 U.S.C. 103 as being unpatentable over Kirillov in view of Zhao as applied above, further in view of Jain (U.S.P.G. Pub. No. 2024/0028831). Regarding claim 8, the combination of Kirillov and Zhao discloses the apparatus of the parent claim (claim 1). Kirillov does not explicitly disclose: wherein the ML model is trained using a training dataset comprising multiple training image-text pairs, wherein each training image-text pair includes a training image and a training textual description, wherein the ML model is trained to learn a similarity or dissimilarity between the training image and the training textual description in each training image-text pair based on a contrastive learning technique, and wherein the class of the medical image is not present in the training dataset. Jain discloses: wherein the ML model is trained using a training dataset comprising multiple training image-text pairs, wherein each training image-text pair includes a training image and a training textual description, wherein the ML model is trained to learn a similarity or dissimilarity between the training image and the training textual description in each training image-text pair based on a contrastive learning technique, and wherein the medical image is not present in the training dataset (paragraphs [0041]-[0041], for example, training data for second associations) Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to combine the system of Jain with the combination of Zhao and Kirillov such that the ML model is trained using a training dataset comprising multiple training image-text pairs, wherein each training image-text pair includes a training image and a training textual description, wherein the ML model is trained to learn a similarity or dissimilarity between the training image and the training textual description in each training image-text pair based on a contrastive learning technique, and wherein the class of the medical image is not present in the training dataset as described in Jain. The suggestion/motivation would have been in order to implement a system capable of “ensur[ing] that the generated associations are based on the content similarity of image and textual data” (paragraph [0040] of the Jain reference) to thereby improve accuracy. Regarding claim 18, the structural elements of apparatus claim 8 perform all of the steps of method claim 18. Thus, claim 18 is rejected for the same reasons discussed in the rejection of claim 8. Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHN R WALLACE whose telephone number is (571)270-1577. The examiner can normally be reached Monday-Friday from 8:30-5 PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Benny Tieu can be reached at 571-272-7490. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /JOHN R WALLACE/Primary Examiner, Art Unit 2682
Read full office action

Prosecution Timeline

Dec 06, 2023
Application Filed
Oct 17, 2025
Non-Final Rejection — §103
Jan 21, 2026
Response Filed
Feb 10, 2026
Applicant Interview (Telephonic)
Feb 10, 2026
Examiner Interview Summary
Mar 13, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12596509
IMAGE FORMING APPARATUS AND NON-TRANSITORY RECORDING MEDIUM STORING COMPUTER READABLE CONTROL PROGRAM
2y 5m to grant Granted Apr 07, 2026
Patent 12592080
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, AND RECORDING MEDIUM
2y 5m to grant Granted Mar 31, 2026
Patent 12591953
Image reassembly system, method and computer-readable storage medium applied to magnetic resonance imaging
2y 5m to grant Granted Mar 31, 2026
Patent 12584845
A SYSTEM AND METHOD THEREOF FOR REAL-TIME AUTOMATIC LABEL-FREE HOLOGRAPHY-ACTIVATED SORTING OF CELLS
2y 5m to grant Granted Mar 24, 2026
Patent 12585414
METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM FOR ADJUSTING DOCUMENT STYLE
2y 5m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
77%
Grant Probability
99%
With Interview (+26.1%)
2y 9m
Median Time to Grant
Moderate
PTA Risk
Based on 366 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month