Last updated: April 19, 2026
Application No. 18/636,423
SYSTEMS AND METHODS FOR MULTI-CONTRAST MULTI-SCALE VISION TRANSFORMERS

Non-Final OA §101§102§103§112
Filed
Apr 16, 2024
Examiner
LY, TOMMY TAI
Art Unit
3797
Tech Center
3700 — Mechanical Engineering & Manufacturing
Assignee
Subtle Medical, Inc.
OA Round
1 (Non-Final)
Interview Optional

— +23.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 121 resolved cases, 2023–2026
Examiner Intelligence

LY, TOMMY TAI View full profile →
Grants 82% — above average
Career Allow Rate
99 granted / 121 resolved
+11.8% vs TC avg
Strong +23% interview lift
Without
With
+23.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 9m
Avg Prosecution
34 currently pending
Career history
155
Total Applications
across all art units
Statute-Specific Performance

§101
2.8%
-37.2% vs TC avg
§103
51.0%
+11.0% vs TC avg
§102
16.8%
-23.2% vs TC avg
§112
23.3%
-16.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 121 resolved cases
Office Action

§101 §102 §103 §112
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
This application is a continuation of PCT/US2022/048414 filed 10/31/2022 which claims benefit from provisional application 63/331,313 filed 04/15/2022 and claims further benefit from provisional application 63/276,301 filed 11/05/2021.

Information Disclosure Statement
The information disclosure statement (IDS) submitted was filed on 08/14/2024. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings are objected to because “database” in figure 13 is labeled with reference character “820” in the drawings but is labeled as “1320” in ¶ [0097] of the specification filed 04/16/2024. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
The disclosure is objected to because of the following informalities:
In ¶ [0061] of the specification filed 04/16/2024, two instances of reference character “257” should be changed to “247” to correctly match applicant’s figure 2B
Appropriate correction is required.

Claim Objections
Claim 15 is objected to because of the following informalities:
“attention scores indicative a relevance of a region” should be corrected:
“attention scores indicative of relevance of a region”
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 1 and 16 recite “wherein the multi-contrast image comprises one or more images of one or more different contrasts”. In the instance that the “multi-contrast” image comprises one image of one contrast, it is unclear whether this would qualify as a multi-contrast image when there is only one contrast, since the term “multi-contrast” implies multiple contrasts. For purposes of examination, it will be interpreted for the “multi-contrast image” to also include a single image of a single contrast as defined by the claims.
Claim 15 recites the limitation "the attention scores indicative a relevance of a region in the one or more images". There is insufficient antecedent basis for this limitation in the claim.
Claims 2-15 and 17-20 are further rejected by virtue of dependency on rejected claims 1 and 16 respectively.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Independent claims 1 and 16 recite receiving a multi-contrast image, generating an input to a transformer model based on the multi-contrast image, and generating, by the transformer model, a synthesized image having a target contrast that is different from the contrast(s) of the multi-contrast image. The transformer model recited amounts to a mathematical concept comprising a set of mathematical operations, i.e. a mathematical algorithm/formula, for transforming one (or more) image(s) into another. This judicial exception is not integrated into a practical application because these recited steps amount to applying mathematical concepts (i.e. the transformer model) to data (images) and therefore comprise data manipulation. MPEP § 2106.04(a)(2) states:
	“iv. organizing information and manipulating information through mathematical correlations, Digitech Image Techs., LLC v. Electronics for Imaging, Inc., 758 F.3d 1344, 1350, 111 USPQ2d 1717, 1721 (Fed. Cir. 2014). The patentee in Digitech claimed methods of generating first and second data by taking existing information, manipulating the data using mathematical functions, and organizing this information into a new form. The court explained that such claims were directed to an abstract idea because they described a process of organizing information through mathematical correlations, like Flook's method of calculating using a mathematical formula. 758 F.3d at 1350, 111 USPQ2d at 1721.”

Independent claims 1 and 16 include the additional elements comprising the method being computer-implemented and a non-transitory computer-readable storage medium executable by a processor to perform the method respectively. However, these additional elements recited amounts no more than mere instructions to apply the exception using generic computer components. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Claims 1 and 16 therefore do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Dependent claims 2-15 and 17-20 further recite similar process that, under its broadest reasonable interpretation, covers mathematical concepts and its use for data manipulation. The depending claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception, i.e. integrating the abstract idea into a practical application.
Accordingly, claims 1-20 are not patent eligible.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-6, 9, and 11-12 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Dalmaz (“ResViT: Residual vision transformers for multi-modal medical image synthesis”).
Regarding claim 1, Dalmaz teaches a computer-implemented method for synthesizing a contrast-weighted image (Abstract, wherein employing a residual vision transformer, i.e. ResVit, comprises a computer-implemented method; Pages 9-10, “Multi-Contrast MRI Synthesis”) comprising:
(a) receiving a multi-contrast image of a subject, wherein the multi-contrast image comprises one or more images of one or more different contrasts (Page 5 figure 2, wherein input images are shown to be of different contrasts);
(b) generating an input to a transformer model based at least in part on the multi- contrast image (Page 4 Figure 1, Page 5, “During training, ResViT takes as input the entire set of images within the multi-modal protocol, including both source and target modalities…Receiving as input the jth layer feature maps…where z0 ∈ ℝNP, ND denotes patch embeddings that the transformer encoder takes as input”; wherein the feature maps and patch embeddings are based on the multi-contrast images received earlier”); and
(c) generating, by the transformer model, a synthesized image having a target contrast that is different from the one or more different contrasts of the one or more images (Page 10 Figure 3, “ResViT was demonstrated on the IXI dataset for two representative many-to-one synthesis tasks: a) T1, T2 → PD, b) T2, PD → T1”), wherein the target contrast is specified in a query received by the transformer model (Page 10-11, Tables 1 & 2, wherein the task-specific ResVit model tasked with generating specific contrast-weighted image(s), i.e. T1, T2 → PD, T1, PD → T2, and T2, PD → T1, implies specifying said task in a query that is received by the model).
Regarding claim 2, Dalmaz teaches the invention as claimed above in claim 1.
Dalmaz further teaches wherein the multi-contrast image is acquired using a magnetic resonance (MR) device (Pages 7-8, “We demonstrated the proposed ResViT model on two multi-contrast brain MRI datasets”, wherein the IXI and BRATS datasets being MR images comprises the multi-contrast image being acquired by a MR device, Pages 9-10, “Multi-Contrast MRI Synthesis Experiments were conducted on the IXI and BRATS datasets to demonstrate synthesis performance in multi-modal MRI…many-to-one tasks of T1, T2 → PD; T1, PD → T2; T2, PD → T1…many-to-one tasks of T1, T2 → FLAIR; T1, FLAIR → T2; T2, FLAIR → T1 were considered”).
Regarding claim 3, Dalmaz teaches the invention as claimed above in claim 1.
Dalmaz further teaches wherein the input to the transformer model comprises an image encoding generated by a convolutional neural network (CNN) model (Page 4 Figure 1, Page 6, “Finally, the feature maps are processed via a residual CNN (ResCNN) to distill learned structural and contextual representations…The decoder receives as input the feature maps distilled by the information bottleneck and produces multi-modality images in separate channels”).
Regarding claim 4, Dalmaz teaches the invention as claimed above in claim 3.
Dalmaz further teaches wherein the image encoding is partitioned into image patches (Page 4 Figure 1, Page 5, “Accordingly, f’j is first split into non-overlapping patches of size (P, P), and the patches are then flattened”).
Regarding claim 5, Dalmaz teaches the invention as claimed above in claim 3.
Dalmaz further teaches wherein the input to the transformer model comprises a combination of the image encoding and a contrast encoding (Page 4 Figure 1, wherein image patches are input and feature maps are output comprises image encoding, Pages 10-11, Figures 3 & Tables 1 & 2, wherein transforming an image of one contrast, i.e. T1, T2, PD, into a synthesized image of a different contrast necessarily comprises contrast encoding).
Regarding claim 6, Dalmaz teaches the invention as claimed above in claim 1.
Dalmaz further teaches wherein the transformer model comprises: i) an encoder model receiving the input and outputting multiple representations of the input having multiple scales, ii) a decoder model receiving the query and the multiple representations of the input having the multiple scales and outputting the synthesized image (Page 4 Figure 1, wherein Figure 1 shows steps of down-sampling and up-sampling, resulting in outputting of multiple representations of input having multiple scales, and wherein figure 1 showing the Decoder stage coming after the down-sampling and up-sampling steps comprises the decoder model receiving the multiple representations of the input having multiple scales, Page 6, “The decoder receives as input the feature maps distilled by the information bottleneck”, Page 10, wherein task-specific synthesis implies given a task or query).
Regarding claim 9, Dalmaz teaches the invention as claimed above in claim 1.
Dalmaz further teaches wherein the transformer model is trained utilizing a combination of synthesis loss, reconstruction loss, and adversarial loss (Page 7, “Loss Function”, wherein pixel-wise Lpix loss that is defined between acquired and synthesized target modalities comprises synthesis loss, Lrec is reconstruction loss, and Ladv is adversarial loss).
Regarding claim 11, Dalmaz teaches the invention as claimed above in claim 1.
Dalmaz further teaches wherein the transformer model is capable of taking arbitrary number of contrasts as input (Page 10 last paragraph, wherein the ResViT model being capable of performing many-to-one and one-to-one tasks comprises it being capable of taking an arbitrary number of contrasts as input, i.e. one or many contrasts).
Regarding claim 12, Dalmaz teaches the invention as claimed above in claim 1.
Dalmaz teaches the invention further comprising displaying interpretation of the transformer model generating the synthesized image (Page 10, Figure 3, wherein figure 3 displaying the synthesized images of the transformer model ResVit comprises displaying an interpretation of the transformer model generating the synthesized image).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over Dalmaz (“ResViT: Residual vision transformers for multi-modal medical image synthesis”) in view of Liu (“Swin Transformer: Hierarchical Vision Transformer using Shifted Windows”). Liu is cited in the IDS filed 08/14/2024.
Regarding claim 7, Dalmaz teaches the invention as claimed above in claim 6.
However, Dalmaz fails to teach wherein the encoder model comprises a multi-contrast shifted window-based attention block.
In an analogous vision transformer field of endeavor, Liu teaches such a feature. Liu teaches a Swin transformer which uses shifted windows (Title, Abstract). Liu teaches transformer blocks with modified self-attention computation (Page 3, 3.1 Overall Architecture”). Liu teaches wherein the Swin transformer may comprise an encoder model (Page 4, Figure 3, wherein figure 3 depicts images being encoded) and wherein the Swin transformer blocks include a shifted window multi-head self attention module or block (SW-MSA) (Page 4, Figure 3, “Swin Transformer block”).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the invention of Dalmaz to have the encoder model include a shifted window-based attention module as taught by Liu (Page 4, Figure 3, “Swin Transformer block”). The shifted window scheme may have greater efficiency and flexibility as recognized by Liu (Abstract). Because Dalmaz teaches wherein inputs/outputs to their model are multi-contrast images, Dalmaz modified by the teachings of Liu to incorporate shifted window-based attention blocks into the encoder would predictably result wherein the encoder comprises a multi-contrast shifted window-based attention block.
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Dalmaz (“ResViT: Residual vision transformers for multi-modal medical image synthesis”) in view of Zhu (US20230100413).
Regarding claim 8, Dalmaz teaches the invention as claimed above in claim 6.
However, Dalmaz fails to teach wherein the decoder model comprises a multi-contrast shifted window-based attention block.
In an analogous transformer model architecture field of endeavor, Zhu teaches such a feature. Zhu teaches using transformer layers with shifted self-attention windows (Abstract, [0001]). Zhu teaches a transformer model including a encoder sub-network and decoder sub-network ([0070]). Zhu teaches the decoder is symmetric to the encoder ([0071]). Zhu teaches the encoder and decoder utilize modified self-attention computation with a shifted windows approach (Figs. 5A & 5B, [0072], [0116-0119]). Zhu teaches the shifted window transformer blocks of the decoder model can compute self-attention (Figs. 5B & 6A, [0119]) and thus comprise shifted window-based attention blocks.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the invention of Dalmaz to have the decoder model comprise shifted window-based attention blocks as taught by Zhu (Figs. 5B & 6A, [0119]). By having the decoder model include shifted window blocks which compute self attention, greater computation flexibility and efficiency/performance may be provided as recognized by Zhu ([0069], [0073]). Because Dalmaz teaches wherein inputs/outputs to their model are multi-contrast images, Dalmaz modified by the teachings of Zhu to incorporate shifted window-based attention blocks in the decoder would predictably result wherein the decoder comprises a multi-contrast shifted window-based attention block.
Claims 10 and 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Dalmaz (“ResViT: Residual vision transformers for multi-modal medical image synthesis”) in view Hagi (US20220180527).
Regarding claim 10, Dalmaz teaches the invention as claimed above in claim 1.
However, Dalmaz fails to explicitly teach wherein the transformer model is trained utilizing multi-scale discriminators.
In an analogous transformer model field of endeavor, Hagi teaches such a feature. Hagi teaches image synthesis by use of a transformer model (Abstract, [0061]). Hagi teaches wherein the trained model comprises a multi-scale discriminator comprising a plurality of single-scale discriminators which operate at different image scales comprising different resolutions ([0013], [0071]). Hagi teaches by using a multi-scale discriminator, samples/images that are indistinguishable from natural images may be generated ([0071]); multi-scale discriminators may help improve realism.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the invention of Dalmaz to train the transformer model by using a multi-scale discriminator as taught by Hagi ([0013], [0071]). Multi-scale discriminators may help improve realism of synthesized images as recognized by Hagi ([0071]).
Regarding claim 16, Dalmaz teaches:
(a) receiving a multi-contrast image of a subject, wherein the multi-contrast image comprises one or more images of one or more different contrasts (Page 5 figure 2, wherein input images are shown to be of different contrasts);
(b) generating an input to a transformer model based at least in part on the multi- contrast image (Page 4 Figure 1, Page 5, “During training, ResViT takes as input the entire set of images within the multi-modal protocol, including both source and target modalities…Receiving as input the jth layer feature maps…where z0 ∈ ℝNP, ND denotes patch embeddings that the transformer encoder takes as input”; wherein the feature maps and patch embeddings are based on the multi-contrast images received earlier”); and
(c) generating, by the transformer model, a synthesized image having a target contrast that is different from the one or more different contrasts of the one or more images (Page 10 Figure 3, “ResViT was demonstrated on the IXI dataset for two representative many-to-one synthesis tasks: a) T1, T2 → PD, b) T2, PD → T1”), wherein the target contrast is specified in a query received by the transformer model (Page 10-11, Tables 1 & 2, wherein the task-specific ResVit model tasked with generating specific contrast-weighted image(s), i.e. T1, T2 → PD, T1, PD → T2, and T2, PD → T1, implies specifying said task in a query that is received by the model).
However, Dalmaz fails to teach a non-transitory computer-readable storage medium including instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: (a), (b), and (c).
In an analogous transformer model field of endeavor, Hagi teaches such a feature. Hagi teaches image synthesis by use of a transformer model and wherein the input to the model comprise an image (Fig. 6, Abstract, [0061-0062], [0068]). Hagi teaches wherein the image synthesis or image transformation may be performed by a non-transitory computer-readable medium storing instructions which, when executed by a processor, causes the processor to perform the method/invention disclosed herein ([0026]).
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the invention of Dalmaz to have the method of image synthesis and/or transformation be stored as instructions in a non-transitory computer-readable storage medium that when executed, causes a processor to perform the invention as taught by Hagi (Fig. 6, [0026], [0061-0062], [0068]). Having the method be stored as instructions for a computer to perform may predictably allow for the method to be more automated, thereby minimizing manual work performed by a user.
Regarding claim 17, Dalmaz in view of Hagi teaches the invention as claimed above in claim 16.
Dalmaz further teaches wherein the multi-contrast image is acquired using a magnetic resonance (MR) device (Pages 7-8, “We demonstrated the proposed ResViT model on two multi-contrast brain MRI datasets”, wherein the IXI and BRATS datasets being MR images comprises the multi-contrast image being acquired by a MR device, Pages 9-10, “Multi-Contrast MRI Synthesis Experiments were conducted on the IXI and BRATS datasets to demonstrate synthesis performance in multi-modal MRI…many-to-one tasks of T1, T2 → PD; T1, PD → T2; T2, PD → T1…many-to-one tasks of T1, T2 → FLAIR; T1, FLAIR → T2; T2, FLAIR → T1 were considered”).
Regarding claim 18, Dalmaz in view of Hagi teaches the invention as claimed above in claim 16.
Dalmaz further teaches wherein the input to the transformer model comprises an image encoding generated by a convolutional neural network (CNN) model (Page 4 Figure 1, Page 6, “Finally, the feature maps are processed via a residual CNN (ResCNN) to distill learned structural and contextual representations…The decoder receives as input the feature maps distilled by the information bottleneck and produces multi-modality images in separate channels”).
Regarding claim 19, Dalmaz in view of Hagi teaches the invention as claimed above in claim 18.
Dalmaz further teaches wherein the image encoding is partitioned into image patches (Page 4 Figure 1, Page 5, “Accordingly, f’j is first split into non-overlapping patches of size (P, P), and the patches are then flattened”).
Regarding claim 20, Dalmaz in view of Hagi teaches the invention as claimed above in claim 18.
Dalmaz further teaches wherein the input to the transformer model comprises a combination of the image encoding and a contrast encoding (Page 4 Figure 1, wherein image patches are input and feature maps are output comprises image encoding, Pages 10-11, Figures 3 & Tables 1 & 2, wherein transforming an image of one contrast, i.e. T1, T2, PD, into a synthesized image of a different contrast necessarily comprises contrast encoding).
Claims 13 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Dalmaz (“ResViT: Residual vision transformers for multi-modal medical image synthesis”) in view Rubin (US20240371500).
Regarding claim 13, Dalmaz teaches the invention as claimed above in claim 12.
However, Dalmaz fails to teach wherein the interpretation is generated based at least in part on attention scores outputted by a decoder of the transformer model.
In an analogous vision transformer field of endeavor, Rubin teaches such a feature. Rubin teaches vision transformer models which may be based on an encoder-decoder architecture ([0008], [0047], [0053], 0057]). Rubin teaches the models receive input data comprising images ([0045-0046]). Rubin teaches the models output an attention map ([0075-0076]). Rubin teaches wherein the attention map (322) may be generated, i.e. outputted, by a decoder ([0015], [0075], [0100]). Rubin teaches wherein the attention maps are synthesized images based on an input image (Fig. 4, [0084]) and are thus an interpretation of the model generating a synthesized image. Rubin teaches the pixels depicted by the attention map depict parts of an image to pay the highest ‘attention’ to ([0084]) and are thus based on attention scores. Moreover, Rubin teaches displaying the generated attention map ([0016], [0104], [0140]). Since the attention map is a map of attention scores and generated by a decoder, Rubin therefore teaches generating an interpretation based at least in part on attention scores outputted by a decoder of a transformer model.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the invention of Dalmaz to generate and display an attention map as taught by Rubin (Fig. 4, [0007], [0015-0016], [0084]). Attention maps may support human visual clinical interpretation, provide transparency to black-box ai models, and/or provide a mechanism for clinical review as recognized by Rubin ([0104]).
Regarding claim 15, Dalmaz teaches the invention as claimed above in claim 12.
However, Dalmaz fails to teach wherein the interpretation comprises a visual representation of the attention scores indicative a relevance of a region in the one or more images or a contrast from the one or more different contrasts to the synthesized image.
In an analogous vision transformer field of endeavor, Rubin teaches such a feature. Rubin teaches vision transformer models which may be based on an encoder-decoder architecture ([0008], [0047], [0053], 0057]). Rubin teaches the models receive input data comprising images ([0045-0046]). Rubin teaches the models output an attention map ([0075-0076]). Rubin teaches wherein the attention maps are synthesized images based on an input image (Fig. 4, [0084]). Rubin teaches the pixels depicted by the attention map depict parts of an image to pay the highest ‘attention’ to, i.e. the ‘higher intensity’ pixels represent the pixels to pay the highest attention to, (Fig. 4, [0084]) and thus the higher intensity pixels comprise a visual representation of attention scores indicative of relevance of a region in the image.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the invention of Dalmaz to generate and display an attention map as taught by Rubin (Fig. 4, [0007], [0015-0016], [0084]). Attention maps may support human visual clinical interpretation, provide transparency to black-box ai models, and/or provide a mechanism for clinical review as recognized by Rubin ([0104]).
Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Dalmaz (“ResViT: Residual vision transformers for multi-modal medical image synthesis”) in view Lu (GAMER MRI: Gated-Attention Mechanism Ranking of Multi-Contrast MRI in Brain Pathology”).
Regarding claim 14, Dalmaz teaches the invention as claimed above in claim 12.
However, Dalmaz fails to teach wherein the interpretation comprises quantitative analysis of a contribution or importance of each of the one or more different contrasts.
In an analogous multi-contrast imaging field of endeavor, Lu teaches such a feature. Lu teaches GAMER MRI: gated-attention mechanism ranking of multi-contrast MRI, which computes attention weights (AWs) as proxies of importance of features in image classification (Abstract). Lu teaches there is a need to address the selection of the most informative MR contrast since acquiring multiple MR contrasts requires significant time (Page 2, 1. Introduction first paragraph). Lu teaches wherein the model receives as input multiple contrasts consisting of combinations of ADC, FLAIR, and Trace (Page 8, Table 4, wherein rmAWs comprises reported mean attention weights). Moreover, Lu teaches computing and displaying attention weights for each contrast, thereby ranking each contrast (Page 8, Tables 4 & 5). Lu therefore teaches displaying an interpretation comprising a quantitative analysis of importance of each of the one or more different contrasts.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the invention of Dalmaz to calculate and display the attention weights of each inputted contrast as taught by Lu (Abstract, Page 8, Tables 4 & 5). By calculating the attention weight of each contrast, a clinician may know which contrast is the most important for classifying/diagnosing a certain pathology such as infarct strokes as recognized by Lu (Abstract, Page 10, 4.5 Conclusion).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TOMMY T LY whose telephone number is (571) 272-6404. The examiner can normally be reached M-F 12:00pm-8:00pm eastern time.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Anhtuan Nguyen can be reached at 571-272-4963. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TOMMY T LY/           Examiner, Art Unit 3797                                                                                                                                                                                             
/SERKAN AKAR/           Primary Examiner, Art Unit 3797
Read full office action
Prosecution Timeline

Apr 16, 2024
Application Filed
Dec 30, 2025
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/628,341
Patent 12599786
ULTRASOUND DEVICE WITH ATTACHABLE COMPONENTS
2y 5m to grant Granted Apr 14, 2026
18/205,809
Patent 12588898
ULTRASOUND IMAGING TECHNIQUES FOR SHEAR-WAVE ELASTOGRAPHY
2y 5m to grant Granted Mar 31, 2026
18/763,001
Patent 12564379
INTRACAVITARY INSERTION TYPE ULTRASOUND PROBE
2y 5m to grant Granted Mar 03, 2026
18/315,260
Patent 12558032
Wearable Computing Device having a Control Circuit to Detect Aggressors Affecting Photoplethysmography (PPG) Data
2y 5m to grant Granted Feb 24, 2026
18/658,713
Patent 12551180
METHODS, SYSTEMS, AND DEVICES FOR ANALYZING LUNG IMAGING DATA
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
82%
Grant Probability
99%
With Interview (+23.4%)
2y 9m
Median Time to Grant
Low
PTA Risk
Based on 121 resolved cases by this examiner. Grant probability derived from career allow rate.