Last updated: April 19, 2026
Application No. 18/674,796
SYSTEMS AND METHODS FOR MULTIMODALITY FUSION OF MEDICAL DATA SOURCES

Non-Final OA §101§102§103§112
Filed
May 24, 2024
Examiner
HRANEK, KAREN AMANDA
Art Unit
3684
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Wisconsin Alumni Research Foundation
OA Round
1 (Non-Final)
This examiner grants 36% of cases after interview

— +46.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 172 resolved cases, 2023–2026
Examiner Intelligence

HRANEK, KAREN AMANDA View full profile →
Grants only 36% of cases
Career Allow Rate
62 granted / 172 resolved
-16.0% vs TC avg
Strong +47% interview lift
Without
With
+46.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 7m
Avg Prosecution
49 currently pending
Career history
221
Total Applications
across all art units
Statute-Specific Performance

§101
30.3%
-9.7% vs TC avg
§103
35.3%
-4.7% vs TC avg
§102
10.6%
-29.4% vs TC avg
§112
20.3%
-19.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 172 resolved cases
Office Action

§101 §102 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
This application does not claim priority to any other patent document and is thus afforded a priority date corresponding to the filing date of 5/24/2024. 

Status of the Claims
Claims 1-26 are currently pending and have been considered below. 

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 10/18/2024 is in accordance with the provisions of 37 CFR 1.97 and is considered by the Examiner. 

Claim Objections
Claims 1-21 are objected to because of the following informalities: 
Claims 1 and 14 each recite determining “a matrix of features for each of the plurality of data modalities” (emphasis added). However, claim 1 only previously introduces “at least three datasets from differing modalities” while claim 14 previously introduces “a plurality of datasets from different modalities,” rendering the nomenclature of the modalities inconsistent within each claim. Claims 6 and 19 also each reference “the plurality of data modalities” and are thus similarly inconsistent. Claims 2-13 and 15-21 are also objected to on this basis because they inherit the objectionable language due to their dependence on claims 1 and 14, respectively. 
Claims 6 and 19 each recite “the omic-histology-directed data” to reference “the omic-histology-directed radiology data” of parent claims 5 and 18; the nomenclature of this type of data should be uniform between claims. Claims 7 and 20 are also objected to on this basis because they inherit the objectionable language due to their dependence on claims 6 and 19, respectively. 
Claim 13 recites “the signature,” which is inconsistent with the previously introduced nomenclature of “the integrated prognostic signature” of parent claim 1. 
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 7, 11, 20, and 24 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 7 and 20 each recite “concatenate the one or more features and into a plurality of fully connected layers” which is confusingly worded such that the scope of each claim is unclear. For instance, this language could be read as “concatenate the one or more features into a plurality of fully connected layers,” implying that the concatenated features form the fully connected layers, or it could be read as “concatenate the one or more features and input the one or more concatenated features into a plurality of fully connected layers,” implying that the concatenated features are utilized as input into a separate entity comprising a plurality of fully connected layers. For purposes of examination, Examiner will utilize the latter interpretation, which appears to be in line with the disclosure of para. [0031] of Applicant’s specification. Examiner separately notes that the scope of this claim appears to include concatenating a single feature (“concatenate the one or more features”), and it is unclear how “one” feature would be concatenated because concatenation is understood as the process of joining or merging at least two inputs. 
Claim 11 recites “the imaging scans.” There is insufficient antecedent basis for this limitation in the claim because parent claim 10 only introduces “imaging slices,” not “imaging scans.” Because imaging “slices” versus “scans” may be slightly different in scope (e.g. a “scan” could include multiple slices), it is unclear whether “the imaging scans” is referencing only the previously-introduced imaging slices of claim 10, or whether this element is broader in scope. For purposes of examination, “the imaging scans” will be interpreted as referencing “the imaging slices” as introduced in claim 10. 
Claim 24 recites “wherein the report that includes an integrated marker that is diagnostic of a disease or prognostic of disease outcome” which is confusingly worded such that the scope of the claim is unclear. For instance, this language could be read as “wherein the report includes an integrated marker that is diagnostic of a disease or prognostic of disease outcome,” implying that the claim is merely further describing the content of the report as including an integrated marker with diagnostic or prognostic implications, or it could be read as “wherein the report that includes an integrated marker is diagnostic of a disease or prognostic of disease outcome,” implying that the report itself is the element that is diagnostic of a disease or prognostic of disease outcome. For purposes of examination, Examiner will utilize the former interpretation, which appears to be in line with the disclosure of paras. [0007] & [0033] of Applicant’s specification. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-26 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1
In the instant case, claims 1-13 and 22-26 are directed to systems (i.e. machines) and claims 14-21 are directed to a method (i.e. a process). Thus, each of the claims falls within one of the four statutory categories. Nevertheless, the claims fall within the judicial exception of an abstract idea.
Step 2A – Prong 1
Independent claims 1, 14, and 22 recite steps that, under their broadest reasonable interpretations, cover certain mathematical concepts as well as methods of organizing human activity, e.g. managing personal behavior, relationships, or interactions between people. Specifically, claim 1 (as representative) recites:
a computing device comprising: a memory storing instructions; a processor configured to access the memory to execute the instructions and, thereby, be caused to: 
receive at least three datasets from differing modalities associated with a disease of a subject; 
determine a matrix of features for each of the plurality of data modalities; 
integrate the matrix of features for each of the data modalities through a sequential, hierarchical structure to create an integrated prognostic signature for the subject; 
generate a report using the integrated prognostic signature indicating at least one of a diagnosis of the disease or a response of the subject to a treatment; and 
a display configured to display the report. 
But for the recitation of generic computer components like a computing device and a display, the italicized functions, when considered as a whole, describe a mathematical procedure for data fusion as well as a clinical report generation operation that could be achieved by a human actor such as a clinician or other medical professional managing their personal behavior and/or interactions with others. For example, a clinician could obtain datasets from at least three different clinical modalities associated with a disease of a subject (e.g. by referencing a patient’s chart, ordering laboratory or imaging tests and receiving the results, etc.), determine a matrix of features for each modality via mathematical analysis, integrate the matrix through a mathematically-based sequential, hierarchical structure/framework (e.g. by performing dot product or other mathematical data fusion techniques) to come up with an integrated prognostic signature for the subject, and generate a visual report based on the prognostic signature (e.g. by writing out a hard copy of a report of diagnostic findings). 
“Unless it is clear that a claim recites distinct exceptions, such as a law of nature and an abstract idea, care should be taken not to parse the claim into multiple exceptions, particularly in claims involving abstract ideas.” MPEP 2106.04, subsection II.B. However, if possible, the examiner should consider the limitations together as a single abstract idea rather than as a plurality of separate abstract ideas to be analyzed individually. “For example, in a claim that includes a series of steps that recite mental steps as well as a mathematical calculation, an examiner should identify the claim as reciting both a mental process and a mathematical concept for Step 2A, Prong One to make the analysis clear on the record.” MPEP 2106.04, subsection II.B. Under such circumstances, however, the Supreme Court has treated such claims in the same manner as claims reciting a single judicial exception. Id. (discussing Bilski v. Kappos, 561 U.S. 593 (2010)). Here, the matrix-related steps fall within the mathematical concepts grouping of abstract ideas, and the remaining steps directed to gathering clinical data and generating a report based on a prognostic indicator fall within the certain methods of organizing human activity grouping of abstract ideas. These limitations are considered together as a single abstract idea for further analysis. Accordingly, claim 1 recites an abstract idea in the form of mathematical concepts and a certain method of organizing human activity. Claims 14 and 22 recite substantially similar subject matter as claim 1 and are found to recite an abstract idea under the same analysis. 
Dependent claims 2-13, 15-21, and 23-26 inherit the limitations that recite an abstract idea from their dependence on claims 1, 14, and 22, respectively, and thus these claims also recite an abstract idea under the Step 2A – Prong 1 analysis. In addition, claims 2-13, 15-21, and 23-26 recite additional limitations that further describe the abstract idea identified in the independent claims. Specifically, claims 2, 8-12, 15, 21, 23, and 26 merely describe types of modality data that are integrated, each of which are types of datasets that a clinician would be capable of ordering and receiving results about for further diagnostic analysis. Claims 3-7, 16-20, and 25 recite additional details about the mathematical operations of the matrix-related data integration, including using an attention model, performing first and second co-attention mechanisms using different types of data as query, key, and value, using transformers and global attention pooling to generate features, and concatenating features and using fully connected layers. These limitations merely further describe the mathematical concepts utilized by the claims to integrate and evaluate the clinical data. Claims 13 and 24 specify example types of signature / prognostic marker, each of which are types of diagnostic indicators that a clinician would be capable of determining and evaluating from clinical data as a basis for a patient report. 
However, recitation of an abstract idea is not the end of the analysis. Each of the claims must be analyzed for additional elements that indicate the abstract idea is integrated into a practical application to determine whether the claim is considered to be “directed to” an abstract idea.

Step 2A – Prong 2
The judicial exception is not integrated into a practical application. In particular, independent claims 1, 14, and 22 do not include additional elements that integrate the abstract idea into a practical application. The additional elements of claim 1 include a computing device comprising a memory storing instructions and a processor configured to access the memory to execute the instructions and thereby be caused to perform the functions of the invention, as well as a display configured to display the report. Claim 14 includes the additional elements of a system comprising a processor for performing the functions of the method. Claim 22 includes the additional elements of a plurality of modality data sources that store the modality data, a computing system configured to perform the functions of the invention, and a display configured to display the report. These additional elements, when considered in the context of each claim as a whole, merely serve to automate/digitize the abstract idea and thus amount to instructions to “apply” the abstract idea using generic computer components (see MPEP 2106.05(f)). For example, a clinician could order and receive the results of various clinical modality tests, and use of electronic data sources and a processor to facilitate the receipt of datasets from a plurality of modalities merely digitizes this otherwise-abstract data communication function. Further, various matrix-based mathematical operations may be performed on the datasets to integrate them, and use of a computer processor to perform these otherwise-abstract functions merely invokes the computer as a tool with which to digitize and/or automate the abstract idea. Finally, a clinician is capable of generating and visualizing a clinical report based on a prognostic signature representing mathematically integrated clinical modalities, and use of a processor and display to perform these steps merely digitizes and/or automates these otherwise-abstract report generation and outputting steps such that they occur in a digital environment. Accordingly, claims 1, 14, and 22 as a whole are each directed to an abstract idea without integration into a practical application.
The judicial exception recited in dependent claims 2-13, 15-21, and 23-26 is also not integrated into a practical application under a similar analysis as above. The functions of these dependent claims are performed with the same additional elements introduced in the independent claims, without introducing any new additional elements of their own, and accordingly also amount to mere instructions to “apply” the abstract idea using these same additional elements. 
Accordingly, the additional elements of claims 1-26 do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Claims 1-26 are directed to an abstract idea.
Step 2B
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of data sources, a computing device with a processor executing instructions stored in memory, and a display for performing the receiving, determining, integrating, generating, displaying, etc. steps of the invention amount to mere instructions to apply the exception using generic computer components. As evidence of the generic nature of the above recited additional elements, Examiner notes Fig. 1 and para. [0025] of Applicant’s specification, where the computing device is disclosed in terms of unspecified memory, processor, and display components, leaving one of ordinary skill in the art to understand that any known computing device with such generic features may be utilized to implement the invention. Examiner notes that the hardware configuration of the modality data sources is not specified anywhere in the specification, leaving one of ordinary skill in the art to understand that any known electronic means of data storage could be utilized as the data sources. Examiner further notes that receiving or transmitting data over a network and storing and retrieving information in memory (i.e. receiving data from modality data sources and outputting a report to a display), as well as performing repetitive calculations (i.e. determining matrices and hierarchically integrating the matrices) are recognized as well-understood, routine, and conventional computing functions previously known to the industry, as outlined in MPEP 2106.05(d)(II). 
Further, the combination of these additional hardware elements is not expanded upon in the specification as a unique arrangement and as such relies on the knowledge of one of ordinary skill in the art to understand the combination of components within a computer system as a well-known and generic combination for automating/digitizing an abstract idea and thus do not provide an inventive concept. Additionally, the combination of clinical modality data sources and a processor-based computing device with a display used for multimodal data integration and diagnostic reporting is well-understood, routine, and conventional, as evidenced by at least Mahmood et al. (US 20220367053 A1) abstract & Figs. 1-4; Braman et al. (US 20220292674 A1) abstract & Figs. 1-2 & 10; Gao et al. (US 20250308663 A1) Figs. 22-25; and Klaiman et al. (US 20250046454 A1) paras. [0017]-[0023] & [0113]. 
Regarding the functional additional elements, as noted above, the steps of receiving data at electronic devices, storing data at the server, and searching the database via generated search parameters amount to insignificant extra-solution activity. These activities are also nothing more than those recognized as well-understood, routine, and conventional computer functions performed using generic computer components; for example, receiving or transmitting data over a network (i.e. receiving/transmitting information electronically as in claims 1 and 4, receiving further patient information in various forms as in claims 2, 5, 16-18, 20, 21, and 28-29, etc.) and storing and retrieving information in memory (i.e. storing treatment plan templates as in claims 1 and 4, searching a database for test results or requests as in claims 10 and 26, etc.) are each recognized as well-understood, routine, and conventional functions previously known to the industry, as outlined in MPEP 2106.05(d)(II). Thus, when considered as a whole and in combination, claims 1-26 are not patent eligible. 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim 14 is rejected under 35 U.S.C. 102(a)(1) as being anticipated by Mahmood et al. (US 20220367053 A1).
Claim 14
Mahmood teaches a method comprising: 
receiving, by a system comprising a processor, a plurality of datasets from different modalities associated with a disease of a subject (Mahmood Fig. 4, [0005], [0052], [0067]-[0068], noting a processor-based system receives histopathology information and omics data associated with a subject’s disease from two different datastores); 
determining, by the processor, a matrix of features for each of the plurality of data modalities (Mahmood Fig. 4, [0052], [0067]-[0068], noting determination of a matrix of features for each modality dataset); 
integrating, by the processor, the matrix of features through a sequential, hierarchical structure to create an integrated prognostic signature for the subject (Mahmood Fig. 4, [0052], [0069], [0071], [0086]-[0087], noting the determined matrices are integrated into a multimodal tensor representing bimodal and trimodal interactions between each modality (i.e. an integrated prognostic signature for the subject) using Kronecker product and gating-based attention mechanisms (i.e. a sequential, hierarchical structure)); 
creating, by the processor, a report using the integrated prognostic signature indicating at least one of a diagnosis of the disease or a response of the subject to a treatment; and displaying, by the system, the report (Mahmood Fig. 4, [0005], [0052], [0070], noting the fused matrix is used as a basis for determining a prognosis and/or therapeutic response profile (i.e. a report) for the subject which may be displayed at a display device).  

Claim Rejections - 35 USC § 103
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Mahmood.
Claim 1
Mahmood teaches a system comprising: a computing device comprising: a memory storing instructions; a processor configured to access the memory to execute the instructions and, thereby, be caused to (Mahmood [0005], [0046]): 
receive at least two datasets from differing modalities associated with a disease of a subject (Mahmood Fig. 4, [0052], [0067]-[0068], noting the system receives histopathology information and omics data associated with a subject’s disease from two different datastores); 
determine a matrix of features for each of the plurality of data modalities (Mahmood Fig. 4, [0052], [0067]-[0068], [0073], [0086], noting determination of a matrix of features for each modality dataset, resulting in at least three matrices of features); 
integrate the matrix of features for each of the data modalities through a sequential, hierarchical structure to create an integrated prognostic signature for the subject (Mahmood Fig. 4, [0052], [0069], [0071], [0086]-[0087], noting the determined matrices are integrated into a multimodal tensor representing bimodal and trimodal interactions between each modality (i.e. an integrated prognostic signature for the subject) using Kronecker product and gating-based attention mechanisms (i.e. a sequential, hierarchical framework/structure)); 
generate a report using the integrated prognostic signature indicating at least one of a diagnosis of the disease or a response of the subject to a treatment; and a display configured to display the report (Mahmood Fig. 4, [0005], [0052], [0070], noting the fused matrix is used as a basis for determining a prognosis and/or therapeutic response profile (i.e. a report) for the subject which may be displayed at a display device).  
In summary, Mahmood teaches a system for multimodal early fusion of feature matrices extracted from datasets received from differing clinical modalities for the purpose of outputting a prognosis or therapeutic response profile for a subject. Although Mahmood contemplates fusion of three feature matrices, the example embodiment only shows that the three feature matrices are extracted from two datasets received from two modalities. Accordingly, this reference fails to explicitly disclose receiving at least three datasets from differing modalities. However, Mahmood does further contemplate fusing additional data types (see [0039]: “a system 100 that can quantify the tumor microenvironment by fusing different data types (e.g., morphological information from histology and molecular information from omics, but can include more and/or alternate data types) using an algorithm that harnesses deep learning”). It therefore would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the exemplary two datasets received from two modalities to encompass three datasets received from three modalities in order to incorporate more and/or alternate data types into the data fusion framework (as suggested by Mahmood [0039]). 
Claim 13
Mahmood teaches the system of claim 1, and further teaches wherein the signature includes at least one of a hazard score, survival prediction score, or a therapeutic response score (Mahmood [0070], [0090], [0103], noting prediction values related to risk (i.e. hazard), survival, and therapeutic response).  

Claims 2-4, 8-12, 15-17, and 21-26 are rejected under 35 U.S.C. 103 as being unpatentable over Mahmood in view of Waqas et al. (Reference V on Pg 2 of the accompanying PTO-892).
Claim 2
Mahmood teaches the system of claim 1, and further teaches wherein the at least three datasets from differing modalities comprises a first modality including omics data, a second modality including histology embeddings,  (Mahmood Fig. 4, [0041], noting omics and histology modalities).  
Though Mahmood contemplates fusing different data types including histology and omics data, as well as “more and/or alternate data types” (see [0039]), it fails to explicitly disclose a third modality including radiology data. However, Waqas teaches fusing omics and histology modality datasets with at least a third modality dataset including radiology data for the purpose of making multi-modal clinical predictions (Waqas Pg 2, sections 1.1-1.1.2  & 1.2.3). It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the multi-modal data fusion process of Mahmood to include use of radiology data as a third modality as in Waqas in order to incorporate “more and/or alternate data types” that are known to be relevant in cancer diagnosis, thereby offering a richer understanding of the complex prediction problem and enabling a predictive model to use complementary information provided by each modality to improve its learning (as suggested by Mahmood [0039] & Waqas Pg 2, sections 1.1 & 1.2.3). 
Claim 3
Mahmood in view of Waqas teaches the system of claim 2, and the combination further teaches wherein the processor is further caused to integrate the first modality and the second modality using a multi-scale attention model (Mahmood [0024], [0073], [0087], noting a gating-based attention mechanism controls the expressiveness of features of each modality when integrating the modality matrices together, considered equivalent to a multi-scale attention model).  
Claim 4
Mahmood in view of Waqas teaches the system of claim 3, and the combination further teaches use of attention mechanisms in the multimodal data fusion process (Mahmood [0024], [0073], [0087]). However, the present combination fails to explicitly disclose wherein the processor is further caused to perform a first co-attention mechanism using the omics data as a query and the histology data as a key and a value of the attention model to produce omic-directed histology embeddings. However, Waqas further teaches that a common method of fusing multimodal data includes use of transformers with cross-attention mechanisms where one modality is used as a query and the other modality is used as a key and value to produce a fused dataset (Waqas Fig. 11, sections 3.1-3.3). It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the attention-based multimodal data fusion method of the combination to include cross-attention mechanisms between first and second modalities as in Waqas because cross-attention fusion is a relatively more flexible approach that allows the model to selectively attend to different modalities based on their relevance to the task and capture complex interactions between the modalities as compared to other attention-based fusion methods (as suggested by Waqas section 3.2.2). The result of such a combination would include use of a first modality of the combination (e.g. omics data) as a query and a second modality of the combination (e.g. histology data) as a key and value in a cross-attention (i.e. co-attention) fusion process, thereby producing omic-directed histology embeddings. 
Claim 8
Mahmood in view of Waqas teaches the system of claim 2, and the combination further teaches wherein the histology embeddings include whole slide image patch embeddings of a disease sample (Mahmood [0026], [0052], [0067], noting the histology data includes whole-slide images of a part of a diseased portion of a subject (e.g. a tumor as in [0002]); see also Waqas section 1.1.2).  
Claim 9
Mahmood in view of Waqas teaches the system of claim 2, and the combination further teaches wherein the omics data includes at least one of genome sequencing data, gene-expression data, and epigenomics data (Mahmood [0027]-[0030]; see also Waqas section 1.1.1).  
Claim 10
Mahmood in view of Waqas teaches the system of claim 2, and the combination further teaches wherein the radiology data includes imaging slices (Waqas section 1.1.2).  
Claim 11
Mahmood in view of Waqas teaches the system of claim 10, and the combination further teaches wherein the imaging scans include at least one of computed tomography (CT) or magnetic resonance (MR) images (Waqas section 1.1.2).  
Claim 12
Mahmood in view of Waqas teaches the system of claim 2, but the present combination fails to explicitly disclose wherein the at least three datasets from differing modalities further comprises a fourth modality including patient data. However, Mahmood contemplates fusing “more and/or alternate data types” (see [0039]), and Waqas further teaches fusing omics, histology, and radiology modality datasets with at least a fourth modality source including patient clinical data for the purpose of making multi-modal clinical predictions (Waqas Pg 2, sections 1.1-1.1.3 & 1.2.3). It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the multi-modal data fusion process of the combination to further include use of patient data from a fourth modality as in Waqas in order to incorporate “more and/or alternate data types” that are known to be relevant in cancer diagnosis, thereby offering a richer understanding of the complex prediction problem and enabling a predictive model to use complementary information provided by each modality to improve its learning (as suggested by Mahmood [0039] & Waqas Pg 2, sections 1.1 & 1.2.3). 
Claim 15
Mahmood teaches the method of claim 14, and further teaches wherein the plurality of datasets comprises a first modality dataset including omics data, a second modality dataset including histology embeddings,  (Mahmood Fig. 4, [0041], noting omics and histology modalities). 
Though Mahmood contemplates fusing different data types including histology and omics data, as well as “more and/or alternate data types” (see [0039]), it fails to explicitly disclose a third modality dataset including radiology data. However, Waqas teaches fusing omics and histology modality datasets with at least a third modality dataset including radiology data for the purpose of making multi-modal clinical predictions (Waqas Pg 2, sections 1.1-1.1.2 & 1.2.3). It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the multi-modal data fusion process of Mahmood to include use of radiology data as a third modality dataset as in Waqas in order to incorporate “more and/or alternate data types” that are known to be relevant in cancer diagnosis, thereby offering a richer understanding of the complex prediction problem and enabling a predictive model to use complementary information provided by each modality to improve its learning (as suggested by Mahmood [0039] & Waqas Pg 2, sections 1.1 & 1.2.3). 

Claim 16
Mahmood in view of Waqas teaches the method of claim 15, and the combination further teaches integrating, by the processor, the first dataset modality and the second modality dataset using a multi-scale attention model (Mahmood [0024], [0073], [0087], noting a gating-based attention mechanism controls the expressiveness of features of each modality when integrating the modality matrices together, considered equivalent to a multi-scale attention model).  
Claim 17 
Mahmood in view of Waqas teaches the method of claim 16, and the combination further teaches use of attention mechanisms in the multimodal data fusion process (Mahmood [0024], [0073], [0087]). However, the present combination fails to explicitly disclose performing, by the processor, a first co-attention mechanism using the omics data as a query and the histology data as a key and a value of the attention model to produce omic-directed histology embeddings. However, Waqas further teaches that a common method of fusing multimodal data includes use of transformers with cross-attention mechanisms where one modality is used as a query and the other modality is used as a key and value to produce a fused dataset (Waqas Fig. 11, sections 3.1-3.3). It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the attention-based multimodal data fusion method of the combination to include cross-attention mechanisms between first and second modalities as in Waqas because cross-attention fusion is a relatively more flexible approach that allows the model to selectively attend to different modalities based on their relevance to the task and capture complex interactions between the modalities as compared to other attention-based fusion methods (as suggested by Waqas section 3.2.2). The result of such a combination would include use of a first modality of the combination (e.g. omics data) as a query and a second modality of the combination (e.g. histology data) as a key and value in a cross-attention (i.e. co-attention) fusion process, thereby producing omic-directed histology embeddings. 
Claim 21
Mahmood in view of Waqas teaches the method of claim 15, but the present combination fails to explicitly disclose wherein the plurality of datasets from differing modalities further comprises a fourth modality including patient data. However, Mahmood contemplates fusing “more and/or alternate data types” (see [0039]), and Waqas further teaches fusing omics, histology, and radiology modality datasets with at least a fourth modality including patient clinical data for the purpose of making multi-modal clinical predictions (Waqas Pg 2, sections 1.1-1.1.3 & 1.2.3). It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the multi-modal data fusion process of the combination to further include use of patient data from a fourth modality as in Braman in order to incorporate “more and/or alternate data types” that are known to be relevant in cancer diagnosis, thereby offering a richer understanding of the complex prediction problem and enabling a predictive model to use complementary information provided by each modality to improve its learning (as suggested by Mahmood [0039] & Waqas Pg 2, sections 1.1 & 1.2.3). 
Claim 22
Mahmood teaches a system for multimodality fusion of medical data sources, comprising: 
a plurality of modality data sources including a first modality data source including omics data, a second modality data source including histology embeddings,  (Mahmood Fig. 1, [0041], noting omics and histology modality data sources); 
a computing system configured to (Mahmood [0005], [0046]): 
receive datasets from the plurality of modality data sources (Mahmood Fig. 4, [0005], [0052], [0067]-[0068], noting the system receives data from the data sources); 
determine a matrix of features for each of the plurality of modality data sources (Mahmood Fig. 4, [0052], [0067]-[0068], noting determination of a matrix of features for each modality dataset); and 
integrate the matrix of features of each of the data modalities in a hierarchal learning network to generate a report related to a disease of a subject (Mahmood Fig. 4, [0005], [0052], [0069]-[0071], [0086]-[0087], noting the determined matrices are integrated into a multimodal tensor representing bimodal and trimodal interactions between each modality using Kronecker product and gating-based attention mechanisms (i.e. a hierarchical learning network), and the fused matrix is then used as a basis for determining a prognosis and/or therapeutic response profile (i.e. a report) for the subject); and 
a display configured to display the report (Mahmood Fig. 1, [0005], [0052], noting the determined prognosis and/or therapeutic response profile (i.e. report) may be displayed at a display device).  
In summary, Mahmood teaches a system for multimodal early fusion of feature matrices extracted from datasets received from differing clinical modalities for the purpose of outputting a prognosis or therapeutic response profile for a subject. Though Mahmood contemplates fusing different data types including histology and omics data, as well as “more and/or alternate data types” (see [0039]), it fails to explicitly disclose a third modality data source including radiology data. However, Waqas teaches fusing omics and histology modality datasets with at least a third modality source dataset including radiology data for the purpose of making multi-modal clinical predictions (Waqas Pg 2, sections 1.1-1.1.2 & 1.2.3). It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the multi-modal data fusion process of Mahmood to include use of radiology data from a third modality data source as in Waqas in order to incorporate “more and/or alternate data types” that are known to be relevant in cancer diagnosis, thereby offering a richer understanding of the complex prediction problem and enabling a predictive model to use complementary information provided by each modality to improve its learning (as suggested by Mahmood [0039] & Waqas Pg 2, sections 1.1 & 1.2.3).
Claim 23
Mahmood in view of Waqas teaches the system of claim 22, and the combination further teaches wherein: the omics data comprises at least one of genome sequencing data, gene-expression data, and epigenomics data (Mahmood [0027]-[0030]; see also Waqas section 1.1.1); the histology data comprises hematoxylin and eosin-stained resected tumor whole slide images (Mahmood [0026], [0052], [0067], noting the histology data includes whole-slide images of a part of a diseased portion of a subject (e.g. a tumor as in [0002]), including H&E slides as in the title of section “A” above [0074] & para. [0093]; see also Waqas section 1.1.2); and the radiology data comprises at least one of computed tomography images and magnetic resonance images (Waqas section 1.1.2).  
Claim 24
Mahmood in view of Waqas teaches the system of claim 22, and the combination further teaches wherein the report that includes an integrated marker that is diagnostic of a disease or prognostic of disease outcome (Mahmood Fig. 4, [0052], [0070], [0090], [0103], noting the determined prediction is a resulting integrated indicator of diagnosis or prognostic outcome like risk stratification, survival, and therapeutic response).  
Claim 25
Mahmood in view of Waqas teaches the system of claim 22, and the combination further teaches wherein the computing system is further configured to integrate the matrix of features of each dataset received from plurality of modality data in a hierarchal fashion following a micro-to-macro view of a condition or disease (Mahmood [0082], [0087], noting matrix features representing a hierarchical topology of the tumor micro-environment from fine-grained to coarser-grained views may be integrated as part of the data fusion; see also Waqas sections 2.5 & 3.2.3, noting hierarchical learning frameworks for data fusion and analysis so that low-level and high-level features are learned sequentially across data scales, e.g. as depicted in Figs. 2-4).  
Claim 26
Mahmood in view of Waqas teaches the system of claim 22, but the present combination fails to explicitly disclose wherein the computing system is further configured to determine a matrix of features for a fourth modality data source comprising patient data, and integrate the matrix of features of the fourth modality data source with the matrix integrations of the first, second, and third modality data sources. However, Mahmood contemplates fusing “more and/or alternate data types” (see [0039]), and Waqas further teaches fusing omics, histology, and radiology modality datasets with at least a fourth modality source dataset including patient clinical data for the purpose of making multi-modal clinical predictions (Waqas Pg 2, sections 1.1-1.1.3 & 1.2.3). It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the multi-modal data fusion process of the combination to further include use of patient data from a fourth modality data source as in Waqas in order to incorporate “more and/or alternate data types” that are known to be relevant in cancer diagnosis, thereby offering a richer understanding of the complex prediction problem and enabling a predictive model to use complementary information provided by each modality to improve its learning (as suggested by Mahmood [0039] & Waqas Pg 2, sections 1.1 & 1.2.3). 

Claims 5-7 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Mahmood and Waqas as applied to claims 1-4 or 14-17 above, and further in view of Zhang et al. (Reference U on Pg 2 of the accompanying PTO-892).
Claim 5
Mahmood in view of Waqas teaches the system of claim 4, showing a multimodal data fusion process utilizing cross-attention mechanisms for two modalities, and the combination further notes that such cross-attention fusion processes “can be extended to multiple modalities” (Waqas section 3.2). Waqas further contemplates hierarchical learning frameworks for data fusion and analysis so that low-level and high-level features are learned sequentially across data scales (Waqas sections 2.5 & 3.2.3). However, the present combination does not appear to specify that the cross-attention fusion includes using a first fused dataset as a query for a second cross-attention mechanism with a third modality as the key and value, and thus the present combination fails to explicitly disclose wherein the processor is further caused to perform a second co-attention mechanism using the omic-directed histology embeddings as a query and the radiology data as a key and a value of the attention model to produce omic-histology-directed radiology data. 
However, Zhang teaches a hierarchical cascading framework for integrating multiple modalities using cross-attention data fusion mechanisms such that the results of a first cross-attention fusion of two modalities can be utilized in a second cross-attention fusion step with a third modality (Zhang Fig. 3, section III.C). It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the scalable cross-attention data fusion methods of the combination to include the hierarchical cascading fusion framework as in Zhang in order to gradually embed representative information from the deeper feature layers into the shallower features, thereby enhancing the role of jump connections to increase information use and filter out redundant information of the shallow features, reducing the pressure of removing noisy information at the end of the network (as suggested by Zhang section III.C). The result of such a combination would include a second cross-attention fusion step (as in Fig. 3 of Zhang) using queries, keys, and values (as in Fig. 11 of Waqas) being implemented with the first fused data set (i.e. the omic-directed histology embeddings of the combination) and a third modality (e.g. radiology data) to produce omic-histology-directed radiology data reflective of deep-to-shallow embeddings (as in Fig. 3 of Zhang, and as depicted in Figs. 2-4 of Waqas in the context of oncology-related modalities). 

Claim 6
Mahmood in view of Waqas and Zhang teaches the system of claim 5, and the combination further teaches wherein the processor is further caused to separately aggregate, using a transformer for each, the omic data, omic-directed histology embeddings, and the omic-histology-directed data via global attention pooling to produce one or more features for each of the plurality of data modalities (Zhang Figs. 2-3, sections III.B-C, noting each level of hierarchy is separately passed through its own transformer and global pooling operations are performed to result in an overall output).  
Claim 7
Mahmood in view of Waqas and Zhang teaches the system of claim 6, and the combination further teaches wherein the processor is further caused to concatenate the one or more features and into a plurality of fully connected layers to produce the report (Zhang Figs. 2-3, sections III.B-C, noting results of each level of hierarchy are combined (i.e. concatenated) into various layers (i.e. fully connected layers) to result in an overall output (i.e. the report when considered in the context of the combination)).  
Claim 18
Mahmood in view of Waqas teaches the method of claim 17, showing a multimodal data fusion process utilizing cross-attention mechanisms for two modalities, and the combination further notes that such cross-attention fusion processes “can be extended to multiple modalities” (Waqas section 3.2). Waqas further contemplates hierarchical learning frameworks for data fusion and analysis so that low-level and high-level features are learned sequentially across data scales (Waqas sections 2.5 & 3.2.3). However, the present combination does not appear to specify that the cross-attention fusion includes using a first fused dataset as a query for a second cross-attention mechanism with a third modality as the key and value, and thus the present combination fails to explicitly disclose performing, by the processor, a second co-attention mechanism using the omic-directed histology embeddings as a query and the radiology data as a key and a value of the attention model to produce omic-histology-directed radiology data. 
However, Zhang teaches a hierarchical cascading framework for integrating multiple modalities using cross-attention data fusion mechanisms such that the results of a first cross-attention fusion of two modalities can be utilized in a second cross-attention fusion step with a third modality (Zhang Fig. 3, section III.C). It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the scalable cross-attention data fusion methods of the combination to include the hierarchical cascading fusion framework as in Zhang in order to gradually embed representative information from the deeper feature layers into the shallower features, thereby enhancing the role of jump connections to increase information use and filter out redundant information of the shallow features, reducing the pressure of removing noisy information at the end of the network (as suggested by Zhang end of section III.C). The result of such a combination would include a second cross-attention fusion step (as in Fig. 3 of Zhang) using queries, keys, and values (as in Fig. 11 of Waqas) being implemented with the first fused data set (i.e. the omic-directed histology embeddings of the combination) and a third modality (e.g. radiology data) to produce omic-histology-directed radiology data reflective of deep-to-shallow embeddings (as in Fig. 3 of Zhang, and as depicted in Figs. 2-4 of Waqas in the context of oncology-related modalities). 
Claim 19
Mahmood in view of Waqas and Zhang teaches the method of claim 18, and the combination further teaches by the processor, separately aggregating, using a transformer for each, the omic data, omic-directed histology embeddings, and the omic-histology-directed data via global attention pooling to produce one or more features for each of the plurality of data modalities (Zhang Figs. 2-3, sections III.B-C, noting each level of hierarchy is separately passed through its own transformer and global pooling operations are performed to result in an overall output).  
Claim 20
Mahmood in view of Waqas and Zhang teaches the method of claim 19, and the combination further teaches by the processor, concatenating the one or more features and into a plurality of fully connected layers to produce the integrated prognostic signature (Zhang Figs. 2-3, sections III.B-C, noting results of each level of hierarchy are combined (i.e. concatenated) into various layers (i.e. fully connected layers) to result in an overall output (i.e. the integrated prognostic signature when considered in the context of the combination)).  


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Gil et al. (WO 2024146849 A1), Li et al. (CN 117422704 A), Yu et al. (CN 117952966 A), Braman et al. (Reference U on the accompanying PTO-892), Chen et al. (Reference V on the accompanying PTO-892), Wu et al. (Reference W on the accompanying PTO-892), Huo et al. (Reference X on the accompanying PTO-892) each describe systems and/or methods for integrating attributes from various clinical modalities at different levels of granularity.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KAREN A HRANEK whose telephone number is (571)272-1679. The examiner can normally be reached M-F 8:00-4:00 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Shahid Merchant can be reached at 571-270-1360. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KAREN A HRANEK/             Primary Examiner, Art Unit 3684
Read full office action
Prosecution Timeline

May 24, 2024
Application Filed
Feb 13, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/062,526
Patent 12580072
CLOUD ANALYTICS PACKAGES
2y 5m to grant Granted Mar 17, 2026
18/345,140
Patent 12555667
SYSTEMS AND METHODS FOR USING AI/ML AND FOR CARDIAC AND PULMONARY TREATMENT VIA AN ELECTROMECHANICAL MACHINE RELATED TO UROLOGIC DISORDERS AND ANTECEDENTS AND SEQUELAE OF CERTAIN UROLOGIC SURGERIES
2y 5m to grant Granted Feb 17, 2026
18/228,640
Patent 12548656
SYSTEM AND METHOD FOR AN ENHANCED PATIENT USER INTERFACE DISPLAYING REAL-TIME MEASUREMENT INFORMATION DURING A TELEMEDICINE SESSION
2y 5m to grant Granted Feb 10, 2026
18/092,041
Patent 12475978
ADAPTABLE OPERATION RANGE FOR A SURGICAL DEVICE
2y 5m to grant Granted Nov 18, 2025
17/207,582
Patent 12462911
CLINICAL CONCEPT IDENTIFICATION, EXTRACTION, AND PREDICTION SYSTEM AND RELATED METHODS
2y 5m to grant Granted Nov 04, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
36%
Grant Probability
83%
With Interview (+46.7%)
3y 7m
Median Time to Grant
Low
PTA Risk
Based on 172 resolved cases by this examiner. Grant probability derived from career allow rate.