Last updated: May 29, 2026
Application No. 18/550,267
METHOD OF PROCESSING MEDICAL DATA, METHOD OF ANALYZING MEDICAL DATA, ELECTRONIC DEVICE, AND MEDIUM

Non-Final OA §103
Filed
Sep 12, 2023
Priority
Nov 11, 2022 — nonprovisional of PCTCN2022131414
Examiner
ZAK, JACQUELINE ROSE
Art Unit
2666
Tech Center
2600 — Communications
Assignee
BOE TECHNOLOGY GROUP CO., LTD.
OA Round
2 (Non-Final)
Interview Optional

— -4.5% interview lift. Interview lift (-4.5%) is below the 15.0% threshold. A written response is recommended.
Based on 17 resolved cases, 2023–2026
Examiner Intelligence

ZAK, JACQUELINE ROSE View full profile →
Grants 53% of resolved cases
Career Allowance Rate
9 granted / 17 resolved
-9.1% vs TC avg
Minimal -4% lift
Without
With
+-4.5%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
31 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§103
94.2%
+54.2% vs TC avg
§102
5.1%
-34.9% vs TC avg
§112
0.7%
-39.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 17 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Status
Claims 1-6, 8-10, 12-15, 18-19, 22, 25, and 28-29 are pending for examination in the application filed 12/05/2025. Claim 7 has been newly cancelled and claims 1, 8, 14-15, 18-19, 22, and 25 have been newly amended. 

Priority
Acknowledgement is made of the present application as a national stage entry of PCT/CN2022/131414, international filing date: 11/11/2022. 

	Response to Arguments and Amendments
The 35 U.S.C. 112(b) rejections of claims 14-15, 18-19, 22, 25, and 28 have been withdrawn in view of the amendments. Applicant's arguments filed 12/05/2025 have been fully considered but they are not persuasive. 
Applicant argues on pages 17-20 of the Remarks filed 12/05/2025 that Wang fails to disclose claim 1 as amended, which incorporates the limitations of previous dependent claim 7. Applicant specifically argues on page 19 of the Remarks that “Wang does not disclose or suggest that the feature(s) of the plurality of images themselves can characterize the correlation relationships/ information between these images, nor does Wang disclose or suggest that the feature(s) of the plurality of images are obtained according to an image weight matrix that characterizes these images”. 
As stated on page 5 of the non-final rejection filed 09/16/2025, Wang teaches:
determine a first image weight matrix according to the first image query matrix and the first image key matrix, wherein the first image weight matrix represents a correlation information between each two first medical images in the first medical image data ([0094] In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 perform various processes to project inputs to one or more dimensions. In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 utilize various weight matrices, also referred to as parameter matrices. In at least one embodiment, one or more systems determine values for weight matrices through one or more training processes. In at least one embodiment, one or more systems determine values for weight matrices through any suitable process, such as using various training processes, functions, random number generation processes, pre-defined values, logic, rules, heuristics, and/or variations thereof. In at least one embodiment, a linear 210 multiplies a Q 204 by a first weight matrix, a linear 212 multiplies a K 206 by a second weight matrix, and a linear 214 multiplies a V 208 by a third weight matrix. [0100] FIG. 3 illustrates an example 300 of a cross correlation module. [0107] In at least one embodiment, a pair matching 320 is a collection of one or more hardware and/or software computing resources with instructions that, when executed, performs one or more image and/or text correlation processes); 
and determine the first image feature according to the first image weight matrix and the first medical image data ([0084] In at least one embodiment, a cross correlation module 110 outputs text features 112, denoted by F.sub.txt, to a text decoder 114 and image features 128, denoted by F.sub.img, to a multi-scale image decoder 130…In at least one embodiment, an image features 128 is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. In at least one embodiment, an image features 128 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. [0070] image features 128, which is processed by a multi-scale image decoder 130 to calculate decoded image features 132).

Thus, Wang teaches “feature(s) of the plurality of images themselves can characterize the correlation relationships/ information between these images” because the correlation determined is the similarity between the learned feature representations. These feature representations are obtained by applying weight matrices that project image features into a learned feature space. Thus, Wang further teaches “feature(s) of the plurality of images are obtained according to an image weight matrix that characterizes these images”. Clarification has been added in the updated 35 USC § 103 rejection to distinguish between these preliminary image features and learned feature representations. 
Applicant further argues on pages 19-20 of the Remarks that Wang teaches correlation between text and an image, however, as cited in the non-final office action, [Wang 0107] In at least one embodiment, a pair matching 320 is a collection of one or more hardware and/or software computing resources with instructions that, when executed, performs one or more image and/or text correlation processes. Thus, Wang teaches correlation processes between images. 
Applicant argues on page 20 of the Remarks that Wang foes not teach a plurality of temporally sequential images. Wang was not cited to teach this limitation in the non-final office action. Please see the updated 35 USC § 103 rejection which incorporates the newly added amendments. Applicant further argues on page 21 of the Remarks that Avinash does not teach inputting the plurality of first medical images into the second feature extraction module to obtain a first temporal image feature, wherein the first temporal image feature represents a temporal relationship between the plurality of first medical images, as previously stated in dependent claim 7 and as stated now in amended claim 1. 
As stated on page 15 of the non-final rejection filed 09/16/2025:
Avinash teaches inputting the plurality of first medical images into the second feature extraction module to obtain a first temporal image feature, wherein the first temporal image feature represents a temporal relationship between the plurality of first medical images ([0270] A single-type, multi-modality medical system, in the present context, may consist of any of the columns of the FIG. 8. In FIG. 7, a diagrammatical representation a single-type, multi-modality system with the temporal attributes is illustrated, considering M modalities at N different time points…The temporal aspects of a medical event are also considered in the context, such as to modify acquisition, processing and analysis modules based on the temporal attributes of the data. [0280] The acquisition/storage module contains acquired medical data. For temporal change analysis, means are provided to access the data from storage corresponding to an earlier time point. To simplify notation in the subsequent discussion we describe only two time points t1 and t2, even though the general approach can be extended for any type of medical data in the acquisition and temporal sequence. The segmentation module provides automated or manual means for isolating features, volumes, regions, lines, and/or points of interest. In many cases of practical interest, the entire data can be the output of the segmentation module).

Applicant agrees that Avinash considers M modalities of information/ data at N different time points, but argues that it is unrelated to calculating the temporal image features of the plurality of images and obtaining image features based on these temporal features so as to accurately and comprehensively characterize these images. As stated above, Avinash specifically discusses temporal change analysis of extracted features. Furthermore, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., Inc., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). Where a rejection of a claim is based on two or more references, a reply that is limited to what a subset of the applied references teaches or fails to teach, or that fails to address the combined teaching of the applied references may be considered to be an argument that attacks the reference(s) individually. Please see below for the updated 35 USC § 103 rejections facilitated by the newly added amendments. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-6, 8-9, 12-13 and 29 are rejected under 35 U.S.C. 103 as being unpatentable over Wang (US20230019211A1) in view of Decuyper (Decuyper, Milan, et al. "Automated MRI based pipeline for segmentation and prediction of grade, IDH mutation and 1p19q co-deletion in glioma." Computerized Medical Imaging and Graphics 88 (2021): 101831) and Avinash (US20070118399A1). 

Regarding claim 1, Wang teaches a method of processing medical data (Fig. 10), comprising: acquiring first medical image data ([0070] In at least one embodiment, a pre-training framework obtains or otherwise receives as input an image 120. [0073] In at least one embodiment, an image 120 is a medical image and depicts various structures, such as skeletal structures, organs, tissues, anomalies, and/or variations thereof), wherein the first medical image data comprises a plurality of first medical images (Fig. 10. 1002: obtain text and one or more images); 
inputting the first medical image data into a first feature extraction network to obtain a first image feature ([0070] In at least one embodiment, a pre-training framework obtains or otherwise receives as input an image 120, calculates an image embedding 122, which is input to a multi-scale image encoder 124 to calculate an image embedding 126, which is processed by a cross correlation module 110 to calculate image features 128); 
and obtaining a first gene mutation information according to the first image feature ([0080] In at least one embodiment, an image encoding 126, also referred to as image features, encoded features, and/or variations thereof, is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image embedding 122. In at least one embodiment, an image encoding 126 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image embedding 122. [0562] Examples of genomic analyses that may be performed using systems and processes described herein include, without limitation, variant calling, mutation detection, and gene expression quantification),
wherein the first feature extraction network comprises a first feature extraction module ([0092] FIG. 2 illustrates an example 200 of a self-attention module, according to at least one embodiment. In at least one embodiment, a self-attention module (SAM) 202 is in accordance with those described elsewhere in this disclosure) configured to: determine a first image query matrix and a first image key matrix according to the first medical image data ([0093] In at least one embodiment, inputs to a SAM 202 include a Q 204, also referred to as a query, a K 206, also referred to as a key, and a V 208, also referred to as a value. In at least one embodiment, a Q 204, a K 206, and a V 208 are each a matrix); 
determine a first image weight matrix according to the first image query matrix and the first image key matrix, wherein the first image weight matrix represents a correlation information between each two first medical images in the first medical image data ([0094] In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 perform various processes to project inputs to one or more dimensions. In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 utilize various weight matrices, also referred to as parameter matrices. In at least one embodiment, one or more systems determine values for weight matrices through one or more training processes. In at least one embodiment, one or more systems determine values for weight matrices through any suitable process, such as using various training processes, functions, random number generation processes, pre-defined values, logic, rules, heuristics, and/or variations thereof. In at least one embodiment, a linear 210 multiplies a Q 204 by a first weight matrix, a linear 212 multiplies a K 206 by a second weight matrix, and a linear 214 multiplies a V 208 by a third weight matrix. [0100] FIG. 3 illustrates an example 300 of a cross correlation module. [0107] In at least one embodiment, a pair matching 320 is a collection of one or more hardware and/or software computing resources with instructions that, when executed, performs one or more image and/or text correlation processes); 
and determine the first image feature according to the first image weight matrix and the first medical image data ([0084] In at least one embodiment, a cross correlation module 110 outputs text features 112, denoted by F.sub.txt, to a text decoder 114 and image features 128, denoted by F.sub.img, to a multi-scale image decoder 130…In at least one embodiment, an image features 128 is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. In at least one embodiment, an image features 128 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. [0070] image features 128, which is processed by a multi-scale image decoder 130 to calculate decoded image features 132);
wherein the obtaining a first gene mutation information according to the first image feature comprises inputting the first image feature to a classification network ([0126] In at least one embodiment, one or more systems utilize one or more neural network models trained in connection with a pre-training framework for tasks such as disease classification, similarity search (e.g., patient study retrieval), and image regeneration. In at least one embodiment, disease classification refers to one or more tasks that classify potential diseases from image and/or text data. [0562] Examples of genomic analyses that may be performed using systems and processes described herein include, without limitation, variant calling, mutation detection, and gene expression quantification).
wherein the first feature extraction network further comprises a second feature extraction module (Fig. 10. 1002: obtain text and one or more images, 1004: calculate features of text and one or more images. [0183] In at least one embodiment, one or more systems depicted in FIG. 13 are utilized to use one or more neural networks to indicate an extent to which text corresponds to one or more images. [0070] In at least one embodiment, a pre-training framework obtains or otherwise receives as input an image 120, calculates an image embedding 122, which is input to a multi-scale image encoder 124 to calculate an image embedding 126, which is processed by a cross correlation module 110 to calculate image features 128); 
and wherein determining the first image query matrix and the first image key matrix according to the first medical image data comprises: inputting the first preliminary image feature into the first feature extraction module to obtain the first image query matrix and the first image key matrix ([0094] In at least one embodiment, a linear 210 multiplies a Q 204 by a first weight matrix, a linear 212 multiplies a K 206 by a second weight matrix, and a linear 214 multiplies a V 208 by a third weight matrix. [0100] FIG. 3 illustrates an example 300 of a cross correlation module. [0107] In at least one embodiment, a pair matching 320 is a collection of one or more hardware and/or software computing resources with instructions that, when executed, performs one or more image and/or text correlation processes. [0084] In at least one embodiment, a cross correlation module 110 outputs text features 112, denoted by F.sub.txt, to a text decoder 114 and image features 128, denoted by F.sub.img, to a multi-scale image decoder 130…In at least one embodiment, an image features 128 is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. In at least one embodiment, an image features 128 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. [0070] image features 128, which is processed by a multi-scale image decoder 130 to calculate decoded image features 132)).
Wang does not teach wherein the first medical image data comprises a brain glioma image; to obtain a brain glioma gene mutation type.
Decuyper, in the same field of endeavor of glioma image analysis, teaches wherein the first medical image data comprises a brain glioma image ([pg. 5 para. 1] To acquire a large dataset, we collected data from multiple public databases: the TCGA-GBM (Scarpace et al., 2016), TCGA-LGG (Pedano et al., 2016) and LGG-1p19qDeletion (Erickson et al., 2017) collections on The Cancer Imaging Archive (TCIA) (Clark et al., 2013) and the BraTS 2019 dataset. Inclusion criteria were: a histologically proven glioma of WHO grade II, III or IV, the availability of pre-operative T1ce MRI together with a T2 and/or FLAIR sequence of sufficient quality and information on WHO grade, IDH mutation and 1p19q co-deletion status);
to obtain a brain glioma gene mutation type ([pg. 6 para 2] 3.2. Glioma classification. In Table 4 the results are presented of the multi-task classification network. For each task (WHO grade, IDH mutation and 1p19q co-deletion status) the AUC, Matthews Correlation Coefficient (MCC), accuracy, sensitivity and specificity scores are included). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Decuyper to determine a glioma mutation type because "Glioblastoma multiforme, WHO grade IV, is the most aggressive type and has a very poor prognosis with a 5-year survival rate of only 5.6%. In contrast, lower-grade glioma (WHO grade II and III) have more favorable survival rates up to 81.6% and 57.6% respectively" [Decuyper pg. 1 para. 1]. 
Wang does not teach inputting the plurality of first medical images into the second feature extraction module to obtain a first temporal image feature, wherein the first temporal image feature represents a temporal relationship between the plurality of first medical images, wherein the first preliminary image feature is a temporal image feature.
Avinash teaches inputting the plurality of first medical images into the second feature extraction module to obtain a first temporal image feature, wherein the first temporal image feature represents a temporal relationship between the plurality of first medical images, wherein the first preliminary image feature is a temporal image feature ([0270] A single-type, multi-modality medical system, in the present context, may consist of any of the columns of the FIG. 8. In FIG. 7, a diagrammatical representation a single-type, multi-modality system with the temporal attributes is illustrated, considering M modalities at N different time points…The temporal aspects of a medical event are also considered in the context, such as to modify acquisition, processing and analysis modules based on the temporal attributes of the data. [0280] The acquisition/storage module contains acquired medical data. For temporal change analysis, means are provided to access the data from storage corresponding to an earlier time point. To simplify notation in the subsequent discussion we describe only two time points t1 and t2, even though the general approach can be extended for any type of medical data in the acquisition and temporal sequence. The segmentation module provides automated or manual means for isolating features, volumes, regions, lines, and/or points of interest. In many cases of practical interest, the entire data can be the output of the segmentation module).
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Avinash to extract temporal features for "performing temporal analysis on multiple-type data to fully characterize the medical condition in question" [Avinash 0278].

Regarding claim 2, Wang, Decuyper, and Avinash teach the method of claim 1. Wang further teaches wherein the first medical image data comprises a plurality of kinds of mono-modality medical image data, and the first feature extraction network comprises first feature extraction sub-networks respectively corresponding to the plurality of kinds of mono-modality medical image data ([0610] In at least one embodiment, and with reference to FIGS. 43A-43B, deployment system 4006 may be implemented as one or more virtual instruments to perform different functionalities (such as image processing, segmentation, enhancement, AI, visualization, and inferencing) with imaging devices (e.g., CT scanners, X-ray machines, MM machines, etc.), sequencing devices, genomics devices, and/or other device types);
wherein the inputting the first medical image data into a first feature extraction network to obtain a first image feature comprises: inputting the plurality of kinds of mono-modality medical image data respectively into the first feature extraction sub-networks corresponding to the plurality of kinds of mono-modality medical image data, so as to obtain a plurality of mono-modality image features ([0056] FIG. 43A includes an example data flow diagram of a virtual instrument supporting an ultrasound device, in accordance with at least one embodiment. [0057] FIG. 43B includes an example data flow diagram of a virtual instrument supporting an CT scanner, in accordance with at least one embodiment);
and wherein the obtaining a first gene mutation information according to the first image feature comprises: performing feature concatenating on the plurality of mono-modality image features to obtain a concatenated image feature ([0131] FIG. 5 illustrates an example 500 of results using a pre-training framework, according to at least one embodiment…In at least one embodiment, ResNet50 indicates a CNN based disease classifier. In at least one embodiment, for classification using both image and text reports, denoted as “img&txt,” one or more systems directly extract a textual feature via a pre-trained model such as a Biobert model and concatenate it together with an output of a layer such as a pool5 in ResNet-50, although any suitable layer of any suitable neural network model can be utilized, for a final classification); and inputting the concatenated image feature into a first classification network to obtain the first gene mutation information ([0126] In at least one embodiment, one or more systems utilize one or more neural network models trained in connection with a pre-training framework for tasks such as disease classification, similarity search (e.g., patient study retrieval), and image regeneration. In at least one embodiment, disease classification refers to one or more tasks that classify potential diseases from image and/or text data. [0562] Examples of genomic analyses that may be performed using systems and processes described herein include, without limitation, variant calling, mutation detection, and gene expression quantification).

Regarding claim 3, Wang, Decuyper, and Avinash teach the method of claim 1. Wang further teaches the first feature extraction network comprises a plurality of first feature extraction sub-networks; wherein the inputting the first medical image data into a first feature extraction network to obtain a first image feature comprises: inputting the medical image data into the plurality of first feature extraction sub-networks to obtain a plurality of image features (Fig. 10. 1002: obtain text and one or more images, 1004: calculate features of text and one or more images. [0183] In at least one embodiment, one or more systems depicted in FIG. 13 are utilized to use one or more neural networks to indicate an extent to which text corresponds to one or more images. [0070] In at least one embodiment, a pre-training framework obtains or otherwise receives as input an image 120, calculates an image embedding 122, which is input to a multi-scale image encoder 124 to calculate an image embedding 126, which is processed by a cross correlation module 110 to calculate image features 128); 
and wherein the obtaining a first gene mutation information according to the first image feature comprises: inputting the plurality of image features into a second classification network to obtain a plurality of predictions for single gene mutation types; and combining the plurality of predictions for single gene mutation types to obtain the first gene mutation information ([0131] FIG. 5 illustrates an example 500 of results using a pre-training framework, according to at least one embodiment…In at least one embodiment, ResNet50 indicates a CNN based disease classifier. In at least one embodiment, for classification using both image and text reports, denoted as “img&txt,” one or more systems directly extract a textual feature via a pre-trained model such as a Biobert model and concatenate it together with an output of a layer such as a pool5 in ResNet-50, although any suitable layer of any suitable neural network model can be utilized, for a final classification) ([0126] In at least one embodiment, one or more systems utilize one or more neural network models trained in connection with a pre-training framework for tasks such as disease classification, similarity search (e.g., patient study retrieval), and image regeneration. In at least one embodiment, disease classification refers to one or more tasks that classify potential diseases from image and/or text data. [0562] Examples of genomic analyses that may be performed using systems and processes described herein include, without limitation, variant calling, mutation detection, and gene expression quantification).
Wang does not teach wherein the first medical image data comprises multi-modality medical image data; a plurality of multi- modality image features. 
Avinash, in the same field of endeavor of medical image feature extraction, teaches wherein the first medical image data comprises multi-modality medical image data ([0270] A single-type, multi-modality medical system, in the present context, may consist of any of the columns of the FIG. 8. In FIG. 7, a diagrammatical representation a single-type, multi-modality system with the temporal attributes is illustrated, considering M modalities at N different time points); a plurality of multi- modality image features ([0399] As described above, the medical practitioner derives information regarding a medical condition from a variety of sources. The present technique provides computer-assisted algorithms and techniques calling upon these sources from multi-modal and multi-dimensional perspectives for the detection and classification of a range of medical conditions in clinically relevant areas including (but not limited to) oncology, radiology, pathology, neurology, cardiology, orthopedics, and surgery. [0402] The feature extraction process involves performing computations on the data sources. For example, in image-based data and for a region of interest, statistics such as shape, size, density, curvature can be computed. On acquisition-based and patient-based data, the data themselves may serve as the features. Once the features are computed, a pre-trained classification algorithm can be used to classify the regions of interest as benign or malignant nodules).
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Avinash to extract features from multi-modality medical data because "interaction within each type is also evident, such as to optimize acquisition, processing and analysis of data" [Avinash 0270]. 
 
Regarding claim 4, Wang, Decuyper, and Avinash teach the method of claim 3. Wang further teaches wherein the second classification network comprises a plurality of second classification sub-networks respectively corresponding to the plurality of image features ([0056] FIG. 43A includes an example data flow diagram of a virtual instrument supporting an ultrasound device, in accordance with at least one embodiment. [0057] FIG. 43B includes an example data flow diagram of a virtual instrument supporting an CT scanner, in accordance with at least one embodiment); 
and wherein the inputting the plurality of image features into a second classification network to obtain a plurality of predictions for single gene mutation types comprises: inputting the plurality of image features respectively into the second classification sub-networks corresponding to the plurality of image features, so as to obtain the plurality of predictions for single gene mutation types ([0126] In at least one embodiment, one or more systems utilize one or more neural network models trained in connection with a pre-training framework for tasks such as disease classification, similarity search (e.g., patient study retrieval), and image regeneration. In at least one embodiment, disease classification refers to one or more tasks that classify potential diseases from image and/or text data. [0562] Examples of genomic analyses that may be performed using systems and processes described herein include, without limitation, variant calling, mutation detection, and gene expression quantification).
Wang does not teach multi-modality image features. 
Avinash teaches multi-modality image features ([0399] As described above, the medical practitioner derives information regarding a medical condition from a variety of sources. The present technique provides computer-assisted algorithms and techniques calling upon these sources from multi-modal and multi-dimensional perspectives for the detection and classification of a range of medical conditions in clinically relevant areas including (but not limited to) oncology, radiology, pathology, neurology, cardiology, orthopedics, and surgery. [0402] The feature extraction process involves performing computations on the data sources. For example, in image-based data and for a region of interest, statistics such as shape, size, density, curvature can be computed. On acquisition-based and patient-based data, the data themselves may serve as the features. Once the features are computed, a pre-trained classification algorithm can be used to classify the regions of interest as benign or malignant nodules).
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Avinash to extract features from multi-modality medical data because "interaction within each type is also evident, such as to optimize acquisition, processing and analysis of data" [Avinash 0270]. 
 
Regarding claim 5, Wang, Decuyper, and Avinash teach the method of claim 4. Decuyper teaches wherein the second classification sub- network comprises at least one selected from: an isocitrate dehydrogenase mutation classification network, a chromosome 1p/19q classification network, a telomerase reverse transcriptase promoter classification network, or an 06-methylguanine-DNA methyltransferase classification network (see Figure 2 below). 

    PNG
    media_image1.png
    475
    696
    media_image1.png
    Greyscale


Regarding claim 6, Wang, Decuyper, and Avinash teach the method of claim 3. Wang further teaches wherein the plurality of first feature extraction sub-networks share model parameters with each other ([0180] In at least one embodiment, trained machine learning models corresponding to one or more neural networks may be used to infer or predict information using resources described above with respect to data center 1300 by using weight parameters calculated through one or more training techniques described herein). 
 
Regarding claim 8, Wang, Decuyper, and Avinash teach the method of claim 1. Wang further teaches wherein the first feature extraction network further comprises a third feature extraction module ([0086] In at least one embodiment, a text decoder 114 and/or a multi-scale image decoder 130 each comprise or otherwise implement one or more neural networks for processing of determined vectors. In at least one embodiment, a text decoder 114 and/or a multi-scale image decoder 130 comprise or otherwise implement one or more multi-layer perceptrons (MLP). In at least one embodiment, a multi-layer perceptron refers to a class of feedforward artificial neural networks that comprise at least an input layer, a hidden layer, and an output layer), and the third feature extraction module comprises a first maximum pooling layer, a first residual unit, a first down-sampling unit, and a first average pooling layer (all within ResNet-50. [0131] ResNet50 indicates a CNN based disease classifier. In at least one embodiment, for classification using both image and text reports, denoted as “img&txt,” one or more systems directly extract a textual feature via a pre-trained model such as a Biobert model and concatenate it together with an output of a layer such as a pool5 in ResNet-50);
wherein the method further comprises: inputting the plurality of first medical images into the third feature extraction module to obtain a plurality of first intermediate image features ([0084] In at least one embodiment, a cross correlation module 110 outputs text features 112, denoted by F.sub.txt, to a text decoder 114 and image features 128, denoted by F.sub.img, to a multi-scale image decoder 130…In at least one embodiment, an image features 128 is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. In at least one embodiment, an image features 128 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. [0070] image features 128, which is processed by a multi-scale image decoder 130 to calculate decoded image features 132).
Wang does not teach wherein the inputting the plurality of first medical images into the second feature extraction module to obtain a first temporal image feature comprises: inputting the plurality of first intermediate image features into the second feature extraction module to obtain the first temporal image feature.
Avinash teaches wherein the inputting the plurality of first medical images into the second feature extraction module to obtain a first temporal image feature comprises: inputting the plurality of first intermediate image features (dataset D1 or dataset D2) into the second feature extraction module to obtain the first temporal image feature ([0280] The acquisition/storage module contains acquired medical data. For temporal change analysis, means are provided to access the data from storage corresponding to an earlier time point. To simplify notation in the subsequent discussion we describe only two time points t1 and t2, even though the general approach can be extended for any type of medical data in the acquisition and temporal sequence. The segmentation module provides automated or manual means for isolating features, volumes, regions, lines, and/or points of interest. In many cases of practical interest, the entire data can be the output of the segmentation module. See Fig. 26. [0366] The process 394 may be considered to begin at a step 400 where an expert or medical professional performs feature detection and classification…The expert will typically draw the data from the IKB 12 or from the various resources 18 and may draw upon additional data from such resources to support the "reading" process of feature detection and classification. The expert then produces a dataset labeled D1, and referred to in FIG. 26 by reference numeral 402. [0367] In parallel with the expert feature detection and classification functions, an algorithm, in the example a CAD algorithm, performs similar feature detection and classification functions at step 404. As noted above, various programs are available for such functions, typically drawing upon raw or processed image data… As a result of step 404, a second dataset D2, referred to in FIG. 26 by reference numeral 406, is produced, which may be similarly annotated for display. [0370]  Block 418 in FIG. 26 represents a reconciler).
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Avinash to use intermediate temporal image features because "The purpose of the reconciler 418 is to resolve conflicts between detection and classification by the algorithm and the expert" [Avinash 0370]. 
 
Regarding claim 9, Wang, Decuyper, and Avinash teach the method of claim 1. Wang further teaches wherein determining the first image feature according to the first image weight matrix and the first medical image data comprises: obtaining a first image value matrix according to the first medical image data ([0093] In at least one embodiment, inputs to a SAM 202 include a Q 204, also referred to as a query, a K 206, also referred to as a key, and a V 208, also referred to as a value. In at least one embodiment, a Q 204, a K 206, and a V 208 are each a matrix); and obtaining the first image feature according to the first image weight matrix and the first image value matrix ([0094] In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 perform various processes to project inputs to one or more dimensions. In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 utilize various weight matrices, also referred to as parameter matrices. In at least one embodiment, one or more systems determine values for weight matrices through one or more training processes. In at least one embodiment, one or more systems determine values for weight matrices through any suitable process, such as using various training processes, functions, random number generation processes, pre-defined values, logic, rules, heuristics, and/or variations thereof. In at least one embodiment, a linear 210 multiplies a Q 204 by a first weight matrix, a linear 212 multiplies a K 206 by a second weight matrix, and a linear 214 multiplies a V 208 by a third weight matrix. [0100] FIG. 3 illustrates an example 300 of a cross correlation module. [0107] In at least one embodiment, a pair matching 320 is a collection of one or more hardware and/or software computing resources with instructions that, when executed, performs one or more image and/or text correlation processes. [0084] In at least one embodiment, a cross correlation module 110 outputs text features 112, denoted by F.sub.txt, to a text decoder 114 and image features 128, denoted by F.sub.img, to a multi-scale image decoder 130…In at least one embodiment, an image features 128 is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. In at least one embodiment, an image features 128 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. [0070] image features 128, which is processed by a multi-scale image decoder 130 to calculate decoded image features 132).

Regarding claim 12, Wang, Decuyper, and Avinash teach the method of claim 2. Wang further teaches wherein the plurality of mono-modality medical image data comprises at least one selected from the group consisting of: mono-modality medical image data corresponding to an anatomical structure, mono-modality medical image data corresponding to a lesion site, mono-modality medical image data corresponding to an edema region, and mono-modality medical image data corresponding to a contrast enhancement ([0061] In at least one embodiment, for example, for an image and a text that correspond to each other, said text comprises description of various medical features (e.g., particular structures, anomalies, conditions, and/or variations thereof), and said image depicts said various medical features). 
 
Regarding claim 13, Wang, Decuyper, and Avinash teach the method of claim 1. Decuyper teaches acquiring first sample data, wherein the first sample data comprises first sample medical image data and a first sample gene mutation label information corresponding to the first sample medical image data (pg. 5 para. 6] The 628 patients are split into a training set of 458 (264 GBM vs. 194 LGG, 123 IDH mutant vs. 87 IDH wildtype and 83 1p19q co-deleted vs. 100 1p19q intact), a validation set of 70 (27 GBM vs. 43 LGG, 41 IDH mutant vs. 29 IDH wildtype and 20 1p19q co-deleted vs. 23 1p19q intact) and a test set of 100 (46 GBM vs. 54 LGG, 48 IDH mutant vs. 52 wildtype and 30 1p19q co-deleted vs. 24 1p19q intact) patients. For patients in the validation and test set, all ground truth labels were available). 
inputting the first sample medical image data into the first feature extraction network and a classification network to obtain a first sample gene mutation prediction information corresponding to the first sample medical image data ([pg. 5 para. 3] Using the segmentation mask, a tumor region of interest (ROI) is extracted from the MRI and subsequently fed into the classification network as illustrated in Fig. 2); 
inputting the first sample gene mutation prediction information and the first sample gene mutation label information into a first loss function to obtain a first loss function value ([pg. 5 para. 5] The loss is calculated for each task separately on all samples in the batch with known ground truth labels and averaged to a global loss which is backpropagated through the network); 
and adjusting a model parameter of the first feature extraction network and a model parameter of the classification network according to the first loss function value ([pg. 5 para. 5] If the validation loss did not improve in the last 10 epochs the learning rate is halved and early stopping occurs after no improvement for 30 epochs. In the last fully connected layer, dropout is applied with probability of 10%. Different hyperparameters of the network were tuned based on the validation set). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Decuyper to adjust feature extraction and classification parameters based on loss because "MRI features describing enhancing regions and tumor margins are important to predict grade, IDH and 1p19q status" [Decuyper pg. 5 para. 3]. 
 
Regarding claim 29, Wang, Decuyper, and Avinash teach the method of claim 1. Wang further teaches a non-transitory computer readable storage medium having executable instructions stored therein, wherein the instructions are configured to, when executed by a processor, cause the processor to implement the method ([0137] In at least one embodiment, code is stored on a computer-readable storage medium in form of a computer program comprising a plurality of computer-readable instructions executable by one or more processors).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Decuyper, Avinash and Backhaus (US20140142983A1). 

Regarding claim 10, Wang, Decuyper, and Avinash teach the method of claim 1. Wang further teaches obtaining an image type information of the first medical image data according to device type conversion standard data corresponding to the first medical image data ([0606] In at least one embodiment, once DICOM data is processed through DICOM adapter 4102B, pipeline manager 4112 may route data through to deployment pipeline 4110A. In at least one embodiment, DICOM reader 4206 may extract image files and any associated metadata from DICOM data (e.g., raw sinogram data, as illustrated in visualization 4216A)).
Wang does not teach wherein the method further comprises: determining the image type information of the first medical image data according to medical image metadata corresponding to the first medical image data, in response to failing to determine the image type information of the first medical image data according to the device type conversion standard data corresponding to the first medical image data; and obtaining the image type information according to the first medical image data and the medical image metadata, in response to failing to determine the image type information according to the device type conversion standard data and failing to determine the image type information according to the medical image metadata.
Backhaus, in the same field of endeavor of medical image analysis, teaches wherein the method further comprises: determining the image type information of the first medical image data according to medical image metadata corresponding to the first medical image data, in response to failing to determine the image type information of the first medical image data according to the device type conversion standard data corresponding to the first medical image data; and obtaining the image type information according to the first medical image data and the medical image metadata, in response to failing to determine the image type information according to the device type conversion standard data and failing to determine the image type information according to the medical image metadata ([0007] For example, the information may be provided at the imaging modality before the radiology technician conducts the imaging of the patient; the information may be provided within the radiological images (such as in the header of a DICOM-standard image); the information may be provided within the PACS storing the images. [0009] At least one of the series of radiology image data files is processed using the image order processing server, resulting in the extraction of the metadata stored in the data files. In a further embodiment, metadata is retrieved from the first image data file received at the image order processing server. The data provided by one or more data fields within the extracted metadata is used to create and populate a radiology order. [0028] As a further example of metadata data values, the metadata within each imaging data file may include identification information such as patient identifier and an identifier of the series of images, in addition to information about the type of modality and the techniques used to obtain the images. Further, for images formatted according to the DICOM standard, data fields such as a unique image identifier, a unique study identifier, the patient's name, and the facility from which the image originates may be included. [0012] In a further embodiment, the radiology image data files may be converted to another format or modified by the image order processing system prior to electronic transmission to the radiologist). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Backhaus to determine the image type using metadata when unable to based on the device, and to determine the image type using image data when unable to based on the device or metadata because "data values such as patient identifier, a medical facility identifier, sex of the patient, age of the patient, a type of modality used to produce the series of medical images, a type of radiological procedure, an indication of whether the study is a preliminary or final read, a type of medical condition, and a type of scan for which the series of medical images relates to might be used to select the radiologist" [Backhaus 0011]. 

Claims 14-15, 18-19, 22, and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Wu (Wu, Yujiao, et al. "Multimodal learning for non-small cell lung cancer prognosis." arXiv.org. 7 Nov 2022) and Avinash.

Regarding claim 14, Wang teaches a method of processing medical data, comprising: acquiring first medical text data and second medical image data; obtaining a second image feature according to the second medical image data, wherein the second medical image data comprises a plurality of second medical images (Fig. 10. 1002: obtain text and one or more images, 1004: calculate features of text and one or more images. [0139] In at least one embodiment, text is a medical text that comprises description of various features, characteristics, aspects, analysis, and/or variations thereof, of one or more medical images. In at least one embodiment, one or more images include one or more medical images that depict various medical structures, such as skeletal structures, organs, tissues, anomalies, and/or variations thereof. [0113] In at least one embodiment, one or more systems determine an image data 408, an image data 410, and an image data 412 from at least an image 402, an image 404, and an image 406, respectively. In at least one embodiment, an image data 408, an image data 410, and an image data 412 each comprise one or more image patches and/or image data features determined from at least an image 402, an image 404, and an image 406, respectively); 
inputting the first medical text data into a feature extraction network to obtain a first text feature ([0070] In at least one embodiment, a pre-training framework obtains or otherwise receives as input a text 102, calculates a text embedding 104, which is input to a text encoder 106 to calculate a text encoding 108, which is processed by a cross correlation module 110 to calculate text features 112, which is processed by a text decoder 114 to calculate decoded text features 116);  
fusing the second image feature with the first text feature to obtain a first fusion feature ([0082] In at least one embodiment, a cross correlation module 110 comprises a UNIT and a UWOX module. In at least one embodiment, a UNIT module fuses image and text features using a SAM); 
wherein the acquiring first medical text data and second medical image data comprises: obtaining a first image query matrix and a first image key matrix according to the first preliminary image feature; determining a first image weight matrix according to the first image query matrix and the first image key matrix, wherein the first image weight matrix represents a correlation information between each two first medical images in the first medical image data; and determining the first image feature according to the first image weight matrix and the first medical image data ([0093] In at least one embodiment, inputs to a SAM 202 include a Q 204, also referred to as a query, a K 206, also referred to as a key, and a V 208, also referred to as a value. In at least one embodiment, a Q 204, a K 206, and a V 208 are each a matrix. [0094] In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 perform various processes to project inputs to one or more dimensions. In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 utilize various weight matrices, also referred to as parameter matrices. In at least one embodiment, one or more systems determine values for weight matrices through one or more training processes. In at least one embodiment, one or more systems determine values for weight matrices through any suitable process, such as using various training processes, functions, random number generation processes, pre-defined values, logic, rules, heuristics, and/or variations thereof. In at least one embodiment, a linear 210 multiplies a Q 204 by a first weight matrix, a linear 212 multiplies a K 206 by a second weight matrix, and a linear 214 multiplies a V 208 by a third weight matrix. [0100] FIG. 3 illustrates an example 300 of a cross correlation module. [0107] In at least one embodiment, a pair matching 320 is a collection of one or more hardware and/or software computing resources with instructions that, when executed, performs one or more image and/or text correlation processes. [0084] In at least one embodiment, a cross correlation module 110 outputs text features 112, denoted by F.sub.txt, to a text decoder 114 and image features 128, denoted by F.sub.img, to a multi-scale image decoder 130…In at least one embodiment, an image features 128 is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. In at least one embodiment, an image features 128 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. [0070] image features 128, which is processed by a multi-scale image decoder 130 to calculate decoded image features 132);
Wang does not teach obtaining a first survival information according to the first fusion feature.
Wu, in the same field of endeavor of mulitmodal learning, teaches obtaining a first survival information according to the first fusion feature.

    PNG
    media_image2.png
    365
    624
    media_image2.png
    Greyscale

Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Wu to obtain survival information based on the fusion because "The survival model is an estimate of how lung cancer will develop, and it can reveal the relationship between prognostic factors and the disease [Wu pg. 1 para. 1].
Wang does not teach obtaining a first temporal image feature according to the plurality of first medical images, wherein the first temporal image feature represents a temporal relationship between the plurality of first medical images, wherein the first preliminary image feature is a first temporal image feature. 
Avinash teaches obtaining a first temporal image feature according to the plurality of first medical images, wherein the first temporal image feature represents a temporal relationship between the plurality of first medical images, wherein the first preliminary image feature is a first temporal image feature ([0270] A single-type, multi-modality medical system, in the present context, may consist of any of the columns of the FIG. 8. In FIG. 7, a diagrammatical representation a single-type, multi-modality system with the temporal attributes is illustrated, considering M modalities at N different time points…The temporal aspects of a medical event are also considered in the context, such as to modify acquisition, processing and analysis modules based on the temporal attributes of the data. [0280] The acquisition/storage module contains acquired medical data. For temporal change analysis, means are provided to access the data from storage corresponding to an earlier time point. To simplify notation in the subsequent discussion we describe only two time points t1 and t2, even though the general approach can be extended for any type of medical data in the acquisition and temporal sequence. The segmentation module provides automated or manual means for isolating features, volumes, regions, lines, and/or points of interest. In many cases of practical interest, the entire data can be the output of the segmentation module).
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Avinash to extract temporal features for "performing temporal analysis on multiple-type data to fully characterize the medical condition in question" [Avinash 0278].
 
Regarding claim 15, Wang, Wu, and Avinash teach the method of claim 14. Wang further teaches wherein the inputting the first medical text data into the feature extraction network to obtain a first text feature comprises: encoding the first medical text data to obtain a first medical text vector ([0075] In at least one embodiment, a text embedding 104 comprises one or more vectors that represent a text 102); 
inputting the first medical text vector into a first encoder (text encoder 106) to obtain a first hidden vector; inputting the first hidden vector into a first decoder (text decoder 114) to obtain a first decoded vector; and obtaining the first text feature according to the first hidden vector and the first decoded vector ([0086] In at least one embodiment, a text decoder 114 and/or a multi-scale image decoder 130 each comprise or otherwise implement one or more SAMs. In at least one embodiment, a text decoder 114 and/or a multi-scale image decoder 130 each process an input (e.g., a text features 112 and/or an image features 128, respectively) to determine one or more vectors (e.g., a decoded text features 116 and/or a decoded image features 132, respectively) that encode various information of said input…In at least one embodiment, a text decoder 114 and/or a multi-scale image decoder 130 comprise or otherwise implement one or more multi-layer perceptrons (MLP). In at least one embodiment, a multi-layer perceptron refers to a class of feedforward artificial neural networks that comprise at least an input layer, a hidden layer, and an output layer).
wherein the fusing the second image feature with the first text feature to obtain a first fusion feature comprises: determining a second image query matrix and a second image key matrix according to the second medical image data ([0093] In at least one embodiment, inputs to a SAM 202 include a Q 204, also referred to as a query, a K 206, also referred to as a key, and a V 208, also referred to as a value. In at least one embodiment, a Q 204, a K 206, and a V 208 are each a matrix); 
determining a text query matrix and a text key matrix according to the first medical text data ([0093] In at least one embodiment, a Q 204, a K 206, and/or a V 208 are each an embedding such as a text embedding, an image embedding, and/or variations thereof);
determining a first fusion weight matrix according to the second image query matrix and the text key matrix; determining a second fusion weight matrix according to the text query matrix and the second image key matrix ([0094] In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 perform various processes to project inputs to one or more dimensions. In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 utilize various weight matrices, also referred to as parameter matrices. In at least one embodiment, one or more systems determine values for weight matrices through one or more training processes. In at least one embodiment, one or more systems determine values for weight matrices through any suitable process, such as using various training processes, functions, random number generation processes, pre-defined values, logic, rules, heuristics, and/or variations thereof. In at least one embodiment, a linear 210 multiplies a Q 204 by a first weight matrix, a linear 212 multiplies a K 206 by a second weight matrix, and a linear 214 multiplies a V 208 by a third weight matrix. [0100] FIG. 3 illustrates an example 300 of a cross correlation module. [0107] In at least one embodiment, a pair matching 320 is a collection of one or more hardware and/or software computing resources with instructions that, when executed, performs one or more image and/or text correlation processes); 
obtaining a first output feature vector according to the first fusion weight matrix and a text value matrix, wherein the text value matrix is obtained according to the first medical text data; obtaining a second output feature vector according to the second fusion weight matrix and a second image value matrix, wherein the second image value matrix is obtained according to the second medical image data ([0084] In at least one embodiment, a cross correlation module 110 outputs text features 112, denoted by F.sub.txt, to a text decoder 114 and image features 128, denoted by F.sub.img, to a multi-scale image decoder 130…In at least one embodiment, an image features 128 is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. In at least one embodiment, an image features 128 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. [0070] text 102, calculates a text embedding 104, which is input to a text encoder 106 to calculate a text encoding 108, which is processed by a cross correlation module 110 to calculate text features 112, which is processed by a text decoder 114 to calculate decoded text features 116…image features 128, which is processed by a multi-scale image decoder 130 to calculate decoded image features 132);
and obtaining the first fusion feature according to the first output feature vector and the second output feature vector ([0082] In at least one embodiment, a cross correlation module 110 comprises a UNIT and a UWOX module. In at least one embodiment, a UNIT module fuses image and text features using a SAM); 
wherein a first fusion deep learning model comprises the feature extraction network ([0041] FIG. 28 illustrates a deep learning application processor, according to at least one embodiment). 
Wang does not teach acquiring second sample data, wherein the second sample data comprises first sample medical data and a first sample survival label information corresponding to the first sample medical data, and the first sample medical data comprises first sample medical text data and second sample medical image data; inputting the first sample medical data into the first fusion deep learning network to obtain a first sample survival prediction information corresponding to the first sample medical data; inputting the first sample survival prediction information and the first sample survival label information into a second loss function to obtain a second loss function value; and adjusting a model parameter of the first fusion deep learning network according to the second loss function value.
Wu teaches acquiring second sample data, wherein the second sample data comprises first sample medical data and a first sample survival label information corresponding to the first sample medical data, and the first sample medical data comprises first sample medical text data and second sample medical image data; inputting the first sample medical data into the first fusion deep learning network to obtain a first sample survival prediction information corresponding to the first sample medical data ([pg. 2 para. 5] we develop the first multimodal network for NSCLC survival analysis, which takes Deep Learning-based NSCLC survival analysis one step forward by simultaneously considering the textual clinical data and the visual CT clues. As shown in Fig.1, our network is a two-tower paradigm, i.e., clinical tower and a visual tower. The clinical tower is responsible for encoding the clinical data, while the visual tower aims to extract the visual representation from the CT images. Finally, the prediction head fuses the cross-modality features and provides the time prediction);
inputting the first sample survival prediction information and the first sample survival label information into a second loss function to obtain a second loss function value; and adjusting a model parameter of the first fusion deep learning network according to the second loss function value (See Figure 6. [pg. 3 para. 8] the parameters of two towers are jointly optimized by minimizing the distance between the survival time prediction t and ground-truth one t_hat. [pg. 10 para. 2] When training the network, we employ the popular parameter normalization strategy to avoid overfitting, i.e., the second term in Eq. 16, and introduce a hyperparameter λ to balance the main loss and the parameter normalization). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Wu to compare the predicted survival to the ground truth survival "to finetune and get the optimized hyperparameters" [Wu pg. 7 para. 1].

Regarding claim 18, Wang teaches a method of analyzing medical data (Fig. 10), comprising: acquiring second medical text data and third medical image data ([0139] In at least one embodiment, a system performing at least a part of process 1000 includes executable code to at least obtain 1002 text and one or more images. In at least one embodiment, a system obtains data, such as training data, comprising text and one or more images. In at least one embodiment, a system obtains data from one or more datasets. In at least one embodiment, text and one or more images are from paired data, unpaired data, and/or variations thereof. In at least one embodiment, text and one or more images may or may not correspond to or otherwise be associated with each other. In at least one embodiment, text is a medical text that comprises description of various features, characteristics, aspects, analysis, and/or variations thereof, of one or more medical images. In at least one embodiment, one or more images include one or more medical images that depict various medical structures, such as skeletal structures, organs, tissues, anomalies, and/or variations thereof); 
inputting the third medical image data into a first feature extraction network to obtain a third image feature ([0140] In at least one embodiment, a system calculates features using one or more encoders based at least in part on an image embedding. In at least one embodiment, a system calculates image features using one or more image encoders, such as a multi-scale image encoder); 
determining a second gene mutation information according to the third image feature ([0080] In at least one embodiment, an image encoding 126, also referred to as image features, encoded features, and/or variations thereof, is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image embedding 122. In at least one embodiment, an image encoding 126 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image embedding 122. [0562] Examples of genomic analyses that may be performed using systems and processes described herein include, without limitation, variant calling, mutation detection, and gene expression quantification); 
inputting the second medical text data into a second feature extraction network to obtain a second text feature ([0078] In at least one embodiment, a text encoder 106 and/or a multi-scale image encoder 124 each process an input embedding to determine one or more vectors that encode various information of said input embedding. In at least one embodiment, a text encoder 106 and/or a multi-scale image encoder 124 comprise or otherwise implement one or more multi-head Self-Attention Modules (SAM), as described in connection with FIG. 2. In at least one embodiment, a text encoder 106 and/or a multi-scale image encoder 124 each comprise or otherwise implement a number, denoted by N.sub.e, of SAMs. In at least one embodiment, a text encoder 106 and/or a multi-scale image encoder 124 each comprise or otherwise implement various neural network models for processing of outputs); 
wherein the first feature extraction network comprises a first feature extraction module configured to: determine a third image query matrix and a third image key matrix according to the third medical image data ([0092] FIG. 2 illustrates an example 200 of a self-attention module, according to at least one embodiment. In at least one embodiment, a self-attention module (SAM) 202 is in accordance with those described elsewhere in this disclosure) configured to: determine a third image query matrix and a third image key matrix according to the third medical image data ([0093] In at least one embodiment, inputs to a SAM 202 include a Q 204, also referred to as a query, a K 206, also referred to as a key, and a V 208, also referred to as a value. In at least one embodiment, a Q 204, a K 206, and a V 208 are each a matrix); 
determine a third image weight matrix according to the third image query matrix and the third image key matrix, wherein the third image weight matrix represents a correlation information between each two third medical images in the third medical image data ([0094] In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 perform various processes to project inputs to one or more dimensions. In at least one embodiment, a linear 210, a linear 212, and/or a linear 214 utilize various weight matrices, also referred to as parameter matrices. In at least one embodiment, one or more systems determine values for weight matrices through one or more training processes. In at least one embodiment, one or more systems determine values for weight matrices through any suitable process, such as using various training processes, functions, random number generation processes, pre-defined values, logic, rules, heuristics, and/or variations thereof. In at least one embodiment, a linear 210 multiplies a Q 204 by a first weight matrix, a linear 212 multiplies a K 206 by a second weight matrix, and a linear 214 multiplies a V 208 by a third weight matrix. [0100] FIG. 3 illustrates an example 300 of a cross correlation module. [0107] In at least one embodiment, a pair matching 320 is a collection of one or more hardware and/or software computing resources with instructions that, when executed, performs one or more image and/or text correlation processes); 
and determine the third image feature according to the third image weight matrix and the third medical image data ([0084] In at least one embodiment, a cross correlation module 110 outputs text features 112, denoted by F.sub.txt, to a text decoder 114 and image features 128, denoted by F.sub.img, to a multi-scale image decoder 130…In at least one embodiment, an image features 128 is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. In at least one embodiment, an image features 128 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. [0070] image features 128, which is processed by a multi-scale image decoder 130 to calculate decoded image features 132);
wherein the third medical image data comprises a plurality of third medical images and the first feature extraction network further comprises a second feature extraction module (Fig. 10. 1002: obtain text and one or more images, 1004: calculate features of text and one or more images. [0183] In at least one embodiment, one or more systems depicted in FIG. 13 are utilized to use one or more neural networks to indicate an extent to which text corresponds to one or more images. [0070] In at least one embodiment, a pre-training framework obtains or otherwise receives as input an image 120, calculates an image embedding 122, which is input to a multi-scale image encoder 124 to calculate an image embedding 126, which is processed by a cross correlation module 110 to calculate image features 128); 
and wherein the method further comprises: and wherein determining the third image query matrix and the third image key matrix according to the third medical image data comprises: inputting the second preliminary image feature into the first feature extraction module to obtain the third image query matrix and the third image key matrix ([0094] In at least one embodiment, a linear 210 multiplies a Q 204 by a first weight matrix, a linear 212 multiplies a K 206 by a second weight matrix, and a linear 214 multiplies a V 208 by a third weight matrix. [0100] FIG. 3 illustrates an example 300 of a cross correlation module. [0107] In at least one embodiment, a pair matching 320 is a collection of one or more hardware and/or software computing resources with instructions that, when executed, performs one or more image and/or text correlation processes. [0084] In at least one embodiment, a cross correlation module 110 outputs text features 112, denoted by F.sub.txt, to a text decoder 114 and image features 128, denoted by F.sub.img, to a multi-scale image decoder 130…In at least one embodiment, an image features 128 is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. In at least one embodiment, an image features 128 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. [0070] image features 128, which is processed by a multi-scale image decoder 130 to calculate decoded image features 132)).
Wang does not teach determining a second survival information according to a fusion feature obtained from the third image feature and the second text feature.
Wu teaches determining a second survival information according to a fusion feature obtained from the third image feature and the second text feature ([pg. 6 para. 4] In this work, we considered 422 NSCLC patients from TCIA to assess the proposed framework. For these patients pretreatment CT scans, manual delineation by a radiation oncologist of the 3D volume of the gross tumor volume and clinical outcome data are available. The corresponding clinical data are also available in the same collection. The patients who had neither survival time nor event status were excluded from this work. See Figure 2 above).
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Wu to determine additional survival information from additional medical text and image data to "develop the first multimodal network for NSCLC survival analysis" [Wu pg. 2 para. 5]. 
Wang does not teach inputting the plurality of third medical images into the second feature extraction module to obtain a second temporal image feature, wherein the second temporal image feature represents a temporal relationship between the plurality of third medical images, wherein the second preliminary image feature is the second temporal image feature.
Avinash teaches inputting the plurality of third medical images into the second feature extraction module to obtain a second temporal image feature, wherein the second temporal image feature represents a temporal relationship between the plurality of third medical images, wherein the second preliminary image feature is the second temporal image feature ([0270] A single-type, multi-modality medical system, in the present context, may consist of any of the columns of the FIG. 8. In FIG. 7, a diagrammatical representation a single-type, multi-modality system with the temporal attributes is illustrated, considering M modalities at N different time points…The temporal aspects of a medical event are also considered in the context, such as to modify acquisition, processing and analysis modules based on the temporal attributes of the data. [0280] The acquisition/storage module contains acquired medical data. For temporal change analysis, means are provided to access the data from storage corresponding to an earlier time point. To simplify notation in the subsequent discussion we describe only two time points t1 and t2, even though the general approach can be extended for any type of medical data in the acquisition and temporal sequence. The segmentation module provides automated or manual means for isolating features, volumes, regions, lines, and/or points of interest. In many cases of practical interest, the entire data can be the output of the segmentation module).
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Avinash to extract temporal features for "performing temporal analysis on multiple-type data to fully characterize the medical condition in question" [Avinash 0278].

Regarding claim 19, Wang, Wu, and Avinash teach the method of claim 18. Wang further teaches generating a test report according to the second gene mutation information ([0131] FIG. 5 depicts AUCs for each disease class and Average (AVG) AUCs for all models. [0562] Examples of genomic analyses that may be performed using systems and processes described herein include, without limitation, variant calling, mutation detection, and gene expression quantification);
wherein the first feature extraction network further comprises a third feature extraction module, and the third feature extraction module comprises a maximum pooling layer, a residual unit, a down- sampling unit, and a average pooling layer (all within ResNet-50. [0131] ResNet50 indicates a CNN based disease classifier. In at least one embodiment, for classification using both image and text reports, denoted as “img&txt,” one or more systems directly extract a textual feature via a pre-trained model such as a Biobert model and concatenate it together with an output of a layer such as a pool5 in ResNet-50);
wherein the method further comprises: inputting the plurality of third medical images into the third feature extraction module to obtain a second intermediate image feature ([0084] In at least one embodiment, a cross correlation module 110 outputs text features 112, denoted by F.sub.txt, to a text decoder 114 and image features 128, denoted by F.sub.img, to a multi-scale image decoder 130…In at least one embodiment, an image features 128 is a collection of data indicating various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. In at least one embodiment, an image features 128 comprises one or more vectors that represent various characteristics, features, aspects, and/or variations thereof, of an image encoding 126. [0070] image features 128, which is processed by a multi-scale image decoder 130 to calculate decoded image features 132).
Wang does not teach generating a test report according to second survival information.
Wu teaches generating a test report according to second survival information ([pg. 10 para. 4] This work contributes a powerful multimodal network for more accurate prediction of NSCLC survival, with the purpose of helping clinicians to develop timely treatment plans and improve patients’ quality of life). 
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Wu to report survival information because "The survival model is an estimate of how lung cancer will develop, and it can reveal the relationship between prognostic factors and the disease [Wu pg. 1 para. 1].
Wang does not teach and wherein the inputting the plurality of third medical images into the second feature extraction module to obtain a second temporal image feature comprises: inputting the second intermediate image feature into the second feature extraction module to obtain the second temporal image feature.
Avinash teaches and wherein the inputting the plurality of third medical images into the second feature extraction module to obtain a second temporal image feature comprises: inputting the second intermediate image feature into the second feature extraction module to obtain the second temporal image feature ([0280] The acquisition/storage module contains acquired medical data. For temporal change analysis, means are provided to access the data from storage corresponding to an earlier time point. To simplify notation in the subsequent discussion we describe only two time points t1 and t2, even though the general approach can be extended for any type of medical data in the acquisition and temporal sequence. The segmentation module provides automated or manual means for isolating features, volumes, regions, lines, and/or points of interest. In many cases of practical interest, the entire data can be the output of the segmentation module. See Fig. 26. [0366] The process 394 may be considered to begin at a step 400 where an expert or medical professional performs feature detection and classification…The expert will typically draw the data from the IKB 12 or from the various resources 18 and may draw upon additional data from such resources to support the "reading" process of feature detection and classification. The expert then produces a dataset labeled D1, and referred to in FIG. 26 by reference numeral 402. [0367] In parallel with the expert feature detection and classification functions, an algorithm, in the example a CAD algorithm, performs similar feature detection and classification functions at step 404. As noted above, various programs are available for such functions, typically drawing upon raw or processed image data… As a result of step 404, a second dataset D2, referred to in FIG. 26 by reference numeral 406, is produced, which may be similarly annotated for display. [0370]  Block 418 in FIG. 26 represents a reconciler).
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Avinash to use intermediate temporal image features because "The purpose of the reconciler 418 is to resolve conflicts between detection and classification by the algorithm and the expert" [Avinash 0370]. 

Regarding claim 22, Wang, Wu, and Avinash teach the method of claim 18. Wang further teaches wherein determining the third image feature according to the third image weight matrix and the third medical image data comprises: obtaining a third image value matrix according to the third medical image data; and obtaining the third image feature according to the third image weight matrix and the third image value matrix ([0139] In at least one embodiment, a system performing at least a part of process 1000 includes executable code to at least obtain 1002 text and one or more images. In at least one embodiment, a system obtains data, such as training data, comprising text and one or more images. In at least one embodiment, a system obtains data from one or more datasets. In at least one embodiment, text and one or more images are from paired data, unpaired data, and/or variations thereof. In at least one embodiment, text and one or more images may or may not correspond to or otherwise be associated with each other. In at least one embodiment, text is a medical text that comprises description of various features, characteristics, aspects, analysis, and/or variations thereof, of one or more medical images. In at least one embodiment, one or more images include one or more medical images that depict various medical structures, such as skeletal structures, organs, tissues, anomalies, and/or variations thereof. [0084] In at least one embodiment, a cross correlation module 110 outputs text features 112, denoted by F.sub.txt, to a text decoder 114 and image features 128, denoted by F.sub.img, to a multi-scale image decoder 130);
wherein the inputting the second medical text data into a second feature extraction network to obtain a second text feature comprises: encoding the second medical text data to obtain a second medical text vector ([0075] In at least one embodiment, a text embedding 104 comprises one or more vectors that represent a text 102); 
inputting the second medical text vector into a second encoder to obtain a second hidden vector; inputting the second hidden vector into a second decoder to obtain a second decoded vector; and obtaining the second text feature according to the second hidden vector and the second decoded vector ([0086] In at least one embodiment, a text decoder 114 and/or a multi-scale image decoder 130 each comprise or otherwise implement one or more SAMs. In at least one embodiment, a text decoder 114 and/or a multi-scale image decoder 130 each process an input (e.g., a text features 112 and/or an image features 128, respectively) to determine one or more vectors (e.g., a decoded text features 116 and/or a decoded image features 132, respectively) that encode various information of said input…In at least one embodiment, a text decoder 114 and/or a multi-scale image decoder 130 comprise or otherwise implement one or more multi-layer perceptrons (MLP). In at least one embodiment, a multi-layer perceptron refers to a class of feedforward artificial neural networks that comprise at least an input layer, a hidden layer, and an output layer);
wherein the determining a second gene mutation information according to the third image feature comprises: inputting the third image feature into a third classification network to obtain the second gene mutation information; inputting the fusion feature obtained from the third image feature and the second text feature into a fourth classification network ([0082] In at least one embodiment, a cross correlation module 110 comprises a UNIT and a UWOX module. In at least one embodiment, a UNIT module fuses image and text features using a SAM. [0126] In at least one embodiment, one or more systems utilize one or more neural network models trained in connection with a pre-training framework for tasks such as disease classification, similarity search (e.g., patient study retrieval), and image regeneration. In at least one embodiment, disease classification refers to one or more tasks that classify potential diseases from image and/or text data. [0562] Examples of genomic analyses that may be performed using systems and processes described herein include, without limitation, variant calling, mutation detection, and gene expression quantification); 
and wherein a model parameter of the second feature extraction network, a model parameter of the third classification network and a model parameter of the fourth classification network are obtained by joint training ([0065] In at least one embodiment, a pre-training framework is transformer-based and utilizes various data inputs for joint representation learning of image and text in a self-supervised manner).
Wang does not teach wherein the determining a second survival information according to a fusion feature obtained from the third image feature and the second text feature comprises: inputting the fusion feature to obtain the second survival information.
Wu teaches wherein the determining a second survival information according to a fusion feature obtained from the third image feature and the second text feature comprises: inputting the fusion feature to obtain the second survival information ([pg. 2 para. 5] we develop the first multimodal network for NSCLC survival analysis, which takes Deep Learning-based NSCLC survival analysis one step forward by simultaneously considering the textual clinical data and the visual CT clues. As shown in Fig.1, our network is a two-tower paradigm, i.e., clinical tower and a visual tower. The clinical tower is responsible for encoding the clinical data, while the visual tower aims to extract the visual representation from the CT images. Finally, the prediction head fuses the cross-modality features and provides the time prediction).
Therefore, it would have been obvious to a person of ordinary skill in the art at the time that the invention was made to modify the method of Wang with the teachings of Wu to obtain survival information based on the fusion because "The survival model is an estimate of how lung cancer will develop, and it can reveal the relationship between prognostic factors and the disease [Wu pg. 1 para. 1].

Regarding claim 28, Wang, Wu, and Avinash teach the method of claim 18. Wang further teaches an electronic device, comprising: one or more processors; and a memory configured to store one or more programs, wherein the one or more programs are configured to, when executed by the one or more processors, cause the one or more processors to implement the method ([0137] FIG. 10 illustrates an example of a process 1000 of a pre-training framework, according to at least one embodiment. In at least one embodiment, some or all of process 1000 (or any other processes described herein, or variations and/or combinations thereof) is performed under control of one or more computer systems configured with computer-executable instructions and is implemented as code (e.g., computer-executable instructions, one or more computer programs, or one or more applications) executing collectively on one or more processors, by hardware, software, or combinations thereof. In at least one embodiment, code is stored on a computer-readable storage medium in form of a computer program comprising a plurality of computer-readable instructions executable by one or more processors).

Allowable Subject Matter
Claim 25 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims once all rejections are overcome.
Regarding claim 25, Wang teaches inputting the third sample medical image data into the first feature extraction network and the third classification network to obtain a second sample gene mutation prediction information corresponding to the third sample medical data; inputting the second sample medical data into the second feature extraction network and the fourth classification network (Fig. 1; Fig. 10. 1002: obtain text and one or more images, 1004: calculate features of text and one or more images. [0183] In at least one embodiment, one or more systems depicted in FIG. 13 are utilized to use one or more neural networks to indicate an extent to which text corresponds to one or more images. [0126] In at least one embodiment, one or more systems utilize one or more neural network models trained in connection with a pre-training framework for tasks such as disease classification, similarity search (e.g., patient study retrieval), and image regeneration. In at least one embodiment, disease classification refers to one or more tasks that classify potential diseases from image and/or text data. [0562] Examples of genomic analyses that may be performed using systems and processes described herein include, without limitation, variant calling, mutation detection, and gene expression quantification);
wherein the adjusting the model parameter of the first feature extraction network, the model parameter of the second feature extraction network, the model parameter of the third classification network, and the model parameter of the fourth classification network according to the third loss function value and the fourth loss function value comprises: adjusting the model parameter of the first feature extraction network and the model parameter of the third classification network according to the third loss function value; and adjusting the model parameter of the second feature extraction network and the model parameter of the fourth classification network according to the fourth loss function value, while maintaining the model parameter of the first feature extraction network and the model parameter of the third classification network unchanged; or determining a total loss function value according to the third loss function value and the fourth loss function value; and adjusting the model parameter of the first feature extraction network, the model parameter of the second feature extraction network, the model parameter of the third classification network, and the model parameter of the fourth classification network according to the total loss function value. 

    PNG
    media_image3.png
    448
    694
    media_image3.png
    Greyscale

Decuyper teaches inputting the second sample gene mutation prediction information and the second sample gene mutation label information into a third loss function to obtain a third loss function value ([pg. 5 para. 5] The loss is calculated for each task separately on all samples in the batch with known ground truth labels and averaged to a global loss which is backpropagated through the network).
Wu teaches inputting the second sample medical data into the second feature extraction network to obtain a second sample survival prediction information corresponding to the second sample medical data ([pg. 2 para. 5] we develop the first multimodal network for NSCLC survival analysis, which takes Deep Learning-based NSCLC survival analysis one step forward by simultaneously considering the textual clinical data and the visual CT clues. As shown in Fig.1, our network is a two-tower paradigm, i.e., clinical tower and a visual tower. The clinical tower is responsible for encoding the clinical data, while the visual tower aims to extract the visual representation from the CT images. Finally, the prediction head fuses the cross-modality features and provides the time prediction);
 inputting the second sample survival prediction information and the second sample survival label information into a fourth loss function to obtain a fourth loss function value (See Figure 6. [pg. 3 para. 8] the parameters of two towers are jointly optimized by minimizing the distance between the survival time prediction t and ground-truth one t_hat. [pg. 10 para. 2] When training the network, we employ the popular parameter normalization strategy to avoid overfitting, i.e., the second term in Eq. 16, and introduce a hyperparameter λ to balance the main loss and the parameter normalization).
 The following limitations were not found to be taught in the art: wherein obtaining the model parameter of the second feature extraction network, the model parameter of the third classification network and the model parameter of the fourth classification network by joint training comprises: acquiring third sample data, wherein the third sample data comprises third sample medical image data, a second sample gene mutation label information corresponding to the third sample medical image data, second sample medical data, and a second sample survival label information corresponding to the second sample medical data, and the second sample medical data comprises second sample medical text data and third sample medical image data. 

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
                                                                                                                                                                
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jacqueline R Zak whose telephone number is (571)272-4077. The examiner can normally be reached M-F 9-5. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached at (571) 270-3717. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JACQUELINE R ZAK/Examiner, Art Unit 2666                                                                                                                                                                                                        
/EMILY C TERRELL/Supervisory Patent Examiner, Art Unit 2666
Read full office action
Prosecution Timeline

Sep 12, 2023
Application Filed
Sep 16, 2025
Non-Final Rejection mailed — §103
Dec 05, 2025
Response Filed
Feb 17, 2026
Final Rejection mailed — §103
Apr 13, 2026
Response after Non-Final Action
May 13, 2026
Request for Continued Examination
May 18, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

17/956,679
Patent 12632957
METHODS AND SYSTEMS FOR USE IN PROCESSING IMAGES RELATED TO CROPS
3y 7m to grant Granted May 19, 2026
17/987,574
Patent 12632932
IMAGE PROCESSING DEVICE AND OPERATION METHOD THEREOF
3y 6m to grant Granted May 19, 2026
18/175,738
Patent 12586340
PIXEL PERSPECTIVE ESTIMATION AND REFINEMENT IN AN IMAGE
3y 0m to grant Granted Mar 24, 2026
18/012,667
Patent 12462343
MEDICAL DIAGNOSTIC APPARATUS AND METHOD FOR EVALUATION OF PATHOLOGICAL CONDITIONS USING 3D OPTICAL COHERENCE TOMOGRAPHY DATA AND IMAGES
2y 10m to grant Granted Nov 04, 2025
17/924,432
Patent 12373946
ASSAY READING METHOD
2y 8m to grant Granted Jul 29, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

2-3
Expected OA Rounds
53%
Grant Probability
48%
With Interview (-4.5%)
3y 1m (~5m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 17 resolved cases by this examiner. Grant probability derived from career allowance rate.