Last updated: May 29, 2026
Application No. 18/089,767
DIAGNOSTIC ASSISTANCE METHOD AND DEVICE

Final Rejection §103
Filed
Dec 28, 2022
Priority
Jun 29, 2020 — RE 10-2020-0079142 +1 more
Examiner
GEBRESLASSIE, WINTA
Art Unit
2677
Tech Center
2600 — Communications
Assignee
Medi Whale Inc.
OA Round
2 (Final)
Interview Optional

— +25.0% interview lift. Examiner has a relatively high allowance rate (76%); +25.0% interview lift. A written response may suffice.
Based on 135 resolved cases, 2023–2026
Examiner Intelligence

GEBRESLASSIE, WINTA View full profile →
Grants 76% — above average
Career Allowance Rate
102 granted / 135 resolved
+13.6% vs TC avg
Strong +25% interview lift
Without
With
+25.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 6m
Avg Prosecution
25 currently pending
Career history
189
Total Applications
across all art units
Statute-Specific Performance

§103
94.4%
+54.4% vs TC avg
§102
3.3%
-36.7% vs TC avg
§112
1.1%
-38.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 135 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: obtaining unit, processing unit, pre-processing unit, in claim 1, 7, 12, and 18.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-8, and 10-20 are rejected under 35 U.S.C. 103 as being unpatentable over Peng et al. (US 20190180441 A1) in view Zhou et al. (US 20180225822 A1).
Regarding claim 1, Peng et al. teaches a diagnosis assistance apparatus which uses a neural network model comprising at least one neural network layer and is configured to obtain diagnosis assistance information based on an eye image (see Abstract; “Methods, systems, and apparatus…..for processing fundus images using fundus image processing machine learning models. One of the methods includes obtaining a model input comprising one or more fundus images, each fundus image being an image of a fundus of an eye of a patient; processing the model input using a fundus image processing machine learning model, wherein the fundus image processing machine learning model is configured to process the model input comprising the one or more fundus image to generate a model output; and processing the model output to generate health analysis data” Note: health analysis data implies diagnosis assistance), the diagnosis assistance apparatus comprising: an eye image obtaining unit configured to obtain a target eye image which is obtained from eyes of a subject (see para [0038]; “the fundus image data includes multiple fundus images that capture the current state of the patient's fundus. For example, the fundus image data can include one or more images of the fundus in the patient's left eye and one or more images of the fundus in the patient's right eye”); and a processing unit configured to use a neural network model trained to obtain diagnosis assistance information based on the eye image, and obtain the diagnosis assistance information based on the target eye image (see para [0043]; “machine learning model that has been configured by being trained on appropriately labeled training data to process the fundus image data and, optionally, the other patient data to generate a model output that characterizes a particular aspect of the patient's health. For example, the fundus image processing machine learning model may be a deep convolutional neural network. An example of a deep convolutional neural network that can be trained to process a fundus image to generate the model outputs”, see also para [0102]; “the other patient data using a fundus image processing machine learning model to generate a predicted fundus image (step 704)….The predicted fundus image is an image of the fundus of the eye of the patient”), wherein the neural network model comprises: first diagnosis assistance neural network model configured to obtain first diagnosis assistance information based on the target eye image (see para [0048]; “the model output is a prediction of the risk of a particular health event occurring in the future. A model output that is a prediction of the risk of a particular event occurring is described in more detail below with reference to FIG. 8”); and second diagnosis assistance neural network model configured to obtain second diagnosis assistance information which is different from the first diagnosis assistance information, based on the target eye image (see also para [0050]; “the model output is a prediction of values of factors that contribute to a particular kind of health-related risk. A model output that is a prediction of values of risk factors is described in more detail below with reference to FIG. 10”, and para [0069]; “in the case of glaucoma, the single score may represent a likelihood that the patient has glaucoma”, Note: Fig. 8 risk of a particular health event and Fig. 10 values of risk factors (i.e., glaucoma) are different diagnosis assistance information from the same fundus images). However, Peng et al. does not teach wherein the first diagnosis assistance neural network model comprises: first common portion configured to obtain first feature set based on the target eye image; and first individual portion configured to obtain the first diagnosis assistance information based on the first feature set, wherein the second diagnosis assistance neural network model comprises: the first common portion configured to obtain the first feature set based on the target eye image; and second individual portion configured to obtain the second diagnosis assistance information based on the first feature set, wherein the first individual portion is trained based on first training data, and the first individual portion is trained based on second training data which is different from the first training data at least in part.  
In the same field of endeavor, Zhou et al. teaches wherein the first diagnosis assistance neural network model comprises: first common portion configured to obtain first feature set based on the target eye image (see para [0043]; “As shown in FIG. 4, common module F1 is shared for determining each of the outputs. Module F2 is an analysis-specific module (e.g., for landmark detection) to determine output image J. Multi-task learning framework 400 also include module G2, a self-reconstruction module for reconstructing input image I as output image I, such that input image I becomes the label” Note: the F1 is the shared feature extractor (first common portion)); and first individual portion configured to obtain the first diagnosis assistance information based on the first feature set (see para [0043]; “Module F2 is an analysis-specific module (e.g., for landmark detection) to determine output image J. Multi-task learning framework 400 also include module G2, a self-reconstruction module for reconstructing input image I as output image I, such that input image I becomes the label” Note: the produced per task output implies the “first individual portion”) wherein the second diagnosis assistance neural network model comprises: the first common portion configured to obtain the first feature set based on the target eye image, and second individual portion configured to obtain the second diagnosis assistance information based on the first feature set (see para [0043]; “As shown in FIG. 4, common module F1 is shared for determining each of the outputs. Module F2 is an analysis-specific module (e.g., for landmark detection) to determine output image J. Multi-task learning framework 400 also include module G2, a self-reconstruction module for reconstructing input image I as output image I, such that input image I becomes the label” Note: the same shared F1 feeds another analysis-specific head (another F2) for a different output); wherein the first individual portion is trained based on first training data, and the first individual portion is trained based on second training data which is different from the first training data at least in part (see para [0047]; “each dataset of training medical imaging data is denoted as Data Set. Modality. Anatomy. Task, or Data Set. MAT, to indicate the medical imaging analysis that the dataset is to be used for training the target network, Net. MAT. Accordingly, such annotation defines the nodes of the target network Net. MAT that the dataset is used to train” Note: different datasets train the different F2 branches).  Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify an apparatus for processing fundus images using fundus image processing machine learning models of Peng et al. in view of the use of a neural network trained to perform the plurality of medical image analyses of Zhou et al. in order to improve efficiency and generalization  across related medical imaging tasks while enabling multiple outputs from a single image input (see para [0043]).
Regarding claim 2, the rejection of claim 1 is incorporated herein.
 Zhou et al. in the combination further teach wherein the first feature set comprises a plurality of feature values which are associated with the first diagnosis assistance information and the second diagnosis assistance information, wherein the first individual portion is configured to obtain the first diagnosis assistance information based on at least one feature value included in the first feature set, and wherein the second individual portion is configured to obtain the second diagnosis assistance information based on at least one feature value included in the first feature set (see para [0043]; “Multi-task learning aims to learn a network that produces multiple outputs, one for each medical imaging analysis, from an input image I. As shown in FIG. 4, common module F1 is shared for determining each of the outputs. Module F2 is an analysis-specific module (e.g., for landmark detection) to determine output image J. Multi-task learning framework 400 also include module G2, a self-reconstruction module for reconstructing input image I as output image I, such that input image I becomes the label. Accordingly, the self-reconstruction module G2 is leveraged to better learn common process F1, thereby improving the mapping between input image I and output image J” Note: mapping F1 “first common portion” that produces a shared feature set used for multiple outputs. Mapping F2 implies a first individual portion receiving F1 features to output the first diagnosis information. Mapping G2 implies a second individual portion receiving the same F1 features to output second, different information. A single “first feature set” from which different heads (first/second individual portions) each select needed feature values, as multi-task CNNs do).  
Regarding claim 3, the rejection of claim 2 is incorporated herein.
Peng et al. in the combination further teach wherein the first diagnosis assistance information comprises first information and second information (see para [0048]; “the model output is a prediction of the risk of a particular health event occurring in the future. A model output that is a prediction of the risk of a particular event occurring is described in more detail below with reference to FIG. 8”, see also para [0050]; “the model output is a prediction of values of factors that contribute to a particular kind of health-related risk. A model output that is a prediction of values of risk factors is described in more detail below with reference to FIG. 10” Note: Fig. 8 risk of a particular health event and Fig. 10 values of risk factors (i.e., glaucoma) are different diagnosis assistance information from the same fundus images). 
  Zhou et al. in the combination further teach and wherein the first individual portion comprises: second common portion configured to obtain second feature set which comprises a plurality of feature values associated with the first information and the second information, based at least in part on the first feature set (see para [0031]; “the hierarchical structure is denoted, from narrowest (bottom level) to broadest (top level), as: the target network Net.MAT, SuperNet, UltraNet, and HyperNet”, see also para [0032]; “A SuperNet is denoted by the terminology SuperNet.Modality.Anatomy, or SuperNet.MA. A SuperNet constitutes the common portion among different tasks T for a same modality M and a same anatomy A. For example, the SuperNet.CT.Liver is shared by Net.CT.Liver.Det and Net.CT.Liver.Seg” Note: SNet.MA implies second common portion inside one branch (same modality/anatomy) producing a new shared feature set for multiple task-heads); first sub-portion configured to obtain the first information based at least in part on the second feature set; and second sub-portion configured to obtain the second information based at least in part on the second feature set (see para [0035]; “The target network, Net. MAT, represents a path or branch in neural network 200 associated with a particular modality, anatomy, and task for performing the particular medical imaging analysis. Mathematically, the target network Net.MAT for a medical procedure takes the form: Net(x; W.sub.H, W.sub.M, W.sub.A, W.sub.MAW.sub.MAT)”, see also para [0032]; “A SuperNet is denoted by the terminology SuperNet.Modality.Anatomy, or SuperNet.MA. A SuperNet constitutes the common portion among different tasks T for a same modality M and a same anatomy A. For example, the SuperNet.CT.Liver is shared by Net.CT.Liver.Det and Net.CT.Liver.Seg” Note:  the branch splits into task specific heads i.e., detection vs segmentation which implies first/second sub-portions).  
Regarding claim 4, the rejection of claim 1 is incorporated herein.
  Peng et al. in the combination further teach wherein the first diagnosis assistance information comprises at least one piece of diagnosis assistance information related to an eye disease (see para [0025]; “For a given patient, the fundus image analysis system 100 receives fundus image data 122 that includes one or more fundus images of the patient's eye and generates health analysis data 142 that characterizes the health of the patient”) and the second diagnosis assistance information comprises at least one piece of diagnosis assistance information related to a cerebral cardiovascular disease (see para [0124]; “the system processes the input fundus image data using a fundus image processing machine learning model to generate a respective predicted value for each of one or more risk factors (step 1004)… Each of the risk factors is a factor that contributes to the risk of one of a particular set of health-related events happening to the patient. For example, when the risk is cardiovascular risk”).  
Regarding claim 5, the rejection of claim 1 is incorporated herein.
  Peng et al. in the combination further teach wherein the first diagnosis assistance information comprises at least one piece of diagnosis assistance information related first eye disease, and the second diagnosis assistance information comprises at least one piece of diagnosis assistance information related to second eye disease which is different from the first eye disease (see para [0061]-[0064]; “Generally, the set of condition state scores are specific to a particular medical condition that the system has been configured to analyze…the medical condition is a particular eye-related condition. For example, the particular eye-related condition may be glaucoma. Generally, glaucoma is a condition in which the optic nerve is damaged, which can result in blindness. As another example, the particular eye-related condition may be age-related macular degeneration. Generally, age-related macular degeneration is a condition in which the macula, an area near the center of the retina, has deteriorated, which may cause partial or total vision loss” Note: glaucoma and macular degeneration are different eye diseases).  
Regarding claim 6, the rejection of claim 1 is incorporated herein.
  Peng et al. in the combination further teach wherein the first diagnosis assistance information comprises diagnosis assistance information related to glaucoma (see para [0063]; “the particular eye-related condition may be glaucoma. Generally, glaucoma is a condition in which the optic nerve is damaged, which can result in blindness”), and the second diagnosis assistance information comprises diagnosis assistance information related to a coronary artery disease (see para [0124]; “Each of the risk factors is a factor that contributes to the risk of one of a particular set of health-related events happening to the patient. For example, when the risk is cardiovascular risk, the particular set of health-related events can be a health event that is classified as a major cardiovascular health event, e.g., myocardial infarction, heart failure, percutaneous cardiac intervention, coronary artery bypass grafting”).  
Regarding claim 7, the rejection of claim 1 is incorporated herein.
  Peng et al. in the combination further teach wherein the processing unit further comprises a pre-processing unit configured to perform pre-processing for emphasizing a blood vessel included in the target eye image, and to obtain a blood vessel-emphasized eye image (see para [0042]; “prior to processing the fundus image data using the machine learning model, the system can pre-process the fundus images. For example, for a given image, the system can apply any of a variety of conventional image processing techniques to the image to improve the quality of the output generated by the machine learning model. As an example, the system may crop, scale, deskew or re-center the image. As another example, the system can remove distortion from the image, e.g., to remove blurring or to re-focus the image, using conventional image processing techniques”, see also para [066]; “the particular eye-related condition may be ocular occlusions. Generally, an ocular occlusion is the blockage or closing of a blood vessel that carries blood to or from some portion of the eye, e.g., to or from the retina”). 
Zhou et al. in the combination further teach and wherein the first common portion is configured to obtain the first feature set based on the blood vessel-emphasized eye image (see para [0043]; “As shown in FIG. 4, common module F1 is shared for determining each of the outputs. Module F2 is an analysis-specific module”).  
Regarding claim 8, the rejection of claim 3 is incorporated herein.
  Peng et al. in the combination further teach wherein the first information and the second information are diagnosis assistance information related to a disease related to first part of a human body (see para [0063]; “For example, the particular eye-related condition may be glaucoma. Generally, glaucoma is a condition in which the optic nerve is damaged, which can result in blindness…. As another example, the particular eye-related condition may be age-related macular degeneration. Generally, age-related macular degeneration is a condition in which the macula, an area near the center of the retina, has deteriorated, which may cause partial or total vision loss”), and the second diagnosis assistance information is diagnosis assistance information related to a disease related to second part of the human body, the second part being different from the first part (see para [0067]; “the specific condition is not an eye-related condition but is instead a neurodegenerative condition, e.g., Parkinson's or Alzheimer's, or another condition that can effectively be analyzed using fundus imagery”, see also para [0109]; “the other patient data using a fundus image processing machine learning model to generate a set of risk scores (step 804)…the set of risk scores includes a single score that measures a particular kind of risk. For example, the score may measure a predicted cardiovascular risk of the patient”).  
Regarding claim 10, the rejection of claim 1 is incorporated herein.
  Peng et al. in the combination further teach wherein the first feature set comprises at least one feature map (see para [0055]; “a set of convolutional neural network layers 162, followed by a set of fully connected layers 164, and an output layer 166”, see also para [0132]; “The initial convolutional layers process each fundus image in the fundus image data to extract a respective feature vector for each of multiple regions in the fundus image” Note: initial convolution layers inherently produce feature maps).  
Regarding claim 11, the rejection of claim 3 is incorporated herein.
  Peng et al. in the combination further teach wherein the first feature set comprises at least one feature map, and the second feature set comprises at least one feature value (see para [0132]; “the initial convolutional layers process each fundus image in the fundus image data to extract a respective feature vector for each of multiple regions in the fundus image. The attention mechanism determines an attention weight for each of the regions in the fundus image and then attends to the feature vectors in accordance with the corresponding attention weights to generate an attention output” see also para [0056]; “the model output is a set of scores 170, with each score being generated by a corresponding node in the output layer 166. As will be described in more detail below, in some cases, the set of scores 170 are specific to particular medical condition” Note: initial convolution layers inherently produce feature maps and output feature score “feature values”).  
Regarding claim 12, Peng et al. teaches a method for assisting a diagnosis by using a diagnosis assistance apparatus, the diagnosis assistance apparatus comprising an eye image obtaining unit configured to obtain an eye image (see Abstract; “Method for processing fundus images using fundus image processing machine learning models. One of the methods includes obtaining a model input comprising one or more fundus images, each fundus image being an image of a fundus of an eye of a patient; processing the model input …. to generate a model output; and processing the model output to generate health analysis data” Note: health analysis data implies diagnosis assistance), and a processing unit configured to obtain diagnosis assistance information based on the eye image by using a neural network model, the neural network model comprising at least one neural network layer and being trained to obtain the diagnosis assistance information based on the eye image (see para [0043]; “the fundus image processing machine learning model is a feedforward machine learning model that has been configured by being trained on appropriately labeled training data to process the fundus image data and, optionally, the other patient data to generate a model output that characterizes a particular aspect of the patient's health. For example, the fundus image processing machine learning model may be a deep convolutional neural network”), wherein the neural network model comprises: first diagnosis assistance neural network model configured to obtain first diagnosis assistance information based on the eye image (see para [0048]; “the model output is a prediction of the risk of a particular health event occurring in the future. A model output that is a prediction of the risk of a particular event occurring is described in more detail below with reference to FIG. 8”); and second diagnosis assistance neural network model configured to obtain second diagnosis assistance information based on the eye image (see para [0050]; “the model output is a prediction of values of factors that contribute to a particular kind of health-related risk. A model output that is a prediction of values of risk factors is described in more detail below with reference to FIG. 10”, see also para [0069]; “For example, in the case of glaucoma, the single score may represent a likelihood that the patient has glaucoma” Note: Fig. 8 risk of a particular health event and Fig. 10 values of risk factors (i.e., glaucoma) are different diagnosis assistance information from the same fundus images), wherein the diagnosis assistance method comprises: obtaining, by the eye image obtaining unit, a target eye image which is obtained from eyes of a subject (see para [0038]; “the fundus image data includes multiple fundus images that capture the current state of the patient's fundus. For example, the fundus image data can include one or more images of the fundus in the patient's left eye and one or more images of the fundus in the patient's right eye”); However, Peng et al. does not teach wherein the first diagnosis assistance neural network model comprises first common portion and first individual portion, and the second diagnosis assistance neural network model comprises the first common portion and second individual portion, wherein the diagnosis assistance method comprises: obtaining, by the processing unit, a first feature set based on the target eye image through the first common portion; obtaining, by the processing unit, the first diagnosis assistance information based at least in part on the first feature set through the first individual portion; and obtaining, by the processing unit, the second diagnosis assistance information based at least in part on the first feature set through the second individual portion, wherein the first individual portion is trained based on first training data, and the second individual portion is trained based on second training data which is different from the first training data at least in part.  
In the same field of endeavor Zhou et al. teaches wherein the first diagnosis assistance neural network model comprises first common portion and first individual portion (see para [0043]; “FIG. 4 shows a multi-task learning framework 400, in accordance with one or more embodiments. Multi-task learning aims to learn a network that produces multiple outputs, one for each medical imaging analysis, from an input image I. As shown in FIG. 4, common module F1 is shared for determining each of the outputs. Module F2 is an analysis-specific module (e.g., for landmark detection) to determine output image J. Multi-task learning framework 400 also include module G2, a self-reconstruction module for reconstructing input image I as output image I, such that input image I becomes the label. Accordingly, the self-reconstruction module G2 is leveraged to better learn common process F1, thereby improving the mapping between input image I and output image J”), and the second diagnosis assistance neural network model comprises the first common portion and second individual portion (see para [0043]; “As shown in FIG. 4, common module F1 is shared for determining each of the outputs. Module F2 is an analysis-specific module (e.g., for landmark detection) to determine output image J. Multi-task learning framework 400 also include module G2, a self-reconstruction module for reconstructing input image I as output image I, such that input image I becomes the label. Accordingly, the self-reconstruction module G2 is leveraged to better learn common process F1, thereby improving the mapping between input image I and output image J”, Note: the same shared F1 feeds another analysis-specific head (another F2) for a different output), obtaining, by the processing unit, a first feature set based on the target eye image through the first common portion (see para [0043]; “Multi-task learning aims to learn a network that produces multiple outputs, one for each medical imaging analysis, from an input image I. As shown in FIG. 4, common module F1 is shared for determining each of the outputs” Note: common module F1 generates shared features for multiple outputs); obtaining, by the processing unit, the first diagnosis assistance information based at least in part on the first feature set through the first individual portion (see para [0043]; “Module F2 is an analysis-specific module (e.g., for landmark detection) to determine output image J. Multi-task learning framework 400 also include module G2, a self-reconstruction module for reconstructing input image I as output image I, such that input image I becomes the label”); and obtaining, by the processing unit, the second diagnosis assistance information based at least in part on the first feature set through the second individual portion (see para [0043]; “As shown in FIG. 4, common module F1 is shared for determining each of the outputs. Module F2 is an analysis-specific module (e.g., for landmark detection) to determine output image J. Multi-task learning framework 400 also include module G2, a self-reconstruction module for reconstructing input image I as output image I, such that input image I becomes the label. Accordingly, the self-reconstruction module G2 is leveraged to better learn common process F1, thereby improving the mapping between input image I and output image J”, Note: the same shared F1 feeds another analysis-specific head (another F2) for a different output), wherein the first individual portion is trained based on first training data, and the second individual portion is trained based on second training data which is different from the first training data at least in part (see para [0041]; “At step 304, datasets of input training medical imaging data are received. Each of the datasets are associated with one of a plurality of medical imaging analyses, and are therefore associated with a particular modality, anatomy, and task. It should be understood that at least some of the input training medical imaging data in the datasets may be associated with multiple datasets”, see also para [0047]; “each dataset of training medical imaging data is denoted as Data Set. Modality. Anatomy. Task, or Data Set. MAT, to indicate the medical imaging analysis that the dataset is to be used for training the target network, Net. MAT….a neural network is trained to perform the plurality of medical imaging analyses based on the datasets of input training medical imaging data and the corresponding output training medical imaging data”).  Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify an apparatus for processing fundus images using fundus image processing machine learning models of Peng et al. in view of the use of a neural network trained to perform the plurality of medical image analyses of Zhou et al. in order to improve efficiency and generalization  across related medical imaging tasks while enabling multiple outputs from a single image input (see para [0043]).
Regarding claim 13, the rejection of claim 12 is incorporated herein.
  Peng et al. in the combination further teach wherein the first diagnosis assistance information comprises first information and second information (see para [0060]-[0061]; “the other patient data using a fundus image processing machine learning model to generate a set of condition state scores (step 304)..the set of condition state scores are specific to a particular medical condition that the system has been configured to analyze”, see also para [0084]-[0085]; “The set of follow-up scores includes a respective score for each of multiple possible follow-up actions that can be taken by the patient to treat a particular medical condition.. The system generates health analysis data from the follow-up scores (step 406). For example, the system can generate health analysis data that recommends that the patient take the follow-up action that has the highest follow-up score”).
  Zhou et al. in the combination further teach and the first individual portion comprises second common portion, first sub-portion and second sub-portion (see para [0032]; “A Super Net is denoted by the terminology Super Net. Modality. Anatomy, or SuperNet.MA. A Super Net constitutes the common portion among different tasks T for a same modality M and a same anatomy A”, see also para [0035]; “The target network, Net. MAT, represents a path or branch in neural network 200 associated with a particular modality, anatomy, and task for performing the particular medical imaging analysis”), wherein obtaining the first diagnosis assistance information comprises: obtaining, by the second common portion, second feature set which is associated with the first information and the second information, based at least in part on the first feature set (see para [0043]; “As shown in FIG. 4, common module F1 is shared for determining each of the outputs. Module F2 is an analysis-specific module (e.g., for landmark detection) to determine output image J”, see also para [0056]; “FIG. 6 shows an exemplary cascade sharing mechanism 600.. HNet weight, W.sub.H(1), is also cascaded down to learn UNet weights W.sub.M(1) and W.sub.A(1) for layer 1. One or more of UNet weights W.sub.M(1) and W.sub.A(1) are cascaded down to learn SNet weight W.sub.MA(1), which is cascaded down to learn target network Net.MAT weight W.sub.MAT.sub.1(1), . . . , W.sub.MAT.sub.k(1) for tasks 1 through k”); obtaining, by the first sub-portion, the first information based at least in part on the second feature set (see para [0035]; “The target network, Net. MAT, represents a path or branch in neural network 200 associated with a particular modality, anatomy, and task for performing the particular medical imaging analysis”, see also para [0043]; “Module F2 is an analysis-specific module (e.g., for landmark detection) to determine output image J”); and obtaining, by the second sub-portion, the second information based at least in part on the second feature set (see para [0043]; “As shown in FIG. 4, common module F1 is shared for determining each of the outputs. Module F2 is an analysis-specific module (e.g., for landmark detection) to determine output image J” Note: second branch/head).  
Regarding claim 14, the rejection of claim 13 is incorporated herein.
  Peng et al. in the combination further teach wherein the first feature set comprises at least one feature map, and the second feature set comprises at least one feature value (see para [0131]; “the machine leaning model is a model that includes one or more initial convolutional layers followed by an attention mechanism, which in turn is followed by one or more additional neural network layers… the initial convolutional layers process each fundus image in the fundus image data to extract a respective feature vector for each of multiple regions in the fundus image. The attention mechanism determines an attention weight for each of the regions in the fundus image and then attends to the feature vectors in accordance with the corresponding attention weights to generate an attention output” see also para [0056]; “the model output is a set of scores 170, with each score being generated by a corresponding node in the output layer 166. As will be described in more detail below, in some cases, the set of scores 170 are specific to particular medical condition” Note: initial convolution layers inherently produce feature maps and output feature score “feature values”).     
Regarding claim 15, the rejection of claim 13 is incorporated herein.
  Peng et al. in the combination further teach wherein the first information and the second information are diagnosis assistance information related to a disease related to first part of a human body, and the second diagnosis assistance information is diagnosis assistance information related to a disease related to second part of the human body, the second part being different from the first part (see para [0067]; “the specific condition is not an eye-related condition but is instead a neurodegenerative condition, e.g., Parkinson's or Alzheimer's, or another condition that can effectively be analyzed using fundus imagery”, see also para [0072]; “in the case of ocular occlusions, the single score may represent a likelihood that the patient has one or more ocular occlusions”).  
Regarding claim 16, the rejection of claim 12 is incorporated herein.
  Peng et al. in the combination further teach wherein the first diagnosis assistance information comprises at least one piece of diagnosis assistance information related first eye disease, and the second diagnosis assistance information comprises at least one piece of diagnosis assistance information related to second eye disease which is different from the first eye disease(see para [0069]; “in the case of glaucoma, the single score may represent a likelihood that the patient has glaucoma”, see also para [0072]; “in the case of ocular occlusions, the single score may represent a likelihood that the patient has one or more ocular occlusions”).  
Regarding claim 17, the rejection of claim 12 is incorporated herein.
  Peng et al. in the combination further teach wherein the first feature set comprises at least one feature map (see para [0131]; “the machine leaning model is a model that includes one or more initial convolutional layers followed by an attention mechanism, which in turn is followed by one or more additional neural network layers. The initial convolutional layers process each fundus image in the fundus image data to extract a respective feature vector for each of multiple regions in the fundus image”).  
Regarding claim 18, the rejection of claim 12 is incorporated herein.
  Peng et al. in the combination further teach wherein the processing unit further comprises a pre-processing unit configured to perform pre-processing for emphasizing a blood vessel included in the target eye image and to obtain a blood vessel-emphasized eye image (see para [0042]; “prior to processing the fundus image data using the machine learning model, the system can pre-process the fundus images. For example, for a given image, the system can apply any of a variety of conventional image processing techniques to the image to improve the quality of the output generated by the machine learning model. As an example, the system may crop, scale, deskew or re-center the image. As another example, the system can remove distortion from the image, e.g., to remove blurring or to re-focus the image, using conventional image processing techniques”, see also para [066]; “the particular eye-related condition may be ocular occlusions. Generally, an ocular occlusion is the blockage or closing of a blood vessel that carries blood to or from some portion of the eye, e.g., to or from the retina”).
Zhou et al. in the combination further teach and wherein obtaining the first feature set comprises obtaining the first feature set based on the blood vessel-emphasized eye image through the first common portion (see para [0043]; “As shown in FIG. 4, common module F1 is shared for determining each of the outputs. Module F2 is an analysis-specific module”).  
Regarding claim 19, the rejection of claim 12 is incorporated herein.
  Peng et al. in the combination further teach wherein the first diagnosis assistance information comprises at least one piece of diagnosis assistance information related to an eye disease (see para [0069]; “in the case of glaucoma, the single score may represent a likelihood that the patient has glaucoma”), and the second diagnosis assistance information comprises at least one piece of diagnosis assistance information related to a cerebral cardiovascular disease (see para [0110]; “the set of risk scores includes a single score that measures a particular kind of risk. For example, the score may measure a predicted cardiovascular risk of the patient, e.g., may be a predicted Framingham risk score that measures the 10-year cardiovascular risk of the patient”).  
Regarding claim 20, the rejection of claim 12 is incorporated herein.
  Peng et al. in the combination further teach computer-readable recording medium having a program recorded thereon to perform the method (see para [0145]; “Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices”).
Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Peng et al. 9in view Zhou et al. as applied in claims 1 and 3 above, and further in view of Zee et al. (US 20120257164 A1).
Regarding claim 9, the rejection of claim 3 is incorporated herein.
  Peng et al. in the combination further teach wherein the first information is diagnosis assistance information indicating whether the eyes of the subject correspond to glaucoma (see para [0063]; “the particular eye-related condition may be glaucoma. Generally, glaucoma is a condition in which the optic nerve is damaged, which can result in blindness”), and wherein the second diagnosis assistance information is diagnosis assistance information indicating a degree of calcification of a coronary artery of the subject (see para [0124]; “when the risk is cardiovascular risk, the particular set of health-related events can be a health event that is classified as a major cardiovascular health event, e.g., myocardial infarction, heart failure, percutaneous cardiac intervention, coronary artery bypass grafting”).  However, the combination of Peng et al. and Zhou et al. as a whole does not teach and the second information is diagnosis assistance information indicating whether the eyes of the subject correspond to diabetic retinopathy.  
In the same field of endeavor Zee et al. teaches and the second information is diagnosis assistance information indicating whether the eyes of the subject correspond to diabetic retinopathy (see para [0058]; “detecting and analyzing abnormal patterns related to diabetic retinopathy in the preprocessed image”). Accordingly, it would have been obvious to one of ordinary skill in the art at the time of the invention to modify an apparatus for processing fundus images using fundus image processing machine learning models of Peng et al. in view of the use of a neural network trained to perform the plurality of medical image analyses of Zhou et al. and devices for diagnosing and/or predicting the presence, progression and/or treatment effect of a disease characterized by retinal pathological changes in a subject of Zee et al. in order to provide disease risk prediction based on their complexity of characteristics (see para [0058]).

Conclusion

	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WINTA GEBRESLASSIE whose telephone number is (571)272-3475. The examiner can normally be reached Monday-Friday9:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Bee can be reached at 571-270-5180. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/WINTA GEBRESLASSIE/Examiner, Art Unit 2677
Read full office action
Prosecution Timeline

Dec 28, 2022
Application Filed
Oct 01, 2025
Non-Final Rejection mailed — §103
Jan 28, 2026
Response Filed
May 27, 2026
Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/710,872
Patent 12579683
IMAGE VIEW ADJUSTMENT
3y 11m to grant Granted Mar 17, 2026
17/876,145
Patent 12573238
BIOMETRIC FACIAL RECOGNITION AND LIVENESS DETECTOR USING AI COMPUTER VISION
3y 7m to grant Granted Mar 10, 2026
18/177,769
Patent 12530768
SYSTEMS AND METHODS FOR IMAGE STORAGE
2y 10m to grant Granted Jan 20, 2026
17/923,954
Patent 12524932
MACHINE LEARNING IMAGE RECONSTRUCTION
3y 2m to grant Granted Jan 13, 2026
18/196,332
Patent 12511861
DETECTION OF ANNOTATED REGIONS OF INTEREST IN IMAGES
2y 7m to grant Granted Dec 30, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
76%
Grant Probability
99%
With Interview (+25.0%)
2y 6m (~0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 135 resolved cases by this examiner. Grant probability derived from career allowance rate.