Last updated: April 19, 2026
Application No. 18/476,355
SELF-SUPERVISED TRAINING AT SCALE WITH WEAKLY-SUPERVISED LATENT SPACE STRUCTURE

Final Rejection §103
Filed
Sep 28, 2023
Examiner
WELLS, HEATH E
Art Unit
2664
Tech Center
2600 — Communications
Assignee
Siemens Healthineers AG
OA Round
2 (Final)
Interview Optional

— +18.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 77 resolved cases, 2023–2026
Examiner Intelligence

WELLS, HEATH E View full profile →
Grants 75% — above average
Career Allow Rate
58 granted / 77 resolved
+13.3% vs TC avg
Strong +18% interview lift
Without
With
+18.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 5m
Avg Prosecution
46 currently pending
Career history
123
Total Applications
across all art units
Statute-Specific Performance

§101
17.8%
-22.2% vs TC avg
§103
62.8%
+22.8% vs TC avg
§102
2.4%
-37.6% vs TC avg
§112
13.8%
-26.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 77 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
The reply filed on 8 January 2026 has been entered.  Applicant’s arguments with respect to claims 1, 3-4, 6-9, 11-12, 14 and 16-20 have been considered but are moot in view of new ground(s) of rejection caused by the amendments.
Claims 1, 2-9, 11-14 and 16-20 are pending in this application and have been considered below.  Claims 2, 10 and 15 are canceled by the applicant.  Claims 5 and 13 having been reconsidered in light of the remarks and amendments, are objected to.

Information Disclosure Statement
The IDS dated 28 September 2025 that has been previously considered remains placed in the application file.  
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f), is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f):
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f). The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f), is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f), except as otherwise indicated. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f), except as otherwise indicated in this Office action.
Such claim limitation(s) is/are:
“means for receiving input medical data” in claim 9;
“means for performing a medical analysis task” in claim 9; and
“means for outputting results” in claim 9.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f), they are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f), applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 3-4, 6-9, 11-12, 14, 16-20 (all claims not cancelled or objected to) are rejected under 35 U.S.C. 103 as obvious over US Patent Publication 2021 US Patent Publication 2022 0208355 A1, (Li) in view of US Patent Publication 2021 US Patent Publication 2023 0281805 A1, (Haghighi et al.). The references are listed in a PTO-892 from the Office Action in which they are first used.
Claim 1
 Regarding Claim 1, Li teaches a computer-implemented method comprising:
receiving input medical data ("receiving a medical image acquired by a medical scanner," paragraph [0012]);
performing a medical analysis task using a trained machine learning based task network based on the input medical data ("medical diagnostic image analysis task with the machine learning model," paragraph [0014]); and
outputting results of the medical analysis task ("obtaining a diagnosis-related tissue segmented image," paragraph [0024]),
wherein the trained machine learning based task network is trained by:
receiving unannotated training medical data ("a data set with no pre-existing labels and with a minimum of human supervision," paragraph [0270]);
generating weakly-supervised labels for the unannotated training medical data using one or more trained machine learning based supervised learning networks ("Weakly-supervised learning uses input having a corresponding weak label. The weak label means it provides less information compared with the label that would be used in supervised learning," paragraph [0272]);
training the machine learning based task network for performing the medical analysis task based on 1) the unannotated training medical data ("a data set with no pre-existing labels and with a minimum of human supervision," paragraph [0270]), 2) self-supervised labels for the unannotated training medical data learned via self-supervised learning ("unsupervised learning, also known as self-organization allows for modeling of probability densities over inputs without referencing corresponding labels for the inputs. Unsupervised learning algorithms are suitable for tasks where the data has distinguishable inherent patterns," paragraph [0270]), and 3) the generated weakly-supervised labels for the unannotated training medical data ("Weakly-supervised learning uses input having a corresponding weak label. The weak label means it provides less information compared with the label that would be used in supervised learning," paragraph [0272]), wherein training the machine learning based task network for  performing the medical analysis task comprises:
assigning the generated weakly-supervised labels to the extracted latent features ("By taking the tumor prediction as a pseudo label (additional to the manual pixel-level-label), the tumor segmentation in the non-enhanced image is thus improved," paragraph [0279] where a pseudo label is a latent space), and
training the machine learning based task network based on the generated weakly-supervised labels assigned to the extracted latent features ("To predict the tumor mask for the box-level-labeled data, WSTS includes an Uncertainty-Sifting Self-Ensembling (USSE). The USSE utilizes the limited pixel-level-labeled data and additional box-level-labeled data to predict the tumor accurately by evaluating the prediction reliability with a Multi-scale Uncertainty-estimation," paragraph [0279]); and
outputting the trained machine learning based task network ("Considering a non-enhanced MRI image as the initial current state, in the training phase, the state-behavior network estimates pixel-level candidate actions of the current state by observing the current state, while the state evaluator network predicts a pixel-level average action as an empirical baseline that would have been taken at the current state. With the dual-level complementary reward measuring the improvement in two kinds of image synthesis actions, the advantage function computes the extra rewards by comparing the real rewards of the candidate actions with the expected rewards of the average action. It finds whether the candidate actions have resulted in better or worse results than the baseline action and takes the optimal action that has the most extra rewards to update the current state to the next state. Meanwhile, the advantage function feeds back to optimize both networks, namely, the advantage function enables the state-behavior network to estimate better candidate actions and enables the state-evaluator network to predict more accurate average actions, thereby computing an accurate advantage function to find an optimal action at the next state," paragraph [0285]).
Li is not relied upon to explicitly teach all of latent features.
However, Haghighi et al. teach extracting latent features from the unannotated training medical data using an encoder of the machine learning based task network  ("The discriminative learning branch generates "discriminative latent features" from input images. Specifically, the discriminative learning branch 661 performs operations including: (i) receiving the two cropped patches 639, (ii) augmenting each of the two cropped patches via the image augmentation algorithms 650 to generate two augmented patches 641, and (iii) generating latent features from the two augmented patches 641 by training an encoder of the discriminative learning branch 661 to maximize agreement between instances of same classes in latent space via a discriminative loss function," paragraph [0130]).
Therefore, taking the teachings of Li and Haghighi et al. as a whole, it would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention of the instant application to modify “Contrastive -Agent-Free Medical Diagnostic Imaging” as taught by Li to use “Systems, Methods and Apparatuses for Implementing Discriminative, Restoriative and Adversarial (DIRA) learning for Self-supervised Medical Image Analysis” as taught by Haghighi et al.  The suggestion/motivation for doing so would have been that, “The present state of the art may therefore benefit from the systems, methods, and apparatuses for implementing Discriminative, Restorative, and Adversarial (DiRA) learning for self-supervised medical image analysis, as is described herein” as noted by the Haghighi et al. disclosure in paragraph [0008], which also motivates combination because the combination would predictably have a greater recognition efficiency with less training data as there is a reasonable expectation that there is never enough training data and the data is not precise enough; and/or because doing so merely combines prior art elements according to known methods to yield predictable results.
The rejection of method claim 1 above applies mutatis mutandis to the corresponding limitations of apparatus claim 9 and manufacture claim 14 while noting that the rejection above cites to both device and method disclosures.  Claims 9 and 14 are mapped below for clarity of the record and to specify any new limitations not included in claim 1.
Claim 3
 Regarding claim 3, Li teaches the computer-implemented method of claim 1, wherein training the machine learning based task network for performing the medical analysis task comprises: 
training the machine learning based task network using a cost function that incorporates the self-supervised labels and the generated weakly-supervised labels ("Reinforcement learning can involve providing a machine learning model with a reward signal for a correct output inference; for example, the reward signal can be a numerical value," paragraph [0274] where a reward signal is a cost function).
Claim 4
 Regarding claim 4, Li teaches the computer-implemented method of claim 1, wherein training the machine learning based task network for performing the medical analysis task comprises: 
fitting a probability distribution model ("unsupervised learning, also known as self-organization allows for modeling of probability densities over inputs without referencing corresponding labels for the inputs. Unsupervised learning algorithms are suitable for tasks where the data has distinguishable inherent patterns," paragraph [0270])  to a latent space of the machine learning based task network to capture an uncertainty ("By taking the tumor prediction as a pseudo label (additional to the manual pixel-level-label), the tumor segmentation in the non-enhanced image is thus improved," paragraph [0279]); and
training the machine learning based task network based on the uncertainty ("Also, the USSE improves the tumor prediction reliability in the contrast enhanced image by integrating uncertainty estimation with Self-Ensembling, which prevents error magnifying in the non-enhanced image segmentation. Moreover, the USSE introduces multi-scale attentions into the uncertainty-estimation; multi-scale attentions increase the observational uncertainty and thus improve the estimation effectiveness to the uncertainty," paragraph [0281]).
Claim 6
 Regarding claim 6, Li teaches the computer-implemented method of claim 1, as noted above.
Li is not relied upon to explicitly teach all of a plurality of decoders.
However, Haghighi et al. teach wherein the machine learning based task network further comprises a plurality of decoders and training the machine learning based task network for performing the medical analysis task comprises: 
training the machine learning based task network to perform a plurality of tasks performed respectively using one of the plurality of decoders ("the restorative learning branch includes: an encoder fe and decoder ge configured for mapping the augmented patches distorted by the augmentation function back to an original image via fe, ge : (x,T) x; wherein the encoder fe of the restorative learning branch is a shared encoder, shared with the discriminative learning branch; and wherein the encoder fe and decoder ge comprise an encoder/decoder network," paragraph [0154]).
Li and Haghighi et al. are combined as per claim 1.
 
Claim 7
 Regarding claim 7, Li teaches the computer-implemented method of claim 1, as noted above.
Li is not relied upon to explicitly teach all of a plurality of decoders.
However, Haghighi et al. teach wherein the machine learning based task network further comprises a plurality of decoders and training the machine learning based task network for performing the medical analysis task comprises: 
training an initial decoder of the plurality of decoders to generate a reconstructed image based on features generated by the encoder ("the restorative learning branch includes: an encoder fe and decoder ge configured for mapping the augmented patches distorted by the augmentation function back to an original image," paragraph [0154]); and
training one or more additional decoders of the plurality of decoders to respectively perform one or more tasks based on the reconstructed image ("According to the described embodiments, the DiRA platform is a general framework that allows various choices of discrimination tasks without any constraint. As such, the declaration of class might range from considering every single image as a class (instance discrimination) to clustering images based on a similarity metric ( cluster discrimination)," paragraph [0056]).
Li and Haghighi et al. are combined as per claim 1.
Claim 8
 Regarding claim 8, Li teaches the computer-implemented method of claim 1, as noted above.
Li is not relied upon to explicitly teach all of a plurality of decoders.
However, Haghighi et al. teach wherein the machine learning based task network further comprises a plurality of decoders and training the machine learning based task network for performing the medical analysis task comprises: fine tuning the machine learning based task network using one or more of the plurality of decoders for domain adaptation ("All pre-trained models were fine-tuned for 4 distinct medical applications ranging from target tasks on the source dataset to the tasks with comparatively significant domain-shifts in terms of data distribution and disease/object of interest," paragraph [0082]).
Li and Haghighi et al. are combined as per claim 1.
Claim 9
 Regarding claim 9, Li teaches an apparatus  ("including for example: MRI scanner, power source, processor (CPU alone OR CPU+GPU OR GPU alone OR any kind of device that can process large volumes of images), memory, connectivity (Wifi, bluetooth, SIM Card), and the like," paragraph [0265]) comprising:
means for receiving input medical data ("receiving a medical image acquired by a medical scanner," paragraph [0012]);
means for performing a medical analysis task using a trained machine learning based task network based on the input medical data ("medical diagnostic image analysis task with the machine learning model," paragraph [0014]); and
means for outputting results of the medical analysis task ("obtaining a diagnosis-related tissue segmented image," paragraph [0024]),
wherein the trained machine learning based task network is trained by:
receiving unannotated training medical data("a data set with no pre-existing labels and with a minimum of human supervision," paragraph [0270]);
generating weakly-supervised labels for the unannotated training medical data using one or more trained machine learning based supervised learning networks("Weakly-supervised learning uses input having a corresponding weak label. The weak label means it provides less information compared with the label that would be used in supervised learning," paragraph [0272]);
training the machine learning based task network for performing the medical analysis task based on 1) the unannotated training medical data ("a data set with no pre-existing labels and with a minimum of human supervision," paragraph [0270]), 2) self-supervised labels for the unannotated training medical data learned via self-supervised learning ("unsupervised learning, also known as self-organization allows for modeling of probability densities over inputs without referencing corresponding labels for the inputs. Unsupervised learning algorithms are suitable for tasks where the data has distinguishable inherent patterns," paragraph [0270]), and 3) the generated weakly-supervised labels for the unannotated training medical data("Weakly-supervised learning uses input having a corresponding weak label. The weak label means it provides less information compared with the label that would be used in supervised learning," paragraph [0272])wherein training the machine learning based task network for  performing the medical analysis task comprises:
assigning the generated weakly-supervised labels to the extracted latent features ("By taking the tumor prediction as a pseudo label (additional to the manual pixel-level-label), the tumor segmentation in the non-enhanced image is thus improved," paragraph [0279] where a pseudo label is a latent space), and
training the machine learning based task network based on the generated weakly-supervised labels assigned to the extracted latent features ("To predict the tumor mask for the box-level-labeled data, WSTS includes an Uncertainty-Sifting Self-Ensembling (USSE). The USSE utilizes the limited pixel-level-labeled data and additional box-level-labeled data to predict the tumor accurately by evaluating the prediction reliability with a Multi-scale Uncertainty-estimation," paragraph [0279]); and
outputting the trained machine learning based task network ("Considering a non-enhanced MRI image as the initial current state, in the training phase, the state-behavior network estimates pixel-level candidate actions of the current state by observing the current state, while the state evaluator network predicts a pixel-level average action as an empirical baseline that would have been taken at the current state. With the dual-level complementary reward measuring the improvement in two kinds of image synthesis actions, the advantage function computes the extra rewards by comparing the real rewards of the candidate actions with the expected rewards of the average action. It finds whether the candidate actions have resulted in better or worse results than the baseline action and takes the optimal action that has the most extra rewards to update the current state to the next state. Meanwhile, the advantage function feeds back to optimize both networks, namely, the advantage function enables the state-behavior network to estimate better candidate actions and enables the state-evaluator network to predict more accurate average actions, thereby computing an accurate advantage function to find an optimal action at the next state," paragraph [0285]). 
Li is not relied upon to explicitly teach all of latent features.
However, Haghighi et al. teach extracting latent features from the unannotated training medical data using an encoder of the machine learning based task network ("The discriminative learning branch generates "discriminative latent features" from input images. Specifically, the discriminative learning branch 661 performs operations including: (i) receiving the two cropped patches 639, (ii) augmenting each of the two cropped patches via the image augmentation algorithms 650 to generate two augmented patches 641, and (iii) generating latent features from the two augmented patches 641 by training an encoder of the discriminative learning branch 661 to maximize agreement between instances of same classes in latent space via a discriminative loss function," paragraph [0130]).
Li and Haghighi et al. are combined as per claim 1.

Claim 11
 Regarding claim 11, Li teaches the apparatus of claim 9, wherein training the machine learning based task network for performing the medical analysis task comprises: 
training the machine learning based task network using a cost function that incorporates the self-supervised labels and the generated weakly-supervised labels ("Reinforcement learning can involve providing a machine learning model with a reward signal for a correct output inference; for example, the reward signal can be a numerical value," paragraph [0274] where a reward signal is a cost function).
Claim 12
 Regarding claim 12, Li teaches the apparatus of claim 9, wherein training the machine learning based task network for performing the medical analysis task comprises: 
fitting a probability distribution model ("unsupervised learning, also known as self-organization allows for modeling of probability densities over inputs without referencing corresponding labels for the inputs. Unsupervised learning algorithms are suitable for tasks where the data has distinguishable inherent patterns," paragraph [0270]) to a latent space of the machine learning based task network to capture an uncertainty ("By taking the tumor prediction as a pseudo label (additional to the manual pixel-level-label), the tumor segmentation in the non-enhanced image is thus improved," paragraph [0279]); and
training the machine learning based task network based on the uncertainty ("Also, the USSE improves the tumor prediction reliability in the contrast enhanced image by integrating uncertainty estimation with Self-Ensembling, which prevents error magnifying in the non-enhanced image segmentation. Moreover, the USSE introduces multi-scale attentions into the uncertainty-estimation; multi-scale attentions increase the observational uncertainty and thus improve the estimation effectiveness to the uncertainty," paragraph [0281]).
Claim 14
 Regarding claim 14, Li teaches a non-transitory computer readable medium storing computer program instructions, the computer program instructions when executed by a processor cause the processor to perform operations ("including for example: MRI scanner, power source, processor (CPU alone OR CPU+GPU OR GPU alone OR any kind of device that can process large volumes of images), memory, connectivity (Wifi, bluetooth, SIM Card), and the like," paragraph [0265]) comprising:
receiving input medical data ("receiving a medical image acquired by a medical scanner," paragraph [0012]);
performing a medical analysis task using a trained machine learning based task network based on the input medical data ("medical diagnostic image analysis task with the machine learning model," paragraph [0014]); and
outputting results of the medical analysis task("obtaining a diagnosis-related tissue segmented image," paragraph [0024]),
wherein the trained machine learning based task network is trained by:
receiving unannotated training medical data ("a data set with no pre-existing labels and with a minimum of human supervision," paragraph [0270]);
generating weakly-supervised labels for the unannotated training medical data using one or more trained machine learning based supervised learning networks ("Weakly-supervised learning uses input having a corresponding weak label. The weak label means it provides less information compared with the label that would be used in supervised learning," paragraph [0272]);
training the machine learning based task network for performing the medical analysis task based on 1) the unannotated training medical data ("a data set with no pre-existing labels and with a minimum of human supervision," paragraph [0270]), 2) self-supervised labels for the unannotated training medical data learned via self-supervised learning ("unsupervised learning, also known as self-organization allows for modeling of probability densities over inputs without referencing corresponding labels for the inputs. Unsupervised learning algorithms are suitable for tasks where the data has distinguishable inherent patterns," paragraph [0270]), and 3) the generated weakly-supervised labels for the unannotated training medical data ("Weakly-supervised learning uses input having a corresponding weak label. The weak label means it provides less information compared with the label that would be used in supervised learning," paragraph [0272]) wherein training the machine learning based task network for  performing the medical analysis task comprises:
assigning the generated weakly-supervised labels to the extracted latent features ("By taking the tumor prediction as a pseudo label (additional to the manual pixel-level-label), the tumor segmentation in the non-enhanced image is thus improved," paragraph [0279] where a pseudo label is a latent space), and
training the machine learning based task network based on the generated weakly-supervised labels assigned to the extracted latent features ("To predict the tumor mask for the box-level-labeled data, WSTS includes an Uncertainty-Sifting Self-Ensembling (USSE). The USSE utilizes the limited pixel-level-labeled data and additional box-level-labeled data to predict the tumor accurately by evaluating the prediction reliability with a Multi-scale Uncertainty-estimation," paragraph [0279]); and
outputting the trained machine learning based task network ("Considering a non-enhanced MRI image as the initial current state, in the training phase, the state-behavior network estimates pixel-level candidate actions of the current state by observing the current state, while the state evaluator network predicts a pixel-level average action as an empirical baseline that would have been taken at the current state. With the dual-level complementary reward measuring the improvement in two kinds of image synthesis actions, the advantage function computes the extra rewards by comparing the real rewards of the candidate actions with the expected rewards of the average action. It finds whether the candidate actions have resulted in better or worse results than the baseline action and takes the optimal action that has the most extra rewards to update the current state to the next state. Meanwhile, the advantage function feeds back to optimize both networks, namely, the advantage function enables the state-behavior network to estimate better candidate actions and enables the state-evaluator network to predict more accurate average actions, thereby computing an accurate advantage function to find an optimal action at the next state," paragraph [0285]).
Li is not relied upon to explicitly teach all of latent features.
However, Haghighi et al. teach extracting latent features from the unannotated training medical data using an encoder of the machine learning based task network ("The discriminative learning branch generates "discriminative latent features" from input images. Specifically, the discriminative learning branch 661 performs operations including: (i) receiving the two cropped patches 639, (ii) augmenting each of the two cropped patches via the image augmentation algorithms 650 to generate two augmented patches 641, and (iii) generating latent features from the two augmented patches 641 by training an encoder of the discriminative learning branch 661 to maximize agreement between instances of same classes in latent space via a discriminative loss function," paragraph [0130]).
Li and Haghighi et al. are combined as per claim 1.
Claim 16
 Regarding claim 16, Li teaches the non-transitory computer readable medium of claim 14, wherein training the machine learning based task network for performing the medical analysis task comprises: 
training the machine learning based task network using a cost function that incorporates the self-supervised labels and the generated weakly-supervised labels ("Reinforcement learning can involve providing a machine learning model with a reward signal for a correct output inference; for example, the reward signal can be a numerical value," paragraph [0274] where a reward signal is a cost function).
Claim 17
 Regarding claim 17, Li teaches the non-transitory computer readable medium of claim 14, as noted above.
Li is not relied upon to explicitly teach all of a plurality of decoders.
However, Haghighi et al. teach wherein the machine learning based task network further comprises a plurality of decoders and training the machine learning based task network for performing the medical analysis task comprises: 
training the machine learning based task network to perform a plurality of tasks performed respectively using one of the plurality of decoders ("the restorative learning branch includes: an encoder fe and decoder ge configured for mapping the augmented patches distorted by the augmentation function back to an original image via fe, ge : (x,T) x; wherein the encoder fe of the restorative learning branch is a shared encoder, shared with the discriminative learning branch; and wherein the encoder fe and decoder ge comprise an encoder/decoder network," paragraph [0154]).
Li and Haghighi et al. are combined as per claim 1.
Claim 18
 Regarding claim 18, Li teaches the non-transitory computer readable medium of claim 14, as noted above.
Li is not relied upon to explicitly teach all of a plurality of decoders.
However, Haghighi et al. teach wherein the machine learning based task network further comprises a plurality of decoders and training the machine learning based task network for performing the medical analysis task comprises: 
training an initial decoder of the plurality of decoders to generate a reconstructed image based on features generated by the encoder ("the restorative learning branch includes: an encoder fe and decoder ge configured for mapping the augmented patches distorted by the augmentation function back to an original image," paragraph [0154]); and
training one or more additional decoders of the plurality of decoders to respectively perform one or more tasks based on the reconstructed image ("According to the described embodiments, the DiRA platform is a general framework that allows various choices of discrimination tasks without any constraint. As such, the declaration of class might range from considering every single image as a class (instance discrimination) to clustering images based on a similarity metric ( cluster discrimination)," paragraph [0056]).
Li and Haghighi et al. are combined as per claim 1.
Claim 19
 Regarding claim 19, Li teaches the non-transitory computer readable medium of claim 14, as noted above.
Li is not relied upon to explicitly teach all of a plurality of decoders.
However, Haghighi et al. teach wherein the machine learning based task network further comprises a plurality of decoders and training the machine learning based task network for performing the medical analysis task comprises:
fine tuning the machine learning based task network using one or more of the plurality of decoders for domain adaptation ("All pre-trained models were fine-tuned for 4 distinct medical applications ranging from target tasks on the source dataset to the tasks with comparatively significant domain-shifts in terms of data distribution and disease/object of interest," paragraph [0082]).
Li and Haghighi et al. are combined as per claim 1.
Claim 20
 Regarding claim 20, Li teaches a computer-implemented method comprising:
receiving unannotated training medical data ("a data set with no pre-existing labels and with a minimum of human supervision," paragraph [0270]);
generating weakly-supervised labels for the unannotated training medical data using one or more trained machine learning based supervised learning networks ("Weakly-supervised learning uses input having a corresponding weak label. The weak label means it provides less information compared with the label that would be used in supervised learning," paragraph [0272]);
training a machine learning based task network for performing a medical analysis task based on 1) the unannotated training medical data  ("a data set with no pre-existing labels and with a minimum of human supervision," paragraph [0270]), 2) self-supervised labels for the unannotated training medical data learned via self-supervised learning("unsupervised learning, also known as self-organization allows for modeling of probability densities over inputs without referencing corresponding labels for the inputs. Unsupervised learning algorithms are suitable for tasks where the data has distinguishable inherent patterns," paragraph [0270]), and 3) the generated weakly-supervised labels for the unannotated training medical data ("Weakly-supervised learning uses input having a corresponding weak label. The weak label means it provides less information compared with the label that would be used in supervised learning," paragraph [0272]) wherein training the machine learning based task network for  performing the medical analysis task comprises:
assigning the generated weakly-supervised labels to the extracted latent features ("By taking the tumor prediction as a pseudo label (additional to the manual pixel-level-label), the tumor segmentation in the non-enhanced image is thus improved," paragraph [0279] where a pseudo label is a latent space), and
training the machine learning based task network based on the generated weakly-supervised labels assigned to the extracted latent features ("To predict the tumor mask for the box-level-labeled data, WSTS includes an Uncertainty-Sifting Self-Ensembling (USSE). The USSE utilizes the limited pixel-level-labeled data and additional box-level-labeled data to predict the tumor accurately by evaluating the prediction reliability with a Multi-scale Uncertainty-estimation," paragraph [0279]); and
outputting the trained machine learning based task network ("Considering a non-enhanced MRI image as the initial current state, in the training phase, the state-behavior network estimates pixel-level candidate actions of the current state by observing the current state, while the state evaluator network predicts a pixel-level average action as an empirical baseline that would have been taken at the current state. With the dual-level complementary reward measuring the improvement in two kinds of image synthesis actions, the advantage function computes the extra rewards by comparing the real rewards of the candidate actions with the expected rewards of the average action. It finds whether the candidate actions have resulted in better or worse results than the baseline action and takes the optimal action that has the most extra rewards to update the current state to the next state. Meanwhile, the advantage function feeds back to optimize both networks, namely, the advantage function enables the state-behavior network to estimate better candidate actions and enables the state-evaluator network to predict more accurate average actions, thereby computing an accurate advantage function to find an optimal action at the next state," paragraph [0285]).
Li is not relied upon to explicitly teach all of latent features.
However, Haghighi et al. teach extracting latent features from the unannotated training medical data using an encoder of the machine learning based task network ("The discriminative learning branch generates "discriminative latent features" from input images. Specifically, the discriminative learning branch 661 performs operations including: (i) receiving the two cropped patches 639, (ii) augmenting each of the two cropped patches via the image augmentation algorithms 650 to generate two augmented patches 641, and (iii) generating latent features from the two augmented patches 641 by training an encoder of the discriminative learning branch 661 to maximize agreement between instances of same classes in latent space via a discriminative loss function," paragraph [0130]);
Li and Haghighi et al. are combined as per claim 1.

Allowable Subject Matter
Claims 5 and 13 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Reference Cited
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
US Patent Publication 2024 0402705 A1 to Xie et al. discloses obtaining an image of an environment in front of an unmanned surface vehicle; and inputting the image of the environment into an environment perception and decision-making model of the unmanned surface vehicle, and outputting an action instruction, where the environment perception and decision-making model of the unmanned surface vehicle includes an image feature extractor, a Bidirectional Encoder Representations from Transformers (BERT) model, a fully connected layer, a short-term scene memory module, and a long-term memory module.
Non Patent Publication “Self-supervised learning methods and applications in medical imaging analysis: a survey” to Shurrab et al. discloses a set of the most recent self-supervised learning methods from the computer vision field as they are applicable to the medical imaging analysis and categorize them as predictive, generative, and contrastive approaches. Moreover, the article covers 40 of the most recent research papers in the field of self-supervised learning in medical imaging analysis aiming at shedding the light on the recent innovation in the field.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HEATH E WELLS whose telephone number is (703)756-4696. The examiner can normally be reached Monday-Friday 8:00-4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ms. Jennifer Mehmood can be reached on 571-272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/H.E.W/Examiner, Art Unit 2664



Date: 16 March 2026

/JENNIFER MEHMOOD/Supervisory Patent Examiner, Art Unit 2664
Read full office action
Prosecution Timeline

Sep 28, 2023
Application Filed
Oct 29, 2025
Non-Final Rejection — §103
Jan 08, 2026
Response Filed
Mar 16, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/232,212
Patent 12602755
DEEP LEARNING-BASED HIGH RESOLUTION IMAGE INPAINTING
2y 5m to grant Granted Apr 14, 2026
17/783,931
Patent 12597226
METHOD AND SYSTEM FOR AUTOMATED PLANT IMAGE LABELING
2y 5m to grant Granted Apr 07, 2026
17/620,452
Patent 12591979
IMAGE GENERATION METHOD AND DEVICE
2y 5m to grant Granted Mar 31, 2026
17/828,545
Patent 12588876
TARGET AREA DETERMINATION METHOD AND MEDICAL IMAGING SYSTEM
2y 5m to grant Granted Mar 31, 2026
17/991,910
Patent 12586363
GENERATION OF PLURAL IMAGES HAVING M-BIT DEPTH PER PIXEL BY CLIPPING M-BIT SEGMENTS FROM MUTUALLY DIFFERENT POSITIONS IN IMAGE HAVING N-BIT DEPTH PER PIXEL
2y 5m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
75%
Grant Probability
93%
With Interview (+18.1%)
3y 5m
Median Time to Grant
Moderate
PTA Risk
Based on 77 resolved cases by this examiner. Grant probability derived from career allow rate.