Last updated: April 18, 2026
Application No. 18/494,157
SYSTEMS AND METHODS FOR DEEP LEARNING MODEL ANNOTATION USING SPECIALIZED IMAGING MODALITIES

Final Rejection §103§112
Filed
Oct 25, 2023
Examiner
HAUSMANN, MICHELLE M
Art Unit
2671
Tech Center
2600 — Communications
Assignee
Pathai Inc.
OA Round
2 (Final)
Interview Optional

— +21.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 863 resolved cases, 2023–2026
Examiner Intelligence

HAUSMANN, MICHELLE M View full profile →
Grants 76% — above average
Career Allow Rate
658 granted / 863 resolved
+14.2% vs TC avg
Strong +22% interview lift
Without
With
+21.6%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
23 currently pending
Career history
886
Total Applications
across all art units
Statute-Specific Performance

§101
14.6%
-25.4% vs TC avg
§103
61.2%
+21.2% vs TC avg
§102
5.7%
-34.3% vs TC avg
§112
10.1%
-29.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 863 resolved cases
Office Action

§103 §112
DETAILED ACTION
Response to Amendment
Claims 1-20 are pending. Claims 1-17 are amended directly or by dependency on an amended claim. Claims 18-20 are new.
Response to Arguments
Applicant’s arguments received 23 March, 2026, pages 5-6 with respect to the 35 USC 102 and 35 USC 103 rejections of claim(s) 1-17 have been considered but are moot because the new ground of rejection does not rely on the combination of references applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. It is noted that while claims 18-20 are more substantive, claims 1 and 15 are still broad and examiner could have also rejected with Shoshan et al. (US 12475193 B1) which is referenced in the conclusion section of this action. Examiner notes currently the angle in claim 19 is indicated as obvious under KSR as a preferred value, however if there is a specific reason one of ordinary skill in the art would not think to use this angle that has support in the specification this could improve conditions for allowance. 
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claim 18 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. The term “a first image of the images” is unclear as it is not certain if this means of the “training pathology slide images” or “at least two polarization images”. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3 and 14-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jesudason et al. (US 20250246010 A1) in view of Bhattacharya et al. (US 20220058803 A1).

Regarding claim 1, Jesudason et al. disclose a method comprising: using a machine learning (ML) model to obtain annotations of a pathology slide image obtained in a first imaging modality (The identification and phenotyping of tissue cells may use one visualization modality. The one visualization modality may be utilized as verifiable ground truth data for training one or more machine-learning models to classify and phenotype tissue cells captured utilizing another visualization modality, [0007], one or more machine-learning models to classify and phenotype tissue cells in one visualization modality image, [0021], [0022], [0060], FIGS. 4A and 4B illustrate annotations of immune cells on a whole slide image for five or more classes of immune cells, [0025], “In certain embodiments, the one or more machine-learning models 312 (e.g., a deep neural network (DNN), a convolutional neural network (CNN), a fully-connected neural network (FCNN), and so forth) may be then trained (e.g., by way of supervised machine-learning). For example, during the training phase, the one or more machine-learning models 312 may be provided a data set of training image(s) 314. The training image(s) 314 may be an image having a first visualization modality or a second visualization modality (e.g., thousands of H&E training images). For example, the training image(s) 314 may be H&E training image(s). In some embodiments, the data set of H&E training image(s) 314 may be annotated for training the one or more machine-learning models 312”, the one or more machine-learning models 312 may generate one or more predictions of class labels for the output image(s) 316, [0064]); wherein the ML model is trained based in part on images obtained from a second imaging modality different from the first imaging modality (That is, in accordance with the presently disclosed embodiments, ground truth data for training one or more machine-learning models to classify and phenotype tissue cells for a particular visualization modality (e.g., in which tissue cells cannot be readily ascertained by observation) may be produced by relying on, for example, spatial features that may be readily ascertainable by observation of human experts (e.g., pathologists, scientists, clinicians, or other medical and scientific experts) with respect to a different visualization modality, [0007], training one or more machine-learning models to classify and phenotype tissue cells in one visualization modality image utilizing class labels from a different visualization modality image as ground truth, [0021], [0022], [0060], “identify and classify tissue cells based on the class labels determined and known from the mxIF image 308 (a different visualization modality)”, [0064]).

Jesudason et al. do not disclose the second imaging modality comprising polarization imaging and the images being obtained from at least two polarization images.

Bhattacharya et al. teach using a machine learning (ML) model to obtain annotations of a pathology slide image obtained in a first imaging modality; wherein the ML model is trained based in part on images obtained from a second imaging modality different from the first imaging modality (The present invention seeks to improve the quality of an ophthalmic image by use of a trained neural network. As is explained below, this requires multiple training pair sets, e.g., a training input image paired with a corresponding ground truth, target training output image. A difficulty with using deep learning is obtaining ground truth outputs for use in the training set. The quality (e.g. vessel continuity, noise level) of true averaged images is generally far superior to that of an individual scan. Thus one approached in the present invention is to translate a single ophthalmic input image to an average-simulating image, e.g., an image that has characteristics of a true averaged image. In this approach, true averaged images are used as ground truth images (e.g., as training output images) in the training of the present neural network. Another approach is to translate an ophthalmic input image of a first modality to an output ophthalmic image simulating a different modality that typically has a higher quality. For example, AO-OCT images may be used as ground truth, training output images to train a neural network to produce AO-simulating image. For ease of discussion, much of the following discussion describes the use of true averaged images as ground truth, target output images in a training set, but unless otherwise stated, it is be understood that a similar description applies to using AO-OCT images (or other higher quality and/or higher resolution images) as ground truth target outputs in a training set, [0058]) the second imaging modality comprising polarization imaging and the images being obtained from at least two polarization images (“Optionally, an OCT system may be made to generate scans/images of varying quality (e.g., the SNR may be lowered and/or select image processing may be omitted and/or motion tracking may be reduced or eliminated), such that the training pairs include a mixture of images of differing quality. Additionally, the present training method may include recording scans/images of exactly the same object structure but with different speckle (e.g. by changing the light polarization or angle)”, [0088]).

Jesudason et al. and Bhattacharya et al. are in the same art of biological images from different modalities and machine learning (Jesudason et al., abstract, [0021]; Bhattacharya et al., [0058]). The combination of Bhattacharya et al. with Jesudason et al. will enable using polarization images. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the type of images of Bhattacharya et al. with the invention of Jesudason et al. as this was known at the time of filing, the combination would have predictable results, and as Bhattacharya et al. state “Individual OCT/OCTA scans suffer from jitter, drop outs, and speckle noise, among other issues. These issues can affect the quality of en face images both qualitatively and quantitatively, as they are used in the quantification of vasculature density. The present invention seeks to improve the quality of an ophthalmic image by use of a trained neural network” ([0058]) indicating an image quality improvement and “Thus, neural network 97 learns to denoise images using only raw images and no special priors (e.g., without spatial averaging) to denoise B-scans and/or enface images” ([0088]) indicating an improvement to computational time when the inventions are combined.

Regarding claim 2, Jesudason et al. and Bhattacharya et al. disclose the method of claim 1. Jesudason et al. further disclose the first imaging modality is configured to image a slide based on light source of visible wavelengths and absorption of light by tissue (“Digital pathology typically involves visualizing and analyzing digitized slides to ascertain whether variations occurring in tissue cells are due to disease, toxicity, and/or natural processes. The visualizations of tissue cells may generally include modalities consisting of dye-based visualization modalities, molecular-based visualization modalities, or probe-based visualization modalities. For example, dye-based visualizations, such as hematoxylin and eosin (H&E), may utilize the chemical properties of the dye molecules to bind to specific tissue cells, such as mucin, fats, and proteins. Similarly, molecular-based visualizations, such as immunohistochemistry (IHC), may utilize antibodies to bind to protein epitope targets with a high degree of specificity, and may be visualized, for example, utilizing 3,3′-diaminobenzidine or other similar organic compound to produce a brownish stain. IHC may also include other visualization colors, which may be combined, for example, to visualize multiple and different biomarkers. Other molecular-based visualizations may include, for example, immunofluorescence (IF), which may utilize fluorescent dyes to multiplex and highlight a large number of different target cells (e.g., antibodies). Likewise, other molecular-based visualizations may include probe-based techniques that may be utilized to visualize, for example, messenger RNA (mRNA), micro RNA (miRNA), and DNA in one or more tissue cells utilizing bright-field or IF visualization modalities. Still, more recent visualization modalities may include, for example, hyperspectral imaging, which may be utilized to distinguish between dozens of different biomarkers labeled with different heavy metals and visualized utilizing imaging mass cytometry (IMC) or mass spectrometry (MS). Thus, as may be appreciated from the foregoing, often different visualization modalities may be deployed to visualize and analyze the same target tissue cells based on the information pathologists, scientists, or clinicians are attempting to ascertain. For example, H&E stains may be well-suited for broadly visualizing specific tissue cells, such as cancer cells and proteins, while multiplex IHC (mxIHC) or multiplex IF (mxIF) may be well-suited to visualize and identify specific proteins and tissue cells. Various tissue cells (or assorted biological features) appear differently in different visualization modalities. For example, often a tissue cell image captured using one visualization modality may appear markedly different from an image of the same tissue cell captured using another visualization modality, and thus it may be challenging, or even counterintuitive, for either humans or computational-based models to recognize that the images portray the same tissue cell,” [0003]-[0005]) [H&E imaging uses visible wavelengths]

Regarding claim 3, Jesudason et al. and Bhattacharya et al. disclose the method of claim 2. Jesudason et al. further disclose the second imaging modality comprises one or more of multispectral imaging (MSI), quantitative phase imaging, or a combination thereof (Still, more recent visualization modalities may include, for example, hyperspectral imaging, which may be utilized to distinguish between dozens of different biomarkers labeled with different heavy metals and visualized utilizing imaging mass cytometry (IMC) or mass spectrometry (MS), [0004]).

Regarding claim 14, Jesudason et al. and Bhattacharya et al. disclose the method of claim 1. Jesudason et al. further disclose the annotations of the pathology slide image comprise heatmaps or labels of tissues/cells in the pathology slide image (In some embodiments, classifying the first tissue cell into the phenotype may include classifying, based on one or more molecular annotations, the first tissue cell as a cancer cell, a macrophage, a regulatory T-cell (Treg), a CD8 cell, a B lymphocyte, a natural killer (NK) cell, or a fibroblast, [0012], In certain embodiments, during a training phase, the one or more computing devices may determine a phenotype class label for the first tissue cell based on one or more spatial features associated with the first region of pixels corresponding to the first tissue cell, determine a correspondence between the first region of pixels corresponding to the first tissue cell and the second region of pixels corresponding to the first tissue cell based on the cell-to-cell registration process, [0013], FIGS. 4A and 4B illustrate annotations of immune cells on a whole slide image for five or more classes of immune cells, [0025], pathologist, scientist, or clinician (e.g., oncologist) to manually label one or more tissue cells to be matched, [0045], phenotyping table 118 may include immune cell phenotype, the immune cell activation state, the image identification or label, the shape, and location of the tissue cells of interest, [0049]).

Regarding claim 15, Jesudason et al. disclose a method comprising: using a machine learning (ML) model to obtain annotations of a pathology slide image of a first type (The identification and phenotyping of tissue cells may use one visualization modality. The one visualization modality may be utilized as verifiable ground truth data for training one or more machine-learning models to classify and phenotype tissue cells captured utilizing another visualization modality, [0007], one or more machine-learning models to classify and phenotype tissue cells in one visualization modality image, [0021], [0022], [0060], “In certain embodiments, the one or more machine-learning models 312 (e.g., a deep neural network (DNN), a convolutional neural network (CNN), a fully-connected neural network (FCNN), and so forth) may be then trained (e.g., by way of supervised machine-learning). For example, during the training phase, the one or more machine-learning models 312 may be provided a data set of training image(s) 314. The training image(s) 314 may be an image having a first visualization modality or a second visualization modality (e.g., thousands of H&E training images). For example, the training image(s) 314 may be H&E training image(s). In some embodiments, the data set of H&E training image(s) 314 may be annotated for training the one or more machine-learning models 312”, the one or more machine-learning models 312 may generate one or more predictions of class labels for the output image(s) 316, [0064]); wherein the ML model is trained based in part on training pathology slide images of a second type different from the first type (That is, in accordance with the presently disclosed embodiments, ground truth data for training one or more machine-learning models to classify and phenotype tissue cells for a particular visualization modality (e.g., in which tissue cells cannot be readily ascertained by observation) may be produced by relying on, for example, spatial features that may be readily ascertainable by observation of human experts (e.g., pathologists, scientists, clinicians, or other medical and scientific experts) with respect to a different visualization modality, [0007], training one or more machine-learning models to classify and phenotype tissue cells in one visualization modality image utilizing class labels from a different visualization modality image as ground truth, [0021], [0022], [0060], “identify and classify tissue cells based on the class labels determined and known from the mxIF image 308 (a different visualization modality)”, [0064]).

Jesudason et al. do not disclose the second type of image being captured by polarization imaging and being obtained from at least two polarization images.

Bhattacharya et al. teach using a machine learning (ML) model to obtain annotations of a pathology slide image of a first type; wherein the ML model is trained based in part on training pathology slide images of a second type different from the first type (The present invention seeks to improve the quality of an ophthalmic image by use of a trained neural network. As is explained below, this requires multiple training pair sets, e.g., a training input image paired with a corresponding ground truth, target training output image. A difficulty with using deep learning is obtaining ground truth outputs for use in the training set. The quality (e.g. vessel continuity, noise level) of true averaged images is generally far superior to that of an individual scan. Thus one approached in the present invention is to translate a single ophthalmic input image to an average-simulating image, e.g., an image that has characteristics of a true averaged image. In this approach, true averaged images are used as ground truth images (e.g., as training output images) in the training of the present neural network. Another approach is to translate an ophthalmic input image of a first modality to an output ophthalmic image simulating a different modality that typically has a higher quality. For example, AO-OCT images may be used as ground truth, training output images to train a neural network to produce AO-simulating image. For ease of discussion, much of the following discussion describes the use of true averaged images as ground truth, target output images in a training set, but unless otherwise stated, it is be understood that a similar description applies to using AO-OCT images (or other higher quality and/or higher resolution images) as ground truth target outputs in a training set, [0058]) the second type of image being captured by polarization imaging and being obtained from at least two polarization images (“Optionally, an OCT system may be made to generate scans/images of varying quality (e.g., the SNR may be lowered and/or select image processing may be omitted and/or motion tracking may be reduced or eliminated), such that the training pairs include a mixture of images of differing quality. Additionally, the present training method may include recording scans/images of exactly the same object structure but with different speckle (e.g. by changing the light polarization or angle)”, [0088]).

Jesudason et al. and Bhattacharya et al. are in the same art of biological images from different modalities and machine learning (Jesudason et al., abstract, [0021]; Bhattacharya et al., [0058]). The combination of Bhattacharya et al. with Jesudason et al. will enable using polarization images. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the type of images of Bhattacharya et al. with the invention of Jesudason et al. as this was known at the time of filing, the combination would have predictable results, and as Bhattacharya et al. state “Individual OCT/OCTA scans suffer from jitter, drop outs, and speckle noise, among other issues. These issues can affect the quality of en face images both qualitatively and quantitatively, as they are used in the quantification of vasculature density. The present invention seeks to improve the quality of an ophthalmic image by use of a trained neural network” ([0058]) indicating an image quality improvement and “Thus, neural network 97 learns to denoise images using only raw images and no special priors (e.g., without spatial averaging) to denoise B-scans and/or enface images” ([0088]) indicating an improvement to computational time when the inventions are combined.

Regarding claim 16, Jesudason et al. and Bhattacharya et al. disclose the method of claim 15. Jesudason et al. further indicate the first type of image is obtained from a stained slide (For example, H&E stains may be well-suited for broadly visualizing specific tissue cells, [0005], a hematoxylin and eosin (H&E) image, [0042], [0043]); and the second type of image is a stain-invariant image obtained from a triplex slide (multiplex IHC (mxIHC) or multiplex IF (mxIF) may be well-suited to visualize and identify specific proteins and tissue cells, [0005], “For example, in certain embodiments, the first visualization modality and the second visualization modality may be independently acquired by a whole slide imaging modality, microscopy modality, non-optical imaging modality. or spatial transcriptomics (ST) imaging modality. The whole slide imaging modality may be selected from bright-field or fluorescence imaging. The microscopy modality may be selected from bright-field microscopy, fluorescence microscopy, confocal microscopy, or high-content screening (HCS) microscopy, or synthetic image generation. The non-optical imaging modality may be selected from imaging mass cytometry (IMC) or myocardial perfusion imaging (MIBI)”, [0009], multiplex immunofluorescence (mxIF) image, [0042]-[0044]).

Jesudason et al. do not specify triplex in particular. It would have been obvious at the time of filing to one of ordinary skill in the art that a triplex is a type of multiplex as this is one of a limited number of options that multi can refer to.

Jesudason et al. do not specify stain-invariant in particular. It would have been obvious at the time of filing to one of ordinary skill in the art the image may be stain-invariant as Jesudason et al. makes a distinction between stain/dye based imaging and non stain or dye based imaging in paragraph 9 above.

Regarding claim 18, Jesudason et al. and Bhattacharya et al. disclose the method of claim 1. Bhattacharya et al. further indicate a first image of the images is obtained from a first polarization image of a sample at a first angle and a second polarization image of the sample at a second angle (Additionally, the present training method may include recording scans/images of exactly the same object structure but with different speckle (e.g. by changing the light polarization or angle), [0088]) [implies at least two angles].

Regarding claim 19, Jesudason et al. and Bhattacharya et al. disclose the method of claim 18. Jesudason et al. and Bhattacharya et al. do not disclose the second angle is rotated 45 degrees with respect to the first angle. Bhattacharya et al. discloses the claimed invention except for the exact angle 45 degrees. It would have been obvious to one having ordinary skill in the art at the time the invention was made to operate at 45 degrees, since it has been held that discovering an optimum value of a result effective variable involves only routine skill in the art. In re Boesch, 617 F.2d 272, 205 USPQ 215 (CCPA 1980) and Bhattacharya et al. states “Additionally, the present training method may include recording scans/images of exactly the same object structure but with different speckle (e.g. by changing the light polarization or angle), [0088]” and a 45 degree difference would be one of a limited number of angle increments that could be used.

Regarding claim 20, Jesudason et al. and Bhattacharya et al. disclose the method of claim 1. Bhattacharya et al. further indicate a first image of the second type of image is obtained from a first polarization image of a sample at a first angle and a second polarization image of the sample at a second angle (Additionally, the present training method may include recording scans/images of exactly the same object structure but with different speckle (e.g. by changing the light polarization or angle), [0088]) [implies at least two angles].

Claim(s) 4-13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jesudason et al. (US 20250246010 A1) and Bhattacharya et al. (US 20220058803 A1) as applied to claim 3 above, further in view of Ozcan et al. (US 20230030424 A1).

Regarding claim 4, Jesudason et al. and Bhattacharya et al. disclose the method of claim 3. Jesudason et al. partly disclose training the ML model, using a plurality of pairs of first image and second images; wherein: the first image in the pair is obtained from the first modality imaging of a first pathology slide; and the second image in the pair is generated based on a second modality imaging of a second pathology slide corresponding to the first pathology slide (the phenotyping table 118 may include, for example, a record of the matched and identified tissue cells determined based on the cross-modality cell-to-cell registration process 114, [0049], “FIG. 3B illustrates a running example 300B of training one or more machine-learning models to classify and phenotype tissue cells (e.g., immune cells) in one visualization modality image utilizing class labels from a different visualization modality image as ground truth, in accordance with the presently disclosed embodiments. FIG. 3B is a running example of the process illustrated and described above with respect to FIG. 3A, As depicted, in one embodiment, the running example 300B of a cross-modality cell-to-cell registration process may be described with respect to an mxIF image 308 and an H&E image 310”, [0060]) however another reference is added to make this more explicit.

Ozcan et al. teach training the ML model, using a plurality of pairs of first image and second images; wherein: the first image in the pair is obtained from the first modality imaging of a first pathology slide; and the second image in the pair is generated based on a second modality imaging of a second pathology slide corresponding to the first pathology slide (“In another embodiment, a method of generating a virtually stained microscopic image of a sample includes providing a trained, deep neural network that is executed by image processing software using one or more processors of a computing device, wherein the trained, deep neural network is trained with a plurality of pairs of stained microscopy images or image patches that are either virtually stained by at least one algorithm or chemically stained to have a first stain type, and are all matched with the corresponding stained microscopy images or image patches of the same sample(s) that are either virtually stained by at least one algorithm or chemically stained to have another different stain type, which constitute ground truth images for training of the deep neural network to transform input images histochemically or virtually stained with the first stain type into output images that are virtually stained with the second stain type. A histochemically or virtually stained input image of the sample stained with the first stain type is obtained. The histochemically or virtually stained input image of the sample is input to the trained, deep neural network that transforms input images stained with the first stain type into output images virtually stained with the second stain type. The trained, deep neural network outputs an output image of the sample having virtual staining to substantially resemble and match a chemically stained image of the same sample stained with the second stain type obtained by an incoherent microscope after the chemical staining process”, [0009], obtain whole slide images (WSI) of the sample 22, [0046], In order to use four different stains, (H&E, Masson trichrome, PAS and Jones), image pre-processing and alignment was implemented for each input image and target image pair (training pairs) from those four different staining datasets respectively, [0065], “At the end of these registration steps 60, 64, 68, the auto-fluorescence image patches 20b and their corresponding brightfield tissue image patches 48f are accurately matched to each other and can be used as input and label pairs for the training of the deep neural network 10, allowing the network to solely focus on and learn the problem of virtual histological staining”, [0105]).

Jesudason et al. and Ozcan et al. are in the same art of registering different types of histological slide images (Jesudason et al., abstract, [0007], [0025]; Ozcan et al., abstract, [0034], [0046]). The combination of Ozcan et al. with Jesudason et al. and Bhattacharya et al. will enable generating pairs. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the pairs of Ozcan et al. with the invention of Jesudason et al. and Bhattacharya et al. as this was known at the time of filing, the combination would have predictable results, and as Ozcan et al. indicate “At the end of these registration steps 60, 64, 68, the auto-fluorescence image patches 20b and their corresponding brightfield tissue image patches 48f are accurately matched to each other and can be used as input and label pairs for the training of the deep neural network 10, allowing the network to solely focus on and learn the problem of virtual histological staining” ([0105]) and “For example, there is a need in the industry to transform chemical stains one type to another type. An example includes chemical H&E stains and the need to create specialized stains such as PAS, MT or JMS. For instance, non-neoplastic kidney disease relies on these “special stains” to provide the standard of care pathologic evaluation. In many clinical practices, H&E stains are available before special stains and pathologists may provide a “preliminary diagnosis” to enable the patient's nephrologist to begin treatment. This is especially useful in the setting of some diseases such as crescentic glomerulonephritis or transplant rejection, where quick diagnosis, followed by rapid initiation or treatment, may lead to significant improvements in clinical outcomes” ([0180]) demonstrating an efficiency improvement in training the neural network, and a clinical outcome improvement by allowing medical interventions to be done sooner without having to wait for more expensive imaging results.

Regarding claim 5, Jesudason et al. and Bhattacharya et al. and Ozcan et al. disclose the method of claim 4. Jesudason et al. and Ozcan et al. further indicate the second pathology slide and the first pathology slide are a same physical slide (Jesudason et al., “FIGS. 5A-5C illustrate one or more graphical or implementation examples of a tissue cell matching example, in accordance with the presently disclosed embodiments. For example, original images 500A of an mxIF slide 502A, an mxIF slide 506A, and an H&E slide 510A may include one or more tissue cells. It should be appreciated that the tissue cells 504A, the tissue cells 508A, and the tissue cells 512A may be the same exact tissue cells captured by different visualization modalities. For example, as depicted by the mxIF slide 502A, the one or more tissue cells 504A may be at different orientations, alignments, proximities, and so forth on one or more of the mxIF slide 502A, the mxIF slide 506A, and the H&E slide 510A”, [0070]; Ozcan et al., “As the deep neural network 10 aims to learn the transformation from autofluorescence images 20 of the unlabeled tissue specimens 22 to those of a stained specimen (i.e., gold standard), it is crucial to accurately align the FOVs. Furthermore, when more than one autofluorescence channel is used as the network's 10 input, the various filter channels must be aligned. In order to use four different stains, (H&E, Masson trichrome, PAS and Jones), image pre-processing and alignment was implemented for each input image and target image pair (training pairs) from those four different staining datasets respectively. Image pre-processing and alignment follows the global and local registration process as described herein and illustrated in FIG. 8. However, one major difference is that when using multiple autofluorescence channels as the network input (i.e., DAPI and TxRed as shown here), they must be aligned. Even though the images from the two channels are captured using the same microscope it was determined that the corresponding FOVs from the two channels are not precisely aligned, particularly on the edges of the FOVs. Therefore, an elastic registration algorithm as described herein was used to accurately align the multiple autofluorescence channels. The elastic registration algorithm matches the local features of both channels of images (e.g., DAPI and TxRed) by hierarchically breaking the images into smaller and smaller blocks while matching the corresponding blocks. The calculated transformation map was then applied to the TxRed images to ensure that they are aligned to the corresponding images from DAPI channel. Finally, the aligned images from the two channels and get aligned whole slide images which contain both the DAPI and TxRed channels”, [0065]).

Regarding claim 6, Jesudason et al. and Bhattacharya et al. and Ozcan et al. disclose the method of claim 4. Jesudason et al. and Ozcan et al. further indicate training further includes registering the first image and the second image in each of the pairs of first image and second image (Jesudason et al., Embodiments of the present disclosure are directed toward one or more computing devices, methods, and non-transitory computer-readable media that may perform a cross-modality cell-to-cell registration process to identify, phenotype, and keep track of tissue cells captured utilizing various visualization modalities. Such a registration process may allow improved visualizations of various cells. The disclosed process may further permit the identification and phenotyping of tissue cells using one visualization modality may be utilized as verifiable ground truth data for training one or more machine-learning models to classify and phenotype tissue cells captured utilizing another visualization modality. That is, in accordance with the presently disclosed embodiments, ground truth data for training one or more machine-learning models to classify and phenotype tissue cells for a particular visualization modality (e.g., in which tissue cells cannot be readily ascertained by observation) may be produced by relying on, for example, spatial features that may be readily ascertainable by observation of human experts (e.g., pathologists, scientists, clinicians, or other medical and scientific experts) with respect to a different visualization modality, [0007], [0031]; Ozcan et al., “As the deep neural network 10 aims to learn the transformation from autofluorescence images 20 of the unlabeled tissue specimens 22 to those of a stained specimen (i.e., gold standard), it is crucial to accurately align the FOVs. Furthermore, when more than one autofluorescence channel is used as the network's 10 input, the various filter channels must be aligned. In order to use four different stains, (H&E, Masson trichrome, PAS and Jones), image pre-processing and alignment was implemented for each input image and target image pair (training pairs) from those four different staining datasets respectively. Image pre-processing and alignment follows the global and local registration process as described herein and illustrated in FIG. 8. However, one major difference is that when using multiple autofluorescence channels as the network input (i.e., DAPI and TxRed as shown here), they must be aligned. Even though the images from the two channels are captured using the same microscope it was determined that the corresponding FOVs from the two channels are not precisely aligned, particularly on the edges of the FOVs. Therefore, an elastic registration algorithm as described herein was used to accurately align the multiple autofluorescence channels. The elastic registration algorithm matches the local features of both channels of images (e.g., DAPI and TxRed) by hierarchically breaking the images into smaller and smaller blocks while matching the corresponding blocks. The calculated transformation map was then applied to the TxRed images to ensure that they are aligned to the corresponding images from DAPI channel. Finally, the aligned images from the two channels and get aligned whole slide images which contain both the DAPI and TxRed channels”, [0065]).

Regarding claim 7, Jesudason et al. and Bhattacharya et al. and Ozcan et al. disclose the method of claim 6. Jesudason et al. and Ozcan et al. further indicate the registering includes aligning the first image and the second image in each of the pairs (Jesudason et al., This application relates generally to molecular annotation, and, more particularly, to resolving disparities in molecular annotation utilizing cross-modality pixel alignment and cell-to-cell registration across various imaging modalities, [0002], FIG. 6 illustrate one or more graphical or implementation examples 600 of a cross-modality cell-to-cell registration process, in accordance with the presently disclosed embodiments. Specifically, the cross-modality cell-to-cell registration process as illustrated by the one or more graphical or implementation examples 600 correspond to the cell-to-cell registration process 114 as discussed above with respect to FIG. 1A. For example, the cross-modality cell-to-cell registration process includes performing (602) a scale-invariant Fourier transform (SIFT) alignment of cross-modality visualization images (e.g., SIFT-based aligned with a down-sampled pyramid layer), and performing (604) a tile-level alignment of the cross-modality visualization images, including a matrix transformation of the SIFT alignment of the cross-modality visualization images (e.g., a matrix transformation from coarse image alignment is used to transform tiles from a full resolution layer, which is aligned again with SIFT), [0072]; Ozcan et al., “As the deep neural network 10 aims to learn the transformation from autofluorescence images 20 of the unlabeled tissue specimens 22 to those of a stained specimen (i.e., gold standard), it is crucial to accurately align the FOVs. Furthermore, when more than one autofluorescence channel is used as the network's 10 input, the various filter channels must be aligned. In order to use four different stains, (H&E, Masson trichrome, PAS and Jones), image pre-processing and alignment was implemented for each input image and target image pair (training pairs) from those four different staining datasets respectively. Image pre-processing and alignment follows the global and local registration process as described herein and illustrated in FIG. 8. However, one major difference is that when using multiple autofluorescence channels as the network input (i.e., DAPI and TxRed as shown here), they must be aligned. Even though the images from the two channels are captured using the same microscope it was determined that the corresponding FOVs from the two channels are not precisely aligned, particularly on the edges of the FOVs. Therefore, an elastic registration algorithm as described herein was used to accurately align the multiple autofluorescence channels. The elastic registration algorithm matches the local features of both channels of images (e.g., DAPI and TxRed) by hierarchically breaking the images into smaller and smaller blocks while matching the corresponding blocks. The calculated transformation map was then applied to the TxRed images to ensure that they are aligned to the corresponding images from DAPI channel. Finally, the aligned images from the two channels and get aligned whole slide images which contain both the DAPI and TxRed channels. At the end of the co-registration process, images 20 from the single or multiple autofluorescence channels of the unlabeled tissue sections are well aligned to the corresponding brightfield images 48 of the histologically stained tissue sections 22. Before feeding those aligned pairs into deep neural network 10 for training, normalization is implemented on the whole slide images of the DAPI and TxRed, respectively. This whole slide normalization is performed by subtracting the mean value of the entire tissue sample and dividing it by the standard deviation between pixel values. Following the training procedure, using the class condition, multiple virtual stains can be applied to the images 20 with a single algorithm on the same input image 20. In other words, an additional network is not required for each individual stain. A single, trained neural network can be used to apply one or more digital/virtual stains to an input image 20”, [0065]-[0066], “For the trained, deep neural network 10, a conditional GAN architecture was used to learn the transformation from a label-free unstained autofluorescence lifetime input image 20L to the corresponding bright-field image 48 in three different stains (HER2, PR, and ER). Following the registration of the autofluorescence lifetime images 20L to the bright-field images 48, these accurately aligned FOVs were randomly partitioned into overlapping patches of 256×256 pixels, which were then used to train the GAN-based deep neural network 10”, [0129]).

Regarding claim 8, Jesudason et al. and Bhattacharya et al. and Ozcan et al. disclose the method of claim 6. Jesudason et al. further indicate the second image in the pair is an annotation image comprising a plurality of objects each associated with a respective portion of the second image (The network 100A may be used in a variety of contexts where scanning and evaluation of histopathology images. such as whole slide images, are an essential component of the work. For example, whole slide image processing system 103 may process histopathology images, including whole slide images, to classify the digital pathology images and generate annotations for the digital pathology images and related output. A tile generating module 111 may define a set of tiles or patches for each digital pathology image. To define the set of tiles or patches, the tile generating module 111 may segment the digital pathology image into the set of tiles or patches. The tile generating module 111 may further define a tile or patch size depending on the type of abnormality being detected. For example, the tile generating module 111 may be configured with awareness of the type(s) of tissue abnormalities that the whole slide image processing system 103 will be searching for and may customize the tile or patch size according to the tissue abnormalities to optimize detection, [0040], Thus, in one embodiment, the mxIF image 102 (e.g., “IF.sub.1) may be representative of an mxIF image captured in a first cycle of the cycling process and the mxIF image 104 (e.g., “IF.sub.2) may be representative of an mxIF image captured in a second cycle of the cycling process. Similarly, in one embodiment, the H&E image 106 may include a dye-based visualization, such as hematoxylin and eosin (H&E), which may utilize the chemical properties of the dye molecules to bind to one or more specific tissue cells. In certain embodiments, as will be further appreciated below with respect to FIG. 3A-3D, the mxIF image 102 (e.g., “IF.sub.1) and the mxIF image 104 (e.g., “IF.sub.2) may include one or more spatial features (e.g., any features suitable for informing or ascertaining the phenotype of a cell by its spatial organization with respect to neighboring cells or its position within a tissue or region of a tissue) suitable for allowing a pathologist, scientist, or clinician (e.g., oncologist) to manually label one or more tissue cells to be matched utilizing a cell-to-cell registration process to the corresponding one or more tissue cells in the H&E image 106, which may not include spatial features, [0045],  In certain embodiments, based on the cross-modality cell-to-cell registration process 114. the system workflow 100B may then proceed with performing phenotyping process 116 of the one or more matched tissue cells. For example, in some embodiments, the phenotyping process 116 may include classifying the one or more matched tissue cells, for example, into one or more classes of immune cells, such as macrophages, regulatory T-cells (Tregs), CD8 cells, B lymphocytes, natural killer (NK) cells, and so forth. In certain embodiments, the system workflow 100B may then proceed with storing the aforementioned data into a phenotyping table 118. For example, in some embodiments, the phenotyping table 118 may include, for example, a record of the matched and identified tissue cells determined based on the cross-modality cell-to-cell registration process 114. In certain embodiments, the phenotyping table 118 may be then utilized to label an image 120 (e.g., H&E image), which may be then utilized in downstream tasks to train one or more machine-learning models to classify various immune cells (e.g., macrophages, Tregs, CD8 cells, a B lymphocytes, NK cells, and so forth) in accordance with the present embodiments. For example, in one embodiment, the phenotyping table 118 may include immune cell phenotype, the immune cell activation state, the image identification or label, the shape, and location of the tissue cells of interest, [0049], “As illustrated. one or more tissue cells in the mxIF image 308 may be annotated based on one or more spatial features (e.g., any features suitable for informing or ascertaining the phenotype of a cell by its spatial organization with respect to neighboring cells or its position within a tissue or region of a tissue), and then matched to the corresponding one or more tissue cells in the H&E image 310 (e.g., as illustrated by the lines with open circles extending between the mxIF image 308 and the H&E image 310). For example, as generally depicted by FIG. 3B, a human annotator (e.g., scientist, pathologist, clinician, or other medical or scientific expert) may observe the mxIF image 308 and classify and label one or more tissue cells or populations of tissue cells (e.g., a cancer cell, a plasma cell, a lymphocyte, a macrophage, a fibroblast, and so forth) based on, for example, one or more spatial features (e.g., varying colors of pixels of tissue cells, tissue cell count and density, tissue cell proportion and proximity, tissue cell area and multiplicity, and/or other features suitable for informing the phenotype of a cell by its spatial organization with respect to neighboring cells or its position within a tissue or region of a tissue)”, [0061]) [tissue abnormalities, spatial features, macrophages, regulatory T-cells (Tregs), CD8 cells, B lymphocytes, natural killer (NK) cells = “plurality of objects”, 1st image can be H&E and the 2nd image mxIF] [processing as pairs taught in previous claims by Ozcan et al.].

Regarding claim 9, Jesudason et al. and Bhattacharya et al. and Ozcan et al. disclose the method of claim 8. Jesudason et al. and Ozcan et al. further indicate generating the annotation image by processing an image captured by the second modality imaging over a physical slide (Jesudason et al., FIG. 3A illustrates a flow diagram of a method for training one or more machine-learning models to classify and phenotype tissue cells in one visualization modality image utilizing class labels from a different visualization modality image as ground truth, [0021], In certain embodiments, a whole slide image generation system 101 may generate one or more whole slide images or histopathology images, corresponding to a particular sample. For example, an image generated by whole slide image generation system 101 may include a stained section of a biopsy sample. As another example, an image generated by whole slide image generation system 101 may include a slide image (e.g., a blood film) of a liquid sample. As another example, an image generated by whole slide image generation system 101 may include fluorescence microscopy such as a slide image depicting fluorescence in situ hybridization (FISH) after a fluorescent probe has been bound to a target DNA or RNA sequence, [0035],  Specifically, as will be further appreciated with respect to FIGS. 3A-3D and 4A-4D, the one or more populations of immune cells identified in the mxIF image 122 may be classified and phenotyped based on one or more spatial features (e.g., any features suitable for informing or ascertaining the phenotype of a cell by its spatial organization with respect to neighboring cells or its position within a tissue or region of a tissue), for example, by a human annotator (e.g., scientist, pathologist, clinician, or other medical or scientific expert). Thus, by then matching the one or more populations of immune cells to the corresponding immune cells in the H&E image 124, the classification and phenotyping can be verifiably trusted (e.g., 90%-100% confidence score) as ground truth data for accurately training the one or more machine-learning models to classify immune cells or other tissue cells in H&E image 124, [0051]; Ozcan et al., A diagnostician can manually label sections of the unstained tissue. These labels are used by the network to stain different areas of the tissue with the desired stains. A co-registered image of the histochemically stained H&E tissue is shown for comparison, [0034], The sample 22 may include, in some embodiments, a portion of tissue that is disposed on or in a substrate 23. The substrate 23 may include an optically transparent substrate in some embodiments (e.g., a glass or plastic slide or the like), [0047]).

Regarding claim 10, Jesudason et al. and Bhattacharya et al. and Ozcan et al. disclose the method of claim 8. Jesudason et al. and Ozcan et al. further indicate generating the annotation image based on a plurality of images captured by the second modality imaging over a physical slide (Jesudason et al., FIG. 3A illustrates a flow diagram of a method for training one or more machine-learning models to classify and phenotype tissue cells in one visualization modality image utilizing class labels from a different visualization modality image as ground truth, [0021], In certain embodiments, a whole slide image generation system 101 may generate one or more whole slide images or histopathology images, corresponding to a particular sample. For example, an image generated by whole slide image generation system 101 may include a stained section of a biopsy sample. As another example, an image generated by whole slide image generation system 101 may include a slide image (e.g., a blood film) of a liquid sample. As another example, an image generated by whole slide image generation system 101 may include fluorescence microscopy such as a slide image depicting fluorescence in situ hybridization (FISH) after a fluorescent probe has been bound to a target DNA or RNA sequence, [0035],  Specifically, as will be further appreciated with respect to FIGS. 3A-3D and 4A-4D, the one or more populations of immune cells identified in the mxIF image 122 may be classified and phenotyped based on one or more spatial features (e.g., any features suitable for informing or ascertaining the phenotype of a cell by its spatial organization with respect to neighboring cells or its position within a tissue or region of a tissue), for example, by a human annotator (e.g., scientist, pathologist, clinician, or other medical or scientific expert). Thus, by then matching the one or more populations of immune cells to the corresponding immune cells in the H&E image 124, the classification and phenotyping can be verifiably trusted (e.g., 90%-100% confidence score) as ground truth data for accurately training the one or more machine-learning models to classify immune cells or other tissue cells in H&E image 124, [0051]; Ozcan et al., A diagnostician can manually label sections of the unstained tissue. These labels are used by the network to stain different areas of the tissue with the desired stains. A co-registered image of the histochemically stained H&E tissue is shown for comparison, [0034], The sample 22 may include, in some embodiments, a portion of tissue that is disposed on or in a substrate 23. The substrate 23 may include an optically transparent substrate in some embodiments (e.g., a glass or plastic slide or the like), [0047]).

Regarding claim 11, Jesudason et al. and Bhattacharya et al. disclose the method of claim 1. Jesudason et al. partly disclose generating HIFs from the annotations (Jesudason et al., “by utilizing the cross-modality cell-to-cell registration process, one or more tissue cells or populations of tissue cells may be identified, classified, and phenotyped across various visualization modalities. In this way, pathologists, scientists, clinicians (e.g., oncologists), or other medical and scientific experts may more readily classify and phenotype immune cells (e.g., macrophages, regulatory T-cells (Tregs), CD8 cells, B lymphocytes, natural killer (NK) cells, and so forth). This may engender, for example, earlier detection and diagnosis of non-Hodgkin's lymphoma (NHL) in patients”, [0057]) however another reference is added to make this more explicit.

Ozcan et al. teach generating HIFs from the annotations (The digitally/virtually-stained output images 40 from the trained, deep neural network 10 were compared to the standard histochemical staining images 48 for diagnosing multiple types of conditions on multiple types of tissues, which were either Formalin-Fixed Paraffin-Embedded (FFPE) or frozen sections. The results are summarized in Table 1 below. The analysis of fifteen (15) tissue sections by four board certified pathologists (who were not aware of the virtual staining technique) demonstrated 100% non-major discordance, defined as no clinically significant difference in diagnosis among professional observers. The “time to diagnosis” varied considerably among observers, from an average of 10 seconds-per-image for observer 2 to 276 seconds-per-image for observer 3. However, the intra-observer variability was very minor and tended towards shorter time to diagnosis with the virtually-stained slide images 40 for all the observers except observer 2 which was equal, i.e., ˜10 seconds-per-image for both the virtual slide image 40 and the histology stained slide image 48. These indicate very similar diagnostic utility between the two image modalities, [0079], This potential staining standardization using deep learning can remedy the negative effects of human-to-human variations at different stages of the sample preparation, create a common ground among different clinical laboratories, enhance the diagnostic workflow for clinicians as well as assist the development of new algorithms such as automatic tissue metastasis detection or grading of different types of cancer, among others, [0086], In many clinical practices, H&E stains are available before special stains and pathologists may provide a “preliminary diagnosis” to enable the patient's nephrologist to begin treatment, [0180]) [features used for human diagnosis interpreted as HIFs].

Regarding claim 12, Jesudason et al. and Bhattacharya et al. and Ozcan et al. disclose the method of claim 11. Ozcan et al. further indicate using a second ML to predict cell/tissue from the pathology slide image; and generating the HIFs based additionally on the predicted cell/tissue (An example includes chemical H&E stains and the need to create specialized stains such as PAS, MT or JMS. For instance, non-neoplastic kidney disease relies on these “special stains” to provide the standard of care pathologic evaluation. In many clinical practices, H&E stains are available before special stains and pathologists may provide a “preliminary diagnosis” to enable the patient's nephrologist to begin treatment. This is especially useful in the setting of some diseases such as crescentic glomerulonephritis or transplant rejection, where quick diagnosis, followed by rapid initiation or treatment, may lead to significant improvements in clinical outcomes. In the setting when only H&E slides are initially available, the preliminary diagnosis is followed by the final diagnosis which is usually provided the next working day. As explained herein, an improved deep neural network 10.sub.stainTN was developed to improve the preliminary diagnosis by generating three additional special stains: PAS, MT and Jones methenamine silver (JMS) using the H&E stained slides, that can be reviewed by the pathologist concurrently with the histochemically stained H&E stain, [0180], A deep neural network 10.sub.stainTN was used to perform the transformation between the H&E stained tissue and the special stains. To train this neural network, a set of additional deep neural networks 10 were used in conjunction with one another. This workflow relies upon the ability of virtual staining to generate images of multiple different stains using a single unlabeled tissue section, [0182]).

Regarding claim 13, Jesudason et al. and Bhattacharya et al. and Ozcan et al. disclose the method of claim 11. Ozcan et al. further indicate predicting a disease based on the HIFs, using a statistical model (“A significant advantage of the system 2 is that it is quite flexible. It can accommodate feedback to statistically mend its performance if a diagnostic failure is detected through a clinical comparison, by accordingly penalizing such failures as they are caught. This iterative training and transfer learning cycle, based on clinical evaluations of the performance of the network output, will help optimize the robustness and clinical impact of the presented approach. Finally, this method and system 2 may be used for micro-guiding molecular analysis at the unstained tissue level, by locally identifying regions of interest based on virtual staining, and using this information to guide subsequent analysis of the tissue for e.g., micro-immunohistochemistry or sequencing. This type of virtual micro-guidance on an unlabeled tissue sample can facilitate high-throughput identification of sub-types of diseases, also helping the development of customized therapies for patients”, [0096]).

Claim(s) 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Jesudason et al. (US 20250246010 A1) and Bhattacharya et al. (US 20220058803 A1) as applied to claim 16 above, further in view of Ozcan et al. (US 20230030424 A1).

Regarding claim 17, Jesudason et al. and Bhattacharya et al. disclose the method of claim 16. Jesudason et al. and Bhattacharya et al. do not disclose the second type of image further comprises a phase image. 

Ozcan et al. teach the second type of image further comprises a phase image (“In this particular embodiment, post-imaging computational autofocusing is performed using a trained, deep neural network 10a for incoherent imaging modalities. Thus, it may be used in connection with images obtained by fluorescence microscopy (e.g., fluorescent microscope) as well as other imaging modalities. Examples include a fluorescence microscope, a widefield microscope, a super-resolution microscope, a confocal microscope, a confocal microscope with single photon or multi-photon excited fluorescence, a second harmonic or high harmonic generation fluorescence microscope, a light-sheet microscope, a FLIM microscope, a brightfield microscope, a darkfield microscope, a structured illumination microscope, a total internal reflection microscope, a computational microscope, a ptychographic microscope, a synthetic aperture-based microscope, or a phase contrast microscope”, [0055], “To train the deep neural network 10a, a Generative Adversarial Network (GAN) may be used to perform the virtual focusing. The training dataset is composed of autofluorescence (endogenous fluorophores) images of multiple tissue sections, for multiple excitation and emission wavelengths. In another embodiment, the training images can be other microscope modalities (e.g., brightfield microscope, a super-resolution microscope, a confocal microscope, a light-sheet microscope, a FLIM microscope, a widefield microscope, a darkfield microscope, a structured illumination microscope, a computational microscope, a ptychographic microscope, a synthetic aperture-based microscope, or a total internal reflection microscope, and a phase contrast microscope)”, [0058], “The output images 20f of the first trained, deep neural network 10a is then input the virtual staining trained, neural network 10. In an alternative embodiment, one can train, using a similar process as outlined above, a single neural network 10 that can directly take an out-of-focus image 20d of an incoherent microscope 110 such as a fluorescence, brightfield, darkfield, or phase microscope, to directly output a virtually stained image 40 of the label-free sample 22, where the raw image 20d was out-of-focus (at the input of the same neural network)”, [0060]).

Jesudason et al. and Ozcan et al. are in the same art of registering different types of histological slide images (Jesudason et al., abstract, [0007], [0025]; Ozcan et al., abstract, [0034], [0046]). The combination of Ozcan et al. with Jesudason et al. and Bhattacharya et al. will enable using a phase image. It would have been obvious at the time of filing to one of ordinary skill in the art to combine the phase image of Ozcan et al. with the invention of Jesudason et al. and Bhattacharya et al. as this was known at the time of filing, the combination would have predictable results, as one of a limited number of types of slide images available, would be obvious to try and a design choice, and as Ozcan et al. indicate “At the end of these registration steps 60, 64, 68, the auto-fluorescence image patches 20b and their corresponding brightfield tissue image patches 48f are accurately matched to each other and can be used as input and label pairs for the training of the deep neural network 10, allowing the network to solely focus on and learn the problem of virtual histological staining” ([0105]) and “For example, there is a need in the industry to transform chemical stains one type to another type. An example includes chemical H&E stains and the need to create specialized stains such as PAS, MT or JMS. For instance, non-neoplastic kidney disease relies on these “special stains” to provide the standard of care pathologic evaluation. In many clinical practices, H&E stains are available before special stains and pathologists may provide a “preliminary diagnosis” to enable the patient's nephrologist to begin treatment. This is especially useful in the setting of some diseases such as crescentic glomerulonephritis or transplant rejection, where quick diagnosis, followed by rapid initiation or treatment, may lead to significant improvements in clinical outcomes” ([0180]) demonstrating an efficiency improvement in training the neural network, and a clinical outcome improvement by allowing medical interventions to be done sooner without having to wait for more expensive imaging results.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: US 12475193 B1 Shoshan et al. teach using a machine learning (ML) model to obtain annotations of a pathology slide image obtained in a first imaging modality; wherein the ML model is trained based in part on images obtained from a second imaging modality different from the first imaging modality (The label data 240 may comprise information such as a sample identifier (ID) 242, modality label 244, model label 246, and so forth. The sample ID 242 indicates a particular training identity. The modality label 244 indicates whether the associated input data is representative of a first modality, second modality, and so forth. The model label 246, discussed later may indicate the embedding model 130 used to determine the training embedding data 220, col. 9, lines 52-59, The transformer network module 310 may comprise a neural network, col. 10, lines 55-60, the aggregation module 502 may comprise a machine learning system that utilizes one or more machine learning techniques, col. 14, lines 15-20) the second imaging modality comprising polarization imaging and the images being obtained from at least two polarization images (Depending upon the polarization used, the images produced by the scanner 104 may be of first modality features or second modality features. The first modality may utilize images in which the hand 102 is illuminated with light having a first polarization and obtained by the camera 108 with a polarizer passing light to the camera 108 that also has the first polarization, Second modality features comprise those features that are below the epidermis. The second modality may utilize images in which the hand 102 is illuminated with light having a second polarization and obtained by the camera 108 with the polarizer passing light to the camera 108 with the first polarization. For example, the second modality features may include subcutaneous anatomical structures such as veins, bones, soft tissue, and so forth, col. 5, line 45 – col. 6, line 5); US 20130094733 A1: a component feature vector computing section configured to obtain a weighted sum of the higher-order local auto-correlation features by adding the higher-order local auto-correlation features belonging to the same invariant feature group which consists of a plurality of local pattern masks which can be considered to be equivalent to each other when they are flipped and/or rotated by 45 degrees or multiples of 45 degrees; US 20220125280 A1: As previously described, the different modalities can include visible polarized light, NIR polarized light, and, optionally, shear wave ultrasound, which have different advantages over the other, To generate the visible light image frame, the computing device can cause or control the filter (or another filter) to selectively pass the visible light, As a specific example, twelve different polarization angles are used with varied intervals, such as angles of −10, 0, 10, 35, 45, 55, 80, 90, 100, 125, 135, and 140 degrees. As another example, seven different polarization angles are used with varied intervals, such as angles of 0, 30, 60, 45, 90, 120, and 150 degrees. However, embodiments are not so limited and different numbers of polarization angles and different varied intervals (e.g., degrees and spacing between respective angles) can be used by the imaging apparatus; US 20220414928 A1: As another example, the electro-optic modulator may be configured to transmit light of different linear polarizations when capturing different frames, e.g., so that the camera captures images with the entirety of the polarization mask set to, sequentially, to different linear polarizer angles (e.g., sequentially set to: 0 degrees; 45 degrees; 90 degrees; or 135 degrees) As such, a neural network trained according to embodiments of the present disclosure would disambiguate between the different noisy or inconsistent surface normals N computed from the different views, modalities, and spectral information captured (e.g., where the surface normals maps N may differ in accordance with the noise or variability of the polarization signature due to differences in albedo, wavelength, and texture copy artifacts due to viewing direction, as well as other noise in ambiguity in the image capture process). Comparing the shape estimates from the model 960 against the shape estimates computed directly from the images 980 in a comparison module 990 show that a properly-trained computer vision model 940 produces smoother and more accurate shape estimates than comparative techniques; US 20220343537 A1: For example, a polarization camera such as those described above with respect to FIGS. 1B, 1C, 1D, and 1E captures polarization raw frames with four different polarization angles ϕ.sub.pol, e.g., 0 degrees, 45 degrees, 90 degrees, or 135 degrees, thereby producing four polarization raw frames I.sub.ϕ.sub.pol, denoted herein as I.sub.0, I.sub.45, I.sub.90, and I.sub.135. This process of rendering images may include placing a virtual camera at one or more poses with respect to the scene and rendering a 2-D image of the scene from those virtual viewpoints. In addition to rendering visible light images (e.g., color images), a synthetic data generator may also render images in other imaging spectra such as infrared and ultraviolet light, and using other imaging modalities such as polarization. Systems and methods for synthetic data generation are described in more detail in International Patent Application No. PCT/US21/12073 “SYSTEMS AND METHODS FOR SYNTHESIZING DATA FOR TRAINING STATISTICAL MODELS ON DIFFERENT IMAGING MODALITIES INCLUDING POLARIZED IMAGES”.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M ENTEZARI HAUSMANN whose telephone number is (571)270-5084. The examiner can normally be reached 10-7 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent M Rudolph can be reached at (571) 272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MICHELLE M ENTEZARI HAUSMANN/Primary Examiner, Art Unit 2671
Read full office action
Prosecution Timeline

Oct 25, 2023
Application Filed
Jan 05, 2024
Response after Non-Final Action
Sep 20, 2025
Non-Final Rejection — §103, §112
Mar 23, 2026
Response Filed
Apr 03, 2026
Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/742,463
Patent 12602775
INTERPOLATION OF MEDICAL IMAGES
2y 5m to grant Granted Apr 14, 2026
17/855,522
Patent 12602793
Systems and Methods for Predicting Object Location Within Images and for Analyzing the Images in the Predicted Location for Object Tracking
2y 5m to grant Granted Apr 14, 2026
18/335,046
Patent 12602949
SYSTEM AND METHOD FOR DETECTING HUMAN PRESENCE BASED ON DEPTH SENSING AND INERTIAL MEASUREMENT
2y 5m to grant Granted Apr 14, 2026
17/964,716
Patent 12597261
OBJECT MOVEMENT BEHAVIOR LEARNING
2y 5m to grant Granted Apr 07, 2026
18/346,894
Patent 12597244
METHOD AND DEVICE FOR IMPROVING OBJECT RECOGNITION RATE OF SELF-DRIVING CAR
2y 5m to grant Granted Apr 07, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
76%
Grant Probability
98%
With Interview (+21.6%)
3y 1m
Median Time to Grant
Moderate
PTA Risk
Based on 863 resolved cases by this examiner. Grant probability derived from career allow rate.
SYSTEMS AND METHODS FOR DEEP LEARNING MODEL ANNOTATION USING SPECIALIZED IMAGING MODALITIES

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email