DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Receipt of Applicant’s Amendment filed January 24, 2025, is acknowledged.
Response to Amendment
Claims 1-14 have been amended. Claim 15 is canceled. Claims 16 and 17 are new. Claims 1-14, 16, and 17 are pending and are provided to be examined upon their merits.
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-14, 16, and 17 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Claims 1-14, 16, and 17 are directed to a system or method, which are statutory categories of invention. (Step 1: YES).
The Examiner has identified method Claim 14 as the claim that represents the claimed invention for analysis and is similar to system Claim 1.
Claim 14 recites the limitations of:
A computer-implemented patient-specific training method to be used with a patient- specific training system, comprising the following steps:
acquiring image data from a patient;
generating a first classification signal for each image of the acquired image data;
separating the image data into at least a first dataset and a second dataset based on the first classification signal and according to a first reliability criterion;
initializing an artificial neural network; and
training the artificial neural network with the entire first dataset or a part thereof.
These above limitations, under their broadest reasonable interpretation, cover performance of the limitation as mental processes. The claim recites elements, highlighted in bold above, which covers performance of the limitation that can be concepts performed in the mind of a person or with pen and paper. A person can acquire an image data from a patient in their mind by looking at the image, generate a first classification signal for each image mentally or with pen and paper, separate the image data into a first and second dataset based on the classification signal and according to a first reliability criterion. Initializing an artificial neural network could be many things as simple as providing data to the neural network. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation as a mental process, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea. Claim 1 is also abstract for similar reasons. (Step 2A-Prong 1: YES. The claims are abstract)
Giving the claim it’s broadest reasonable interpretation, the claim is also abstract as certain methods of organizing human activity as the claim is directed to using patient data for training an artificial neural network in order to diagnose a patient, therefore, abstract as managing personal behavior and interactions between people by diagnosing a patient (teaching patient’s health), see paras. [0002], [0008], and [0071] of the specification.
This judicial exception is not integrated into a practical application. In particular, the claims only recite: artificial neural network, computing device, output interface (Claim 1); computer, artificial neural network (Claim 14). The computer hardware is recited at a high-level of generality (i.e., as a generic processor performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component (see para. [0017] and use of an entity to process data). The Initiating and training the neural network appear to somehow be using at a high level existing neural network technology and there is no detail as to how the network is initialized or trained. Further, there appears to be no improvement to the neural network technology itself (see para. [0016] where artificial neural network can be various types of existing networks). Accordingly, these additional elements, when considered separately and as an ordered combination, do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Therefore claims 1 and 14 are directed to an abstract idea without a practical application. (Step 2A-Prong 2: NO. The additional claimed elements are not integrated into a practical application)
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because, when considered separately and as an ordered combination, they do not add significantly more (also known as an “inventive concept”) to the exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using a computer hardware amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. See MPEP 2106.05(f) where applying a computer as a tool is not indicative of significantly more. Accordingly, these additional elements, when considered separately and as an ordered combination, do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. Steps such as acquiring (receiving) are steps that are considered insignificant extra solution activity and mere instructions to apply the exception using general computer components (see MPEP 2106.05(d), II). Thus claims 1 and 14 are not patent eligible. (Step 2B: NO. The claims do not provide significantly more)
Dependent claims 2-13, 16, and 17 further define the abstract idea that is present in their respective independent claims 1 and 14 and thus correspond to Mental Processes and Certain Methods of Organizing Human Activity and hence are abstract for the reasons presented above. The dependent claims do not include any additional elements that integrate the abstract idea into a practical application or are sufficient to amount to significantly more than the judicial exception when considered both individually and as an ordered combination. The claims themselves are abstract or further limit abstract elements. Claims 2, 6, 10-12 recite modules, which is computer software recited at a high level of generality. Therefore, the claims 2-13, 16, and 17 are directed to an abstract idea. Claim 2, 11, and 12 recite neural network at a high level of generality. Thus, the claims 1-14, 16, and 17 are not patent-eligible.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim 5 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 5 recites “the first dataset comprises all images above a relative threshold probability value between the classes and the second dataset comprises all images at or below the same relative threshold probability value between the classes” where between the classes is indefinite. Between the classes is a relative term and classes could be anything as there is no value or basis provided. For examination purposes, this is interpreted as any class.
Claim 7 recites “the selector unit further comprises a threshold discriminating subunit, which is configured to select a class of the first set of classes as a class of the second set of classes, provided that the number of images classified in the class of the first set of classes is above a certain threshold value” where select a class of the first set of classes as a class of the second set of classes is indefinite. It is indefinite as to the second class of the second set select as a first class and how select the second class relates to a threshold. For examination purposes this is interpreted as the second class has the same threshold requirement (number) as the first class.
Claim 14 recites “initializing an artificial neural network” where initializing a neural network is indefinite. Initialing could be anything including installing software on a computer, providing data to the network, etc.. For examination purposes, this is interpreted as some type of interaction with a neural network on a computer.
Examiner Request
The Applicant is requested to indicate where in the specification there is support for amendments to claims should Applicant amend. The purpose of this is to reduce potential 35 U.S.C. §112(a) or §112 1st paragraph issues that can arise when claims are amended without support in the specification. The Examiner thanks the Applicant in advance.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-10, 12-14, 16 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Pub. No. US 2022/0237788 to Shaul et al. in view of Pub. No. US 2021/0090694 to Colley et al.
Regarding claim 1
A patient-specific artificial neural network training system, comprising:
an input interface configured to receive image data from a patient;
Shaul et al. teaches:
User interface to select or provide (receive) a particular image…
“The computer system comprises a user interface that enables a user 984 to select or provide a particular image or image tile that is to be used as search image 986. The trained feature-extraction MLL 950 is adapted to extract a feature vector 988 (“search feature vector”) from the input image. a search engine 990 receives the search feature vector 988 from the feature output MLL 950 and performs a vector-based similarity search in the image database. The similarity search comprises comparing the search feature vector which each of the feature vectors of the images in the database in order to compute a similarity score as a function of the two compared feature vectors. The similarity score is indicative of the degree of similarity of the search feature vector with the feature vector of the image in the database and hence indicates the similarity of the tissue patterns depicted in the two compared images. The search engine 990 is adapted to return and output a search result 994 to the user. The search result can be, for example, one or more images of the database for which the highest similarity score was computed.” [0385]
Receiving (acquiring) images, where the images depict tissue sample from a patient..
“In one aspect, the invention relates to a method for classifying tissue images. The method comprises: receiving, by an image analysis system, a plurality of digital images; each of the digital images depicts a tissue sample of a patient; splitting, by the image analysis system, each received image into a set of image tiles; for each of the tiles, computing, by the image analysis system, a feature vector comprising image features extracted selectively from the tile; providing a Multiple-Instance-Learning (MIL) program configured to use a model for classifying any input image as a member of one out of at least two different classes based on the feature vectors extracted from all tiles of the said input image; for each of the tiles, computing a certainty value (referred herein according to embodiments of the invention as “c”); the certainty value is indicative of the certainty of the model regarding the contribution of the tile's feature vector on the classification of the image from which the tile was derived; for each of the images: using, by the MIL-program, a certainty-value-based pooling function for aggregating the feature vectors extracted from the image into a global feature vector as a function of the certainty values of the tiles of the image, and computing an aggregated predictive value (referred herein according to embodiments of the invention as “ah”) from the global feature vector; or computing, by the MIL program, predictive values from respective ones of the feature vectors of the image and using, by the MIL-program, a certainty-value-based pooling function for aggregating the predictive values of the image into an aggregated predictive value (referred herein according to embodiments of the invention as “ah”) as a function of the certainty values of the tiles of the image; and classifying, by the MIL-program, each of the images as a member of one out of the at least two different classes based on the aggregated predictive value.” [0008] – [0017]
a computing device comprising a classification module, a separator module, and a training module; and
MIL-program for classifying (classification module)…
“…Hence, using a MIL-program for classifying digital tissue images may have the advantage that weakly annotated training data is sufficient for training a MIL-program capable of accurately classifying digital tissue images. Furthermore, the trained MIL-program will be able to accurately classifying digital tissue images even in case a human annotator, e.g. a pathologist, does not know the tissue structures which are highly predictive of the class membership of the tissue and hence is not able to select training images having an unbiased ratio of tissue regions with and without this tissue structure.” [0019]
Example of image splitting (separator) module…
“The image analysis system further comprises an image splitting module being executable by the at least one processor and being configured to split each of the images into a set of image tiles.” [0149]
A training process (module)…
“A “machine learning logic (MLL)” as used herein is a program logic, e.g. a piece of software like a neuronal network or a support vector machine or the like, that has been trained or that can be trained in a training process and that comprises a predictive model that—as a result of the training phase—has learned to perform some predictive and/or data processing tasks (e.g. image classification) based on the provided training data. Thus, an MLL can be a program code that is at least partially not explicitly specified by a programmer, but that is implicitly learned and modified in a data-driven learning process that builds one or more implicit or explicit models from sample inputs. Machine learning may employ supervised or unsupervised learning.” [0220]
See Training module below.
an output interface configured to output a diagnosis signal;
Example of outputting mage tile of image class label of cancer patient (diagnosis signal)…
“This may be advantageous as this may enable a pathologist to identify tiles and tissue structures depicted therein which have the highest predictive value in respect to the membership of an image in a particular class. Thereby, new biomedical knowledge can be generated that has not yet been described in the biomedical literature before. For example, by outputting and highlighting the one out of 1000 image tiles depicting a particular tissue structure that is highly predictive of the image class label “tissue of a cancer patient who will benefit from treatment X” may reveal a correlation or causal relationship between the tissue structure depicted this tile and the particular image class that may not have been discovered and published before. Furthermore, even in case the tissue structure being predictive for a particular image class is as such known, the heat map or any other form of graphically representing the tile having the highest predictive value wh may have the advantage that a pathologist is enabled to identify the one or more tiles comprising the most interesting tissue structures in respect to a particular biomedical question faster and more accurately.” [0117]
wherein the classification module is configured to acquire as input the received image data and generate a first classification signal for each image of the image data,
Receiving (acquiring) images, where the images depict tissue sample from a patient..
“In one aspect, the invention relates to a method for classifying tissue images. The method comprises: receiving, by an image analysis system, a plurality of digital images; each of the digital images depicts a tissue sample of a patient; splitting, by the image analysis system, each received image into a set of image tiles; for each of the tiles, computing, by the image analysis system, a feature vector comprising image features extracted selectively from the tile; providing a Multiple-Instance-Learning (MIL) program configured to use a model for classifying any input image as a member of one out of at least two different classes based on the feature vectors extracted from all tiles of the said input image; for each of the tiles, computing a certainty value (referred herein according to embodiments of the invention as “c”); the certainty value is indicative of the certainty of the model regarding the contribution of the tile's feature vector on the classification of the image from which the tile was derived; for each of the images: using, by the MIL-program, a certainty-value-based pooling function for aggregating the feature vectors extracted from the image into a global feature vector as a function of the certainty values of the tiles of the image, and computing an aggregated predictive value (referred herein according to embodiments of the invention as “ah”) from the global feature vector; or computing, by the MIL program, predictive values from respective ones of the feature vectors of the image and using, by the MIL-program, a certainty-value-based pooling function for aggregating the predictive values of the image into an aggregated predictive value (referred herein according to embodiments of the invention as “ah”) as a function of the certainty values of the tiles of the image; and classifying, by the MIL-program, each of the images as a member of one out of the at least two different classes based on the aggregated predictive value.” [0008] – [0017]
Using MIL-program (generating) for classifying tissue structures with high predictive value (first classification signal)...) for image
“A Multiple instance learning (MIL) program is a form of weakly supervised learning program configured to learn from a training set wherein training instances are arranged in sets called bags, and wherein a label is provided for the entire bag while the labels of the individual instances in the bag are not known. Hence, a MIL-program requires only weakly annotated training data. This type of data is especially common in medical imaging because the annotation of individual image regions to provide richly annotated training data is highly time consuming and hence expensive. Furthermore, the tissue structures which imply (have high predictive value for . . . ) a digital image to be member of a particular class (e.g. image depicting healthy tissue/image depicting a primary tumor/image depicting a metastase) are sometimes not known or are not perceivable by a pathologist. Hence, using a MIL-program for classifying digital tissue images may have the advantage that weakly annotated training data is sufficient for training a MIL-program capable of accurately classifying digital tissue images. Furthermore, the trained MIL-program will be able to accurately classifying digital tissue images even in case a human annotator, e.g. a pathologist, does not know the tissue structures which are highly predictive of the class membership of the tissue and hence is not able to select training images having an unbiased ratio of tissue regions with and without this tissue structure.” [0019]
wherein the separator module is configured to, based on the first classification signal, separate the received image data into at least a first dataset and a second dataset according to a first reliability criterion, and
Example of image splitting (separator) module…
“The image analysis system further comprises an image splitting module being executable by the at least one processor and being configured to split each of the images into a set of image tiles.” [0149]
Classifying images into at least two classes and certainty value (reliability criterion)…
“The image analysis system further comprises a Multiple-Instance-Learning (MIL) program. The MIL-program is executable by the at least one processor and is configured to use a model for classifying any input image as a member of one out of at least two different classes based on the feature vectors extracted from all tiles of the said input image. The MIL-program is further configured for: for each of the tiles, computing a certainty value, the certainty value being indicative of the certainty of the model regarding the contribution of the tile's feature vector on the classification of the image from which the tile was derived; for each of the images: using, by the MIL-program, a certainty-value-based pooling function for aggregating the feature vectors extracted from the image into a global feature vector as a function of the certainty values of the tiles of the image, and computing an aggregated predictive value (referred herein according to embodiments of the invention as “ah”) from the global feature vector; or computing, by the MIL program, predictive values from respective ones of the feature vectors of the image and using, by the MIL-program, a certainty-value-based pooling function for aggregating the predictive values of the image into an aggregated predictive value (“ah”) as a function of the certainty values of the tiles of the image; and classifying each of the images as a member of one out of the at least two different classes based on the aggregated predictive value.” [0151] – [0156]
Example of training set (first dataset) based on highest predictive power…
“According to embodiments, the providing of the MIL-program comprises training the MIL-program on a training image set, whereby during the training phase a certainty-value-based-max-pooling function is used as pooling function. This may have the advantage that the predictive model generated during the training the MIL strongly reflects the tissue pattern depicted in the tile having the feature vector with the highest predictive power in respect to the bag's label. The model is not negatively affected by tissue regions/tiles which are irrelevant for the label. However, the maximum operation will neglect all the information contained in all tiles except the highest scoring tile. Hence, the predictive power of tiles/tissue patterns which may also be of relevance may be missed.” [0046]
wherein the training module is configured to use the first dataset or only a part thereof as a training dataset to train an artificial neural network.
“According to embodiments, the MIL-program is a neural network. The certainty-value is computed using a dropout technique at training and/or test time of the model of the neural network.” [0074]
Training images (dataset or part thereof) that are labeled by class (first dataset or part thereof) using MIL-program (neural network)…
“According to embodiments, the providing of the MIL-program comprises training the model of the MIL-program. The training comprises: providing a set of digital training images of tissue samples, each digital training image having assigned a class label being indicative of one of the at least two classes; splitting each training image into training image tiles, each training tile having assigned the same class label as the digital training image from which the training tile was derived; for each of the tiles, computing, by the image analysis system, a training feature vector comprising image features extracted selectively from the said tile; and/or repeatedly adapting the model of the MIL-program such that an error of a loss function is minimized; the error of the loss function indicates a difference of predicted class labels of the training tiles and the class labels actually assigned to the training tiles; the predicted class labels have been computed by the model based on the feature vector of the training tiles.” [0106] - [0132]
Training Module
Shaul et al. teaches training process. They do not specifically teach a training module.
Colley et al. also in the business of training teaches:
“Referring still to FIGS. 11b and 11c, the input micro-services 1167 may also run a variant classification engine 1360 on the variant files utilizing a knowledge database of variant information 1175 to calculate many different types of variant criteria, further classification and addition database insertion. The variant micro-service 1167 may publish an alert 1183 when a key event occurs, to which other services 1179 can subscribe in order to react. After a variant call text file is parsed, the variant micro-service may insert variant analysis data into the expert treatment system database 160 including criteria, classifications, variants, findings, and sample information.” [1055]
Training engine (module)…
“…The user may correct any errors by selecting the data field in the application corresponding to the information. These errors may be stored and/or sent to a training engine to improve upon the extraction algorithms and techniques. The training engine may generate a new extraction algorithm to use in future extractions based from detected errors. More information is disclosed below with respect to FIGS. 165 and 168, below…”
It would have been obvious to one of ordinary skill in the art before the effective filing date to include in the method and system of the combined references the ability to use a training engine (module) as taught by Colley et al. since the claimed invention is merely a combination of old elements and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. Further motivation is provided by Shaul et al. who teaches training process.
Regarding claim 2
The patient-specific training system according to claim 1, wherein the classification module comprises a pre-trained artificial neural network structure.
Shaul et al. teaches:
Example of pre-trained MIL program (neural network)…
“A MIL-program in the form of a Resnet50 (Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 770-778, June 2016) pre-trained on ImageNet was provided, and feature vectors (embeddings) of size 2048 were extracted for each tile from the second to last layer of the network.” [0289]
Regarding claim 3
The patient-specific training system according to claim 1, wherein the first classification signal indicates, for each image, probability values associated with a first set of classes.
Shaul et al. teaches:
Aggregated predictive value and associated with “image depicting a tumor” (first set of classes)…
“The aggregated predictive value is then used for classifying the input image. For example, in case wh.sub.MAX exceeds a threshold determined during training, a digital image may be classified as “image depicting a tumor”. Otherwise, the image is classified as “image depicting healthy tissue”.” [0044]
Regarding claim 4
The patient-specific training system according to claim 1, wherein the first dataset comprises all images above a threshold probability value and the second dataset comprises all images at or below the same threshold probability value.
Shaul et al. teaches:
Exceeds a threshold and otherwise (below same threshold, second dataset)…
“The aggregated predictive value is then used for classifying the input image. For example, in case wh.sub.MAX exceeds a threshold determined during training, a digital image may be classified as “image depicting a tumor”. Otherwise, the image is classified as “image depicting healthy tissue”.” [0044]
Regarding claim 5
The patient-specific training system according to claim 1, wherein the first dataset comprises all images above a relative threshold probability value between the classes and the second dataset comprises all images at or below the same relative threshold probability value between the classes.
Shaul et al. teaches:
Exceeds a threshold and otherwise (below same threshold, second dataset) for classes…
“The aggregated predictive value is then used for classifying the input image. For example, in case wh.sub.MAX exceeds a threshold determined during training, a digital image may be classified as “image depicting a tumor”. Otherwise, the image is classified as “image depicting healthy tissue”.” [0044]
Regarding claim 6
The patient-specific training system according to claim 1, wherein the training module further comprises a selector unit, which is configured to determine a second set of classes and assemble the training dataset based on information from the first dataset or the second dataset.
{
From Applicant’s specification on selector unit…
“According to some of the embodiments, refinements, or variants of embodiments, the training module further comprises a selector unit, which is configured to determine a second set of classes and assemble the training dataset, based on information from the first dataset and/or the second dataset. While the first set of classes is determined by the pre-trained artificial neural network of the classification unit, the second set of classes can be determined prior to the training of the artificial neural network. This depends on the outcome of the classification performed by the classification unit. If, for instance, the representatives of the first dataset only populate significantly a subset of classes, it might be preferable to train the artificial neural network with only those well-represented classes. On the other hand, the second dataset might show that the unclear cases mostly arise from the confusion between two classes. In this case, it might be advantageous to concentrate the refined classification on these two classes, since this is where the classification can most likely be improved. The selection of the specific training dataset, e.g., the proportion of images from each of the selected classes, is also a parameter that can be adjusted depending on the class-distribution of the first dataset. In certain preferred embodiments, the second set of classes might be a subset of the first set of classes. In certain other embodiments, the second set of classes might contain classes that were not present in the first set of classes.” [0033]
The second set of classes can be subset of the first set or different classes not present in the first set of classes.
}
Shaul et al. teaches:
Images with class label output…
“The GUI is configured to display the classification result 207 generated by the MIL-program for each of the one or more received images 212. For example, the GUI can display images or image tiles with a class label output by the MIL-program.” [0281]
Select subsets of images (first or second dataset)…
“Optionally, the image analysis system can comprise a sampling module (not shown) adapted to select samples (subsets) of the images for training and test the trained MIL on the rest of the image tiles. The sampling module may perform a clustering of the tiles based on their feature vectors first before performing the sampling.” [0284]
Regarding claim 7
The patient-specific training system according to claim 6, wherein the selector unit further comprises a threshold discriminating subunit, which is configured to select a class of the first set of classes as a class of the second set of classes, provided that the number of images classified in the class of the first set of classes is above a certain threshold value.
{
From Applicant’s specification on threshold discriminating subunit…
“According to some of the embodiments, refinements, or variants of embodiments, the selector unit further comprises a threshold discriminating subunit, which is configured to select a class of the first set of classes as a class of the second set of classes, provided that the number of images classified in the class of the first set of classes is above a certain threshold value. In other words, the configuration of the second set of classes can take place based on a quantitative criterion, which only selects those classes that are statistically significant by defining a lower threshold of representatives per class.” [0034]
This looks at the number/quantity/count of images in the first class that are above a threshold (e.g. 20 images above threshold value of 70%) and selects this configuration for the second class. As claimed, this appears to just require a second class equal to a first class where the first class has images with any number above a threshold (so the second class needs images above a threshold value).
}
Shaul et al. teaches:
Image classified to a particular class when certainty value exceeds predefined threshold, therefore, a second class would have the same threshold requirement as a first class…
“For example, a maximum pooling function may classify an image to be member of a particular class only in case an attention-based predictive value which was weighted by the certainty value exceeds a predefined threshold.” [0145]
Regarding claim 8
The patient-specific training system according to claim 6, wherein the selector unit further comprises a sampling subunit, which is configured to sample a subset of data from the first dataset or augment the first dataset, such that each class of the second set of classes has a statistically significant number of images in the training dataset.
{
From Applicant’s specification on sampling subunit…
“According to some of the embodiments, refinements, or variants of embodiments, the selector unit further comprises a sampling subunit, which is configured to sample a subset of data from the first dataset and/or augment the first dataset, such that each class of the second set of classes has a statistically significant number of images in the training dataset. Even with the introduction of a threshold, some classes might be underrepresented with respect to other selected classes. This might lead to an unwanted bias in the training of the artificial neural network. In order to avoid this possible bias it is advantageous to balance the amount of representatives per class. In general, the sampling subunit will take a random selection of representatives of the overrepresented classes to have a balanced distribution of the training images according to the second set of classes. The sampling subunit can also apply augmentation techniques to generate further training data for less well-represented or rare classes, e.g., by rotating, mirroring, flipping, scaling and/or resampling image data. In general, increasing the amount of qualified training data by augmentation will lead to a better performance of the artificial neural network.” [0035]
Therefore, use a sample of the dataset for the second dataset (second classes are subsets of first classes)
}
Shaul et al. teaches:
Images with class label output…
“The GUI is configured to display the classification result 207 generated by the MIL-program for each of the one or more received images 212. For example, the GUI can display images or image tiles with a class label output by the MIL-program.” [0281]
Select subsets of images (first or second dataset)…
“Optionally, the image analysis system can comprise a sampling module (not shown) adapted to select samples (subsets) of the images for training and test the trained MIL on the rest of the image tiles. The sampling module may perform a clustering of the tiles based on their feature vectors first before performing the sampling.” [0284]
Regarding claim 9
The patient-specific training system according to claim 6, wherein the selector unit further comprises an artificial intelligence subunit, pre-trained to recognize a specific diagnosis, which is configured to select the training dataset based on a second reliability criterion.
{
From Applicant’s specification on specific diagnosis…
“According to some of the embodiments, refinements, or variants of embodiments, the selector unit further comprises an artificial intelligence subunit pre-trained to recognize a specific diagnosis, which is configured to select the training images based on a second reliability criterion. In some cases where a specific disease or abnormality is the target of the analysis of the patient images, the second set of classes should be related to the markers of the specific disease or abnormality. In this case, using an artificial neural network pre-trained to identify the specific disease or abnormality can select the training dataset more efficiently. Depending on the data available, the artificial neural network can be trained with generic data or patient-specific data. The reliability criterion may be a probability distribution among the different classes with which the artificial neural network has been pretrained.” [0036]
“The artificial intelligence subunit 423 is configured to perform a selection of the second set of classes Cl'-C3' based on a specific diagnosis. The artificial intelligence subunit 423 comprises at least an artificial neural network (not shown in the figure) pre-trained with data in order to recognize a specific disease. Since in some of the embodiments the training system 100 is meant to be part of a computer assisted diagnostic system, fixing the second set of classes Cl'-C3' by using pre-trained artificial neural networks can be certainly advantageous. For example, one might want to use the training system 100 to ascertain whether a patient which was diagnosed in the past with blood cancer suffers it again. The artificial intelligence subunit 423 can be trained with the blood samples of the patient when he/she was first diagnosed, and use this trained artificial neural network to determine the classes Cl'-C3' of the artificial neural network 41. The amount of available data for the training is in this case clearly the limiting factor, but one could also complement patient blood samples with blood samples of other patients diagnosed with the same type of blood cancer.” [0077]
Therefore, AI pre-trained to recognize a diagnosis or disease and select dataset based on second reliability criteria. Have a pre-trained AI model for diagnosing disease and run image sample to see if it meets a threshold.
}
Shaul et al. teaches:
Using a trained (therefore, pre-trained) MIL-program (AI model) with classification of digital images and takes into account uncertainties (reliability criterion)…
“FIG. 1 depicts a flowchart of a method according to an embodiment of the invention. The method can be used for classifying tissue images of a patient. The classification can be performed e.g. for predicting an attribute value of a patient such as, for example, a biomarker status, diagnosis, treatment outcome, microsatellite status (MSS) of a particular cancer such as colorectal cancer or breast cancer, micrometastases in Lymph nodes and Pathologic Complete Response (pCR) in diagnostic biopsies. The prediction is based on a classification of digital images of histology slides using a trained MIL-program that takes into account model uncertainties. In the following description of FIG. 1, reference will also be made to elements of FIGS. 2, 15 and 16.” [0257]
Regarding claim 10
The patient-specific training system according to claim 6, wherein the training module is further configured to generate a second classification signal associated with each image of the training dataset, wherein the second classification signal indicates, for each image of the training dataset, probability values associated to the second set of classes.
Shaul et al. teaches:
Example of distance between two tiles (second classification signal) and probability of non-tumor (second set of classes)…
“When a training image comprising many different tissue patterns (e.g. “non-tumor” and “tumor”) is split into many different tiles, the smaller the distance between two tiles, the higher the probability that both compared tiles depict the same tissue pattern, e.g. “non-tumor”. There will, however, be some tile pairs next to the border of two different patterns that depict different tissue pattern (e.g. the first tile “tumor”, the other tile “non-tumor”). These tile pairs generate noise, because they depict different tissue patterns although they lie in close spatial proximity to each other.” [0174]
Regarding claim 12
The patient-specific training system according to claim 11, wherein the training-scenario subunit is further configured to add patient-specific non- image data to the training dataset used to train the artificial neural network of the training module.
{
From Applicant’s specification on non-image…
“According to some of the embodiments, refinements, or variants of embodiments, the patient-specific non-image data comprises information on at least a previous diagnosed disease. This existing diagnosis of the patient selects the kind of algorithm to be used, in particular one that is pre-trained to identify the specific diagnosed disease. The artificial neural network could be trained on generic data or data from the patient itself, depending on the data availability.” [0040]
Therefore, added data is diagnostic data of a patient.
}
Shaul et al. teaches:
Provide additional quantitative information…
“An “image analysis system” as used herein is a system, e.g. a computer system, adapted to evaluate and process digital images, in particular images of tissue samples, in order to assist a user in evaluating or interpreting, e.g. classifying, an image and/or in order to extract biomedical information that is implicitly or explicitly contained in the image. For example, the computer system can be a standard desktop computer system or a distributed computer system, e.g. a cloud system. Generally, a computerized histopathology image analysis system takes as its input a single- or multi-channel image captured by a camera and attempts to provide additional quantitative information to aid in the diagnosis or treatment. The image can be received directly from the camera or can be read from a local or remote storage medium.” [0211]
Non-Image Data
The combined references teach image and training. They do not teach non-image data.
Colley et al. also in the business of image and training teaches:
Use available data for diagnosis including clinical and molecular details (non-image data) of a patient…
“Analytics module 3236 can, in general, use available data to indicate a diagnosis, predict progression, predict treatment outcomes, and/or suggest an optimized treatment plan (such as a medication type, an available clinical trial) based on the specific disease state of each patient. Exemplary analytics may include machine learning algorithms or neural networks. A machine learning algorithm (MLA) or a neural network (NN) may be trained from a training data set. For a disease state, an exemplary training data set may include the clinical and molecular details of a patient such as those curated from the Electronic Health Record or genetic sequencing reports. MLAs include supervised algorithms (such as algorithms where the features/classifications in the data set are annotated) using linear regression, logistic regression, decision trees, classification and regression trees, Naïve Bayes, nearest neighbor clustering; unsupervised algorithms (such as algorithms where no features/classification in the data set are annotated) using Apriori, means clustering, principal component analysis, random forest, adaptive boosting; and semi-supervised algorithms (such as algorithms where certain features/classifications in the data set are annotated) using generative approach (such as mixture of Gaussian distributions, mixture of multinomial distributions, hidden Markov models), low density separation, graph-based approaches (such as mincut, harmonic function, manifold regularization), heuristic approaches, or support vector machines. NNs include conditional random fields, convolutional neural networks, attention based neural networks, long short term memory networks, or other neural models where the training data set includes a plurality of samples and RNA expression data for each sample. While MLA and neural networks identify distinct approaches to machine learning, the terms may be used interchangeably herein. Thus, a mention of MLA may include a corresponding NN or a mention of NN may include a corresponding MLA.” [1296]
It would have been obvious to one of ordinary skill in the art before the effective filing date to include in the method and system of the combined references the ability to use non-image data as taught by Colley et al. since the claimed invention is merely a combination of old elements and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. Further motivation is provided by Colley et al. who teaches the importance of using available data to indicate a diagnosis. The combined references benefit as the also are directed to providing a diagnosis.
Regarding claim 13
The patient-specific training system according to claim 12, where the patient-specific non- image data comprises information of at least a previous diagnosed disease.
Non-Image Data
The combined references teach image and training. They do not teach non-image data.
Colley et al. also in the business of image and training teaches:
Maintains collection of training data (previous) that includes diagnoses…
“With regard to both the post-processing of scanned documents discussed above, as well as the analysis and structuring of other aspects of patient EHRs or EMRs, such as next-generation sequencing reports, a system that identifies and processes information in clinical documents or other records is disclosed herein. The system may use a combination of text extraction techniques, text cleaning techniques, natural language processing techniques, machine learning algorithms, and medical concept (Entity) identification, normalization, and structuring techniques. The system also maintains and utilizes a continuous collection of training data across clinical use cases (such as diagnoses, therapies, outcomes, genetic markers, etc.) that help to increase both accuracy and reliability of predictions specific to a patient record. The system accelerates a structuring of clinical data in a patient's record. The system may execute subroutines that highlight, suggest, and pre-populate an electronic medical record (“EHR” or “EMR”). The system may provide other formats of structured clinical data, with relevant medical concepts extracted from the text and documents of record.” [1542]
It would have been obvious to one of ordinary skill in the art before the effective filing date to include in the method and system of the combined references the ability to use non-image data as taught by Colley et al. since the claimed invention is merely a combination of old elements and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. Further motivation is provided by Colley et al. who teaches the importance of using available data to indicate a diagnosis. The combined references benefit as the also are directed to providing a diagnosis.
Regarding claim 14
A computer-implemented patient-specific training method to be used with a patient- specific training system, comprising the following steps:
acquiring image data from a patient;
Shaul et al. teaches:
Receiving (acquiring) images, where the images depict tissue sample from a patient..
“In one aspect, the invention relates to a method for classifying tissue images. The method comprises: receiving, by an image analysis system, a plurality of digital images; each of the digital images depicts a tissue sample of a patient; splitting, by the image analysis system, each received image into a set of image tiles; for each of the tiles, computing, by the image analysis system, a feature vector comprising image features extracted selectively from the tile; providing a Multiple-Instance-Learning (MIL) program configured to use a model for classifying any input image as a member of one out of at least two different classes based on the feature vectors extracted from all tiles of the said input image; for each of the tiles, computing a certainty value (referred herein according to embodiments of the invention as “c”); the certainty value is indicative of the certainty of the model regarding the contribution of the tile's feature vector on the classification of the image from which the tile was derived; for each of the images: using, by the MIL-program, a certainty-value-based pooling function for aggregating the feature vectors extracted from the image into a global feature vector as a function of the certainty values of the tiles of the image, and computing an aggregated predictive value (referred herein according to embodiments of the invention as “ah”) from the global feature vector; or computing, by the MIL program, predictive values from respective ones of the feature vectors of the image and using, by the MIL-program, a certainty-value-based pooling function for aggregating the predictive values of the image into an aggregated predictive value (referred herein according to embodiments of the invention as “ah”) as a function of the certainty values of the tiles of the image; and classifying, by the MIL-program, each of the images as a member of one out of the at least two different classes based on the aggregated predictive value.” [0008] – [0017]
generating a first classification signal for each image of the acquired image data;
Using MIL-program (generating) for classifying tissue structures with high predictive value (first classification signal)...) for image
“A Multiple instance learning (MIL) program is a form of weakly supervised learning program configured to learn from a training set wherein training instances are arranged in sets called bags, and wherein a label is provided for the entire bag while the labels of the individual instances in the bag are not known. Hence, a MIL-program requires only weakly annotated training data. This type of data is especially common in medical imaging because the annotation of individual image regions to provide richly annotated training data is highly time consuming and hence expensive. Furthermore, the tissue structures which imply (have high predictive value for . . . ) a digital image to be member of a particular class (e.g. image depicting healthy tissue/image depicting a primary tumor/image depicting a metastase) are sometimes not known or are not perceivable by a pathologist. Hence, using a MIL-program for classifying digital tissue images may have the advantage that weakly annotated training data is sufficient for training a MIL-program capable of accurately classifying digital tissue images. Furthermore, the trained MIL-program will be able to accurately classifying digital tissue images even in case a human annotator, e.g. a pathologist, does not know the tissue structures which are highly predictive of the class membership of the tissue and hence is not able to select training images having an unbiased ratio of tissue regions with and without this tissue structure.” [0019]
separating the image data into at least a first dataset and a second dataset based on the first classification signal and according to a first reliability criterion;
Example of image splitting (separator) module…
“The image analysis system further comprises an image splitting module being executable by the at least one processor and being configured to split each of the images into a set of image tiles.” [0149]
Classifying images into at least two classes and certainty value (reliability criterion)…
“The image analysis system further comprises a Multiple-Instance-Learning (MIL) program. The MIL-program is executable by the at least one processor and is configured to use a model for classifying any input image as a member of one out of at least two different classes based on the feature vectors extracted from all tiles of the said input image. The MIL-program is further configured for: for each of the tiles, computing a certainty value, the certainty value being indicative of the certainty of the model regarding the contribution of the tile's feature vector on the classification of the image from which the tile was derived; for each of the images: using, by the MIL-program, a certainty-value-based pooling function for aggregating the feature vectors extracted from the image into a global feature vector as a function of the certainty values of the tiles of the image, and computing an aggregated predictive value (referred herein according to embodiments of the invention as “ah”) from the global feature vector; or computing, by the MIL program, predictive values from respective ones of the feature vectors of the image and using, by the MIL-program, a certainty-value-based pooling function for aggregating the predictive values of the image into an aggregated predictive value (“ah”) as a function of the certainty values of the tiles of the image; and classifying each of the images as a member of one out of the at least two different classes based on the aggregated predictive value.” [0151] – [0156]
Example of training set (first dataset) based on highest predictive power…
“According to embodiments, the providing of the MIL-program comprises training the MIL-program on a training image set, whereby during the training phase a certainty-value-based-max-pooling function is used as pooling function. This may have the advantage that the predictive model generated during the training the MIL strongly reflects the tissue pattern depicted in the tile having the feature vector with the highest predictive power in respect to the bag's label. The model is not negatively affected by tissue regions/tiles which are irrelevant for the label. However, the maximum operation will neglect all the information contained in all tiles except the highest scoring tile. Hence, the predictive power of tiles/tissue patterns which may also be of relevance may be missed.” [0046]
initializing an artificial neural network; and
“According to embodiments, the MIL-program is a neural network. The certainty-value is computed using a dropout technique at training and/or test time of the model of the neural network.” [0074]
training the artificial neural network with the entire first dataset or a part thereof.
Training images (dataset or part thereof) that are labeled by class (first dataset or part thereof) using MIL-program (neural network)…
“According to embodiments, the providing of the MIL-program comprises training the model of the MIL-program. The training comprises: providing a set of digital training images of tissue samples, each digital training image having assigned a class label being indicative of one of the at least two classes; splitting each training image into training image tiles, each training tile having assigned the same class label as the digital training image from which the training tile was derived; for each of the tiles, computing, by the image analysis system, a training feature vector comprising image features extracted selectively from the said tile; and/or repeatedly adapting the model of the MIL-program such that an error of a loss function is minimized; the error of the loss function indicates a difference of predicted class labels of the training tiles and the class labels actually assigned to the training tiles; the predicted class labels have been computed by the model based on the feature vector of the training tiles.” [0106] - [0132]
Regarding claim 16
The method according to claim 14, wherein the first classification signal indicates, for each image, probability values associated with a first set of classes.
Shaul et al. teaches:
Aggregated predictive value and associated with “image depicting a tumor” (first set of classes)…
“The aggregated predictive value is then used for classifying the input image. For example, in case wh.sub.MAX exceeds a threshold determined during training, a digital image may be classified as “image depicting a tumor”. Otherwise, the image is classified as “image depicting healthy tissue”.” [0044]
Regarding claim 17
The method according to claim 14, wherein the first dataset comprises all images above a threshold probability value and the second dataset comprises all images at or below the same threshold probability value.
Shaul et al. teaches:
Exceeds a threshold and otherwise (below same threshold, second dataset)…
“The aggregated predictive value is then used for classifying the input image. For example, in case wh.sub.MAX exceeds a threshold determined during training, a digital image may be classified as “image depicting a tumor”. Otherwise, the image is classified as “image depicting healthy tissue”.” [0044]
Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over the combined references in section (9) above in further view of Pub. No. US 2022/0237449 to Chung.
Regarding claim 11
The patient-specific training system according to claim 6, wherein the selector unit further comprises a training-scenario subunit, which is configured to select the artificial neural network of the training module.
Shaul et al. teaches:
Training phase…
“According to embodiments, applying dropout during the training phase means creating many different dropout layers respectively comprising a randomly selected sub-set of nodes of a fully connected layer, whereby each dropout layer acts as a masks (comprising “zero”-nodes and “one”-nodes). The masks are created during forward propagation, are respectively applied to the layer outputs during training and cached for future use on back-propagation. The dropout mask applied on a layer is saved to allow identifying the neurons that were activated during the backward propagation step. Now with those identified neurons selected, the output of the neurons is back-propagated. Typically, the dropout layers are created and used only during the training phase and are saved in the form of deactivated, additional layers in the trained network.” [0077]
Randomly using (select) different subnetworks of a neural network….
“The key idea behind Monte-Carlo Dropout (MC Dropout) is assessing model uncertainty by using dropout. MC Dropout calculates the model uncertainty (and hence, implicitly, also model certainty) using a dropout technique, i.e. by randomly using different subnetworks of a neural network architecture to get multiple different results from the network for the same predictive task and assess the “certainty” as the “consistency” of the result. MC is referring to Monte Carlo as the dropout process is similar to sampling the neurons.” [0082]
The combined references teach neural network. They do not teach select.
Chung also in the business of neural networks teaches:
Selecting neural network with lower loss function…
“In an embodiment of the disclosure, the plurality pieces of label data include a plurality pieces of training data and a plurality pieces of test data, and the training module generates the first neural network model and a second neural network model according to the plurality pieces of training data, wherein the second neural network model includes a plurality of second cluster centers, and a first number of the plurality of cluster centers is different from a second number of the plurality of second cluster centers, and the computing module calculates a first loss function value of the first neural network model and a second loss function value of the second neural network model according to the plurality pieces of test data, and the computing module selects the first neural network model from the first neural network model and the second neural network model to generate the reference configuration in response to the first loss function value being less than the second loss function value.” [0015]
It would have been obvious to one of ordinary skill in the art before the effective filing date to include in the method and system of the combined references the ability to select a neural network as taught by Chung since the claimed invention is merely a combination of old elements and in the combination each element merely would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable. Further motivation is provided by Chung who teaches the advantages of selecting a model with a lower loss function.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
The following prior art teaches at least neural network and training:
US-12243644-B2; US-12008751-B2; US-11790523-B2; US-20230025181-A1; US-20190355114-A1; US-20240173012-A1; US-20240315664-A1; US-20220328189-A1; US-20250308019-A1; US-20220133260-A1; US-20230177682-A1; US-20180315182-A1; US-20190130566-A1; US-20240215945-A1; US-20250036941-A1; US-20240257293-A1; US-20170357844-A1
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KENNETH BARTLEY whose telephone number is (571)272-5230. The examiner can normally be reached Mon-Fri: 7:30 - 4:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, SHAHID MERCHANT can be reached at (571) 270-1360. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/KENNETH BARTLEY/Primary Examiner, Art Unit 3684