Office Action Analysis: 18316987 — MULTI-VARIATE COUNTERFACTUAL DIFFUSION PROCESS

Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This action is in response to a patent application filed on May 12th, 2023. Claims 1-20 are pending in the current application.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



	Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim 1, Under Step 1 of the Subject Matter Eligibility Test of Products and
Processes, the claim is directed towards a machine which is one of the four statutory categories.
Next, under a Step 2A Prong 1 Analysis, the claim recites the following limitations, which are interpreted to be, under the broadest reasonable interpretation, abstract ideas:
generating a plurality of synthetic vectors for each input vector of a plurality of input vectors used to train a first machine learning model, wherein the plurality of synthetic vectors represent potential counterfactuals associated with the corresponding input vector (mental process)
filtering the plurality of synthetic vectors based at least on a comparison between a first score… based on a first input vector of the plurality of input vectors and a second score… based on a first synthetic vector of the plurality of synthetic vectors corresponding to the first input vector. (mental process)
predicting… based on the plurality of input vectors and the filtered plurality of counterfactual synthetic vectors, a classification of at least one input vector of the plurality of input vectors. (mental process)
Therefore, we have to examine the claim under Step 2A prong 2, which considers the additional elements within the claim. The claim’s additional elements are:
A data processor
one memory storing instructions
a first machine learning model
a second machine learning model
The data processor, memory and second machine learning model are limitations that are considered to be mere instructions to apply a judicial exception, as it instructs to use a processor, memory, and model as tools to perform the abstract idea. (See MPEP 2106.05(f)) and the first machine learning model merely indicates the field of technology or field of use, and “generally links” the first machine learning model to the judicial exception. (See MPEP 2106.05(h)) Therefore, these additional elements do not integrate the abstract idea into a practical application. The claim is directed to an abstract idea.
Under a Step 2B analysis, the claim’s additional elements do not amount to significantly
more than the judicial exception as explained above in Step 2A prong 2. Therefore, the claim is ineligible.

Regarding claim 10, Under Step 1 of the Subject Matter Eligibility Test of Products and Processes, the claim is directed towards a process which is one of the four statutory categories.
Next, under a Step 2A Prong 1 Analysis, the claim recites the following limitations, which are interpreted to be, under the broadest reasonable interpretation, abstract ideas:
generating a plurality of synthetic vectors for each input vector of a plurality of input vectors used to train a first machine learning model, wherein the plurality of synthetic vectors represent potential counterfactuals associated with the corresponding input vector (mental process)
filtering the plurality of synthetic vectors based at least on a comparison between a first score… based on a first input vector of the plurality of input vectors and a second score… based on a first synthetic vector of the plurality of synthetic vectors corresponding to the first input vector. (mental process)
predicting… based on the plurality of input vectors and the filtered plurality of counterfactual synthetic vectors, a classification of at least one input vector of the plurality of input vectors. (mental process)
Therefore, we have to examine the claim under Step 2A prong 2, which considers the additional elements within the claim. The claim’s additional elements are:
a first machine learning model
a second machine learning model
The “second machine learning model” is a limitation that is considered to be mere instructions to apply a judicial exception, as it instructs to use the model as a tool to perform the abstract idea. (See MPEP 2106.05(f)) The “first machine learning model” is a limitation that merely indicates the field of technology or field of use, and “generally links” the first machine learning model to the judicial exception. (See MPEP 2106.05(h)) Therefore, these additional elements do not integrate the abstract idea into a practical application. The claim is directed to an abstract idea.
Under a Step 2B analysis, the claim’s additional elements do not amount to significantly
more than the judicial exception as explained above in Step 2A prong 2. Therefore, the claim is ineligible.

Regarding claim 19, Under Step 1 of the Subject Matter Eligibility Test of Products and
Processes, the claim is directed towards a manufacture which is one of the four statutory categories.
Next, under a Step 2A Prong 1 Analysis, the claim recites the following limitations, which are interpreted to be, under the broadest reasonable interpretation, abstract ideas:
generating a plurality of synthetic vectors for each input vector of a plurality of input vectors used to train a first machine learning model, wherein the plurality of synthetic vectors represent potential counterfactuals associated with the corresponding input vector (mental process)
filtering the plurality of synthetic vectors based at least on a comparison between a first score… based on a first input vector of the plurality of input vectors and a second score… based on a first synthetic vector of the plurality of synthetic vectors corresponding to the first input vector. (mental process)
predicting… based on the plurality of input vectors and the filtered plurality of counterfactual synthetic vectors, a classification of at least one input vector of the plurality of input vectors. (mental process)
Therefore, we have to examine the claim under Step 2A prong 2, which considers the additional elements within the claim. The claim’s additional elements are:
A non-transitory computer-readable medium
One data processor
a first machine learning model
a second machine learning model
The data processor, non-transitory computer-readable medium and second machine learning model are limitations that are considered to be mere instructions to apply a judicial exception, as it instructs to use a processor, medium, and second ML model as tools to perform the abstract idea. (See MPEP 2106.05(f)) and the first machine learning model merely indicates the field of technology or field of use, and “generally links” the first machine learning model to the judicial exception. (See MPEP 2106.05(h)) Therefore, these additional elements do not integrate the abstract idea into a practical application. The claim is directed to an abstract idea.
Under a Step 2B analysis, the claim’s additional elements do not amount to significantly
more than the judicial exception as explained above in Step 2A prong 2. Therefore, the claim is ineligible.

	Regarding claims 2, 11, and 20, the claims recite “the plurality of synthetic vectors are generated based on a Gaussian distribution associated with each input vector.” The limitation, as drafted, is interpreted to be, under the broadest reasonable interpretation, a “mathematical concept”, which is a grouping of abstract idea. Therefore, the claims are rejected on the same basis as claims 1, 10, and 19.

	Regarding claims 3 and 12, the claims recite “the filtering comprises: generating, by the first machine learning model and based at least on the first input vector, the first score; generating, by the first machine learning model and based at least on the first synthetic vector, the second score; determining a difference between the first score and the second score; and determining to include the first synthetic vector in the filtered plurality of synthetic vectors based at least on the difference between the first score and the second score meeting a threshold difference.” The “generating… based at least on the first input vector, the first score”, “generating… based at least on the first synthetic vector, the second score”, “determining a difference between the first score and the second score”, and “determining to include the first synthetic vector in the filtered plurality of synthetic vectors based at least on the difference between the first score and the second score meeting a threshold difference” are limitations which are considered to be, under the broadest reasonable interpretation, “mental processes”, which is a grouping of abstract idea. The limitation “the first machine learning model” merely indicates the field of use and particular technology, and “generally links” a machine learning model to the abstract idea. (See MPEP2106.05(h)) Therefore, the claims are rejected on the same basis as claims 1 and 10.

	Regarding claims 4 and 13, the claims recite “the filtering further comprises: identifying a synthetic vector of the plurality of synthetic vectors having a highest absolute residual value among the plurality of synthetic vectors for each input vector, wherein the synthetic vector having the highest absolute residual value indicates a boundary of a data manifold associated with each input vector.” The limitation, as drafted, is interpreted to be, under the broadest reasonable interpretation, a “mental process”, which is a grouping of abstract idea. Therefore, the claims are rejected on the same basis as claims 3 and 12.

	Regarding claims 5 and 14, the claims recite “determining to include the first synthetic vector in the filtered plurality of counterfactual synthetic vectors is further based on an angle between the first synthetic vector and the synthetic vector having the highest absolute residual value meeting a threshold angle.” The limitation, as drafted, is interpreted to be, under the broadest reasonable interpretation, a “mathematical concept”, which is a grouping of abstract idea. Therefore, the claims are rejected on the same basis as claims 4 and 13.

	Regarding claims 6 and 15, the claims recite “the angle is a cosine distance, and wherein the threshold angle is a threshold cosine distance.” The limitation, as drafted, is interpreted to be, under the broadest reasonable interpretation, a “mathematical concept”, which is a grouping of abstract idea. Therefore, the claims are rejected on the same basis as claims 5 and 14.

	Regarding claim 7 and 16, the claims recite “the filtering comprises: iteratively determining to include a synthetic vector of the plurality of synthetic vectors for each input vector in the filtered plurality of counterfactual synthetic vectors until a threshold quantity of synthetic vectors is included in the filtered plurality of counterfactual synthetic vectors.” The limitation, as drafted, is interpreted to be, under the broadest reasonable interpretation, a “mental process”, which is a grouping of abstract idea. Therefore, the claims are rejected on the same basis as claims 1 and 10.

	Regarding claims 8 and 17, the claims recite “the operations further comprise training the first machine learning model based on the plurality of input vectors; and training the second machine learning model based on the plurality of input vectors and the filtered plurality of counterfactual synthetic vectors.” The limitations, as drafted merely recite instructions to apply a judicial exception, as it instructs to train the machine learning models and apply them to the abstract idea. (See MPEP 2106.05(f)) Therefore, the claims are rejected on the same basis as claims 1 and 10.

	Regarding claims 9 and 18, the claims recite “the first machine learning model is a first neural network, and wherein the second machine learning model is a second neural network.” The limitation, as drafted, merely indicates the particular technology or field of use, and “generally links”, a particular type of machine learning model (neural networks) to the judicial exception. (See MPEP 2106.05(h)) Therefore, the claims are rejected on the same basis as claims 1 and 10.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1, 3-10, and 12-19 are rejected under 35 U.S.C. 103 as being unpatentable over Matsui et al. (Herein referred to as Matsui) (Counterfactual Explanation of Brain Activity Classifiers using Image-to-Image Transfer by Generative Adversarial Network) in view of Chengliang Dai et al. (Herein referred to as Dai) (Suggestive Annotation of Brain Tumour Images with Gradient-guided Sampling)

Regarding claim 1, Matsui teaches a system comprising at least one data processor and at least one memory storing instructions (Although the processor and memory are not explicitly stated in the disclosure, one would implicitly need  those components to run the method of Matsui.) which when executed by the at least one processor result in operations comprising: generating a plurality of synthetic vectors for each input vector of a plurality of input vectors used to train a first machine learning model, wherein the plurality of synthetic vectors represent potential counterfactuals associated with the corresponding input vector (“Generator takes a combination of an image of brain activation and a one-hot label indicating the target class as an input. Generator outputs a counterfactual brain activation that is a minimal transform of the input brain activation toward the target class. Discriminator takes an activation map output by Generator and outputs a one-hot label. Discriminator was co-trained with Generator, as in STAR-GAN.”, pg. 8, Figure 3) (Images are processed by ML models as vectors or embeddings. The output of the generator is a synthetic image, which under the broadest reasonable interpretation, is a synthetic vector. Combined with the training data in the disclosure, the limitation is fully taught.) and predicting, using a second machine learning model trained based on the plurality of input vectors and the filtered plurality of counterfactual synthetic vectors, a classification of at least one input vector of the plurality of input vectors. (“The architecture includes two trainable modules, Discriminator (D) and CAG. Note that DNN-classifier is not trainable. (a) D learns to discriminate between real and fake activations and to classify the real activation to its corresponding domain… Discriminator takes an activation map output by Generator and outputs a one-hot label.”, pg. 25, Figure S1; pg. 8, Figure 3) (The Discriminator is trained on synthetic and real data to learn how differentiate between real and fake data. The label output by the discriminator corresponds to a prediction of a classification of at least one input vector (image).)
However, Matsui does not explicitly teach filtering the plurality of synthetic vectors based at least on a comparison between a first score generated by the first machine learning model based on a first input vector of the plurality of input vectors and a second score generated by the first machine learning model based on a first synthetic vector of the plurality of synthetic vectors corresponding to the first input vector.
Dai teaches filtering the plurality of synthetic vectors based at least on a comparison between a first score generated by the first machine learning model based on a first input vector of the plurality of input vectors and a second score generated by the first machine learning model based on a first synthetic vector of the plurality of synthetic vectors corresponding to the first input vector. (“images can be synthesized from z 0 i via the VAE decoder fθ(·) and suggested to the expert for annotation. However, the synthetic image may not be of high quality, which would prevent the expert from producing reliable annotation. To mitigate this issue, we propose to sample in the real image space, searching for existing real images that are most similar to the synthesized image… Fig. 2 illustrates the sampling process in the latent space. The red dots are the latent representation of S that are randomly selected initially. After training the base model with these samples, for any xi from S, the loss gradient is backpropagated and integrated to xi (Eq. 3), then projected to z 0 i in the latent space (Eq. 4 and red cross in Fig. 2). One of two criteria we use to find an existing real image is to find a zj ∈ Z (yellow dot) that has the shortest Euclidean distance to zi’.”, pg. 4-5) (The samples are projected into real space via an autoencoder, which corresponds to the first machine learning model. (Both GAN’s and VAE are generative AI models.) The selection of samples, which corresponds to a filtering step, is done based on the shortest distance to zi’ (a sample, which corresponds to a synthetic vector) from zj ∈ Z (which correspond to an input vector, as it represent a real image). The distance is calculated from “scores” corresponding to a placement in the latent space of both zi’ and zj. With the synthetic vectors of Matsui, configured to work with the autoencoder of Dai, to find a hard sample, and filter samples based on the hard sample, the limitation is fully taught.)
Therefore, it would have been considered obvious to one of ordinary skill in the art, prior to the filing date of the current application, to combine the models, synthetic data, and counterfactuals of Matsui, with the filtering process using a VAE, as described by Dai. One of ordinary skill in the art would have been motivated to combine the two teachings, prior to the application’s filing date, as this leads to produced images for professional annotation, and better training of a network, as disclosed in Dai. (“we select m informative samples and suggest them to the expert for manual annotation. A new training dataset with m more samples 0 = {(x1, y1),(x2, y2), ...,(x2m, y2m)} can be constructed, which consists of initial samples and new samples suggested by the proposed method. The new training set can be used for further training the base network.”, pg. 4, bottom paragraph; pg. 5, above “3 Experiments”)

Regarding claim 10, Matsui teaches a method comprising: generating a plurality of synthetic vectors for each input vector of a plurality of input vectors used to train a first machine learning model, wherein the plurality of synthetic vectors represent potential counterfactuals associated with the corresponding input vector (“Generator takes a combination of an image of brain activation and a one-hot label indicating the target class as an input. Generator outputs a counterfactual brain activation that is a minimal transform of the input brain activation toward the target class. Discriminator takes an activation map output by Generator and outputs a one-hot label. Discriminator was co-trained with Generator, as in STAR-GAN.”, pg. 8, Figure 3) (Images are processed by ML models as vectors or embeddings. The output of the generator is a synthetic image, which under the broadest reasonable interpretation, is a synthetic vector. Combined with the training data in the disclosure, the limitation is fully taught.) and predicting, using a second machine learning model trained based on the plurality of input vectors and the filtered plurality of counterfactual synthetic vectors, a classification of at least one input vector of the plurality of input vectors. (“Discriminator takes an activation map output by Generator and outputs a one-hot label.”, pg. 8, Figure 3) (The label output by the discriminator corresponds to a prediction of a classification of at least one input vector (image).)
However, Matsui does not explicitly teach filtering the plurality of synthetic vectors based at least on a comparison between a first score generated by the first machine learning model based on a first input vector of the plurality of input vectors and a second score generated by the first machine learning model based on a first synthetic vector of the plurality of synthetic vectors corresponding to the first input vector.
Dai teaches filtering the plurality of synthetic vectors based at least on a comparison between a first score generated by the first machine learning model based on a first input vector of the plurality of input vectors and a second score generated by the first machine learning model based on a first synthetic vector of the plurality of synthetic vectors corresponding to the first input vector. (“images can be synthesized from z 0 i via the VAE decoder fθ(·) and suggested to the expert for annotation. However, the synthetic image may not be of high quality, which would prevent the expert from producing reliable annotation. To mitigate this issue, we propose to sample in the real image space, searching for existing real images that are most similar to the synthesized image… Fig. 2 illustrates the sampling process in the latent space. The red dots are the latent representation of S that are randomly selected initially. After training the base model with these samples, for any xi from S, the loss gradient is backpropagated and integrated to xi (Eq. 3), then projected to z 0 i in the latent space (Eq. 4 and red cross in Fig. 2). One of two criteria we use to find an existing real image is to find a zj ∈ Z (yellow dot) that has the shortest Euclidean distance to zi’.”, pg. 4-5) (The samples are projected into real space via an autoencoder, which corresponds to the first machine learning model. (Both GAN’s and VAE are generative AI models.) The selection of samples, which corresponds to a filtering step, is done based on the shortest distance to zi’ (a sample, which corresponds to a synthetic vector) from zj ∈ Z (which correspond to an input vector, as it represent a real image). The distance is calculated from “scores” corresponding to a placement in the latent space of both zi’ and zj. With the synthetic vectors of Matsui, configured to work with the autoencoder of Dai, to find a hard sample, and filter samples based on the hard sample, the limitation is fully taught.)
Therefore, it would have been considered obvious to one of ordinary skill in the art, prior to the filing date of the current application, to combine the models, synthetic data, and counterfactuals of Matsui, with the filtering process using a VAE, as described by Dai.  One of ordinary skill in the art would have been motivated to combine the two teachings, prior to the application’s filing date, as this leads to produced images for professional annotation, and better training of a network, as disclosed in Dai. (“we select m informative samples and suggest them to the expert for manual annotation. A new training dataset with m more samples 0 = {(x1, y1),(x2, y2), ...,(x2m, y2m)} can be constructed, which consists of initial samples and new samples suggested by the proposed method. The new training set can be used for further training the base network.”, pg. 4, bottom paragraph; pg. 5, above “3 Experiments”)

Regarding claim 19, Matsui teaches A non-transitory computer-readable medium storing instructions, (While Matsui does not explicitly disclose a non-transitory computer-readable medium, you would implicitly need one to distribute the method of Matsui.) which when executed by at least one data processor, result in operations comprising: generating a plurality of synthetic vectors for each input vector of a plurality of input vectors used to train a first machine learning model, wherein the plurality of synthetic vectors represent potential counterfactuals associated with the corresponding input vector (“Generator takes a combination of an image of brain activation and a one-hot label indicating the target class as an input. Generator outputs a counterfactual brain activation that is a minimal transform of the input brain activation toward the target class. Discriminator takes an activation map output by Generator and outputs a one-hot label. Discriminator was co-trained with Generator, as in STAR-GAN.”, pg. 8, Figure 3) (Images are processed by ML models as vectors or embeddings. The output of the generator is a synthetic image, which under the broadest reasonable interpretation, is a synthetic vector. Combined with the training data in the disclosure, the limitation is fully taught.) and predicting, using a second machine learning model trained based on the plurality of input vectors and the filtered plurality of counterfactual synthetic vectors, a classification of at least one input vector of the plurality of input vectors. (“Discriminator takes an activation map output by Generator and outputs a one-hot label.”, pg. 8, Figure 3) (The label output by the discriminator corresponds to a prediction of a classification of at least one input vector (image).)
However, Matsui does not explicitly teach filtering the plurality of synthetic vectors based at least on a comparison between a first score generated by the first machine learning model based on a first input vector of the plurality of input vectors and a second score generated by the first machine learning model based on a first synthetic vector of the plurality of synthetic vectors corresponding to the first input vector.
Dai teaches filtering the plurality of synthetic vectors based at least on a comparison between a first score generated by the first machine learning model based on a first input vector of the plurality of input vectors and a second score generated by the first machine learning model based on a first synthetic vector of the plurality of synthetic vectors corresponding to the first input vector. (“images can be synthesized from z 0 i via the VAE decoder fθ(·) and suggested to the expert for annotation. However, the synthetic image may not be of high quality, which would prevent the expert from producing reliable annotation. To mitigate this issue, we propose to sample in the real image space, searching for existing real images that are most similar to the synthesized image… Fig. 2 illustrates the sampling process in the latent space. The red dots are the latent representation of S that are randomly selected initially. After training the base model with these samples, for any xi from S, the loss gradient is backpropagated and integrated to xi (Eq. 3), then projected to z 0 i in the latent space (Eq. 4 and red cross in Fig. 2). One of two criteria we use to find an existing real image is to find a zj ∈ Z (yellow dot) that has the shortest Euclidean distance to zi’.”, pg. 4-5) (The samples are projected into real space via an autoencoder, which corresponds to the first machine learning model. (Both GAN’s and VAE are generative AI models.) The selection of samples, which corresponds to a filtering step, is done based on the shortest distance to zi’ (a sample, which corresponds to a synthetic vector) from zj ∈ Z (which correspond to an input vector, as it represent a real image). The distance is calculated from “scores” corresponding to a placement in the latent space of both zi’ and zj. With the synthetic vectors of Matsui, configured to work with the autoencoder of Dai, to find a hard sample, and filter samples based on the hard sample, the limitation is fully taught.)
Therefore, it would have been considered obvious to one of ordinary skill in the art, prior to the filing date of the current application, to combine the models, synthetic data, and counterfactuals of Matsui, with the filtering process using a VAE, as described by Dai. One of ordinary skill in the art would have been motivated to combine the two teachings, prior to the application’s filing date, as this leads to produced images for professional annotation, and better training of a network, as disclosed in Dai. (“we select m informative samples and suggest them to the expert for manual annotation. A new training dataset with m more samples 0 = {(x1, y1),(x2, y2), ...,(x2m, y2m)} can be constructed, which consists of initial samples and new samples suggested by the proposed method. The new training set can be used for further training the base network.”, pg. 4, bottom paragraph; pg. 5, above “3 Experiments”)

Regarding claims 3 and 12, Matsui, as modified by Dai, teaches the system and method of claims 1 and 10 respectively, wherein the filtering comprises: generating, by the first machine learning model and based at least on the first input vector, the first score; (The first score corresponds to zj in Dai) generating, by the first machine learning model and based at least on the first synthetic vector, the second score; (The second score corresponds to zi’ in Dai.) determining a difference between the first score and the second score; (The distance between zi and zj corresponds to a difference between the first and second score.) and determining to include the first synthetic vector in the filtered plurality of synthetic vectors based at least on the difference between the first score and the second score meeting a threshold difference. (“One of two criteria we use to find an existing real image is to find a zj ∈ Z (yellow dot) that has the shortest Euclidean distance to z 0 i . However, we have found that using the Euclidean distance alone sometimes may fail to find an existing image that is similar to the synthesized image. To mitigate this issue, we introduce an angular condition to limit the search angle (black angle in Fig. 2)… By using the gradient-guided sampling method, one or more samples can be found given each z 0 i . To simplify the process, we only select one sample zj ∈ Z for each z 0 i in our work, which corresponds to image xj ∈ X.”, pg. 5, second and third paragraph (Dai)) (The synthetic vector is included if an existing image is found within the shortest Euclidian distance and given the angular condition and corresponds to xj ∈ X. The determination is based on, in part the difference between the first score and the second score meeting a threshold difference, with the threshold distance being the next shortest.)

Regarding claims 4 and 13, Matsui, as modified by Dai, teaches the system and method of claims 3 and 12 respectively, wherein the filtering further comprises: identifying a synthetic vector of the plurality of synthetic vectors having a highest absolute residual value among the plurality of synthetic vectors for each input vector, wherein the synthetic vector having the highest absolute residual value indicates a boundary of a data manifold associated with each input vector. (“A VAE is trained on X with the loss function… MSE denotes the mean square error function and DKL denotes the KL-divergence, which regularizes the optimization problem by minimizing the distance between the latent variable distribution and a Gaussian distribution [12]. Once trained, the VAE can be used to obtain the latent representations Z = {z1, z2, ..., zn} given X. It will be used for sampling purpose in the later stage.”, pgs. 3 and 4; Also see Figure 2 on pg. 5 of Dai.) (The dotted circle represents a manifold and its boundary is determined via a trained VAE, (variational autoencoder) which regularizes the optimization problem by minimizing the distance between the latent variable distribution and a Gaussian distribution. The minimum distance between the latent variable distribution and a Gaussian distribution corresponds to a residual with the highest absolute value, teaching the limitation.

Regarding claims 5 and 14, Matsui, as modified by Dai, teaches the system and method of claims 4 and 13 respectively, wherein determining to include the first synthetic vector in the filtered plurality of counterfactual synthetic vectors is further based on an angle between the first synthetic vector and the synthetic vector having the highest absolute residual value meeting a threshold angle. (“The angular condition is similar to the cosine distance widely used in machine learning, which in our case was used to constrain the search in the VAE manifold with respect to similar cases”, pg. 5, second paragraph (Dai)) (The angular condition corresponds to an angle and the hard sample within the angle corresponds to a synthetic vector having the highest absolute residual value meeting a threshold angle.)

Regarding claims 6 and 15, Matsui, as modified by Dai, teaches the system and method of claims 5 and 14 respectively, wherein the angle is a cosine distance, and wherein the threshold angle is a threshold cosine distance. (“The angular condition is similar to the cosine distance widely used in machine learning, which in our case was used to constrain the search in the VAE manifold with respect to similar cases”, pg. 5, second paragraph (Dai))

Regarding claims 7 and 16, Matsui, as modified by Dai, teaches the system and method of claims 1 and 10 respectively, wherein the filtering comprises: iteratively determining to include a synthetic vector of the plurality of synthetic vectors for each input vector in the filtered plurality of counterfactual synthetic vectors until a threshold quantity of synthetic vectors is included in the filtered plurality of counterfactual synthetic vectors. (“By using the gradient-guided sampling method, one or more samples can be found given each z 0 i . To simplify the process, we only select one sample zj ∈ Z for each z 0 i in our work, which corresponds to image xj ∈ X. In this way, we select m informative samples and suggest them to the expert for manual annotation.”, pg. 5, third paragraph (Dai)) (One sample is chosen for each “zi’” up to m samples, indicating a threshold quantity.)

Regarding claims 8 and 17, Matsui, as modified by Dai, teaches the system and method of claims 1 and 10 respectively, wherein the operations further comprise training the first machine learning model based on the plurality of input vectors; and training the second machine learning model based on the plurality of input vectors and the filtered plurality of counterfactual synthetic vectors. (“Discriminator was co-trained with Generator, as in STAR-GAN… By including the classification loss by the DNN classifier, CAG was trained to simultaneously fool both the discriminator and the DNN classifier (Fig. S1; see Methods for details). Throughout the training, the Generator Loss, which is a good indicator of the quality of the generated image (Arjovsky et al., 2017), consistently decreased toward zero and plateaued around 10,000 epochs of training (data for five replicates are shown in Fig. 3b; see also Fig. S2 for time courses of all the loss terms).”, pg. 12, under “CAG generated counterfactual activations were realistic and fooled the classifiers”; See also Figure 3b on pg. 8 (Matsui))

Regarding claims 9 and 18, Matsui, as modified by Dai, teaches the system and method of claims 1 and 10 respectively, wherein the first machine learning model is a first neural network, and wherein the second machine learning model is a second neural network. (“Two DNNs, generator (CAG) and discriminator, were simultaneously trained (Fig. 3a; Fig. S1).”, pg. 12, under “CAG generated counterfactual activations were realistic and fooled the classifiers” (Matsui))


Claim(s) 2, 11, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Matsui et al. (Herein referred to as Matsui) (Counterfactual Explanation of Brain Activity Classifiers using Image-to-Image Transfer by Generative Adversarial Network) in view of Chengliang Dai et al. (Herein referred to as Dai) (Suggestive Annotation of Brain Tumour Images with Gradient-guided Sampling) and in further view of Hansoo Lee et al. (Herein referred to as Lee) (Gaussian-Based SMOTE Algorithm for Solving Skewed Class Distributions)

Regarding claims 2, 11, and 20, Matsui, as modified by Dai, teaches the system, method, and non-transitory computer-readable medium of claims 1, 10, and 19 respectively but does not explicitly teach the plurality of synthetic vectors are generated based on a Gaussian distribution associated with each input vector.
Lee teaches the plurality of synthetic vectors are generated based on a Gaussian distribution associated with each input vector. (“By including the Gaussian probability distribution in the process, it is possible to expand the place, where the synthetic sample is generated, from the line between minorities. Also, the Gaussian distribution allows the synthetic data located near the line.”, pg. 3, left column, bottom paragraph)
Therefore, it would have been considered obvious to one of ordinary skill in the art, prior to the filing date of the current application, to combine the teachings of Matsui, as modified by Dai, with the Gaussian distribution and synthetic data of Lee. One of ordinary skill in the art would have been motivated to combine the teachings, prior to the application’s filing date, as the distribution solves for class imbalance, as disclosed in Lee. (“Sufficient amount of learning data is an essential condition to implement a classifier with excellent performance. However, the obtained data usually follow a significantly biased distribution of classes. It is called a class imbalance problem, which is one of the frequently occurred issues in the real-world applications. This problem causes a considerable performance drop because most of the machine learning methods assume given data follow a balanced distribution of classes. The implemented classifier will derive false classification results if the problem is not solved. Therefore, this paper proposes a novel method, named as Gaussian-based SMOTE, to solve the problem by combining Gaussian distribution in a synthetic data generation process. It is confirmed that the proposed method could solve the class imbalance problem by conducting experiments with actual cases.”, pg. 1, Abstract)


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Tyler E Iles whose telephone number is (571)272-5442. The examiner can normally be reached 9:00am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/T.E.I./             Patent Examiner, Art Unit 2122                                                                                                                                                                                           

/KAKALI CHAKI/             Supervisory Patent Examiner, Art Unit 2122
Read full office action
MULTI-VARIATE COUNTERFACTUAL DIFFUSION PROCESS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

MULTI-VARIATE COUNTERFACTUAL DIFFUSION PROCESS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email