Prosecution Insights
Last updated: May 29, 2026
Application No. 17/957,891

DOMAIN-BASED LEARNING FOR AUTOENCODER MODELS

Final Rejection §103
Filed
Sep 30, 2022
Examiner
BOSTWICK, SIDNEY VINCENT
Art Unit
2124
Tech Center
2100 — Computer Architecture & Software
Assignee
SAP SE
OA Round
5 (Final)
51%
Grant Probability
Moderate
6-7
OA Rounds
9m
Est. Remaining
89%
With Interview

Examiner Intelligence

Grants 51% of resolved cases
51%
Career Allowance Rate
71 granted / 138 resolved
-3.6% vs TC avg
Strong +38% interview lift
Without
With
+38.0%
Interview Lift
resolved cases with interview
Typical timeline
4y 5m
Avg Prosecution
45 currently pending
Career history
207
Total Applications
across all art units

Statute-Specific Performance

§101
2.5%
-37.5% vs TC avg
§103
93.4%
+53.4% vs TC avg
§102
1.4%
-38.6% vs TC avg
§112
2.6%
-37.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 138 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Remarks This Office Action is responsive to Applicants' Amendment filed on March 13, 2026, in which claims 1, 10, and 19 are currently amended. Claims 1-20 are currently pending. Response to Arguments The rejections to claims 1-20 under 35 U.S.C. § 112(a) are hereby withdrawn, as necessitated by applicant's amendments and remarks made to the rejections. Applicant’s arguments with respect to rejection of claims 1-20 under 35 U.S.C. 103 based on amendment have been considered and are persuasive. The argument is moot in view of a new ground of rejection set forth below. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 1, 2, 8, 9, 10, 11, 17, 18, 19, and 20 are rejected under U.S.C. §103 as being unpatentable over the combination of Bousmalis (US11361531B2) and Zhu (“Domain Adaptation Using Convolutional Autoencoder and Gradient Boosting for Adverse Events Prediction in the Intensive Care Unit”, 2022). PNG media_image1.png 506 706 media_image1.png Greyscale FIG. 2 of US11361531B2 PNG media_image2.png 506 706 media_image2.png Greyscale Markup of FIG. 2 of US11361531B2 with shared layer highlighted Regarding claim 1, Bousmalis teaches A system comprising: at least one hardware processor; and a computer-readable medium storing instructions that, when executed by the at least one hardware processor, cause the at least one hardware processor to perform operations comprising:([Abstract] "Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using an image processing neural network system. One of the system includes a shared encoder neural network implemented by one or more computers") accessing training data, the training data including data from a first domain and data from a second domain,([Abstract] "process the input image to generate a shared feature representation of features of the input image that are shared between images from the target domain and images from a source domain different from the target domain;" [Col. 3 l. 4-12] "The shared encoder neural network 110 is a neural network, e.g., a convolutional neural network, that has been configured through training to receive the target domain image 102 and to process the target domain image 102 to generate a shared feature representation 112 for the target domain image 102. The shared feature representation 112 is a vector of numeric values and is a representation of the features of the target domain image 102 that are shared between images from the target domain and images from a source domain.") passing the training data to an input layer of a domain autoencoder neural network;([Col. 1 l. 31-43] "Neural networks are machine learning models that employ one or more layers of nonlinear units to predict an output for a received input. Some neural networks include one or more hidden layers in addition to an output layer. The output of each hidden layer is used as input to the next layer in the network, i.e., the next hidden layer or the output layer. Each layer of the network generates an output from a received input in accordance with current values of a respective set of parameters." [Col. 4 l. 11-24] " the neural network training system 200 trains the shared encoder neural network 110 to (i) generate shared feature representations for input images from the target domain that are similar to shared feature representations for input images from the source domain while (ii) generating shared feature representations for input images from the target domain that are different from private feature representations for the same input images from the target domain generated by the private target encoder neural network 210 and (iii) generating shared feature representations for input images from the source domain that are different from private feature representations for the same input images from the source domain generated by the private source encoder neural network 220." See also FIG. 2) receiving, at a shared layer of the domain autoencoder neural network, output from the input layer, the shared layer passing its output to both a classifier portion of the domain autoencoder neural network and an autoencoder portion of the domain autoencoder neural network, the autoencoder portion at least one convolutional [upscaling] layer and at least one convolutional [downscaling] layer; the shared layer being separate from the autoencoder portion([Col. 4 l. 24-30] "To train the shared encoder neural network 110 and the classifier neural network 120, the neural network training system 200 also includes a private target encoder neural network 210, a private source encoder neural network 220, and a shared decoder neural network 230" [Col. 7 l. 37-45] “the private target encoder neural network 210, the private source encoder neural network 220, and the shared encoder neural network 110 have the same neural network architecture” [Col. 8 l. 10-20] "Hc s is a matrix having rows that are the shared feature representations of the training source domain images" See FIG. 2. layers in shared decoder 230 interpreted as downscaling layers, layers in private target and private source encoders interpreted as upscaling layers. FIG. 2 clearly shows an input layer of shared encoder which outputs to a shared layer which outputs a feature representation to both the classifier and autoencoder) training the classifier portion to learn a first set of parameters for classifying input data into either the first domain or the second domain; and([Col. 8 25-60] "the similarity loss may be a domain adversarial similarity loss that trains the shared encoder neural network to generate the shared representations such that a domain classifier neural network cannot reliably predict the domain of the encoded representation" Bousmalis explicitly discloses that the domain classifiers job is to predict the domain of the input encoded representation) training, using the first set of parameters, the autoencoder portion to learn a second set of parameters for generating synthetic data for both the first domain and the second domain based on input data.([Col. 5 l. 21-35] " the neural network training system 200 trains the shared encoder neural network 110 to generate a shared feature representation for an input image from the target domain that, when combined with a private feature representation for the same input image generated by the private target encoder neural network 210, can be used to accurately reconstruct the input image by the shared decoder neural network 230 and to generate a shared feature representation for an input image from the source domain that, when combined with a private feature representation for the same input image generated by the private source encoder neural network 220, can be used to accurately reconstruct the input image by the shared decoder neural network 230." [Col. 7 l. 10-33] "The classification loss trains the classifier neural network and, by virtue of backpropagation, the shared encoder neural network to generate accurate network outputs for source domain images, i.e., to generate network outputs that match the known network outputs for the training source domain images [...] The reconstruction loss trains the shared decoder neural network and, by virtue of backpropagation, each of the encoder neural networks" Reconstruction image interpreted as synthetic data. Bousmalis explicitly trains the classifier parameters and backpropagates the classification loss to jointly train the classifier, encoders, and decoder. The classification loss being influenced by the classifier output/predicted domain) wherein the synthetic data generated by the autoencoder portion for the first domain is influenced by a predicted domain of the input data such that the synthetic data differs depending on whether the input data is from the first domain or the second domain([Col. 6 l. 50-55] "The system generates a respective combined representation for each training source domain image and each training target domain image" [Col. 7 l. 10-33] "The classification loss trains the classifier neural network and, by virtue of backpropagation, the shared encoder neural network to generate accurate network outputs for source domain images, i.e., to generate network outputs that match the known network outputs for the training source domain images [...] The reconstruction loss trains the shared decoder neural network and, by virtue of backpropagation, each of the encoder neural networks" Reconstruction image interpreted as synthetic data. Bousmalis explicitly trains the classifier parameters and backpropagates the classification loss to jointly train the classifier, encoders, and decoder. The classification loss being influenced by the classifier output/predicted domain). While one of ordinary skill in the art would recognize that the encoder in an autoencoder typically downscales and the decoder typically upscales, Bousmalis does not explicitly teach the first domain containing data from a separate computer system than the second domain; the autoencoder portion at least one convolutional upscaling layer and at least one convolutional downscaling layer. PNG media_image3.png 534 1432 media_image3.png Greyscale FIG. 1 of Zhu Zhu, in the same field of endeavor, teaches the first domain containing data from a separate computer system than the second domain;([p. 1] "We demonstrate our results from a retrospective data analysis using patient records from a publicly available database called Multi-parameter Intelligent Monitoring in Intensive Care-II (MIMIC-II) and a local database from Children’s Healthcare of Atlanta (CHOA)") the autoencoder portion at least one convolutional upscaling layer and at least one convolutional downscaling layer([p. 5] "Each block of the encoder includes a 1D convolutional layer, a ReLu activation layer, and a max pooling layer. Similarly, each block of the decoder includes a 1D convolutional layer, a ReLu activation layer, and an upsampling layer." See FIG. 1 of Zhu which shows the autoencoder architecture having downsampling and upsampling layers in an autoencoder). Bousmalis as well as Zhu are directed towards machine learning domain adaptation and specifically towards learning domain-invariant information to better generalize a target and source domain. Therefore, Bousmalis as well as Zhu are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Bousmalis with the teachings of Zhu by using the downscaling and upscaling autoencoder layers in Zhu as the respective autoencoder layers in Bousmalis. Zhu provides as additional motivation for combination ([p. 9] “We designed three different experimental settings, implemented CAE to learn latent feature representation, and applied multiple classifiers, such as gradient boosting and random forest for classification. We demonstrated the effectiveness of gradient boosting in both mortality prediction and ICU readmission prediction tasks. In addition, we showed that domain adaptation using CAE across two datasets can significantly improve results against using CAE and classifiers without domain adaptation”). This motivation for combination also applies to the remaining claims which depend on this combination. Regarding claim 2, the combination of Bousmalis and Zhu teaches The system of claim 1, wherein the shared layer is a one-dimensional convolutional layer that takes the output from the input layer and performs one or more convolutions on the output from the input layer to transform the output from the input layer to a different format using one or more filters.(Zhu [p. 5] "Each block of the encoder includes a 1D convolutional layer, a ReLu activation layer, and a max pooling layer. Similarly, each block of the decoder includes a 1D convolutional layer, a ReLu activation layer, and an upsampling layer." See FIG. 1 of Zhu). Regarding claim 8, the combination of Bousmalis and Zhu teaches The system of claim 1, wherein the operations further comprise: generating synthetic data similar to data from a first domain by passing the data from the first domain to the trained domain autoencoder neural network; and(Bousmalis [Col. 5 l. 21-35] " the neural network training system 200 trains the shared encoder neural network 110 to generate a shared feature representation for an input image from the target domain that, when combined with a private feature representation for the same input image generated by the private target encoder neural network 210, can be used to accurately reconstruct the input image by the shared decoder neural network 230 and to generate a shared feature representation for an input image from the source domain that, when combined with a private feature representation for the same input image generated by the private source encoder neural network 220, can be used to accurately reconstruct the input image by the shared decoder neural network 230." [Col. 7 l. 10-33] "The classification loss trains the classifier neural network and, by virtue of backpropagation, the shared encoder neural network to generate accurate network outputs for source domain images, i.e., to generate network outputs that match the known network outputs for the training source domain images [...] The reconstruction loss trains the shared decoder neural network and, by virtue of backpropagation, each of the encoder neural networks" Reconstruction image interpreted as synthetic data. Bousmalis explicitly trains the classifier parameters and backpropagates the classification loss to jointly train the classifier, encoders, and decoder. The classification loss being influenced by the classifier output/predicted domain) using the generated synthetic data as training data using a machine learning algorithm to train a machine-learned model.(Bousmalis [Col. 6 l. 50-55] "The system generates a respective combined representation for each training source domain image and each training target domain image" [Col. 7 l. 10-33] "The classification loss trains the classifier neural network and, by virtue of backpropagation, the shared encoder neural network to generate accurate network outputs for source domain images, i.e., to generate network outputs that match the known network outputs for the training source domain images [...] The reconstruction loss trains the shared decoder neural network and, by virtue of backpropagation, each of the encoder neural networks" Reconstruction image interpreted as synthetic data. Bousmalis explicitly trains the classifier parameters and backpropagates the classification loss to jointly train the classifier, encoders, and decoder. The classification loss being influenced by the classifier output/predicted domain). Regarding claim 9, the combination of Bousmalis and Zhu teaches The system of claim 8, wherein the machine learning algorithm is a linear regression model.(Zhu [p. 4] "A Linear regression classifier of Ordinary Least Squares fits a linear model that aims to minimize the residual sum of squares between the prediction (linear approximation) and the ground truth observations" Zhu explicitly uses a linear regression model classifier. See also FIG. 2). Regarding claim 10, Bousmalis teaches A method comprising: accessing training data, the training data including data from a first domain and data from a second domain([Abstract] "process the input image to generate a shared feature representation of features of the input image that are shared between images from the target domain and images from a source domain different from the target domain;" [Col. 3 l. 4-12] "The shared encoder neural network 110 is a neural network, e.g., a convolutional neural network, that has been configured through training to receive the target domain image 102 and to process the target domain image 102 to generate a shared feature representation 112 for the target domain image 102. The shared feature representation 112 is a vector of numeric values and is a representation of the features of the target domain image 102 that are shared between images from the target domain and images from a source domain.") passing the training data to an input layer of a domain autoencoder neural network;([Col. 1 l. 31-43] "Neural networks are machine learning models that employ one or more layers of nonlinear units to predict an output for a received input. Some neural networks include one or more hidden layers in addition to an output layer. The output of each hidden layer is used as input to the next layer in the network, i.e., the next hidden layer or the output layer. Each layer of the network generates an output from a received input in accordance with current values of a respective set of parameters." [Col. 4 l. 11-24] " the neural network training system 200 trains the shared encoder neural network 110 to (i) generate shared feature representations for input images from the target domain that are similar to shared feature representations for input images from the source domain while (ii) generating shared feature representations for input images from the target domain that are different from private feature representations for the same input images from the target domain generated by the private target encoder neural network 210 and (iii) generating shared feature representations for input images from the source domain that are different from private feature representations for the same input images from the source domain generated by the private source encoder neural network 220." See also FIG. 2) receiving, at a shared layer of the domain autoencoder neural network, output from the input layer, the shared layer passing its output to both a classifier portion of the domain autoencoder neural network and an autoencoder portion of the domain autoencoder neural network, the autoencoder portion comprising at least one convolutional [upscaling] laver and at least one convolutional [downscaling] laver; the shared layer being separate from the autoencoder portion([Col. 4 l. 24-30] "To train the shared encoder neural network 110 and the classifier neural network 120, the neural network training system 200 also includes a private target encoder neural network 210, a private source encoder neural network 220, and a shared decoder neural network 230" [Col. 8 l. 10-20] "Hc s is a matrix having rows that are the shared feature representations of the training source domain images" See FIG. 2. layers in shared decoder 230 interpreted as downscaling layers, layers in private target and private source encoders interpreted as upscaling layers. FIG. 2 clearly shows an input layer of shared encoder which outputs to a shared layer which outputs a feature representation to both the classifier and autoencoder) training the classifier portion to learn a first set of parameters for classifying input data into either the first domain or the second domain; and([Col. 8 25-60] "the similarity loss may be a domain adversarial similarity loss that trains the shared encoder neural network to generate the shared representations such that a domain classifier neural network cannot reliably predict the domain of the encoded representation" Bousmalis explicitly discloses that the domain classifiers job is to predict the domain of the input encoded representation) training, using the first set of parameters, the autoencoder portion to learn a second set of parameters for generating synthetic data for the first domain and the second domain based on input data.([Col. 5 l. 21-35] " the neural network training system 200 trains the shared encoder neural network 110 to generate a shared feature representation for an input image from the target domain that, when combined with a private feature representation for the same input image generated by the private target encoder neural network 210, can be used to accurately reconstruct the input image by the shared decoder neural network 230 and to generate a shared feature representation for an input image from the source domain that, when combined with a private feature representation for the same input image generated by the private source encoder neural network 220, can be used to accurately reconstruct the input image by the shared decoder neural network 230." [Col. 7 l. 10-33] "The classification loss trains the classifier neural network and, by virtue of backpropagation, the shared encoder neural network to generate accurate network outputs for source domain images, i.e., to generate network outputs that match the known network outputs for the training source domain images [...] The reconstruction loss trains the shared decoder neural network and, by virtue of backpropagation, each of the encoder neural networks" Reconstruction image interpreted as synthetic data. Bousmalis explicitly trains the classifier parameters and backpropagates the classification loss to jointly train the classifier, encoders, and decoder. The classification loss being influenced by the classifier output/predicted domain) wherein the synthetic data generated by the autoencoder portion for the first domain is influenced by a predicted domain of the input data such that the synthetic data differs depending on whether the input data is from the first domain or the second domain.([Col. 6 l. 50-55] "The system generates a respective combined representation for each training source domain image and each training target domain image" [Col. 7 l. 10-33] "The classification loss trains the classifier neural network and, by virtue of backpropagation, the shared encoder neural network to generate accurate network outputs for source domain images, i.e., to generate network outputs that match the known network outputs for the training source domain images [...] The reconstruction loss trains the shared decoder neural network and, by virtue of backpropagation, each of the encoder neural networks" Reconstruction image interpreted as synthetic data. Bousmalis explicitly trains the classifier parameters and backpropagates the classification loss to jointly train the classifier, encoders, and decoder. The classification loss being influenced by the classifier output/predicted domain). However, Bousmalis does not explicitly teach the first domain containing data from a separate computer system than the second domain; the autoencoder portion at least one convolutional upscaling layer and at least one convolutional downscaling layer. Zhu, in the same field of endeavor, teaches the first domain containing data from a separate computer system than the second domain;([p. 1] "We demonstrate our results from a retrospective data analysis using patient records from a publicly available database called Multi-parameter Intelligent Monitoring in Intensive Care-II (MIMIC-II) and a local database from Children’s Healthcare of Atlanta (CHOA)") the autoencoder portion at least one convolutional upscaling layer and at least one convolutional downscaling layer([p. 5] "Each block of the encoder includes a 1D convolutional layer, a ReLu activation layer, and a max pooling layer. Similarly, each block of the decoder includes a 1D convolutional layer, a ReLu activation layer, and an upsampling layer." See FIG. 1 of Zhu which shows the autoencoder architecture having downsampling and upsampling layers in an autoencoder). Bousmalis as well as Zhu are directed towards machine learning domain adaptation and specifically towards learning domain-invariant information to better generalize a target and source domain. Therefore, Bousmalis as well as Zhu are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Bousmalis with the teachings of Zhu by using the downscaling and upscaling autoencoder layers in Zhu as the respective autoencoder layers in Bousmalis. Zhu provides as additional motivation for combination ([p. 9] “We designed three different experimental settings, implemented CAE to learn latent feature representation, and applied multiple classifiers, such as gradient boosting and random forest for classification. We demonstrated the effectiveness of gradient boosting in both mortality prediction and ICU readmission prediction tasks. In addition, we showed that domain adaptation using CAE across two datasets can significantly improve results against using CAE and classifiers without domain adaptation”). This motivation for combination also applies to the remaining claims which depend on this combination. Regarding claim 11, the combination of Bousmalis and Zhu teaches The method of claim 10, wherein the shared layer is a one-dimensional convolutional layer that takes the output from the input layer and performs one or more convolutions on the output from the input layer to transform the output from the input layer to a different format using one or more filters.(Zhu [p. 5] "Each block of the encoder includes a 1D convolutional layer, a ReLu activation layer, and a max pooling layer. Similarly, each block of the decoder includes a 1D convolutional layer, a ReLu activation layer, and an upsampling layer." See FIG. 1 of Zhu). Regarding claim 17, the combination of Bousmalis and Zhu teaches The method of claim 10, further comprising; generating synthetic data similar to data from a first domain by passing the data from the first domain to the trained domain autoencoder neural network; and(Bousmalis [Col. 5 l. 21-35] " the neural network training system 200 trains the shared encoder neural network 110 to generate a shared feature representation for an input image from the target domain that, when combined with a private feature representation for the same input image generated by the private target encoder neural network 210, can be used to accurately reconstruct the input image by the shared decoder neural network 230 and to generate a shared feature representation for an input image from the source domain that, when combined with a private feature representation for the same input image generated by the private source encoder neural network 220, can be used to accurately reconstruct the input image by the shared decoder neural network 230." [Col. 7 l. 10-33] "The classification loss trains the classifier neural network and, by virtue of backpropagation, the shared encoder neural network to generate accurate network outputs for source domain images, i.e., to generate network outputs that match the known network outputs for the training source domain images [...] The reconstruction loss trains the shared decoder neural network and, by virtue of backpropagation, each of the encoder neural networks" Reconstruction image interpreted as synthetic data. Bousmalis explicitly trains the classifier parameters and backpropagates the classification loss to jointly train the classifier, encoders, and decoder. The classification loss being influenced by the classifier output/predicted domain) using the generated synthetic data as training data using a machine learning algorithm to train a machine-learned model.(Bousmalis [Col. 6 l. 50-55] "The system generates a respective combined representation for each training source domain image and each training target domain image" [Col. 7 l. 10-33] "The classification loss trains the classifier neural network and, by virtue of backpropagation, the shared encoder neural network to generate accurate network outputs for source domain images, i.e., to generate network outputs that match the known network outputs for the training source domain images [...] The reconstruction loss trains the shared decoder neural network and, by virtue of backpropagation, each of the encoder neural networks" Reconstruction image interpreted as synthetic data. Bousmalis explicitly trains the classifier parameters and backpropagates the classification loss to jointly train the classifier, encoders, and decoder. The classification loss being influenced by the classifier output/predicted domain). Regarding claim 18, the combination of Bousmalis and Zhu teaches The method of claim 17, wherein the machine learning algorithm is a linear regression model.(Zhu [p. 4] "A Linear regression classifier of Ordinary Least Squares fits a linear model that aims to minimize the residual sum of squares between the prediction (linear approximation) and the ground truth observations" Zhu explicitly uses a linear regression model classifier. See also FIG. 2). Regarding claims 19 and 20, claims 19 and 20 are substantially similar to claims 1 and 2, respectively. Therefore, the rejections applied to claims 1 and 2 also apply to claims 19 and 20. Claims 3, 4, 5, 6, 7, 12, 13, 14, 15, and 16 are rejected under U.S.C. §103 as being unpatentable over the combination of Bousmalis and Zhu and in further view of Kim (“Convolutional Neural Networks for Sentence Classification”, 2014). Regarding claim 3, the combination of Bousmalis and Zhu teaches The system of claim 1. However, the combination of Bousmalis and Zhu doesn't explicitly teach wherein the classifier portion includes a reshaping layer. Kim, in the same field of endeavor, teaches the classifier portion includes a reshaping layer. ([Abstract] "convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks" [p. 5 §5] "we have described a series of experiments with convolutional neural networks built on top of word2vec […] a simple CNN with one layer of convolution performs remarkably well" FIG. 1 shows that the CNN classifier has multiple layers which reshape the input to an output of different shape). The combination of Bousmalis and Zhu as well as Kim are directed towards convolutional neural networks for classification. Therefore, the combination of Bousmalis and Zhu as well as Kim are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Bousmalis and Zhu with the teachings of Kim by substituting the classifier in Bousmalis/Zhu with the 1D convolutional feature extractor model of Kim. Kim provides as additional motivation for combination ([p. 5 §5] “a simple CNN with one layer of convolution performs remarkably well." See also Table 2 which compares results with other known models). Regarding claim 4, the combination of Bousmalis, Zhu, and Kim teaches The system of claim 3, wherein the classifier portion further includes at least one one-dimensional convolutional layer.(Kim [p. 5 §5] "we have described a series of experiments with convolutional neural networks built on top of word2vec […] a simple CNN with one layer of convolution performs remarkably well" See FIG. 1 of Kim which shows 1D convolution). Regarding claim 5, the combination of Bousmalis, Zhu, and Kim teaches The system of claim 4, wherein the classifier portion further includes at least one dropout layer.(Kim [p. 2 §2.1] "For regularization we employ dropout on the penultimate layer with a constraint on l2-norms of the weight vectors (Hinton et al., 2012). Dropout prevents co-adaptation of hidden units by randomly dropping out—i.e., setting to zero—a pro portion p of the hidden units during forward backpropagation" See also FIG. 1 of Kim). Regarding claim 6, the combination of Bousmalis, Zhu, and Kim teaches The system of claim 5, wherein the classifier portion further includes a global max pooling layer.(Kim [p. 2] "We then apply a max-over time pooling operation (Collobert et al., 2011) over the feature map and take the maximum value"). Regarding claim 7, the combination of Bousmalis, Zhu, and Kim teaches The system of claim 6, wherein the classifier portion further includes at least one dense layer.(Kim [p. 2] "Fully connected layer with dropout and softmax output" [p. 5 §5] "we have described a series of experiments with convolutional neural networks built on top of word2vec […] a simple CNN with one layer of convolution performs remarkably well" fully connected layer interpreted as dense layer). Regarding claim 12, the combination of Bousmalis and Zhu teaches The method of claim 10. However, the combination of Bousmalis and Zhu doesn't explicitly teach wherein the classifier portion includes a reshaping layer. Kim, in the same field of endeavor, teaches the classifier portion includes a reshaping layer. ([Abstract] "convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks" [p. 5 §5] "we have described a series of experiments with convolutional neural networks built on top of word2vec […] a simple CNN with one layer of convolution performs remarkably well" FIG. 1 shows that the CNN classifier has multiple layers which reshape the input to an output of different shape). The combination of Bousmalis and Zhu as well as Kim are directed towards convolutional neural networks for classification. Therefore, the combination of Bousmalis and Zhu as well as Kim are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Bousmalis and Zhu with the teachings of Kim by substituting the classifier in Bousmalis/Zhu with the 1D convolutional feature extractor model of Kim. Kim provides as additional motivation for combination ([p. 5 §5] “a simple CNN with one layer of convolution performs remarkably well." See also Table 2 which compares results with other known models). Regarding claim 13, the combination of Bousmalis, Zhu, and Kim teaches The method of claim 12, wherein the classifier portion further includes at least one one-dimensional convolutional layer.(Kim [p. 5 §5] "we have described a series of experiments with convolutional neural networks built on top of word2vec […] a simple CNN with one layer of convolution performs remarkably well" See FIG. 1 of Kim which shows 1D convolution). Regarding claim 14, the combination of Bousmalis, Zhu, and Kim teaches The method of claim 13, wherein the classifier portion further includes at least one dropout layer.(Kim [p. 2 §2.1] "For regularization we employ dropout on the penultimate layer with a constraint on l2-norms of the weight vectors (Hinton et al., 2012). Dropout prevents co-adaptation of hidden units by randomly dropping out—i.e., setting to zero—a pro portion p of the hidden units during forward backpropagation" See also FIG. 1 of Kim). Regarding claim 15, the combination of Bousmalis, Zhu, and Kim teaches The method of claim 14, wherein the classifier portion further includes a global max pooling layer.(Kim [p. 2] "We then apply a max-over time pooling operation (Collobert et al., 2011) over the feature map and take the maximum value"). Regarding claim 16, the combination of Bousmalis, Zhu, and Kim teaches The method of claim 15, wherein the classifier portion further includes at least one dense layer.(Kim [p. 2] "Fully connected layer with dropout and softmax output" [p. 5 §5] "we have described a series of experiments with convolutional neural networks built on top of word2vec […] a simple CNN with one layer of convolution performs remarkably well" fully connected layer interpreted as dense layer). Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Tsai (“Adversarial domain separation and adaptation”, 2017) which is directed towards a multi-domain adaptation network with a shared encoder outputting to a classifier and autoencoder. PNG media_image4.png 658 678 media_image4.png Greyscale FIG. 1 of Tsai THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720. The examiner can normally be reached M-F 7:30am-5:00pm EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /SIDNEY VINCENT BOSTWICK/Examiner, Art Unit 2124 /MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124
Read full office action

Prosecution Timeline

Show 13 earlier events
Dec 19, 2025
Response Filed
Feb 18, 2026
Non-Final Rejection mailed — §103
Mar 10, 2026
Applicant Interview (Telephonic)
Mar 10, 2026
Examiner Interview Summary
Mar 13, 2026
Response Filed
May 06, 2026
Final Rejection mailed — §103
May 14, 2026
Examiner Interview Summary
May 14, 2026
Applicant Interview (Telephonic)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12626139
SECRET SOFTMAX FUNCTION CALCULATION SYSTEM, SECRET SOFTMAX FUNCTION CALCULATION APPARATUS, SECRET SOFTMAX FUNCTION CALCULATION METHOD, SECRET NEURAL NETWORK CALCULATION SYSTEM, SECRET NEURAL NETWORK LEARNING SYSTEM, AND PROGRAM
4y 3m to grant Granted May 12, 2026
Patent 12619815
Magnitude Invariant Multimodal Agent for Efficient Image-Text Interface Automation
1y 6m to grant Granted May 05, 2026
Patent 12561604
SYSTEM AND METHOD FOR ITERATIVE DATA CLUSTERING USING MACHINE LEARNING
4y 7m to grant Granted Feb 24, 2026
Patent 12547878
Highly Efficient Convolutional Neural Networks
2y 4m to grant Granted Feb 10, 2026
Patent 12536426
Smooth Continuous Piecewise Constructed Activation Functions
5y 7m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

6-7
Expected OA Rounds
51%
Grant Probability
89%
With Interview (+38.0%)
4y 5m (~9m remaining)
Median Time to Grant
High
PTA Risk
Based on 138 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month