Office Action Analysis: 18112917 — HYPER-EFFICIENT, PRIVACY-PRESERVING ARTIFICIAL INTELLIGENCE SYSTEM

Office Action

§101 §103 §112
DETAILED ACTION
Claims 1-20 are presented for examination.
This office action is in response to submission of application on 02/22/2023.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 4 and 14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 4 and 14 recite the limitation “wherein the encrypted bytes are represented as a neural embedding matrix.” A neural embedding matrix is a parameter of a model which defines the mapping between the model’s input and its embedding. It is unclear what it would mean for the encrypted bytes (i.e. the model’s input itself) to be represented as a neural embedding matrix. For examination purposes, this limitation will be interpreted as specifying that the encrypted bytes are represented as an embedding derived from a neural embedding matrix.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


	Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

	Claim 1:
Step 1: The claim is directed to a method, which falls within the statutory category of a
process.
Step 2A Prong 1: The claim is directed to an abstract idea. Specifically, the claim recites:
generating a private key and a public key for the user based on the user information; (Abstract idea – mental process. Generating private and public keys based on user information can practically be performed in the human mind or with the aid of pen and paper, for example, by setting the private key equal to the user’s numerical passcode and setting the public key equal to the result of a simple deterministic function applied to the private key (e.g. add 1 to each character). The courts have recognized that claims can recite a mental process even if they are claimed as being performed on a computer. See MPEP 2106.04(a)(2)(III).)
Step 2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination. Specifically, the claim recites the additional elements:
A computer-implemented method of training a machine learning model (This amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
receiving user information for a user; (Receiving user information amounts to adding insignificant extra-solution activity (necessary data gathering) to the judicial exception – see MPEP2106.05(g).)
receiving input bytes containing user-specific features; (Receiving input bytes amounts to adding insignificant extra-solution activity (necessary data gathering) to the judicial exception – see MPEP2106.05(g).)
feeding the input bytes, the private key, and the public key into a machine learning model; (Feeding data into a machine learning model amounts to adding insignificant extra-solution activity (necessary data gathering) to the judicial exception – see MPEP2106.05(g).)
training the machine learning model based on the received input bytes, the private key, and the public key; and (Generic training of a machine learning model based on received data is standard in the field of machine learning, and thus amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
generating a personalized machine learning model for the user based on the training of the machine learning model. (Generating a personalized machine learning model based on model training amounts to adding insignificant extra-solution activity to the judicial exception – see MPEP2106.05(g). Further, generating personalized models is well-understood, routine, and conventional in the field of machine learning, per Atrey: “Recent work in context prediction has focused on ML model personalization where a personalized model is learned for each individual user in order to tailor predictions or recommendations to a user’s mobile behavior” (Atrey et al., “Preserving Privacy in Personalized Models for Distributed Mobile Services”, pg. 1, Abstract). See MPEP 2106.05(d).)
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Specifically, the claim recites the additional elements:
A computer-implemented method of training a machine learning model (This amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
receiving user information for a user; (Receiving user information amounts to adding insignificant extra-solution activity (necessary data gathering) to the judicial exception – see MPEP2106.05(g). Further, the limitation is directed to receiving or transmitting data over a network, which the courts have found to be well-understood, routine, and conventional in the computer arts – see MPEP 2106.05(d).)
receiving input bytes containing user-specific features; (Receiving input bytes amounts to adding insignificant extra-solution activity (necessary data gathering) to the judicial exception – see MPEP2106.05(g). Further, the limitation is directed to receiving or transmitting data over a network, which the courts have found to be well-understood, routine, and conventional in the computer arts – see MPEP 2106.05(d).)
feeding the input bytes, the private key, and the public key into a machine learning model; (Feeding data into a machine learning model amounts to adding insignificant extra-solution activity to the judicial exception – see MPEP2106.05(g). Further, the limitation is directed to receiving or transmitting data over a network, which the courts have found to be well-understood, routine, and conventional in the computer arts – see MPEP 2106.05(d).)
training the machine learning model based on the received input bytes, the private key, and the public key; and (Generic training of a machine learning model based on received data is standard in the field of machine learning, and thus amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
generating a personalized machine learning model for the user based on the training of the machine learning model. (Generating a personalized machine learning model based on model training amounts to adding insignificant extra-solution activity to the judicial exception – see MPEP2106.05(g). Further, generating personalized models is well-understood, routine, and conventional in the field of machine learning, per Atrey: “Recent work in context prediction has focused on ML model personalization where a personalized model is learned for each individual user in order to tailor predictions or recommendations to a user’s mobile behavior” (Atrey et al., “Preserving Privacy in Personalized Models for Distributed Mobile Services”, pg. 1, Abstract). See MPEP 2106.05(d).)

Claims 2-20:
Claim 2 recites The computer-implemented method of claim 1, wherein the user information comprises user identification information and password associated with an application. This limitation merely qualifies the data received in claim 1 as user identification information and password data, and thus amounts to adding insignificant extra-solution activity (necessary data gathering) to the judicial exception – see MPEP2106.05(g). Further, the limitation is directed to receiving or transmitting data over a network, which the courts have found to be well-understood, routine, and conventional in the computer arts – see MPEP 2106.05(d). Therefore, the claim does not recite additional elements that are sufficient to amount to significantly more than the abstract idea.

Claim 3 recites The computer-implemented method of claim 1, prior to training the machine learning model, the method further comprises: converting the input bytes into encrypted bytes based on the private key and the public key. Converting input bytes to encrypted bytes based on the private and public keys can practically be performed in the human mind or with the aid of pen and paper (i.e. mental process), for example, by using a simple encryption scheme such as adding the value of each character of the public key to each corresponding value in the input data. See MPEP 2106.04(a)(2)(III). Therefore, the claim merges with the abstract idea recited in claim 1, and does not recite additional elements that are sufficient to amount to significantly more than the abstract idea.

Claim 4 recites The computer-implemented method of claim 3, wherein the encrypted bytes are represented as a neural embedding matrix. Representing encrypted bytes as an embedding derived from a neural embedding matrix can practically be performed in the human mind or with the aid of pen and paper (i.e. mental process), for example, by viewing the neural embedding matrix on a display and writing out the corresponding embedding for each encrypted input. See MPEP 2106.04(a)(2)(III). Therefore, the claim merges with the abstract idea recited in claim 3, and does not recite additional elements that are sufficient to amount to significantly more than the abstract idea.

Claim 5 recites The computer-implemented method of claim 3, wherein training the machine learning model comprises dynamic embedding of the encrypted bytes and performing one or more layers of linear or nonlinear neural applications. Performing layers of generic linear or nonlinear neural applications is standard in the field of machine learning, and thus amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea – see MPEP 2106.05(f). Dynamic embedding of the encrypted bytes amounts to adding insignificant extra-solution activity to the judicial exception – see MPEP2106.05(g). Further, dynamic embedding is well-understood, routine, and conventional in the field of machine learning, per Hofmann: “The meaning of a word can also vary across extralinguistic contexts such as time…and social space… To capture these phenomena, various types of dynamic word embeddings have been proposed…” (Hofmann et al., “Dynamic Contextualized Word Embeddings”, pg. 2, section 2.2).

Claim 6 recites The computer-implemented method of claim I, wherein training the machine learning model comprises optimizing parameters of the machine learning model to reflect the user-specific features. Optimizing model parameters based on the input data is generic training which is standard in the field of machine learning, and thus amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea – see MPEP 2106.05(f). Therefore, the claim does not recite additional elements that are sufficient to amount to significantly more than the abstract idea.

Claim 7 recites The computer-implemented method of claim 1, wherein obtaining a personalized machine learning model for the user comprises obtaining a first personalized machine learning model for a first user and obtaining a second personalized machine learning model for a second user. (Generating personalized machine learning models for multiple users amounts to adding insignificant extra-solution activity to the judicial exception – see MPEP2106.05(g). Further, generating personalized models is well-understood, routine, and conventional in the field of machine learning, per Atrey: “Recent work in context prediction has focused on ML model personalization where a personalized model is learned for each individual user in order to tailor predictions or recommendations to a user’s mobile behavior” (Atrey et al., “Preserving Privacy in Personalized Models for Distributed Mobile Services”, pg. 1, Abstract). See MPEP 2106.05(d). Therefore, the claim does not recite additional elements that are sufficient to amount to significantly more than the abstract idea.

Claim 8 recites The computer-implemented method of claim 7, wherein the first personalized machine learning model includes a first set of model parameters optimized for the first user and the second personalized machine learning model includes a second set of model parameters optimized for the second user. (Generating personalized machine learning models for multiple users by learning user-optimized parameters amounts to adding insignificant extra-solution activity to the judicial exception – see MPEP2106.05(g). Further, generating personalized models with user-optimized parameters is well-understood, routine, and conventional in the field of machine learning, per Atrey: “Recent work in context prediction has focused on ML model personalization where a personalized model is learned for each individual user in order to tailor predictions or recommendations to a user’s mobile behavior” (Atrey, pg. 1, Abstract). See MPEP 2106.05(d). Therefore, the claim does not recite additional elements that are sufficient to amount to significantly more than the abstract idea.

Claim 9 recites The computer-implemented method of claim 7, wherein, when the first user accesses the second personalized machine learning model without providing user information associated with the second user, the second personalized machine learning model generates an output with an accuracy below a threshold or does not generate an output. Limiting user access to machine learning model inference amounts to adding insignificant extra-solution activity to the judicial exception – see MPEP2106.05(g). Further, homomorphically encrypted inference, where the output of a model is encrypted and can only be generated via decryption with the intended user’s private key, is well-understood, routine, and conventional in the field of machine learning, per Chillotti: “Earlier works attempted to evaluate neural networks using fully homomorphic encryption. Cryptonets [12] was the first initiative towards this goal. They were able to perform a homomorphic inference… A number of subsequent works have adopted a similar approach and improved it in various directions.” (Chillotti et al., “Programmable Bootstrapping Enables Efficient Homomorphic Inference of Deep Neural Networks”, pg. 2, section 1).

Claim 10 recites The computer-implemented method of claim 1, prior to training the machine learning model, the method further comprises: receiving device information of a device intended to run the personalized machine learning model for the user; and training the machine learning model based on the input bytes, the private key, the public key, and the device information of the device. Receiving device information amounts to adding insignificant extra-solution activity (mere data gathering) to the judicial exception – see MPEP2106.05(g). Further, the limitation is directed to receiving or transmitting data over a network, which the courts have found to be well-understood, routine, and conventional in the computer arts – see MPEP 2106.05(d). Generic training of a machine learning model based on received data is standard in the field of machine learning, and thus amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).

Claims 11-20 are system claims containing substantially the same elements as method claims 1-10, respectively, and are rejected on the same grounds under 35 U.S.C. 101 as claims 1-10, respectively, mutatis mutandis. The additional components of A system for training a machine learning model, comprising: a processor; and a memory, coupled to the processor, configured to store executable instructions that, when executed by the processor, cause the processor to perform operations are interpreted as a general-purpose computer and mere instructions to apply the judicial exception on the computer. Therefore, the claims do not recite additional elements that are sufficient to amount to significantly more than the abstract idea.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

	Claims 1-4, 6, 11-14, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over 
Badawi et al. (hereinafter Badawi), “PrivFT: Private and Fast Text Classification with Homomorphic Encryption” in view of 
Khaleghi, Untied States Patent US 10735191 B1. 

Regarding Claim 1, 
Badawi teaches A computer-implemented method of training a machine learning model, comprising: (Pg. 9, section 5.2: “PrivFT is implemented in two libraries on two different hardware platforms. The first implementation uses Microsoft SEAL v3.3.0 and runs on CPU. The second implementation utilizes our GPU implementation of the CKKS scheme - described in the previous section and runs on NVIDIA-enabled GPUs.”)
generating a private key and a public key for the user [based on the user information]; (Pg. 5, section 3.4: “Given input data represented as real or complex numbers, we use a modified version of CKKS [14] for encryption                         
                            (
                            E
                            N
                            C
                            )
                        
                     and decryption                         
                            (
                            D
                            E
                            C
                            )
                        
                     as follows…                         
                            K
                            E
                            Y
                            G
                            E
                            N
                        
                    : generate: (1) secret key                         
                            s
                            ←
                            
                                
                                    X
                                
                                
                                    k
                                    e
                                    y
                                
                            
                            ∈
                            
                                
                                    R
                                
                                
                                    q
                                    L
                                
                            
                        
                    , and (2) public key                         
                            (
                            a
                            ,
                            b
                            )
                            ∈
                            
                                
                                    R
                                
                                
                                    q
                                    L
                                
                                
                                    2
                                
                            
                        
                    , where                         
                            a
                            ←
                            
                                
                                    X
                                
                                
                                    q
                                    L
                                
                            
                        
                     and                         
                            b
                            =
                            -
                            a
                            s
                            +
                            e
                        
                     with                         
                            e
                            ←
                            
                                
                                    X
                                
                                
                                    e
                                    r
                                    r
                                
                            
                        
                    .” A secret key is a private key.)
receiving input bytes containing user-specific features; (Pg. 2, section 1.1: “To give motivational use-cases for PrivFT, consider the inference and training as a service shown in Figure 1… In Figure 1b, Alice has a private dataset and wants to train a model on the cloud.” The dataset used for training (i.e. input bytes) is private to a user (i.e. contains user-specific features).)
feeding the input bytes, the private key, and the public key into a machine learning model; (Pg. 2, section 1.1: “She encrypts her dataset and sends to the cloud. The cloud runs the training algorithm and generates an encrypted model that is communicated back to Alice.” The training dataset (i.e. input bytes) is encrypted (based on the private and public keys) and sent to the cloud for model training (i.e. fed into a machine learning model).)
training the machine learning model based on the received input bytes, the private key, and the public key; and (Pg. 2, section 1.1: “We demonstrate how to train an effective model using an encrypted dataset with FHE [Fully Homomorphic Encryption].” The machine learning model is trained using the training dataset (i.e. input bytes), which is encrypted based on the private and public keys.)
generating a personalized machine learning model for the user based on the training of the machine learning model. (Pg. 2, section 1.1: “The cloud runs the training algorithm and generates an encrypted model that is communicated back to Alice. Alice can decrypt the model and use it for local inference.” The training algorithm generates a model based on the user’s data (i.e. a personalized model) and sends it to the user for decryption and use by the user.)
Badawi does not appear to explicitly disclose 
receiving user information for a user; 
generating keys based on the user information;
However, Khaleghi teaches receiving user information for a user; (Col. 15, lines 46-49: “At block 602, the user device may access local user authentication data (e.g., a password received from the user, biometric data of the user, and/or the like).” User authentication data is user information.)
generating keys based on the user information; (Col. 15, lines 49-55: “At block 604, a key is generated using the authentication data. At block 606, an authentication token is generated. The authentication token may be different than the authentication data and different than the key. For example the authentication [token] may be generated using a key derivation function, such as Password-Based Key Derivation Function…” A cryptographic key is generated based on the authentication data (i.e. user information).)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Badawi and Khaleghi. Badawi teaches homomorphically encrypted machine learning inference and training for text classification. Khaleghi teaches controlling access to sensitive data using cryptographic keys generated using a password-based key derivation function. One of ordinary skill would have motivation to combine Badawi and Khaleghi in order to “make it extremely difficult for an adverse party (e.g., a hacker) to determine the authentication token using brute force guessing. For example, a 90 bits password processed using 4500 PBKDF2 [Password-Based Key Derivation Function 2], may take trillions of years to guess, even with a high powered graphical processing unit,” (Khaleghi, col. 15, lines 58-63) thereby “enable[ing] communication of data and information, including multimedia data, among disparate systems in a secure and controlled manner” (Khaleghi, col. 17, lines 6-8).

Regarding Claim 2, Badawi and Khaleghi teach The computer-implemented method of claim 1, as shown above.
Khaleghi also teaches wherein the user information comprises user identification information and password associated with an application. (Col. 6, lines 63-67: “[T]he user may be authenticated a first time using a username and a password (e.g., within a session of limited duration), a time-limited token may be generated and provided in return by the protected resource, and the token may for further authentication…” A username is user identification information.)

Regarding Claim 3, Badawi and Khaleghi teach The computer-implemented method of claim 1, as shown above.
Badawi also teaches prior to training the machine learning model, the method further comprises: converting the input bytes into encrypted bytes based on the private key and the public key. (Pg. 2, section 1.1: “In Figure 1b, Alice has a private dataset and wants to train a model on the cloud. She encrypts her dataset and sends to the cloud. The cloud runs the training algorithm and generates an encrypted model that is communicated back to Alice.” Pg. 5, section 3.4: “Given input data represented as real or complex numbers, we use a modified version of CKKS [14] for encryption                         
                            (
                            E
                            N
                            C
                            )
                        
                     and decryption                         
                            (
                            D
                            E
                            C
                            )
                        
                     as follows…                         
                            K
                            E
                            Y
                            G
                            E
                            N
                        
                    : generate: (1) secret key                         
                            s
                            ←
                            
                                
                                    X
                                
                                
                                    k
                                    e
                                    y
                                
                            
                            ∈
                            
                                
                                    R
                                
                                
                                    q
                                    L
                                
                            
                        
                    , and (2) public key                         
                            (
                            a
                            ,
                            b
                            )
                            ∈
                            
                                
                                    R
                                
                                
                                    q
                                    L
                                
                                
                                    2
                                
                            
                        
                    , where                         
                            a
                            ←
                            
                                
                                    X
                                
                                
                                    q
                                    L
                                
                            
                        
                     and                         
                            b
                            =
                            -
                            a
                            s
                            +
                            e
                        
                     with                         
                            e
                            ←
                            
                                
                                    X
                                
                                
                                    e
                                    r
                                    r
                                
                            
                        
                    …                         
                            E
                            N
                            C
                            (
                            µ
                            )
                        
                    : given a plaintext message                         
                            (
                            µ
                            )
                        
                    , sample                         
                            u
                            ←
                            
                                
                                    X
                                
                                
                                    q
                                    L
                                
                            
                        
                     and                         
                            
                                
                                    e
                                
                                
                                    0
                                
                            
                            ,
                             
                            
                                
                                    e
                                
                                
                                    1
                                
                            
                            ←
                            
                                
                                    X
                                
                                
                                    e
                                    r
                                    r
                                
                            
                        
                    . Return ciphertext                         
                            c
                            t
                            =
                            (
                            
                                
                                    c
                                
                                
                                    0
                                
                            
                            ,
                             
                            
                                
                                    c
                                
                                
                                    1
                                
                            
                            )
                            =
                            (
                            a
                            v
                            +
                            µ
                            +
                            
                                
                                    e
                                
                                
                                    0
                                
                            
                            ,
                            b
                            v
                            +
                            
                                
                                    e
                                
                                
                                    1
                                
                            
                            )
                            ∈
                            
                                
                                    R
                                
                                
                                    q
                                    L
                                
                                
                                    2
                                
                            
                        
                    .                        
                             
                            D
                            E
                            C
                            (
                            c
                            t
                            )
                        
                    : given a ciphertext                         
                            c
                            t
                            ∈
                            
                                
                                    R
                                
                                
                                    q
                                    l
                                
                                
                                    2
                                
                            
                        
                    , return                         
                            µ
                            =
                            
                                
                                    c
                                
                                
                                    0
                                
                            
                            +
                            
                                
                                    s
                                    c
                                
                                
                                    1
                                
                            
                            ∈
                             
                            
                                
                                    R
                                
                                
                                    q
                                    l
                                
                            
                        
                    .” The training dataset (i.e. input bytes) is encrypted (i.e. converted to encrypted bytes) prior to being sent to the cloud for training. Encryption                         
                            E
                            N
                            C
                        
                     returns ciphertext computed based on the public key                         
                            (
                            a
                            ,
                            b
                            )
                        
                    , which can be decrypted using the secret (private) key                         
                            s
                        
                     (i.e. the encryption is based on the private and public keys).)

Regarding Claim 4, Badawi and Khaleghi teach The computer-implemented method of claim 3, as shown above.
Badawi also teaches wherein the encrypted bytes are represented as a neural embedding matrix. (Pg. 6, section 4.1: “we require the client to encode her text                         
                            r
                            1
                            ,
                            .
                            .
                            .
                            ,
                            r
                            w
                        
                     into a 1-hot vector                         
                            (
                            v
                            )
                        
                    … The client encrypts                         
                            v
                        
                     and sends it along with the number of words                         
                            w
                        
                     (in plaintext) to the server. By doing so, the server can compute the embedded representation                         
                            h
                        
                     (Step 2) by a vector-matrix multiplication                         
                            (
                            v
                            ·
                            H
                            )
                        
                     and a plaintext multiplication                         
                            (
                            H
                            M
                            U
                            L
                            P
                            L
                            A
                            I
                            N
                            )
                        
                     of the factor                         
                            1
                            /
                            w
                        
                    . We assume here that the embedding vectors in                         
                            H
                        
                     correspond to the words in the same order as indexed in the dictionary.” Matrix                         
                            H
                        
                     is a neural embedding matrix, and the embedded representation                         
                            h
                        
                     of the encrypted input                         
                            v
                        
                     (i.e. encrypted bytes) is derived from the embedding matrix                        
                             
                            H
                        
                    .)

Regarding Claim 6, Badawi and Khaleghi teach The computer-implemented method of claim 1, as shown above.
Badawi also teaches wherein training the machine learning model comprises optimizing parameters of the machine learning model to reflect the user-specific features. (Pg. 2, section 1.1: “In Figure 1b, Alice has a private dataset and wants to train a model on the cloud. She encrypts her dataset and sends to the cloud. The cloud runs the training algorithm and generates an encrypted model that is communicated back to Alice.” Pg. 7, section 4.2: “The training procedure is used to find the weights of the hidden and output layers                         
                            H
                        
                     and                         
                            O
                        
                    .” The model is trained to find the optimal weights of the hidden and output layers (i.e. optimize parameters of the machine learning model). Since the model is trained on the user’s private dataset, its learned parameters will necessarily reflect the user-specific features.)

Claims 11-14 and 16 are system claims containing substantially the same elements as method claims 1-4 and 6, respectively. Badawi and Khaleghi teach the elements of claims 1-4 and 6, as shown above.
Badawi also teaches A system for training a machine learning model, comprising: a processor; and a memory, coupled to the processor, configured to store executable instructions that, when executed by the processor, cause the processor to perform operations including: (Examiner notes that this limitation is interpreted as implementation of the disclosed method in a generic computing environment. Pg. 9, section 5.2: “PrivFT is implemented in two libraries on two different hardware platforms. The first implementation uses Microsoft SEAL v3.3.0 and runs on CPU. The second implementation utilizes our GPU implementation of the CKKS scheme - described in the previous section and runs on NVIDIA-enabled GPUs.”)

Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Badawi in view of Khaleghi and further in view of
Yao et al. (hereinafter Yao), “Dynamic Word Embeddings for Evolving Semantic Discovery”.

Regarding Claim 5, Badawi and Khaleghi teach The computer-implemented method of claim 3, as shown above.
Badawi also teaches wherein training the machine learning model comprises [dynamic] embedding of the encrypted bytes and performing one or more layers of linear or nonlinear neural applications. ((Pg. 6, section 4.1: “we require the client to encode her text                         
                            r
                            1
                            ,
                            .
                            .
                            .
                            ,
                            r
                            w
                        
                     into a 1-hot vector                         
                            (
                            v
                            )
                        
                    … The client encrypts                         
                            v
                        
                     and sends it along with the number of words                         
                            w
                        
                     (in plaintext) to the server. By doing so, the server can compute the embedded representation                         
                            h
                        
                     (Step 2) by a vector-matrix multiplication                         
                            (
                            v
                            ·
                            H
                            )
                        
                     and a plaintext multiplication                         
                            (
                            H
                            M
                            U
                            L
                            P
                            L
                            A
                            I
                            N
                            )
                        
                     of the factor                         
                            1
                            /
                            w
                        
                    . We assume here that the embedding vectors in                         
                            H
                        
                     correspond to the words in the same order as indexed in the dictionary… To compute the class scores                         
                            s
                        
                    , a slightly different approach is used to multiply                         
                            h
                        
                     with the output matrix                         
                            O
                        
                    . Since                        
                             
                            h
                        
                     is non-packed, we multiply each component                         
                            
                                
                                    h
                                
                                
                                    i
                                
                            
                        
                     by the rows of                         
                            O
                        
                     to generate                         
                            n
                        
                     packed ciphertexts. This step requires                         
                            n
                        
                                             
                            H
                            M
                            U
                            L
                            P
                            L
                            A
                            I
                            N
                        
                     operations. The resultant ciphertexts are summed to generate a packed ciphertext encrypting                         
                            s
                        
                    , which is communicated back to the client who can decrypt and find the best class by evaluating the                         
                            s
                            o
                            f
                            t
                            m
                            a
                            x
                        
                     function in plaintext.” Multiplying the encrypted input                         
                            v
                        
                     by the embedding matrix                         
                            H
                        
                     to derive the embedded representation                         
                            h
                        
                     amounts to embedding of the encrypted bytes, and multiplying the embedded representation                         
                            h
                        
                     by the output layer matrix                         
                            O
                        
                     amounts to performing a layer of linear neural applications.)
Badawi and Khaleghi do not appear to explicitly disclose wherein training the machine learning model comprises dynamic embedding
However, Yao teaches wherein training the machine learning model comprises dynamic embedding (Pg. 1, abstract: “In this paper, we develop a dynamic statistical model to learn time-aware word vector representation. We propose a model that simultaneously learns time-aware embeddings and solves the resulting ‘alignment problem’.”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Badawi, Khaleghi, and Yao. Badawi teaches homomorphically encrypted machine learning inference and training for text classification. Khaleghi teaches controlling access to sensitive data using cryptographic keys generated using a password-based key derivation function. Yao teaches learning dynamic, time-aware word embeddings to model the evolution of word associations and meanings over time. One of ordinary skill would have motivation to combine Badawi, Khaleghi, and Yao because “understanding and tracking word evolution is useful for time-aware knowledge extraction tasks (e.g., public sentiment analysis), and other applications in text mining” (Yao, pg. 1, section 1). 
Read full office action
HYPER-EFFICIENT, PRIVACY-PRESERVING ARTIFICIAL INTELLIGENCE SYSTEM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

HYPER-EFFICIENT, PRIVACY-PRESERVING ARTIFICIAL INTELLIGENCE SYSTEM

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email