Office Action Analysis: 18194336 — METHOD AND APPARATUS FOR JOINT TRAINING LOGISTIC REGRESSION MODEL

Examiner Intelligence

AHMED, SYED RAYHAN View full profile →
Grants 71% — above average
Career Allow Rate
5 granted / 7 resolved
+16.4% vs TC avg
Strong +50% interview lift
Without
With
+50.0%
Interview Lift
resolved cases with interview
Typical timeline
4y 4m
Avg Prosecution
32 currently pending
Career history
39
Total Applications
across all art units
Statute-Specific Performance

§101
32.6%
-7.4% vs TC avg
§103
50.0%
+10.0% vs TC avg
§102
6.7%
-33.3% vs TC avg
§112
9.4%
-30.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 7 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION
This Office Action is sent in response to the Applicant’s Communication received on 03/31/2023 for application number 18/194,336. The Office hereby acknowledges receipt of the following and placed of record in file: Specification, Drawings, Abstract, Oath/Declaration, IDS, and Claims.
Claims 1-20 are amended. 
Claims 1-20 are pending.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Claims 1-9 are directed to a method. Claim 10-16 are directed to a non-transitory computer-readable medium. Claims 17-20 are directed to a system. Therefore, all claims are directed to one of the four statutory categories of patent eligible subject matter.

Claim 1
Step 2A Prong 1:
	Claim 1 recites:
“performing, by a first party of two parties, masking on three first-party fragments corresponding to three types of training data for a logistic regression model joint training by, respectively, using first fragments of three random numbers in a first fragment of a random array to obtain three first mask fragments;” Performing masking on three first-party fragments by, respectively, using first fragments of three random numbers in a first fragment of a random array to obtain three first mask fragments is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“sending, by the first party of two parties, the three first mask fragments to a second party;” Sending the three first mask fragments to a second party is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“constructing, by the first party of two parties, three pieces of mask data corresponding to the three types of training data by using the three first mask fragments and three second mask fragments received from the second party;” Constructing three pieces of mask data by using the three first mask fragments and three second mask fragments received from the second party is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“performing, by the first party of two parties, a first calculation based on the three pieces of mask data and the first fragment of the random array to obtain a first gradient fragment for updating the first-party fragment of the model parameter, wherein the first calculation is determined based on a Taylor expansion of a gradient calculation of a logistic regression model;” Performing a first calculation based on the three pieces of mask data and the first fragment of the random array to obtain a first gradient fragment for updating the first-party fragment of the model parameter is a claim that merely uses textual replacements for particular equations, and is therefore a mathematical concept.

Step 2A Prong Two 
This judicial exception is not integrated into a practical application because the additional elements are as follows:
“A computer-implemented method;” Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“wherein the logistic regression model joint training comprises the three types of training data: a sample characteristic, a sample label, and a model parameter, and wherein each of the three types of training data is split into fragments that are distributed between the two parties;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)).
“wherein the first fragment of the random array is a fragment, sent by a third party to the first party, of two-party fragments that are obtained by splitting values in the random array generated by the third party;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)).

Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements are as follows:
“A computer-implemented method;” Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f)) which cannot provide an inventive concept.
“wherein the logistic regression model joint training comprises the three types of training data: a sample characteristic, a sample label, and a model parameter, and wherein each of the three types of training data is split into fragments that are distributed between the two parties;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)) and cannot provide an inventive concept.
“wherein the first fragment of the random array is a fragment, sent by a third party to the first party, of two-party fragments that are obtained by splitting values in the random array generated by the third party;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)) and cannot provide an inventive concept.
Even when considered in combination, these additional elements represent mere instructions to apply an exception and therefore do not provide an inventive concept. The claim is ineligible.

Claim 2
Step 2A Prong 1:
	Claim 2 recites:
“before obtaining the three first mask fragments: splitting the sample characteristic into a corresponding first-party fragment and a corresponding second-party fragment;” Splitting the sample characteristic into a corresponding first-party fragment and a corresponding second-party fragment is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“sending the corresponding second-party fragment to the second party;” Sending the corresponding second-party fragment to the second party is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.

Step 2A Prong Two 
This judicial exception is not integrated into a practical application because the additional elements are as follows:
“the first party holds the sample characteristic and the second party holds the sample label;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)).
“using secret sharing technology;” Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“receiving, from the second party, a first-party fragment obtained by splitting the sample label;” Mere data gathering recited at a high level of generality, and thus are insignificant extra-solution activity (MPEP 2106.05(g)).

Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements are as follows:
“the first party holds the sample characteristic and the second party holds the sample label;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)) and cannot provide an inventive concept.
“using secret sharing technology;” Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f)) which cannot provide an inventive concept.
“receiving, from the second party, a first-party fragment obtained by splitting the sample label;” Mere data gathering recited at a high level of generality, and thus are insignificant extra-solution activity. See MPEP 2106.05(g). The additional element of “receiving” does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of receiving steps amounts to no more than mere data gathering. This element amounts to receiving data over a network and are well-understood, routine, conventional activity. See MPEP 2106.05(d), subsection II (i). This cannot provide an inventive concept.
Even when considered in combination, these additional elements represent mere instructions to apply an exception and therefore do not provide an inventive concept. The claim is ineligible.

Claim 3
Step 2A Prong 1:
	Claim 3 recites:
“before obtaining the three first mask fragments: after initializing, as an initialized model parameter, the model parameter: splitting the model parameter into a corresponding first-party fragment and a corresponding second-party fragment;” Splitting the model parameter into a corresponding first-party fragment and a corresponding second-party fragment is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“sending the corresponding second-party fragment to the second party;” Sending the corresponding second-party fragment to the second party is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.

Step 2A Prong Two 
This judicial exception is not integrated into a practical application because the additional elements are as follows:
“receiving, from the second party, a first-party fragment obtained by splitting the initialized model parameter by using the secret sharing technology;” Mere data gathering recited at a high level of generality, and thus are insignificant extra-solution activity (MPEP 2106.05(g)).

Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements are as follows:
“receiving, from the second party, a first-party fragment obtained by splitting the initialized model parameter by using the secret sharing technology;” Mere data gathering recited at a high level of generality, and thus are insignificant extra-solution activity. See MPEP 2106.05(g). The additional element of “receiving” does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of receiving steps amounts to no more than mere data gathering. This element amounts to receiving data over a network and are well-understood, routine, conventional activity. See MPEP 2106.05(d), subsection II (i). This cannot provide an inventive concept.
Even when considered in combination, these additional elements represent mere instructions to apply an exception and therefore do not provide an inventive concept. The claim is ineligible.
	
Claim 4
Step 2A Prong 1:
	Claim 4 recites:
“for any type of training data, performing masking on a first-party fragment of the type of training data by using a first fragment of a random number having a same dimension as the type of training data to obtain a corresponding first mask fragment;” Performing masking on a first-party fragment of the type of training data by using a first fragment of a random number is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.

Step 2A Prong Two and Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B. Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. The claim is ineligible.

Claim 5
Step 2A Prong 1:
	Claim 5 recites:
“for any type of training data, constructing corresponding mask data by using a first mask fragment and a second mask fragment of the type of training data;” Constructing corresponding mask data by using a first mask fragment and a second mask fragment is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.

Step 2A Prong Two and Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B. Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. The claim is ineligible.
Claim 6
Step 2A Prong 1:
	Claim 6 recites:
“after constructing the three pieces of mask data corresponding to the three types of training data and before obtaining the first gradient fragment: determining a first product mask fragment corresponding to a product result of the second random number and the characteristic mask data based on a first fragment of the second random number, the characteristic mask data, and a first fragment of the fourth random number;” Determining a first product mask fragment is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“sending the first product mask fragment to the second party;” Sending the first product mask fragment to the second party is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“constructing product mask data corresponding to the product result by using the first product mask fragment and a second product mask fragment corresponding to the product result received from the second party; Constructing product mask data corresponding to the product result by using the first product mask fragment and a second product mask fragment is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“performing, by the first party of two parties, a first calculation based on the three pieces of mask data and the first fragment of the random array comprises: further performing the first calculation based on the product mask data;” Performing the first calculation based on the product mask data is a claim that merely uses textual replacements for particular equations, and is therefore a mathematical concept.

Step 2A Prong Two 
This judicial exception is not integrated into a practical application because the additional elements are as follows:
“the random array further comprises a fourth random number; the three random numbers comprise a second random number corresponding to the model parameter; the three pieces of mask data comprise characteristic mask data corresponding to the sample characteristic;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)).

Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements are as follows:
“the random array further comprises a fourth random number; the three random numbers comprise a second random number corresponding to the model parameter; the three pieces of mask data comprise characteristic mask data corresponding to the sample characteristic;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)) and cannot provide an inventive concept.
Even when considered in combination, these additional elements represent mere instructions to apply an exception and therefore do not provide an inventive concept. The claim is ineligible.

Claim 7
Step 2A Prong 1:
	Claim 7 recites:
“the plurality of additional values are values obtained by the third party by performing an operation based on the three random numbers;” Performing an operation based on the three random numbers is a claim that merely uses textual replacements for particular equations, and is therefore a mathematical concept.
“performing a first calculation based on the three pieces of mask data and the first fragment of the random array to obtain a first gradient fragment comprises: calculating gradient mask data corresponding to a training gradient based on the three pieces of mask data;” Calculating gradient mask data corresponding to a training gradient based on the three pieces of mask data is a claim that merely uses textual replacements for particular equations, and is therefore a mathematical concept.
“calculating a first removal fragment for a mask in the gradient mask data based on the three pieces of mask data, the first fragments of three random numbers, and a first fragment of the plurality of additional values;” Calculating a first removal fragment for a mask in the gradient mask data based on the three pieces of mask data, the first fragments of three random numbers, and a first fragment of the plurality of additional values is a claim that merely uses textual replacements for particular equations, and is therefore a mathematical concept.
“determining the first removal fragment as the first gradient fragment;” Determining the first removal fragment as the first gradient fragment is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.

Step 2A Prong Two 
This judicial exception is not integrated into a practical application because the additional elements are as follows:
“the random array further comprises a plurality of additional values;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)).
“performing de-masking on the gradient mask data by using the first removal fragment to obtain the first gradient fragment;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)).

Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements are as follows:
“the random array further comprises a plurality of additional values;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)) and cannot provide an inventive concept.
“performing de-masking on the gradient mask data by using the first removal fragment to obtain the first gradient fragment;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)) and cannot provide an inventive concept.
Even when considered in combination, these additional elements represent mere instructions to apply an exception and therefore do not provide an inventive concept. The claim is ineligible.

Claim 8
Step 2A Prong 1:
	Claim 8 recites:
“after obtaining the first gradient fragment: subtracting a product of a predetermined learning rate and the first gradient fragment from the first-party fragment of the model parameter as an updated first-party fragment of the model parameter;” Subtracting a product of a predetermined learning rate and the first gradient fragment from the first-party fragment of the model parameter as an updated first-party fragment of the model parameter is a claim that merely uses textual replacements for particular equations, and is therefore a mathematical concept.

Step 2A Prong Two and Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B. Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. The claim is ineligible.
	
Claim 9
Step 2A Prong 1:
	Claim 9 recites:
“performing, by a first party of two parties, masking on three first-party fragments corresponding to three types of training data for a logistic regression model joint training by, respectively, using first fragments of three random numbers in a first fragment of a random array to obtain three first mask fragments;” Performing masking on three first-party fragments by, respectively, using first fragments of three random numbers in a first fragment of a random array to obtain three first mask fragments is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“sending, by the first party of two parties, the three first mask fragments to a second party;” Sending the three first mask fragments to a second party is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“constructing, by the first party of two parties, three pieces of mask data corresponding to the three types of training data by using the three first mask fragments and three second mask fragments received from the second party;” Constructing three pieces of mask data by using the three first mask fragments and three second mask fragments received from the second party is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“performing, by the first party of two parties, a first calculation based on the three pieces of mask data and the first fragment of the random array to obtain a first gradient fragment for updating the first-party fragment of the model parameter, wherein the first calculation is determined based on a Taylor expansion of a gradient calculation of a logistic regression model;” Performing a first calculation based on the three pieces of mask data and the first fragment of the random array to obtain a first gradient fragment for updating the first-party fragment of the model parameter is a claim that merely uses textual replacements for particular equations, and is therefore a mathematical concept.

Step 2A Prong Two 
This judicial exception is not integrated into a practical application because the additional elements are as follows:
“A non-transitory, computer-readable medium storing one or more instructions executable by a computer system to perform operations;” Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“wherein the logistic regression model joint training comprises the three types of training data: a sample characteristic, a sample label, and a model parameter, and wherein each of the three types of training data is split into fragments that are distributed between the two parties;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)).
“wherein the first fragment of the random array is a fragment, sent by a third party to the first party, of two-party fragments that are obtained by splitting values in the random array generated by the third party;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)).

Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements are as follows:
“A non-transitory, computer-readable medium storing one or more instructions executable by a computer system to perform operations;” Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f)) which cannot provide an inventive concept.
“wherein the logistic regression model joint training comprises the three types of training data: a sample characteristic, a sample label, and a model parameter, and wherein each of the three types of training data is split into fragments that are distributed between the two parties;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)) and cannot provide an inventive concept.
“wherein the first fragment of the random array is a fragment, sent by a third party to the first party, of two-party fragments that are obtained by splitting values in the random array generated by the third party;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)) and cannot provide an inventive concept.
Even when considered in combination, these additional elements represent mere instructions to apply an exception and therefore do not provide an inventive concept. The claim is ineligible.

Claims 10-16 are non-transitory computer-readable medium claims that recite similar limitations to method claims 2-8, respectively. Therefore, the rejection of claims 10-16 follow the same rationale as the rejection for claims 2-8, respectively. 

Claim 17
Step 2A Prong 1:
	Claim 17 recites:
“performing, by a first party of two parties, masking on three first-party fragments corresponding to three types of training data for a logistic regression model joint training by, respectively, using first fragments of three random numbers in a first fragment of a random array to obtain three first mask fragments;” Performing masking on three first-party fragments by, respectively, using first fragments of three random numbers in a first fragment of a random array to obtain three first mask fragments is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“sending, by the first party of two parties, the three first mask fragments to a second party;” Sending the three first mask fragments to a second party is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“constructing, by the first party of two parties, three pieces of mask data corresponding to the three types of training data by using the three first mask fragments and three second mask fragments received from the second party;” Constructing three pieces of mask data by using the three first mask fragments and three second mask fragments received from the second party is an action that can be performed mentally with the aid of pen and paper, and is therefore a mental process.
“performing, by the first party of two parties, a first calculation based on the three pieces of mask data and the first fragment of the random array to obtain a first gradient fragment for updating the first-party fragment of the model parameter, wherein the first calculation is determined based on a Taylor expansion of a gradient calculation of a logistic regression model;” Performing a first calculation based on the three pieces of mask data and the first fragment of the random array to obtain a first gradient fragment for updating the first-party fragment of the model parameter is a claim that merely uses textual replacements for particular equations, and is therefore a mathematical concept.

Step 2A Prong Two 
This judicial exception is not integrated into a practical application because the additional elements are as follows:
“A computer-implemented system, comprising: one or more computers; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions that, when executed by the one or more computers, perform one or more operations;” Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“wherein the logistic regression model joint training comprises the three types of training data: a sample characteristic, a sample label, and a model parameter, and wherein each of the three types of training data is split into fragments that are distributed between the two parties;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)).
“wherein the first fragment of the random array is a fragment, sent by a third party to the first party, of two-party fragments that are obtained by splitting values in the random array generated by the third party;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)).

Step 2B:
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements are as follows:
“A computer-implemented system, comprising: one or more computers; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions that, when executed by the one or more computers, perform one or more operations;” Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea (MPEP 2106.05(f)) which cannot provide an inventive concept.
“wherein the logistic regression model joint training comprises the three types of training data: a sample characteristic, a sample label, and a model parameter, and wherein each of the three types of training data is split into fragments that are distributed between the two parties;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)) and cannot provide an inventive concept.
“wherein the first fragment of the random array is a fragment, sent by a third party to the first party, of two-party fragments that are obtained by splitting values in the random array generated by the third party;” The limitation amounts to merely indicating a field of use or technological environment in which to apply a judicial exception. This does not amount to significantly more than the exception itself (MPEP 2106.05(h)) and cannot provide an inventive concept.
Even when considered in combination, these additional elements represent mere instructions to apply an exception and therefore do not provide an inventive concept. The claim is ineligible.

Claims 18-20 are system claims that recite similar limitations to method claims 2-4, respectively. Therefore, the rejection of claims 18-20 follow the same rationale as the rejection for claims 2-4, respectively. 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 4-9, 12-17 and 20 are rejected under 35 U.S.C. 102(a)(1) and 102(a)(2) as being anticipated by Mohassel et al. (WO 2018174873 A1), hereinafter M1.

Regarding claim 1, M1 teaches,
A computer-implemented method [Para 0013, Embodiments of the present invention provide methods, apparatuses, and s stems for implementing privacy-preserving machine learning], comprising: performing, by a first party of two parties [Para 0046, a set of clients C.sub.t, ..., C.sub.w want to train various models on their joint data. No assumptions are made on how the data is distributed among the clients. In particular, the data can be horizontally or vertically partitioned, or be secret-shared among the clients, e.g., as part of a previous computation. Thus, a database table can be distributed among clients before any training of a machine learning model starts. For example, some client may have some rows of the database table while another client has other rows], masking on three first-party fragments corresponding to three types of training data for a logistic regression model joint training by, respectively, using first fragments of three random numbers in a first fragment of a random array to obtain three first mask fragments [Para 0123, Given two shared matrices (A) and (B)… we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l, U has the same dimension as A, V has the same dimension as B… The idea of this generalization is that each element in matrix A is always masked by the same random element in U, while it is multiplied by different elements in B in the matrix multiplication], 
wherein the logistic regression model joint training comprises the three types of training data: a sample characteristic (Para 0078, x), a sample label (Para 0078, y), and a model parameter (Para 0078, w; Para 0125, U) [Para 0078, The mini-batch SGD algorithm for logistic regression updates the coefficients in each iteration as follows:                         
                            w
                            ≔
                            w
                            -
                            
                                    1
                                
                                            B
                                        
                            a
                            
                                    X
                                
                                    B
                                
                                    T
                                
                            ×
                            
                                    f
                                    
                                                    X
                                                
                                                    B
                                                
                                            ×
                                            w
                                        
                                    -
                                    
                                            Y
                                        
                                            B
                                        
                    ], and 
wherein each of the three types of training data is split into fragments that are distributed between the two parties [Para 0046, a database table can be distributed among clients before any training of a machine learning model starts. For example, some client may have some rows of the database table while another client has other rows. This is called horizontal partitioning. Or, one client may have some columns while others may have other columns. This is referred to as vertical partitioning]; 
sending, by the first party of two parties, the three first mask fragments to a second party, wherein the first fragment of the random array is a fragment, sent by a third party to the first party, of two-party fragments that are obtained by splitting values in the random array generated by the third party [Para 0123, Para 0123, Si computes (E)i = (A)I -(U)i and (F)i = (Β)i + (V)i and sends them to the other server. The idea of this generalization is that each element in matrix A is always masked by the same random element in U, while it is multiplied by different elements in B in the matrix multiplication; Para 0124, Applying the technique to linear regression, in each iteration, we assume the set of mini-batch indices B is public, and perform the update: (w) := (w) -                         
                            
                                    1
                                
                                            B
                                        
                            a
                            
                                    M
                                    u
                                    l
                                
                                    A
                                
                            (
                            
                                    X
                                
                                    B
                                
                                    T
                                
                            ,
                             
                                    M
                                    u
                                    l
                                
                                    A
                                
                                                    X
                                                
                                                    B
                                                
                                    ,
                                     
                                            w
                                        
                            -
                            
                                            Y
                                        
                                            B
                                        
                            )
                        
                    . We further observe that one data sample will be used several times in different epochs, yet it suffices to mask it by the same random multiplication triplets; Para 0156, masked multiple times by different random vectors for each inner products. These masked values are transferred between the two parties in the secure computation protocol. In particular, the overhead compared to the protocols in Section IV is for linear and logistic regressions; Para 0059, a target ideal functionality Fml for machine learning protocols involving a trusted 3rd party can be defined for a system comprising clients C1; : : : ;Cm and servers S0; S1. For uploading data, input xi from Ci can be stored internally at the trusted third party. For computation, after input of function F from S0 or S1, (y1; : : : ; ym)=f(x1; : : : ; xm) can be computed, and yi sent to Ci. This step can be repeated multiple times with different functions]; 
constructing, by the first party of two parties, three pieces of mask data corresponding to the three types of training data by using the three first mask fragments and three second mask fragments received from the second party [Para 0125, the multiplication triplets (U), (V), (Z), (V’), (Ζ') are precomputed with the following property; U is an                         
                            n
                            ×
                            d
                        
                     matrix to mask the data X, V is a                         
                            d
                            ×
                            t
                        
                     matrix, each column of which is used to mask w in one iteration (forward propagation), and V’ is a                         
                            B
                            ×
                            t
                        
                     matrix wherein each column is used to mask the difference vector Y*-Y in one iteration (backward propagation). We then let                         
                            Z
                            
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            i
                                        
                            ×
                            V
                            
                                    i
                                
                            a
                            n
                            d
                             
                                    Z
                                
                                    '
                                
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            t
                                        
                                    T
                                
                            ×
                            
                                    V
                                
                                    '
                                
                                    i
                                
                            f
                            o
                            r
                             
                            i
                            =
                            1
                            ,
                            …
                            t
                        
                     where M[i] denotes the ith column of the matrix M… One will notice the re-use of U, and thus the two sets of triplets are not independent of each other, but instead share a matrix; Para 0158, privacy preserving machine learning with client-aided multiplication triplets generation]; and 
performing, by the first party of two parties, a first calculation based on the three pieces of mask data and the first fragment of the random array to obtain a first gradient fragment for updating the first-party fragment of the model parameter, wherein the first calculation is determined based on a Taylor expansion (Para 0163, polynomial approximation with different degrees… we select as many points on the logistic function as the degree of the polynomial) of a gradient calculation of a logistic regression model [Para 0123, we generalize the addition and multiplication operations on share values to shared matrices. Matrices are shared by applying ShrA to every element. Given two shared matrices (A) and (B), matrix addition can be computed non- interactively by letting (C)i = (A)i + (B)i for i                         
                            ∈
                        
                     {0,1}. To multiply two shared matrices, instead of using independent multiplication triplets (e.g., just multiplying two numbers), we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l , U has the same dimension as A, V has the same dimension as B… The idea of this generalization is that each element in matrix A is always masked by the same random element in U, while it is multiplied by different elements in B in the matrix multiplication; Para 0161, The polynomial might be close to the logistic function in certain intervals, but the tails are unbounded. If a data sample yields a very large input u to the activation function, f(u) will be far beyond the [0, 1] interval which affects accuracy of the model significantly in the backward propagation; Para 0162, computing the backward propagation can be performed in a variety of ways. For example, embodiments can use the same update function as the logistic function; Para 0163, compare the accuracy of the produced model using the new activation function and polynomial approximation with different degrees… we select as many points on the logistic function as the degree of the polynomial; Para 0197, The error term can include a gradient, which can correspond to all or part of a backward propagation term].

Regarding claim 4, M1 teaches the limitations of claim 1 including performing masking on three first-party fragments corresponding to the three types of training data by, respectively, using first fragments of three random numbers to obtain three first mask fragments (para 0123, as shown above).

M1 further teaches,
for any type of training data, performing masking on a first-party fragment of the type of training data by using a first fragment of a random number (Para 0123, U and V is uniformly random in Z2l) having a same dimension as the type of training data (Para 0123, U has the same dimension as A, V has the same dimension as B) to obtain a corresponding first mask fragment [Para 0123, Embodiments can use the mini-batch and vectorization techniques discussed in Section IT- A (see Equation 2). To achieve this, we generalize the addition and multiplication operations on share values to shared matrices. Matrices are shared by applying ShrA to every element. Given two shared matrices (A) and (B), matrix addition can be computed non- interactively by letting (C)i = (A)i + (B)i for i                         
                            ∈
                        
                     {0,1}. To multiply two shared matrices, instead of using independent multiplication triplets (e.g., just multiplying two numbers), we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l , U has the same dimension as A, V has the same dimension as B… The idea of this generalization is that each element in matrix A is always masked by the same random element in U, while it is multiplied by different elements in B in the matrix multiplication].

Regarding claim 5, M1 teaches the limitations of claim 1 including constructing three pieces of mask data corresponding to the three types of training data by using the three first mask fragments and three second mask fragments received from the second party (paras 0125 and 0158, as shown above).

M1 further teaches,
for any type of training data (Para 0123, shared matrices (A) and (B)), constructing corresponding mask data (Para 0123, Both servers reconstruct E and F) by using a first mask fragment and a second mask fragment of the type of training data [Para 0123, Embodiments can use the mini-batch and vectorization techniques discussed in Section IT- A (see Equation 2). To achieve this, we generalize the addition and multiplication operations on share values to shared matrices. Matrices are shared by applying ShrA to every element. Given two shared matrices (A) and (B), matrix addition can be computed non- interactively by letting (C)i = (A)i + (B)i for i                         
                            ∈
                        
                     {0,1}. To multiply two shared matrices, instead of using independent multiplication triplets (e.g., just multiplying two numbers), we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l , U has the same dimension as A, V has the same dimension as B, and                         
                            z
                            =
                            U
                            ×
                            V
                            m
                            o
                            d
                             
                                    2
                                
                                    l
                                
                    . Si computes (E)i = (A)i -(U)i and (F)i = (Β)i + (V)i and sends them to the other server. Both servers reconstruct E and F].

Regarding claim 6, M1 teaches the limitations of claim 1 including performing, by the first party of two parties, a first calculation based on the three pieces of mask data and the first fragment of the random array (paras 0123 and 0163, as shown above).

M1 further teaches,
the random array further comprises a fourth random number (Para 0124, random matrix (U); Para 0125, re-use of U); the three random numbers comprise a second random number corresponding to the model parameter; the three pieces of mask data comprise characteristic mask data (Para 0125, V’) corresponding to the sample characteristic (Para 0124, data samples (X)) [Para 0123, we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l , U has the same dimension as A, V has the same dimension as B, and                         
                            Z
                            =
                            U
                            ×
                            V
                            m
                            o
                            d
                             
                                    2
                                
                                    l
                                
                    ; Para 0124, Therefore, in the offline phase, one shared nxd random matrix (U) is generated to mask the data samples (X); Para 0125, the multiplication triplets (U), (V), (Z), (V’), (Ζ') are precomputed with the following property; U is an                         
                            n
                            ×
                            d
                        
                     matrix to mask the data X, V is a                         
                            d
                            ×
                            t
                        
                     matrix, each column of which is used to mask w in one iteration (forward propagation), and V’ is a                         
                            B
                            ×
                            t
                        
                     matrix wherein each column is used to mask the difference vector Y*-Y in one iteration (backward propagation). We then let                         
                            z
                            
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            i
                                        
                            ×
                            V
                            
                                    i
                                
                            a
                            n
                            d
                             
                                    Z
                                
                                    '
                                
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            t
                                        
                                    T
                                
                            ×
                            
                                    V
                                
                                    '
                                
                                    i
                                
                            f
                            o
                            r
                             
                            i
                            =
                            1
                            ,
                            …
                            t
                        
                     where M[i] denotes the ith column of the matrix M… One will notice the re-use of U, and thus the two sets of triplets are not independent of each other, but instead share a matrix]; and 
after constructing the three pieces of mask data (Para 0125, U, V, and V’) corresponding to the three types of training data and before obtaining the first gradient fragment: determining a first product mask fragment corresponding to a product result of the second random number and the characteristic mask data based on a first fragment of the second random number, the characteristic mask data, and a first fragment of the fourth random number (                        
                            Z
                            
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            i
                                        
                            ×
                            V
                            
                                    i
                                
                            a
                            n
                            d
                             
                                    Z
                                
                                    '
                                
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            t
                                        
                                    T
                                
                            ×
                            
                                    V
                                
                                    '
                                
                                    i
                                
                            f
                            o
                            r
                             
                            i
                            =
                            1
                            ,
                            …
                            t
                        
                    ) [Para 0125, the multiplication triplets (U), (V), (Z), (V’), (Ζ') are precomputed with the following property; U is an                         
                            n
                            ×
                            d
                        
                     matrix to mask the data X, V is a                         
                            d
                            ×
                            t
                        
                     matrix, each column of which is used to mask w in one iteration (forward propagation), and V’ is a                         
                            B
                            ×
                            t
                        
                     matrix wherein each column is used to mask the difference vector Y*-Y in one iteration (backward propagation). We then let                         
                            Z
                            
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            i
                                        
                            ×
                            V
                            
                                    i
                                
                            a
                            n
                            d
                             
                                    Z
                                
                                    '
                                
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            t
                                        
                                    T
                                
                            ×
                            
                                    V
                                
                                    '
                                
                                    i
                                
                            f
                            o
                            r
                             
                            i
                            =
                            1
                            ,
                            …
                            t
                        
                     where M[i] denotes the ith column of the matrix M… One will notice the re-use of U, and thus the two sets of triplets are not independent of each other, but instead share a matrix; Para 0197, The error term can include a gradient, which can correspond to all or part of a backward propagation term], and 
sending the first product mask fragment to the second party [Para 0123, we generalize the addition and multiplication operations on share values to shared matrices. Matrices are shared by applying ShrA to every element. Given two shared matrices (A) and (B), matrix addition can be computed non- interactively by letting (C)i = (A)i + (B)i for i                         
                            ∈
                        
                     {0,1}. To multiply two shared matrices, instead of using independent multiplication triplets (e.g., just multiplying two numbers), we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l , U has the same dimension as A, V has the same dimension as B. and Z=UxV mod 2l. Si computes (E)i = (A)I -(U)i and (F)i = (Β)i + (V)i and sends them to the other];
constructing product mask data corresponding to the product result by using the first product mask fragment and a second product mask fragment corresponding to the product result received from the second party [Para 0125, the multiplication triplets (U), (V), (Z), (V’), (Ζ') are precomputed with the following property; U is an                         
                            n
                            ×
                            d
                        
                     matrix to mask the data X, V is a                         
                            d
                            ×
                            t
                        
                     matrix, each column of which is used to mask w in one iteration (forward propagation), and V’ is a                         
                            B
                            ×
                            t
                        
                     matrix wherein each column is used to mask the difference vector Y*-Y in one iteration (backward propagation). We then let                         
                            Z
                            
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            i
                                        
                            ×
                            V
                            
                                    i
                                
                            a
                            n
                            d
                             
                                    Z
                                
                                    '
                                
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            t
                                        
                                    T
                                
                            ×
                            
                                    V
                                
                                    '
                                
                                    i
                                
                            f
                            o
                            r
                             
                            i
                            =
                            1
                            ,
                            …
                            t
                        
                     where M[i] denotes the ith column of the matrix M… One will notice the re-use of U, and thus the two sets of triplets are not independent of each other, but instead share a matrix; Para 0156, now it is masked multiple times by different random vectors for each inner products. These masked values are transferred between the two parties in the secure computation protocol]; and 
further performing the first calculation based on the product mask data [Para 0156, now it is masked multiple times by different random vectors for each inner products. These masked values are transferred between the two parties in the secure computation protocol; Para 0163, compare the accuracy of the produced model using the new activation function and polynomial approximation with different degrees… we select as many points on the logistic function as the degree of the polynomial].

Regarding claim 7, M1 teaches the limitations of claim 1, including performing a first calculation based on the three pieces of mask data and the first fragment of the random array to obtain a first gradient fragment (paras 0123 and 0163, as shown above).

M1 further teaches,
the random array further comprises a plurality of additional values, and the plurality of additional values are values obtained by the third party by performing an operation based on the three random numbers [Para 0046, a set of clients C.sub.t, ..., C.sub.w want to train various models on their joint data. No assumptions are made on how the data is distributed among the clients. In particular, the data can be horizontally or vertically partitioned, or be secret-shared among the clients, e.g., as part of a previous computation. Thus, a database table can be distributed among clients before any training of a machine learning model starts. For example, some client may have some rows of the database table while another client has other rows; Para 0051, Each of data clients 210-212 store private data that they do not wish to share with the other data clients. In a setup phase, data clients 210-212 secret-share their private data among servers 230 and 240. Examples of secret-sharing include additive sharing, Boolean sharing, and Yao sharing, and may involve encryption. Each client can generate shares of its own private data and then send each share to one of the servers. Thus, servers 230 and 240 can collectively store all of the private data; Para 0123, Given two shared matrices (A) and (B)… we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l, U has the same dimension as A, V has the same dimension as B… The idea of this generalization is that each element in matrix A is always masked by the same random element in U, while it is multiplied by different elements in B in the matrix multiplication]; and 
calculating gradient mask data corresponding to a training gradient based on the three pieces of mask data [Para 0125, the multiplication triplets (U), (V), (Z), (V’), (Ζ') are precomputed with the following property; U is an                         
                            n
                            ×
                            d
                        
                     matrix to mask the data X, V is a                         
                            d
                            ×
                            t
                        
                     matrix, each column of which is used to mask w in one iteration (forward propagation), and V’ is a                         
                            B
                            ×
                            t
                        
                     matrix wherein each column is used to mask the difference vector Y*-Y in one iteration (backward propagation); Para 0197, The error term can include a gradient, which can correspond to all or part of a backward propagation term]; 
calculating a first removal fragment for a mask in the gradient mask data based on the three pieces of mask data, the first fragments of three random numbers, and a first fragment of the plurality of additional values [Para 0132, The protocol assumes that the data-independent shared matrices (U), (V), (Z), (V’), (Ζ') are already generated in an offline phase. Besides multiplication and addition of shared decimal numbers, the protocol also multiplies the coefficient vector by                         
                            
                                    a
                                
                                            B
                                        
                     in each iteration. To make this operation efficient, we set                         
                            
                                    a
                                
                                            B
                                        
                     to be a power of 2, i.e.,                         
                            
                                    a
                                
                                            B
                                        
                            =
                            
                                    2
                                
                                    -
                                    k
                                
                    . Then the multiplication with                         
                            
                                    a
                                
                                            B
                                        
                     can be replaced by having the parties truncate k additional bits from their shares of the coefficients]; and 
performing de-masking on the gradient mask data by using the first removal fragment to obtain the first gradient fragment (alternate limitation); or determining the first removal fragment as the first gradient fragment [Para 0132, The protocol assumes that the data-independent shared matrices (U), (V), (Z), (V’), (Ζ') are already generated in an offline phase. Besides multiplication and addition of shared decimal numbers, the protocol also multiplies the coefficient vector by                         
                            
                                    a
                                
                                            B
                                        
                     in each iteration. To make this operation efficient, we set                         
                            
                                    a
                                
                                            B
                                        
                     to be a power of 2, i.e.,                         
                            
                                    a
                                
                                            B
                                        
                            =
                            
                                    2
                                
                                    -
                                    k
                                
                    . Then the multiplication with                         
                            
                                    a
                                
                                            B
                                        
                     can be replaced by having the parties truncate k additional bits from their shares of the coefficients; Para 0125, the multiplication triplets (U), (V), (Z), (V’), (Ζ') are precomputed with the following property; U is an                         
                            n
                            ×
                            d
                        
                     matrix to mask the data X, V is a                         
                            d
                            ×
                            t
                        
                     matrix, each column of which is used to mask w in one iteration (forward propagation), and V’ is a                         
                            B
                            ×
                            t
                        
                     matrix wherein each column is used to mask the difference vector Y*-Y in one iteration (backward propagation); Para 0197, The error term can include a gradient, which can correspond to all or part of a backward propagation term].

Regarding claim 8, M1 teaches the limitations of claim 1.

M1 further teaches,
after obtaining the first gradient fragment: subtracting a product of a predetermined learning rate and the first gradient fragment from the first-party fragment of the model parameter as an updated first-party fragment of the model parameter [Para 0072,                         
                            α
                        
                     is a learning rate… The phase to calculate the predicted output                         
                            
                                    y
                                
                                    i
                                
                                    *
                                
                            =
                            
                                    x
                                
                                    i
                                
                            ∙
                            w
                        
                     is called forward propagation, and the phase to calculate the change                         
                            α
                            
                                            y
                                        
                                            i
                                        
                                            *
                                        
                                    -
                                    
                                            y
                                        
                                            i
                                        
                                    x
                                
                                    i
                                    j
                                
                     is called backward propagation. In some embodiments, all values of coefficient vector w can be updated together in a single vectorized operation; Para 0073, With a mini-batch, the update function can be expressed in a vectorized form:                         
                            w
                            ≔
                            w
                            -
                            
                                    1
                                
                                            B
                                        
                            α
                            
                                    X
                                
                                    B
                                
                                    T
                                
                            ×
                            (
                            
                                    X
                                
                                    B
                                
                            ×
                            w
                            -
                            
                                    Y
                                
                                    B
                                
                            )
                        
                    ; Para 0197, The error term can include a gradient, which can correspond to all or part of a backward propagation term].

Regarding claim 9, M1 teaches ,
A non-transitory, computer-readable medium storing one or more instructions executable by a computer system to perform operations [Para 0286, The software code may be stored as a series of instructions or commands on a computer readable medium for storage and-' or transmission. A suitable non-transitory computer readable medium ca include random access memory (RAM), a read only memory (ROM)], comprising: performing, by a first party of two parties [Para 0046, a set of clients C.sub.t, ..., C.sub.w want to train various models on their joint data. No assumptions are made on how the data is distributed among the clients. In particular, the data can be horizontally or vertically partitioned, or be secret-shared among the clients, e.g., as part of a previous computation. Thus, a database table can be distributed among clients before any training of a machine learning model starts. For example, some client may have some rows of the database table while another client has other rows], masking on three first-party fragments corresponding to three types of training data for a logistic regression model joint training by, respectively, using first fragments of three random numbers in a first fragment of a random array to obtain three first mask fragments [Para 0123, Given two shared matrices (A) and (B)… we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l, U has the same dimension as A, V has the same dimension as B… The idea of this generalization is that each element in matrix A is always masked by the same random element in U, while it is multiplied by different elements in B in the matrix multiplication], 
wherein the logistic regression model joint training comprises the three types of training data: a sample characteristic (Para 0078, x), a sample label (Para 0078, y), and a model parameter (Para 0078, w; Para 0125, U) [Para 0078, The mini-batch SGD algorithm for logistic regression updates the coefficients in each iteration as follows:                         
                            w
                            ≔
                            w
                            -
                            
                                    1
                                
                                            B
                                        
                            a
                            
                                    X
                                
                                    B
                                
                                    T
                                
                            ×
                            
                                    f
                                    
                                                    X
                                                
                                                    B
                                                
                                            ×
                                            w
                                        
                                    -
                                    
                                            Y
                                        
                                            B
                                        
                    ], and 
wherein each of the three types of training data is split into fragments that are distributed between the two parties [Para 0046, a database table can be distributed among clients before any training of a machine learning model starts. For example, some client may have some rows of the database table while another client has other rows. This is called horizontal partitioning. Or, one client may have some columns while others may have other columns. This is referred to as vertical partitioning]; 
sending, by the first party of two parties, the three first mask fragments to a second party, wherein the first fragment of the random array is a fragment, sent by a third party to the first party, of two-party fragments that are obtained by splitting values in the random array generated by the third party [Para 0123, Para 0123, Si computes (E)i = (A)I -(U)i and (F)i = (Β)i + (V)i and sends them to the other server. The idea of this generalization is that each element in matrix A is always masked by the same random element in U, while it is multiplied by different elements in B in the matrix multiplication; Para 0124, Applying the technique to linear regression, in each iteration, we assume the set of mini-batch indices B is public, and perform the update: (w) := (w) -                         
                            
                                    1
                                
                                            B
                                        
                            a
                            
                                    M
                                    u
                                    l
                                
                                    A
                                
                            (
                            
                                    X
                                
                                    B
                                
                                    T
                                
                            ,
                             
                                    M
                                    u
                                    l
                                
                                    A
                                
                                                    X
                                                
                                                    B
                                                
                                    ,
                                     
                                            w
                                        
                            -
                            
                                            Y
                                        
                                            B
                                        
                            )
                        
                    . We further observe that one data sample will be used several times in different epochs, yet it suffices to mask it by the same random multiplication triplets; Para 0156, masked multiple times by different random vectors for each inner products. These masked values are transferred between the two parties in the secure computation protocol. In particular, the overhead compared to the protocols in Section IV is for linear and logistic regressions; Para 0059, a target ideal functionality Fml for machine learning protocols involving a trusted 3rd party can be defined for a system comprising clients C1; : : : ;Cm and servers S0; S1. For uploading data, input xi from Ci can be stored internally at the trusted third party. For computation, after input of function F from S0 or S1, (y1; : : : ; ym)=f(x1; : : : ; xm) can be computed, and yi sent to Ci. This step can be repeated multiple times with different functions]; 
constructing, by the first party of two parties, three pieces of mask data corresponding to the three types of training data by using the three first mask fragments and three second mask fragments received from the second party [Para 0125, the multiplication triplets (U), (V), (Z), (V’), (Ζ') are precomputed with the following property; U is an                         
                            n
                            ×
                            d
                        
                     matrix to mask the data X, V is a                         
                            d
                            ×
                            t
                        
                     matrix, each column of which is used to mask w in one iteration (forward propagation), and V’ is a                         
                            B
                            ×
                            t
                        
                     matrix wherein each column is used to mask the difference vector Y*-Y in one iteration (backward propagation). We then let                         
                            Z
                            
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            i
                                        
                            ×
                            V
                            
                                    i
                                
                            a
                            n
                            d
                             
                                    Z
                                
                                    '
                                
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            t
                                        
                                    T
                                
                            ×
                            
                                    V
                                
                                    '
                                
                                    i
                                
                            f
                            o
                            r
                             
                            i
                            =
                            1
                            ,
                            …
                            t
                        
                     where M[i] denotes the ith column of the matrix M… One will notice the re-use of U, and thus the two sets of triplets are not independent of each other, but instead share a matrix; Para 0158, privacy preserving machine learning with client-aided multiplication triplets generation]; and 
performing, by the first party of two parties, a first calculation based on the three pieces of mask data and the first fragment of the random array to obtain a first gradient fragment for updating the first-party fragment of the model parameter, wherein the first calculation is determined based on a Taylor expansion (Para 0163, polynomial approximation with different degrees… we select as many points on the logistic function as the degree of the polynomial) of a gradient calculation of a logistic regression model [Para 0123, we generalize the addition and multiplication operations on share values to shared matrices. Matrices are shared by applying ShrA to every element. Given two shared matrices (A) and (B), matrix addition can be computed non- interactively by letting (C)i = (A)i + (B)i for i                         
                            ∈
                        
                     {0,1}. To multiply two shared matrices, instead of using independent multiplication triplets (e.g., just multiplying two numbers), we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l , U has the same dimension as A, V has the same dimension as B… The idea of this generalization is that each element in matrix A is always masked by the same random element in U, while it is multiplied by different elements in B in the matrix multiplication; Para 0161, The polynomial might be close to the logistic function in certain intervals, but the tails are unbounded. If a data sample yields a very large input u to the activation function, f(u) will be far beyond the [0, 1] interval which affects accuracy of the model significantly in the backward propagation; Para 0162, computing the backward propagation can be performed in a variety of ways. For example, embodiments can use the same update function as the logistic function; Para 0163, compare the accuracy of the produced model using the new activation function and polynomial approximation with different degrees… we select as many points on the logistic function as the degree of the polynomial; Para 0197, The error term can include a gradient, which can correspond to all or part of a backward propagation term].

Claims 12-16 are non-transitory computer-readable medium claims that recite similar limitations to method claims 4-8, respectively. Therefore, the rejection of claims 12-16 follow the same rationale as the rejection for claims 4-8, respectively. 

Regarding claim 17, M1 teaches,
A computer-implemented system [Para 0043, a privacy-preserving system for training neural networks], comprising: one or more computers [Para 0039, The present disclosure provides techniques for efficient implementation that allows multiple client computers]; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions [Para 0287, The software code may be stored as a series of instructions or commands on a computer readable medium for storage and-' or transmission. A suitable non-transitory computer readable medium ca include random access memory (RAM)] that, when executed by the one or more computers, perform one or more operations, comprising: performing, by a first party of two parties [Para 0046, a set of clients C.sub.t, ..., C.sub.w want to train various models on their joint data. No assumptions are made on how the data is distributed among the clients. In particular, the data can be horizontally or vertically partitioned, or be secret-shared among the clients, e.g., as part of a previous computation. Thus, a database table can be distributed among clients before any training of a machine learning model starts. For example, some client may have some rows of the database table while another client has other rows], masking on three first-party fragments corresponding to three types of training data for a logistic regression model joint training by, respectively, using first fragments of three random numbers in a first fragment of a random array to obtain three first mask fragments [Para 0123, Given two shared matrices (A) and (B)… we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l, U has the same dimension as A, V has the same dimension as B… The idea of this generalization is that each element in matrix A is always masked by the same random element in U, while it is multiplied by different elements in B in the matrix multiplication], 
wherein the logistic regression model joint training comprises the three types of training data: a sample characteristic (Para 0078, x), a sample label (Para 0078, y), and a model parameter (Para 0078, w; Para 0125, U) [Para 0078, The mini-batch SGD algorithm for logistic regression updates the coefficients in each iteration as follows:                         
                            w
                            ≔
                            w
                            -
                            
                                    1
                                
                                            B
                                        
                            a
                            
                                    X
                                
                                    B
                                
                                    T
                                
                            ×
                            
                                    f
                                    
                                                    X
                                                
                                                    B
                                                
                                            ×
                                            w
                                        
                                    -
                                    
                                            Y
                                        
                                            B
                                        
                    ], and 
wherein each of the three types of training data is split into fragments that are distributed between the two parties [Para 0046, a database table can be distributed among clients before any training of a machine learning model starts. For example, some client may have some rows of the database table while another client has other rows. This is called horizontal partitioning. Or, one client may have some columns while others may have other columns. This is referred to as vertical partitioning]; 
sending, by the first party of two parties, the three first mask fragments to a second party, wherein the first fragment of the random array is a fragment, sent by a third party to the first party, of two-party fragments that are obtained by splitting values in the random array generated by the third party [Para 0123, Para 0123, Si computes (E)i = (A)I -(U)i and (F)i = (Β)i + (V)i and sends them to the other server. The idea of this generalization is that each element in matrix A is always masked by the same random element in U, while it is multiplied by different elements in B in the matrix multiplication; Para 0124, Applying the technique to linear regression, in each iteration, we assume the set of mini-batch indices B is public, and perform the update: (w) := (w) -                         
                            
                                    1
                                
                                            B
                                        
                            a
                            
                                    M
                                    u
                                    l
                                
                                    A
                                
                            (
                            
                                    X
                                
                                    B
                                
                                    T
                                
                            ,
                             
                                    M
                                    u
                                    l
                                
                                    A
                                
                                                    X
                                                
                                                    B
                                                
                                    ,
                                     
                                            w
                                        
                            -
                            
                                            Y
                                        
                                            B
                                        
                            )
                        
                    . We further observe that one data sample will be used several times in different epochs, yet it suffices to mask it by the same random multiplication triplets; Para 0156, masked multiple times by different random vectors for each inner products. These masked values are transferred between the two parties in the secure computation protocol. In particular, the overhead compared to the protocols in Section IV is for linear and logistic regressions; Para 0059, a target ideal functionality Fml for machine learning protocols involving a trusted 3rd party can be defined for a system comprising clients C1; : : : ;Cm and servers S0; S1. For uploading data, input xi from Ci can be stored internally at the trusted third party. For computation, after input of function F from S0 or S1, (y1; : : : ; ym)=f(x1; : : : ; xm) can be computed, and yi sent to Ci. This step can be repeated multiple times with different functions]; 
constructing, by the first party of two parties, three pieces of mask data corresponding to the three types of training data by using the three first mask fragments and three second mask fragments received from the second party [Para 0125, the multiplication triplets (U), (V), (Z), (V’), (Ζ') are precomputed with the following property; U is an                         
                            n
                            ×
                            d
                        
                     matrix to mask the data X, V is a                         
                            d
                            ×
                            t
                        
                     matrix, each column of which is used to mask w in one iteration (forward propagation), and V’ is a                         
                            B
                            ×
                            t
                        
                     matrix wherein each column is used to mask the difference vector Y*-Y in one iteration (backward propagation). We then let                         
                            Z
                            
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            i
                                        
                            ×
                            V
                            
                                    i
                                
                            a
                            n
                            d
                             
                                    Z
                                
                                    '
                                
                                    i
                                
                            =
                            
                                    U
                                
                                            B
                                        
                                            t
                                        
                                    T
                                
                            ×
                            
                                    V
                                
                                    '
                                
                                    i
                                
                            f
                            o
                            r
                             
                            i
                            =
                            1
                            ,
                            …
                            t
                        
                     where M[i] denotes the ith column of the matrix M… One will notice the re-use of U, and thus the two sets of triplets are not independent of each other, but instead share a matrix; Para 0158, privacy preserving machine learning with client-aided multiplication triplets generation]; and 
performing, by the first party of two parties, a first calculation based on the three pieces of mask data and the first fragment of the random array to obtain a first gradient fragment for updating the first-party fragment of the model parameter, wherein the first calculation is determined based on a Taylor expansion (Para 0163, polynomial approximation with different degrees… we select as many points on the logistic function as the degree of the polynomial) of a gradient calculation of a logistic regression model [Para 0123, we generalize the addition and multiplication operations on share values to shared matrices. Matrices are shared by applying ShrA to every element. Given two shared matrices (A) and (B), matrix addition can be computed non- interactively by letting (C)i = (A)i + (B)i for i                         
                            ∈
                        
                     {0,1}. To multiply two shared matrices, instead of using independent multiplication triplets (e.g., just multiplying two numbers), we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l , U has the same dimension as A, V has the same dimension as B… The idea of this generalization is that each element in matrix A is always masked by the same random element in U, while it is multiplied by different elements in B in the matrix multiplication; Para 0161, The polynomial might be close to the logistic function in certain intervals, but the tails are unbounded. If a data sample yields a very large input u to the activation function, f(u) will be far beyond the [0, 1] interval which affects accuracy of the model significantly in the backward propagation; Para 0162, computing the backward propagation can be performed in a variety of ways. For example, embodiments can use the same update function as the logistic function; Para 0163, compare the accuracy of the produced model using the new activation function and polynomial approximation with different degrees… we select as many points on the logistic function as the degree of the polynomial; Para 0197, The error term can include a gradient, which can correspond to all or part of a backward propagation term].

Claim 20 is a system claim that recites similar limitations to method claim 4. Therefore, the rejection of claim 20 follows the same rationale as the rejection for claim 4. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 2, 3, 10, 11, 18, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over M1 in view of Mohassel et al. (SecureML: A System for Scalable Privacy-Preserving Machine Learning, published 2017), hereinafter M2.

Regarding claim 2, M1 teaches the limitations of claim 1.

M1 does not teach first party holds sample characteristic and second party holds sample label; and before obtaining three first mask fragments: splitting the sample characteristic into a corresponding first-party fragment and a corresponding second-party fragment by using a secret sharing technology, and sending the corresponding second-party fragment to the second party; and receiving, from the second party, a first-party fragment obtained by splitting the sample label by using the secret sharing technology.

M2 teaches,
first party (Sect II (B), para 4, Alice) holds sample characteristic (Sect II (B), para 4, x) and second party (Sect II (B), para 4, Bob) holds sample label (Sect II (B), para 4, y); and before obtaining three first mask fragments: splitting the sample characteristic into a corresponding first-party fragment (Sect II (B), para 4,                 
                    
                            x
                        
                        ^
                    
            ) and a corresponding second-party fragment (Sect II (B), para 4,                 
                    
                            y
                        
                        ^
                    
            ) by using a secret sharing technology (Sect II (B), para 4, garbling scheme), and sending the corresponding second-party fragment to the second party (Sect II (B), para 4, Bob obtains his encoded (garbled) input                 
                    
                            y
                        
                        ^
                    
            ); and receiving, from the second party, a first-party fragment obtained by splitting the sample label by using the secret sharing technology (Sect II (B), para 4, We can have Alice, Bob, or both learn an output by communicating the decoding table accordingly) [Sect II (B), para 4, Given such a garbling scheme, it is possible to design a secure two-party computation protocol as follows: Alice generates a random seed σ and runs the garbling algorithm for function f to obtain a garbled circuit GC. She also encodes her input                 
                    
                            x
                        
                        ^
                    
             using σ and x as inputs to the encoding algorithm. Alice sends GC and                 
                    
                            x
                        
                        ^
                    
             to Bob. Bob obtains his encoded (garbled) input                 
                    
                            y
                        
                        ^
                    
             using an oblivious transfer for each bit of y. He then runs the evaluation algorithm on GC,                 
                    
                            x
                        
                        ^
                    
            ,                 
                    
                            y
                        
                        ^
                    
              to obtain the garbled output                 
                    
                            z
                        
                        ^
                    
            . We can have Alice, Bob, or both learn an output by communicating the decoding table accordingly].
M2 is analogous to the claimed invention as they both relate to privacy-preserving machine learning. Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified M1’s teachings to incorporate the teachings of M2 and provide a garbling scheme in order to [M2, Sect II (B), para 3] satisfy standard security properties.

Regarding claim 3, M1-M2 teach the limitations of claim 2.

M1 further teaches,
wherein, before obtaining the three first mask fragments: after initializing, as an initialized model parameter, the model parameter [Para 0072, The SGD algorithm works as follows: w is initialized as a vector of random values or all 0s]: splitting the model parameter (Para 0123, share values) into a corresponding first-party fragment (Para 0123, U) and a corresponding second-party fragment (Para 0123, V); and sending the corresponding second-party fragment to the second party (Para 0123, sends them to the other) [Para 0123, we generalize the addition and multiplication operations on share values to shared matrices. Matrices are shared by applying ShrA to every element. Given two shared matrices (A) and (B), matrix addition can be computed non- interactively by letting (C)i = (A)i + (B)i for i                 
                    ∈
                
             {0,1}. To multiply two shared matrices, instead of using independent multiplication triplets (e.g., just multiplying two numbers), we take shared matrices (U), (V), (Z), where each element in U and V is uniformly random in Z2l , U has the same dimension as A, V has the same dimension as B. and Z=UxV mod 2l. Si computes (E)i = (A)I -(U)i and (F)i = (Β)i + (V)i and sends them to the other]; or (alternate limitation) receiving, from the second party, a first-party fragment obtained by splitting the initialized model parameter by using the secret sharing technology.

Claims 10 and 11 are non-transitory computer-readable medium claims that recite similar limitations to method claims 2 and 3, respectively. Therefore, the rejection of claims 10 and 11 follow the same rationale as the rejection for claims 2 and 3, respectively. 

Claims 18 and 19 are system claims that recite similar limitations to method claims 2 and 3, respectively. Therefore, the rejection of claims 18 and 19 follow the same rationale as the rejection for claims 2 and 3, respectively. 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SYED RAYHAN AHMED whose telephone number is (571)270-0286. The examiner can normally be reached Mon-Fri ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached at (571) 270-7519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SYED RAYHAN AHMED/Examiner, Art Unit 2126  
                                                                                                                                                                                                      /VAN C MANG/Primary Examiner, Art Unit 2126
Read full office action
Prosecution Timeline

Mar 31, 2023
Application Filed
Feb 06, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/345,702
Patent 12450891
IMAGE CLASSIFIER COMPRISING A NON-INJECTIVE TRANSFORMATION
2y 5m to grant Granted Oct 21, 2025
Study what changed to get past this examiner. Based on 1 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds
Prosecution Projections

1-2
Expected OA Rounds
71%
Grant Probability
99%
With Interview (+50.0%)
4y 4m
Median Time to Grant
Low
PTA Risk
Based on 7 resolved cases by this examiner. Grant probability derived from career allow rate.
METHOD AND APPARATUS FOR JOINT TRAINING LOGISTIC REGRESSION MODEL

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

METHOD AND APPARATUS FOR JOINT TRAINING LOGISTIC REGRESSION MODEL

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email