Last updated: May 29, 2026
Application No. 17/937,825
Leveraging Public Data in Training Neural Networks with Private Mirror Descent

Final Rejection §101§103
Filed
Oct 04, 2022
Priority
Oct 05, 2021 — provisional 63/262,129
Examiner
PAULA, CESAR B
Art Unit
2145
Tech Center
2100 — Computer Architecture & Software
Assignee
Google LLC
OA Round
2 (Final)
Interview Optional

— +7.3% interview lift. Interview lift (+7.3%) is below the 15.0% threshold. A written response is recommended.
Based on 172 resolved cases, 2023–2026
Examiner Intelligence

PAULA, CESAR B View full profile →
Grants only 34% of cases
Career Allowance Rate
58 granted / 172 resolved
-21.3% vs TC avg
Moderate +7% lift
Without
With
+7.3%
Interview Lift
resolved cases with interview
Typical timeline
4y 6m
Avg Prosecution
3 currently pending
Career history
195
Total Applications
across all art units
Statute-Specific Performance

§101
2.0%
-38.0% vs TC avg
§103
83.9%
+43.9% vs TC avg
§102
9.8%
-30.2% vs TC avg
§112
2.2%
-37.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 172 resolved cases
Office Action

§101 §103
DETAILED ACTION
The action is in response to the amendment filing on October 13, 2025. Claims 1-24 are pending and have been considered below. Claims 1 and 13 are independent claims. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

Priority
Acknowledgment is made of applicant’s claim for domestic priority based on provisional application 63/262,129 filed on October 5, 2021. 

Claim Interpretation
The terms geometry and reshaping used in claim 1 are not given explicit definitions in the specification. The term geometry is interpreted herein to reference the dimensionality or structure of a gradient subspace. The term reshaping is interpreted herein as any projection, transformation or scaling function that in any way alters the gradient subspace. 
 
The term mirror map used in claim 6 is not given an explicit definition in the specification. It is interpreted herein as any transformation function that maps or projects a set of gradients from one subspace to another. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. 
Claims 1-24 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The analysis of the claims will follow the 2016 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”). 
Claim 1 
Step 1: The claim recites a method; therefore, it is directed to the statutory category of a process. 
Step 2A Prong 1: The claim recites, inter alia: 
obtaining a set of differentially private (DP) gradients each generated based on processing corresponding private data: This limitation encompasses the mathematical concept of calculating differentially private gradients, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
obtaining a set of public gradients each generated based on processing corresponding public data: This limitation encompasses the mathematical concept of calculating gradients, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
applying mirror descent to the set of public gradients to learn a geometry of the set of public gradients: This limitation encompasses the mathematical concept of executing a mirror descent algorithm, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
reshaping the set of DP gradients to conform to the learned geometry; and reshaped set of DP gradients to ensure a population risk guarantee for convex losses with no explicit dependence on a dimension of the machine learning model: This limitation encompasses the mathematical concept of applying a transformation to a gradient set for convex losses with no dependence on a dimension of the model, which is an evaluation practically capable of being performed in the human mind or mathematically with the assistance of pen and paper. 
Step 2A Prong 2: The abstract ideas listed above are not integrated into a practical application. Specifically, the additional element, [a] computer-implemented method when executed on data processing hardware causes the data processing hardware to perform operations comprising, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract idea on a computer (see MPEP § 2106.05(f)). 
The additional element, training a machine learning model based on the reshaped set of DP gradients, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract idea on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites additional hardware and machine learning elements that are recited at such a high level of generality such that they represent no more than mere instructions to apply the abstract ideas on a computer. The additional element listed above does not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 2 
Step 1: A process, as above. 
Step 2A Prong 1: The claim recites, inter alia: 
wherein each DP gradient in the set of DP gradients is generated by: processing…corresponding private data to generate a corresponding predicted private output: This limitation encompasses the mathematical concept of calculating a differentially private gradient from a corresponding output, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
determining a private loss function based on the corresponding predicted private output and a corresponding private ground truth; and: This limitation encompasses the mathematical concept of calculating a loss, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
adding, to a private gradient derived from the private loss function, noise to generate the DP gradient: This limitation encompasses the mathematical concept of adding noise to a gradient, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
Step 2A Prong 2: The abstract ideas listed above are not integrated into a practical application. Specifically, the additional element, using a machine learning model, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract idea on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites the additional element of a machine learning model recited at such a high level of generality such that it represents no more than mere instructions to apply the abstract ideas on a computer. The additional element listed above does not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 

Claim 3 
Step 1: A process, as above. 
Step 2A Prong 1: The claim recites: 
wherein the private loss function is convex and L-Lipschitz: This limitation encompasses the mathematical concept of executing a convex and L-Lipschitz loss function, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
Step 2A Prong 2: There are no additional elements in the claim that integrate the abstract idea into a practical application, and the claim is thus directed to the abstract idea. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. Therefore, the claim is subject-matter ineligible. 
Claim 4 
Step 1: A process, as above. 
Step 2A Prong 1: The claim recites: 
wherein the private data and the public data are derived from a same distribution of sources: This limitation encompasses the mental process of drawing private and public data from the same source, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
Step 2A Prong 2: There are no additional elements in the claim that integrate the abstract idea into a practical application, and the claim is thus directed to the abstract idea. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. Therefore, the claim is subject-matter ineligible. 
Claim 5 
Step 1: A process, as above. 
Step 2A Prong 1: The claim recites, inter alia: 
wherein each public gradient in the set of public gradients is generated by: processing…corresponding public data to generate a corresponding predicted public output: This limitation encompasses the mathematical concept of calculating a gradient from a corresponding output, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
determining a public loss function based on the corresponding predicted public output and a corresponding public ground truth; and: This limitation encompasses the mathematical concept of calculating a loss, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
deriving the public gradient from the public loss function: This limitation encompasses the mathematical concept of calculating a gradient from a loss function, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 

Step 2A Prong 2: The abstract ideas listed above are not integrated into a practical application. Specifically, the additional element, using a machine learning model, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract idea on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites the additional element of a machine learning model recited at such a high level of generality such that it represents no more than mere instructions to apply the abstract ideas on a computer. The additional element listed above does not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 6 
Step 1: A process, as above. 
Step 2A Prong 1: The claim recites: 
wherein applying mirror descent to the set of public gradients to learn the geometry for the set of DP gradients comprises applying mirror descent by using the public gradients derived from the public loss function as a mirror map to learn the geometry for the set of DP gradients: This limitation encompasses the mathematical concept of executing a mirror descent algorithm, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
Step 2A Prong 2: There are no additional elements in the claim that integrate the abstract idea into a practical application, and the claim is thus directed to the abstract idea. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. Therefore, the claim is subject-matter ineligible. 
Claim 7 
Step 1: A process, as above. 
Step 2A Prong 1: The claim recites: 
wherein the public loss function is strongly convex: This limitation encompasses the mathematical concept of executing a strongly convex loss function, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
Step 2A Prong 2: There are no additional elements in the claim that integrate the abstract idea into a practical application, and the claim is thus directed to the abstract idea. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. Therefore, the claim is subject-matter ineligible. 
Claim 8 
Step 1: A process, as above. 
Step 2A Prong 1: The claim inherits the abstract ideas of claim 1, from which it depends. 
Step 2A Prong 2: The abstract ideas inherited from claim 1 are not integrated into a practical application. Specifically, the additional elements, wherein: the data processing hardware resides on a central server: and the set of DP gradients and the set of public gradients are stored in a central repository residing on the central server, amount to invoking computers or other machinery merely as tools to perform an existing process. Thus, these additional elements are recited at such a high level of generality that they represent no more than mere instructions to apply the abstract ideas on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites additional elements such as processing hardware, a central repository and a central server that are recited at such a high level of generality such that they represent no more than mere instructions to apply the abstract ideas on a computer. The additional elements listed above do not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 9 
Step 1: A process, as above. 
Step 2A Prong 1: The claim inherits the abstract ideas of claim 1, from which it depends. 
Step 2A Prong 2: The abstract ideas inherited from claim 1 are not integrated into a practical application. Specifically, the additional elements, wherein: the data processing hardware resides on a remote system…and each DP gradient in the set of DP gradients is generated locally at a respective one of the one or more client devices, amount to invoking computers or other machinery merely as tools to perform an existing process. Thus, these additional elements are recited at such a high level of generality that they represent no more than mere instructions to apply the abstract ideas on a computer (see MPEP § 2106.05(f)). 
The additional element, obtaining the set of DP gradients comprises receiving the set of DP gradients from one or more client devices via federated learning without receiving any of the corresponding private data; and, amounts to no more than mere data gathering, which is insignificant extra-solution activity that does not amount to an inventive concept (see MPEP § 2106.05(g) “Whether the limitation amounts to necessary data gathering and outputting, (i.e., all uses of the recited judicial exception require such data gathering or data output). See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015) (presenting offers and gathering statistics amounted to mere data gathering).”). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. 
The claim recites additional elements such as processing hardware and client devices that are recited at such a high level of generality such that they represent no more than mere instructions to apply the abstract ideas on a computer. 
The claim recites additional elements directed to transmitting data over a network which is a well-understood, routine, and conventional computer function as recognized by the court decisions listed in MPEP § 2106.05(d) (“OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network)”). 
The additional elements listed above do not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 10 
Step 1: A process, as above. 
Step 2A Prong 1: The claim inherits the abstract ideas of claim 1, from which it depends. 
Step 2A Prong 2: The abstract ideas inherited from claim 1 are not integrated into a practical application. Specifically, the additional element, wherein the machine learning model comprises an image classification model, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract ideas on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites the additional element of an image classification model that is recited at such a high level of generality such that it represents no more than mere instructions to apply the abstract ideas on a computer. The additional elements listed above do not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 11 
Step 1: A process, as above. 
Step 2A Prong 1: The claim inherits the abstract ideas of claim 1, from which it depends. 
Step 2A Prong 2: The abstract ideas inherited from claim 1 are not integrated into a practical application. Specifically, the additional element, the machine learning model comprises a language model, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract ideas on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites the additional element of a language model that is recited at such a high level of generality such that it represents no more than mere instructions to apply the abstract ideas on a computer. The additional elements listed above do not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 12 
Step 1: A process, as above. 
Step 2A Prong 1: The claim inherits the abstract ideas of claim 1, from which it depends. 
Step 2A Prong 2: The abstract ideas inherited from claim 1 are not integrated into a practical application. Specifically, the additional element, wherein the machine learning model comprises a speech recognition model, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract ideas on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites the additional element of a speech recognition model that is recited at such a high level of generality such that it represents no more than mere instructions to apply the abstract ideas on a computer. The additional elements listed above do not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 13 contains similar limitations to claim 1 and is likewise deficient.
Claim 14 
Step 1: A machine, as above. 
Step 2A Prong 1: The claim recites, inter alia: 
wherein each DP gradient in the set of DP gradients is generated by: processing…corresponding private data to generate a corresponding predicted private output: This limitation encompasses the mathematical concept of calculating a differentially private gradient from a corresponding output, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
determining a private loss function based on the corresponding predicted private output and a corresponding private ground truth; and: This limitation encompasses the mathematical concept of calculating a loss, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
adding, to a private gradient derived from the private loss function, noise to generate the DP gradient: This limitation encompasses the mathematical concept of adding noise to a gradient, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
Step 2A Prong 2: The abstract ideas listed above are not integrated into a practical application. Specifically, the additional element, using a machine learning model, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract idea on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites the additional element of a machine learning model recited at such a high level of generality such that it represents no more than mere instructions to apply the abstract ideas on a computer. The additional element listed above does not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 15 
Step 1: A machine, as above. 
Step 2A Prong 1: The claim recites: 
wherein the private loss function is convex and L-Lipschitz: This limitation encompasses the mathematical concept of executing a convex and L-Lipschitz loss function, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
Step 2A Prong 2: There are no additional elements in the claim that integrate the abstract idea into a practical application, and the claim is thus directed to the abstract idea. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in 
combination, they do not constitute an inventive concept. Therefore, the claim is subject-matter ineligible. 
Claim 16 
Step 1: A machine, as above. 
Step 2A Prong 1: The claim recites: 
wherein the private data and the public data are derived from a same distribution of sources: This limitation encompasses the mental process of drawing private and public data from the same source, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
Step 2A Prong 2: There are no additional elements in the claim that integrate the abstract idea into a practical application, and the claim is thus directed to the abstract idea. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. Therefore, the claim is subject-matter ineligible. 
Claim 17 
Step 1: A machine, as above. 
Step 2A Prong 1: The claim recites, inter alia: 
wherein each public gradient in the set of public gradients is generated by: processing…corresponding public data to generate a corresponding predicted public output: This limitation encompasses the mathematical concept of calculating a gradient from a corresponding output, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
determining a public loss function based on the corresponding predicted public output and a corresponding public ground truth; and: This limitation encompasses the mathematical concept of calculating a loss, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
deriving the public gradient from the public loss function: This limitation encompasses the mathematical concept of calculating a gradient from a loss function, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
Step 2A Prong 2: The abstract ideas listed above are not integrated into a practical application. Specifically, the additional element, using a machine learning model, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract idea on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites the additional element of a machine learning model recited at such a high level of generality such that it represents no more than mere instructions to apply the abstract ideas on a computer. The 
additional element listed above does not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 18 
Step 1: A machine, as above. 
Step 2A Prong 1: The claim recites: 
wherein applying mirror descent to the set of public gradients to learn the geometry for the set of DP gradients comprises applying mirror descent by using the public gradients derived from the public loss function as a mirror map to learn the geometry for the set of DP gradients: This limitation encompasses the mathematical concept of executing a mirror descent algorithm, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
Step 2A Prong 2: There are no additional elements in the claim that integrate the abstract idea into a practical application, and the claim is thus directed to the abstract idea. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. Therefore, the claim is subject-matter ineligible. 
Claim 19 
Step 1: A machine, as above. 
Step 2A Prong 1: The claim recites: 
wherein the public loss function is strongly convex: This limitation encompasses the mathematical concept of executing a strongly convex loss function, which is an evaluation practically capable of being performed in the human mind with the assistance of pen and paper. 
Step 2A Prong 2: There are no additional elements in the claim that integrate the abstract idea into a practical application, and the claim is thus directed to the abstract idea. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. Therefore, the claim is subject-matter ineligible. 
Claim 20 
Step 1: A machine, as above. 
Step 2A Prong 1: The claim inherits the abstract ideas of claim 1, from which it depends. 
Step 2A Prong 2: The abstract ideas inherited from claim 1 are not integrated into a practical application. Specifically, the additional elements, wherein: the data processing hardware resides on a central server: and the set of DP gradients and the set of public gradients are stored in a central repository residing on the central server, amount to invoking computers or other machinery merely as tools to perform an existing process. Thus, these additional elements are recited at such a high level of generality that they represent no more than mere instructions to apply the abstract ideas on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites additional elements such as processing hardware, a central repository and a central server that are recited at such a high level of generality such that they represent no more than mere instructions to apply the abstract ideas on a computer. The additional elements listed above do not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 21 
Step 1: A machine, as above. 
Step 2A Prong 1: The claim inherits the abstract ideas of claim 1, from which it depends. 
Step 2A Prong 2: The abstract ideas inherited from claim 1 are not integrated into a practical application. Specifically, the additional elements, wherein: the data processing hardware resides on a remote system…and each DP gradient in the set of DP gradients is generated locally at a respective one of the one or more client devices, amount to invoking computers or other machinery merely as tools to perform an existing process. Thus, these additional elements are recited at such a high level of generality that they represent no more than mere instructions to apply the abstract ideas on a computer (see MPEP § 2106.05(f)). 
The additional element, obtaining the set of DP gradients comprises receiving the set of DP gradients from one or more client devices via federated learning without receiving any of the corresponding private data; and, amounts to no more than mere data gathering, which is insignificant extra-solution activity that does not amount to an inventive concept (see MPEP § 2106.05(g) “Whether the limitation amounts to necessary data gathering and outputting, (i.e., all uses of the recited judicial exception require such data gathering or data output). See Mayo, 566 U.S. at 79, 101 USPQ2d at 1968; OIP Techs., Inc. v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015) (presenting offers and gathering statistics amounted to mere data gathering).”). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. 
The claim recites additional elements such as processing hardware and client devices that are recited at such a high level of generality such that they represent no more than mere instructions to apply the abstract ideas on a computer. 
The claim recites additional elements directed to transmitting data over a network which is a well-understood, routine, and conventional computer function as recognized by the court decisions listed in MPEP § 2106.05(d) (“OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355, 112 USPQ2d 1093, 1096 (Fed. Cir. 2014) (computer receives and sends information over a network)”). 
The additional elements listed above do not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 

Claim 22 
Step 1: A machine, as above. 
Step 2A Prong 1: The claim inherits the abstract ideas of claim 1, from which it depends. 
Step 2A Prong 2: The abstract ideas inherited from claim 1 are not integrated into a practical application. Specifically, the additional element, wherein the machine learning model comprises an image classification model, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract ideas on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites the additional element of an image classification model that is recited at such a high level of generality such that it represents no more than mere instructions to apply the abstract ideas on a computer. The additional elements listed above do not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 23 
Step 1: A machine, as above. 
Step 2A Prong 1: The claim inherits the abstract ideas of claim 1, from which it depends. 
Step 2A Prong 2: The abstract ideas inherited from claim 1 are not integrated into a practical application. Specifically, the additional element, the machine learning model comprises a language model, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract ideas on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites the additional element of a language model that is recited at such a high level of generality such that it represents no more than mere instructions to apply the abstract ideas on a computer. The additional elements listed above do not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 
Claim 24 
Step 1: A machine, as above. 
Step 2A Prong 1: The claim inherits the abstract ideas of claim 1, from which it depends. 
Step 2A Prong 2: The abstract ideas inherited from claim 1 are not integrated into a practical application. Specifically, the additional element, wherein the machine learning model comprises a speech recognition model, amounts to invoking computers or other machinery merely as tools to perform an existing process. Thus, this additional element is recited at such a high level of generality that it represents no more than mere instructions to apply the abstract ideas on a computer (see MPEP § 2106.05(f)). 
Nothing in the claim integrates the abstract ideas into a practical application, and the claim is thus directed to the abstract ideas. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception because when considered separately or in combination, they do not constitute an inventive concept. The claim recites the additional element of a speech recognition model that is recited at such a high level of generality such that it represents no more than mere instructions to apply the abstract ideas on a computer. The additional elements listed above do not amount to significantly more than the abstract ideas. Therefore, the claim is subject-matter ineligible. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: 
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. 

Claims 1-6, 10, 13-18 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Zhou et al. (“Bypassing the Ambient Dimension: Private SGD with Gradient Subspace 
17937825_072825_1237PM_Approved_Non-Final_Rejection.pdf Application/Control Number: 17/937,825 Page 29 Art Unit: 2145 

Identification,” hereinafter Zhou) in view of Arora et al. (“Private Stochastic Convex Optimization: Efficient Algorithms for Non-smooth Objectives,” hereinafter Arora). 

Regarding claim 1, Zhou teaches [a] computer-implemented method when executed on data processing hardware causes the data processing hardware to perform operations comprising (Zhou, D Experimental Setup and Additional Results, pp. 29; “All experiments have been run on NVIDIA Tesla K40 GPUs,” wherein “GPUs” encompass data processing hardware storing operations.). 
obtaining a set of differentially private (DP) gradients each generated based on processing corresponding private data (Zhou, 2 Preliminaries, pp. 4, paragraph 3; “Given a private dataset 𝑆 = {𝑧1,…,𝑧𝑛} drawn i.i.d. from the underlying distribution 𝒫, we want to solve the following empirical risk minimization (ERM) problem subject to differential privacy: min𝐰𝐿̂𝑛(𝐰)=1𝑛Σℓ(𝐰,𝑧𝑖)𝑛𝑖=1 where the parameter 𝐰 ∈ℝ𝒑. We optimize this objective with an iterative algorithm. At each step 𝑡, we write 𝐰𝑡 as the algorithm’s iterate and use 𝐠𝑡 to denote the mini-batch gradient, and ∇𝐿̂𝑛(𝐰𝑡)=1𝑛Σℓ(𝐰𝑡,𝑧𝑖)𝑛𝑖=1 to dente the empirical gradient,” wherein this “empirical gradient” encompasses differentially private (DP) gradients each generated based on processing corresponding data, sampled from “private dataset S.”). 
obtaining a set of public gradients each generated based on processing corresponding public data; (Zhou, 2 Preliminaries, pp. 4, paragraph 3; “In addition to the private dataset, the algorithm can also freely access to a small public dataset 𝑆ℎ={𝑧̃1,…,𝑧̃𝑛} drawn from the same distribution. Zhou, Algorithm 1; Zhou, 3.1 Gradient Subspace Identification, pp. 6, paragraph 3; “Note that, if 𝑀𝑡=1𝑚Σ∇ℓ(𝐰𝑡,𝑧̃𝑖)∇𝑚𝑖=1ℓ(𝐰𝑡,𝑧̃𝑖)T is evaluated on fresh public samples, the Σ𝑡 is the expectation of 𝑀𝑡, and the deviation of 𝑀𝑡 from Σ𝑡 can easily be analyzed by the Ahlswede-Winter Inequality,” wherein “𝑀𝑡” encompasses a set of public gradients each generated based on processing corresponding public data, “𝑆ℎ.” Zhou, Theorem 2, pp. 7, paragraph 4; “Under assumption 1, 2, the second moment matrix of the public gradient 1𝑚Σ∇ℓ(𝐰𝑡,𝑧̃𝑖)∇𝑚𝑖=1ℓ(𝐰𝑡,𝑧̃𝑖)T approximates the population second moment matrix Σ𝑡=𝔼𝑧~𝑃[∇ℓ(𝐰𝑡,𝑧̃𝑖)∇ℓ(𝐰𝑡,𝑧̃𝑖)T], uniformly over all iterations,” further indicating that “𝑀𝑡” encompasses a set of public gradients.). 
applying…descent to the set of public gradients to learn a geometry of the set of public gradients (Zhou, page 5, section 2; Here the public gradients “𝑀𝑡” are also evaluated on public dataset Sh. Zhou, 1 Introduction, pp. 2, paragraph 2; “In this paper, we aim to overcome such dependence on the ambient dimension by leveraging the structure p of the gradient space in the training of neural networks,” wherein “leveraging the structure p of the gradient space” is equivalent to learn[ing] a geometry for the set of public gradients. Zhou, 3 Projected Private Gradient Descent, pp. 5; “Given the dimension of gradient to be p, this method ends up in getting a factor of p in the error rate [Bassily et al., 2014, 2019a]. Our algorithm is inspired by the recent observations that stochastic gradients stay in a low-dimensional space in the training of deep nets [Li et al., 2020, Gur-Ari et al., 2018]. Such observation is also valid for the private training algorithm, i.e., DP-SGD (Figure 1 (b) and (c)),” wherein the “low-dimensional space” corresponding to “the dimension of gradient” or “p” encompasses a geometry.). 
… 
Zhou does not explicitly teach applying mirror descent. However, Arora, in the area of differentially private gradient descent algorithms, teaches this limitation (Arora, 4 Algorithm and Utility Analysis, pp. 5, paragraph 6; “Our setup fits the popular framework of Online Stochastic Mirror Descent (OSMD) algorithm, wherein, given a strictly convex potential Φ : ℝ𝑑→ℝ, the updates are given as” ).  Arora is analogous to the claimed invention as both are from the same field of endeavor, that is, differentially private gradient descent algorithms. Zhou applies a gradient descent algorithm to optimize the structure of a differentially private gradient subspace but does not specify that this algorithm is a form of mirror descent. Arora teaches this limitation. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the differentially private stochastic gradient descent algorithm of Zhou to incorporate mirror descent, as taught by Arora. The motivation to do so is to leverage the performance benefits of such an algorithm (Arora, Abstract; “We propose an algorithm based on noisy mirror descent, which achieves optimal rates both in terms of statistical complexity and number of queries to a first-order stochastic oracle in the regime when the privacy parameter is inversely proportional to the number of samples.”). 
Zhou further teaches reshaping the set of DP gradients to conform to the learned geometry; and (Zhou, Abstract; “We propose Projected DP-SGD that performs noise reduction by projecting the noisy gradients to a low-dimensional subspace, which is given by the top gradient eigenspace on a small public dataset,” wherein “projecting the noisy gradients” or DP gradients is a transformation that amounts to reshaping… to conform to the learned geometry or “a low-dimensional subspace.”) 
training a machine learning model based on the reshaped set of DP gradients (Zhou, 4 Experiments, pp. 10, paragraph 1; “We empirically evaluate PDP-SGD on training neural networks,” wherein “neural networks” encompass a machine learning model and the gradients produced by the “PDP-SGD” algorithm correspond to the reshaped set of DP gradients.).  Zhou teaches  Convergence for convex and non-convex optimization, “Building on the reconstruction error bound, we provide convergence and sample complexity results for our method PDP-SGD for solving Empirical risk minimization (ERM) in two types of loss functions, including 1) smooth and non-convex, 2) Lipschitz convex.”) (pp. 3, paragraph 4). Zhou discloses circumventing the dependence on the ambient dimension by leveraging a low-dimensional structure of gradient space in deep networks (abstract)--to ensure a population risk guarantee for convex losses with no explicit dependence on a dimension of the machine learning model.

Regarding claim 2, the combination of Zhou and Arora teaches [t]he method of claim 1 (and thus the rejection of claim 1 is incorporated). 
Zhou further teaches wherein each DP gradient in the set of DP gradients is generated by: processing, using a machine learning model, corresponding private data to generate a corresponding predicted private output (Zhou, 3 Projected Private Gradient Descent, pp. 5; “At each iteration t, in order to obtain an approximated subspace without leaking the information of the private dataset S, we evaluated the second moment matrix 𝑀𝑡 on 𝑆ℎ and compute the top-k eigenvectors 𝑉̂𝑘(𝑡) of 𝑀𝑡 (line 4 in Algorithm 1),” wherein each DP gradient in the set of DP gradients is generated by processing “the private dataset S,” or the corresponding private data. That the algorithm avoids “leaking the information of the private dataset S” indicates that its predicted output is also private. Zhou, 4 Experiments, pp. 10, paragraph 1; “We empirically 
evaluate PDP-SGD on training neural networks,” thereby indicating that a machine learning model is performing the processing.). 
determining a private loss function based on the corresponding predicted private output and a corresponding private ground truth; and (Zhou, Algorithm 1, line 5; Here, the loss function “ℓ” necessarily takes as input the corresponding predicted private output and a corresponding private ground truth “𝑧𝑖” drawn from the “private dataset S” thereby qualifying it as a private loss function.) 
adding, to a private gradient derived from the private loss function, noise to generate the DP gradient (Zhou, Algorithm 1, line 6; Here, the algorithm add[s], to a private gradient “𝐠𝑡” derived from the private loss function “ℓ,” noise “𝐛𝑡” to generate the DP gradient “𝐠̃𝑡.” ). 

Regarding claim 3, the combination of Zhou and Arora teaches the method of claim 2 (and thus the rejection of claim 2 is incorporated). 
Zhou further teaches wherein the private loss function is convex (Zhou, 1 Introduction, Convergence for convex and non-convex optimization, pp. 3, paragraph 4; “Building on the reconstruction error bound, we provide convergence and sample complexity results for our method PDP-SGD in two types of loss functions, including 1) smooth and non-convex, 2) Lipschitz convex.”). 
Zhou does not explicitly teach and L-Lipschitz. However, Arora, in the area of in the area of differentially private gradient descent algorithms, teaches this limitation, teaches this limitation (Arora, 2 Notation and Preliminaries, pp. 2, paragraph 6; “We assume that 𝑓(⋅,z) is L- Lipschitz with respect to the dual norm ‖⋅‖∗, i.e., |𝑓(w1,𝑧)−𝑓(w2,𝑧)|≤𝐿‖w1−w2‖ for all z.”). 
Arora is analogous to the claimed invention as both are from the same field of endeavor, that is, differentially private gradient descent algorithms. Zhou teaches a loss function that is G-Lipschitz but does not explicitly specify that the constant 𝐺≥1. Arora, however, explicitly specifies its loss function as being L-Lipschitz. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the loss function of Zhou to guarantee the L-Lipschitz property, as taught by Arora. The motivation to do so is the improve the performance of the training algorithm by optimizing the error bounds (Arora, 3 Related Work, pp. 4, paragraph 2; “For the class of L-Lipschitz convex functions, Bassily et al. (2014) improved upon Chaudhuri et al. (2011) and gave optimal bounds on excess empirical risk of 𝑂(𝑑𝜖𝑛).”). 

Regarding claim 4, the combination of Zhou and Arora teaches [t]he method of claim 1 (and thus the rejection of claim 1 is incorporated). 
Zhou further teaches wherein the private data and the public data are derived from a same distribution of sources (Zhou, 2 Preliminaries, pp. 4, paragraph 3; “Given a private dataset 𝑆={𝑧1,…,𝑧𝑛} drawn i.i.d. from the underlying distribution 𝒫…In addition to the private dataset, the algorithm can also freely access to a small public dataset 𝑆ℎ={𝑧̃1,…,𝑧̃𝑚} drawn from the same distribution 𝒫.”). 
Regarding claim 5, the combination of Zhou and Arora teaches [t]he method of claim 1 (and thus the rejection of claim 1 is incorporated). 
Zhou further teaches wherein each public gradient in the set of public gradients is generated by: processing, using a machine learning model, corresponding public data to generate a corresponding predicted public output; (Zhou, Algorithm 1, line 4; Here, the each public gradient “𝑀𝑡” is generated by processing corresponding public data “𝑆ℎ” to generate a corresponding predicted public output to be input to the loss function, “ℓ.”) 
determining a public loss function based on the corresponding predicted public output and a corresponding public ground truth; and (Zhou, Algorithm 1, line 4; That the loss function, “ℓ” is processing corresponding predicted public output and a corresponding public ground truth “𝑧̃𝑖” indicates that it is a public loss function.). 
deriving the public gradient from the public loss function (Zhou, Algorithm 1, line 4; [T]he public gradient “𝑀𝑡” is calculated using the public loss function “ℓ.”). 
Regarding claim 6, the combination of Zhou and Arora teaches [t]he method of claim 5 (and thus the rejection of claim 5 is incorporated). 
Zhou further teaches wherein applying…descent to the set of public gradients to learn the geometry for the set of DP gradients comprises applying…descent by using the public gradients derived from the public loss function as a mirror map to learn the geometry for the set of DP gradients (Zhou, Algorithm 1, line 6; Here the set of public gradients “𝑀𝑡” are calculated using the public loss function “ℓ” as part of a “differentially private stochastic gradient descent (DP-SGD)” algorithm. Zhou, 1 Introduction, pp. 2, paragraph 2; “In this paper, we aim to overcome such dependence on the ambient dimension by leveraging the structure p of the gradient space in the training of neural networks,” wherein “leveraging the structure p of the gradient space” is equivalent to learn[ing] the geometry for the set of DP gradients. Zhou, 3 
Projected Private Gradient Descent, pp. 5; “Given the dimension of gradient to be p, this method ends up in getting a factor of p in the error rate [Bassily et al., 2014, 2019a]. Our algorithm is inspired by the recent observations that stochastic gradients stay in a low-dimensional space in the training of deep nets [Li et al., 2020, Gur-Ari et al., 2018]. Such observation is also valid for the private training algorithm, i.e., DP-SGD (Figure 1 (b) and (c)),” wherein the “low-dimensional space” corresponding to “the dimension of gradient” or “p” encompasses the geometry. Zhou, Algorithm 1, line 6; The “top-k eigenspace 𝑉̂𝑘(𝑡) of 𝑀𝑡” is used as a mirror map to “project” the “noisy gradient,” or set of DP gradients, to a new subspace, thereby learn[ing] the geometry. Note that no explicit definition is given for the term mirror map in the specification of the claimed invention; as such, it is interpreted here to mean any transform function mapping gradients from one subspace to another.). 
… 
Zhou does not explicitly teach applying mirror descent. However, Arora, in the area of differentially private gradient descent algorithms, teaches this limitation (Arora, 4 Algorithm and Utility Analysis, pp. 5, paragraph 6; “Our setup fits the popular framework of Online Stochastic Mirror Descent (OSMD) algorithm, wherein, given a strictly convex potential Φ : ℝ𝑑→ℝ, the updates are given as” ). Arora is analogous to the claimed invention as both are from the same field of endeavor, that is, differentially private gradient descent algorithms. Zhou applies a gradient descent algorithm to optimize the structure of a differentially private gradient subspace but does not specify that this algorithm is a form of mirror descent. Arora teaches this limitation. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the differentially private stochastic gradient descent algorithm of Zhou to incorporate mirror descent, as taught by Arora. The motivation to do so is to leverage the performance benefits of such an algorithm (Arora, Abstract; “We propose an algorithm based on noisy mirror descent, which achieves optimal rates both in terms of statistical complexity and number of queries to a first-order stochastic oracle in the regime when the privacy parameter is inversely proportional to the number of samples.”). 

Regarding claim 10, the combination of Zhou and Arora teaches [t]he method of claim 1 (and thus the rejection of claim 1 is incorporated). 
Zhou further teaches wherein the machine learning model comprises an image classification model (Zhou, 4 Experiments, pp. 10, paragraph 1; “We empirically evaluate PDP-SGD on training neural networks with two datasets: the MNIST [LeCun et al., 1998] and Fashion MNIST [Xiao et al., 2017],” wherein “MNIST” is a popular image classification dataset thereby qualifying the “neural networks” as image classification model[s].). 

Claims 13-18 and 22 are system claims corresponding to the steps of claims 1-6 and 10 and are therefore rejected for the same reasons. Note that Zhou further teaches memory hardware in communication with the data processing hardware, the memory hardware storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising: (Zhou, D Experimental Setup and Additional Results, pp. 29; “All experiments have been run on NVIDIA Tesla K40 GPUs,” wherein “GPUs” encompass memory hardware in communication with the data processing hardware.). 
Claims 7 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Zhou in view of Arora in further view of Wu et al. (“Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics,” hereinafter Wu). 

Regarding claim 7, the combination of Zhou and Arora teaches [t]he method of claim 5 (and thus the rejection of claim 5 is incorporated). 
Zhou does not explicitly teach wherein the public loss function is strongly convex. However, Wu, in the area of differentially private gradient descent algorithms, teaches this limitation (Wu, Algorithm 2; “Require: ℓ(⋅,𝑧) is 𝛾-strongly convex for every 𝑧,” thereby disclosing a strongly convex loss function. Wu, Figure 3; 
“Tuning using Public Data. Row 1 is MNIST, row 2 is Protein and row 3 is Forest Covertype. Each row gives the test accuracy results of 4 tests: Test 1 is Convex, (𝜀, 0)-DP, Test 2 is Convex, (𝜀, 𝛿)-DP, Test 3 is Strongly Convex, (𝜀, 0)-DP, and Test 4 is Strongly Convex, (𝜀, 𝛿)-DP. For 
Test 1 and 3, we compare Noiseless, our algorithm and SCS13. For Test 2 and 4, we compare all four algorithms,” wherein “tuning using public data” indicates that each loss function tested is a public loss function with tests 3 and 4 representing strongly convex log functions.). 
Wu is analogous to the claimed invention as both are from the same field of endeavor, that is, differentially private gradient descent algorithms. Zhou discloses a loss function for processing a public dataset but does not specify that said loss function is strongly convex, merely that it is convex. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the public loss function of Zhou to guarantee strong convexity, as taught by Wu. The motivation to do so is to improve the accuracy of model training (Wu, Figure 3; As can be seen from the image, the accuracies of Wu’s algorithms (denoted by the red lines) employing the strongly convex loss functions in tests 3 and 4 exceed those employing the convex loss functions in tests 1 and 2). 

Claim 19 is a system claim corresponding to the steps of claim 7 and is therefore rejected for the same reasons. 

Claims 8, 9, 20 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Zhou in view of Arora in further view of Zhao et al. (“Local Differential Privacy-Based Federated Learning for Internet of Things,” hereinafter Zhao). 

Regarding claim 8, the combination of Zhou and Arora teaches [t]he method of claim 1 (and thus the rejection of claim 1 is incorporated). 
Zhou does not explicitly teach wherein: the data processing hardware resides on a central server; and the set of DP gradients and the set of public gradients are stored in a central repository residing on the central server. However, Zhao, in the area of differentially private federated learning, teaches these limitations (Zhao, I. Introduction, pp. 8837, col. 1, paragraph 1; “By adding LDP noises to the gradients before uploading, we obtain the LDP-based federated stochastic gradient descent LDP-FedSGD algorithm, which prevents attackers from deducing original data even though they obtain perturbed gradients. As a result, the FL server gathers and averages users’ submitted perturbed gradients to obtain the averaged result to update the global model’s parameters,” wherein “the FL server” encompasses a central server. That this server is “gather[ing] and average[ing] users’ submitted perturbed gradients” indicates that it possesses data processing hardware. Zhao, IV. System Model and Local Differential Privacy-Based FedSGD Algorithm, A. System Model, pp. 8840, col. 1, paragraph 2; “After finishing predefined epochs locally, the cloud server calculates the average of uploaded gradients from vehicles and updates the global model with the average. The FL aggregator is honest-but-curious or semi honest, which follows the FL protocol but it will try to learn additional information using received data [36], [45]. With the injected LDP noise, servers or attackers cannot retrieve users’ information by reversing their uploaded gradients [3], [4].” Zhao, IV. System Model and Local Differential Privacy-Based FedSGD Algorithm, B. Federated Learning With LDP: LDP-FedSGD, pp. 8840, col. 1, paragraph 3; “Unlike the FedAvg algorithm, in the algorithm, FedSGD clients (i.e., vehicles) upload updated gradients instead of model parameters to the central aggregator (i.e., cloud server) [5],” wherein “upload[ing] updated gradients…to the central aggregator” or “cloud server” is equivalent to having the set of DP gradients and the set of public gradients stored in a central repository residing on the central server. Zhao 
VIII. Experiments, pp. 8848, col. 1, paragraph 2; “We implemented both existing solutions and our proposed solutions, including PM-SUB, Three-Outputs, HM-TP proposed by us, PM And HM proposed by Wang et al. [8], Duchi et al.’s [22] solution and the traditional Laplace mechanism. Our data sets include…two public data sets extracted from Integrated Public Use Microdata Series [50] contain[ing] census records from Brazil (BR) and Mexico (MX).” VIII. Experiments, B. Results on Empirical Risk Minimization, pp. 8849, col. 2, paragraph 2; “Consider each tuple of data as the data set of a vehicle, so vehicles calculate gradients and run different LDP mechanisms to generate noisy gradients. Each mini-batch is a group of vehicles. Thus, the centralized aggregator, i.e., cloud server updates the model after each group of vehicles send noisy gradients,” wherein calculating “noisy gradients” from “two public datasets” indicates that the “cloud server” also store[s] public gradients. Note that the gradients themselves need not be public, just that they derive from outputs calculated from public data, as outlined in the specification of the claimed invention at paragraph [0032], “The outputs 115 are referred to herein as predicted public outputs 115 to denote that they are generated based on the public data 160 not that they are necessarily publicly disclosed outside the remote system 110. However, the predicted public gradients 117 may be publicly exposed” (emphasis added).). 
Zhao is analogous to the claimed invention as both are from the same field of endeavor, that is, differentially private federated learning. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to implement the combined differentially private mirror descent algorithm of Zhou and Arora on the federated learning architecture of Zhao. The motivation to do so is to guarantee the privacy of local user data in distributed systems (Zhao, I. Introduction, pp. 8836, col. 2, paragraph 1; “The development of sensors and communication technologies for Internet of Things (IoT) have 
enabled a fast and large-scale collection of user data, which has bred new services and applications, such as the Waze application that provides the intelligent transportation routing service. This kind of service benefits users’ daily life, but it may raise privacy concerns of sensitive data, such as users’ location information. To address these concerns, we propose a hybrid approach that integrates federated learning (FL) [1] with local differential privacy (LDP) [2] techniques.”). 

Regarding claim 9, the combination of Zhou and Arora teaches [t]he method of claim 1 (and thus the rejection of claim 1 is incorporated). 
Zhou does not explicitly teach wherein: the data processing hardware resides on a remote system; obtaining the set of DP gradients comprises receiving the set of DP gradients from one or more client devices via federated learning without receiving any of the corresponding private data; and each DP gradient in the set of DP gradients is generated locally at a respective one of the one or more client devices. However, Zhao, in the area of differentially private federated learning, teaches these limitations. 
wherein: the data processing hardware resides on a remote system; (Zhao, I. Introduction, pp. 8837, col. 1, paragraph 1; “By adding LDP noises to the gradients before uploading, we obtain the LDP-based federated stochastic gradient descent LDP-FedSGD algorithm, which prevents attackers from deducing original data even though they obtain perturbed gradients. As a result, the FL server gathers and averages users’ submitted perturbed gradients to obtain the averaged result to update the global model’s parameters,” wherein “the FL server” encompasses a remote system. That this system is “gather[ing] and average[ing] users’ submitted perturbed gradients” indicates that it possesses data processing hardware.) 
obtaining the set of DP gradients comprises receiving the set of DP gradients from one or more client devices via federated learning without receiving any of the corresponding private data; and (Zhao, I. Introduction, pp. 8836, col. 2, paragraph 1; “[W]e propose a hybrid approach that integrates federated learning (FL) [1] with local differential privacy (LDP) [2] techniques. FL can facilitate the collaborative learning with uploaded gradients from users instead of sharing users’ raw data,” wherein “users” are equivalent to client devices.) 
each DP gradient in the set of DP gradients is generated locally at a respective one of the one or more client devices (Zhao, I. Introduction, pp. 8836, col. 2, paragraph 1; “[W]e propose a hybrid approach that integrates federated learning (FL) [1] with local differential privacy (LDP) [2] techniques. FL can facilitate the collaborative learning with uploaded gradients from users instead of sharing users’ raw data,” wherein “local differential privacy” indicates that each DP gradient in the set of DP gradients is generated locally at a respective one of the one or more client devices or “users.”). 
Zhao is analogous to the claimed invention as both are from the same field of endeavor, that is, differentially private federated learning. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to implement the combined differentially private mirror descent algorithm of Zhou and Arora on the federated learning architecture of Zhao. The motivation to do so is to guarantee the privacy of local user data in distributed systems (Zhao, I. Introduction, pp. 8836, col. 2, paragraph 1; “The development of sensors and communication technologies for Internet of Things (IoT) have enabled a fast and large-scale collection of user data, which has bred new services and applications, such as the Waze application that provides the intelligent transportation routing 
service. This kind of service benefits users’ daily life, but it may raise privacy concerns of sensitive data, such as users’ location information. To address these concerns, we propose a hybrid approach that integrates federated learning (FL) [1] with local differential privacy (LDP) [2] techniques.”). 

Claims 20 and 21 are system claims corresponding to the steps of claims 8 and 9 and are therefore rejected for the same reasons. 

Claims 11 and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Zhou in view of Arora in further view of Asi et al. (“Private Adaptive Gradient Methods for Convex Optimization,” hereinafter Asi). 

Regarding claim 11, the combination of Zhou and Arora teaches [t]he method of claim 1 (and thus the rejection of claim 1 is incorporated). 
Zhou does not explicitly teach wherein the machine learning model comprises a language model. However, Asi, in the area of differentially private gradient descent algorithms, teaches this limitation (Asi, 6. Experiments, pp. 7, col. 1; “We perform experiments both on synthetic data, where we may control all aspects of the experiment, and a real-world example training large-scale private language models”). 
Asi is analogous to the claimed invention as both are from the same field of endeavor, that is, differentially private gradient descent algorithms. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to train the large-scale private language models of Asi using the combined differentially private mirror 
descent algorithm of Zhou and Arora. The motivation to do so is to guarantee privacy in natural language processing applications. 

Claim 23 is a system claim corresponding to the steps of claim 11 and is therefore rejected for the same reasons. 

Claims 12 and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Zhou in view of Arora in further view of Jiang et al. (“A GDPR-compliant Ecosystem for Speech Recognition with Transfer, Federated, and Evolutionary Learning,” hereinafter Jiang). 
Regarding claim 12, the combination of Zhou and Arora teaches [t]he method of claim 1 (and thus the rejection of claim 1 is incorporated). 
Zhou does not explicitly teach wherein the machine learning model comprises a speech recognition model. However, Jiang, in the area of differentially private speech recognition machine learning methods, teaches this limitation (Jiang, 1 Introduction, pp. 2, paragraph 3; “In the proposed framework, transfer learning is responsible for tuning a highly customized ASR (Automated Speech Recognition) system for the client and overcoming the performance degrade caused by the “one-size-fit-all” LM (language model) and AM (acoustic model). Federated learning bridges the gap of information flow between the clients and the vendor in a privacy-preserving way.”). Jiang is analogous to the claimed invention as both are from the same field of endeavor, differentially private federated learning. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to train the 
automated speech recognition model of Jiang using the combined differentially private mirror descent algorithm of Zhou and Arora. The motivation to do so is to guarantee privacy of client’s speech data in line with legal regulations while maintaining sufficient accuracy (Jiang, 1 Introduction, pp. 2, paragraph 2; “Considering that strict data regulations such as the European Union’s [72] has come into General Data Protection Regulation (GDPR) effect, such privacy-violating practice becomes illegal for real-life applications. Meanwhile, since the amount and diversity of speech data utilized for training the ASR system is critical for the performance of AM, the speech data stored on clients’ machines is invaluable resources for ASR vendors to further refine their ASR systems. Jiang, 1 Introduction, pp. 2, paragraph 3; “With differential privacy, the perturbed version of tuned AM works as a compact and secure proxy of clients’ data and are communicated between the clients and the vendor.”). 
Claim 24 is a system claim corresponding to the steps of claim 12 and is therefore rejected for the same reasons. 

Response to Arguments
Applicant's arguments filed 10/13/2025 have been fully considered but they are not persuasive. The Applicant states that “The rejection errs by focusing on the individual mathematical aspects of the steps (e.g., "calculating gradients," "executing a mirror descent algorithm") in isolation. The proper analysis requires determining the "character of the claim" as a whole by considering the specific, integrated steps. The amended claim, in particular, contains limitations that move the invention beyond mere abstract calculation into a non-abstract, technical process…”(pages 7-8). The examiner disagrees, the various limitations are mathematical calculations, which do not amount to significantly more than the mathematical operations or abstract idea. The training of a machine learning model amounts to the application of a computer to the abstract mathematical calculations.

Regarding claim 1, Applicant indicates that “Zhou and Arora, whether taken alone or in combination, fails to disclose or suggest training a machine learning model based on the reshaped set of DP gradients to ensure a population risk guarantee for convex losses with no explicit dependence on a dimension of the machine learning model. Namely, as Zhou and Arora are each directed toward DP-SGD variants, both Zhou and Arora are hindered with polynomial dependence on model dimensionality caused by DP-SGD.”(page 10). The examiner disagrees, because Zhou teaches  Convergence for convex and non-convex optimization, “Building on the reconstruction error bound, we provide convergence and sample complexity results for our method PDP-SGD for solving Empirical risk minimization (ERM) in two types of loss functions, including 1) smooth and non-convex, 2) Lipschitz convex.”) (pp. 3, paragraph 4). Zhou discloses circumventing the dependence on the ambient dimension by leveraging a low-dimensional structure of gradient space in deep networks (abstract).

Claim 13 is similar to claim 1 and is likewise rejected.

Claims 7, 19 are rejected at least based on the rejection independent claims 1, and  13 as indicated above.

Claims 8-9, and 20-21 are rejected at least based on the rejection independent claims 1, and 13 as indicated above.

Claims 12, and 24 are rejected at least based on the rejection independent claims 1, and 13 as indicated above.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CESAR PAULA whose telephone number is (571)272-4128. The examiner can normally be reached Monday - Friday, 6.30am- 4:30 pm ET. 
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Wiley can be reached at (571)272-3923. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/CESAR B PAULA/Supervisory Patent Examiner, Art Unit 2145
Read full office action
Prosecution Timeline

Oct 04, 2022
Application Filed
Jul 30, 2025
Non-Final Rejection mailed — §101, §103
Oct 13, 2025
Response Filed
May 05, 2026
Final Rejection mailed — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/706,665
Patent 12596934
PREDICTION-MODEL-BUILDING METHOD, STATE PREDICTION METHOD AND DEVICES THEREOF
4y 0m to grant Granted Apr 07, 2026
17/450,353
Patent 12585982
MODEL MANAGEMENT USING CONTAINERS
4y 5m to grant Granted Mar 24, 2026
19/011,578
Patent 12585859
SYSTEM AND METHOD FOR IMPROVING THE CLARITY OF OVERLAPPING OBJECTS
1y 2m to grant Granted Mar 24, 2026
17/245,892
Patent 12579439
Kernelized Classifiers in Neural Networks
4y 10m to grant Granted Mar 17, 2026
17/699,489
Patent 12554971
METHOD OF PREDICTING CHARACTERISTICS OF SEMICONDUCTOR DEVICE AND COMPUTING DEVICE PERFORMING THE SAME
3y 11m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
34%
Grant Probability
41%
With Interview (+7.3%)
4y 6m (~10m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 172 resolved cases by this examiner. Grant probability derived from career allowance rate.