Prosecution Insights
Last updated: April 19, 2026
Application No. 17/715,768

Shuffled Secure Multiparty Deep Learning

Non-Final OA §101§103
Filed
Apr 07, 2022
Examiner
BOSTWICK, SIDNEY VINCENT
Art Unit
2124
Tech Center
2100 — Computer Architecture & Software
Assignee
Micron Technology, Inc.
OA Round
3 (Non-Final)
52%
Grant Probability
Moderate
3-4
OA Rounds
4y 7m
To Grant
90%
With Interview

Examiner Intelligence

Grants 52% of resolved cases
52%
Career Allow Rate
71 granted / 136 resolved
-2.8% vs TC avg
Strong +38% interview lift
Without
With
+38.2%
Interview Lift
resolved cases with interview
Typical timeline
4y 7m
Avg Prosecution
68 currently pending
Career history
204
Total Applications
across all art units

Statute-Specific Performance

§101
24.4%
-15.6% vs TC avg
§103
40.9%
+0.9% vs TC avg
§102
12.0%
-28.0% vs TC avg
§112
21.9%
-18.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 136 resolved cases

Office Action

§101 §103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/3/2025 has been entered. Information Disclosure Statement The information disclosure statement (IDS) submitted on December 26, 2025 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner. Remarks This Office Action is responsive to Applicants' Amendment filed on December 3, 2025, in which claim 1 is currently amended. Claims 1-20 are currently pending. Response to Arguments Applicant’s arguments with respect to rejection of claims 1-20 under 35 U.S.C. 103 based on amendment have been considered, however, are not persuasive. With respect to Applicant's arguments on p. 1 of the Remarks submitted 12/3/2025 that "It is erroneous to read the "data sample" recited in the claims in a "vacuum" by asserting that "[[t]]here is nothing in the claims that limits "data sample"", Examiner respectfully disagrees. There is nothing in the instant specification that explicitly limits "data sample" such that it would be unreasonable to interpret the model parameters in Radhakrishnan as "data samples". Radhakrishnan explicitly discloses that the local model is stored in memory such that one of ordinary skill in the art would recognize that the trained local model is objectively data, the model parameters being a subset (samples) of the data which is highly consistent with the ordinary definition of "data sample". Examiner notes MPEP 2111(II) "II. IT IS IMPROPER TO IMPORT CLAIM LIMITATIONS FROM THE SPECIFICATION “Though understanding the claim language may be aided by explanations contained in the written description, it is important not to import into a claim limitations that are not part of the claim. For example, a particular embodiment appearing in the written description may not be read into a claim when the claim language is broader than the embodiment.”". For this reason, Examiner also asserts that the interpretation does not modify the disclosure of Radhakrishnan whatsoever but again simply relies on the ordinary definition of a "data sample" as would be understood to one of ordinary skill in the art. Examiner notes that claims 12, 17, and 18 are of significantly different scope than claim 1. Applicant’s arguments with respect to rejection of claims 1-20 under 35 U.S.C. 101 based on amendment have been considered, however, are not persuasive. With respect to Applicant's arguments on pp. 3-5 of the Remarks submitted 12/3/2025 that the claims integrate the judicial exception into a practical application, Examiner respectfully disagrees. Examiner notes that the claims as a whole are directed towards a judicial exception of mapping and shuffling data which can readily and practically be performed entirely in the mind with or without the assistance of tools such as pen and paper. Examiner notes (MPEP 2106.07(a)(II) "employing well-known computer functions to execute an abstract idea, even when limiting the use of the idea to one particular environment, does not integrate the exception into a practical application"). With respect to Applicant’s arguments on p. 4 of the Remarks submitted 12/3/2025 that “a person of ordinary skill in the relevant field would recognize that the subject matter of claim 1 recites a technique for data privacy protection in collaborative processing by a recited “computing device” and a recited “first entity””, Examiner respectfully disagrees. First, Examiner notes that a “first entity” is a nonce term which is not explicitly limited in the instant claims or specification. Secondly, Examiner notes that there is nothing in the instant claims explicitly limiting the claims to the field of data privacy protection. Even if the claims were limited to data privacy protection, Examiner notes that the claims as a whole can readily be performed entirely in the mind such that the recitation of generic computer components is mere instructions to apply the judicial exception using generic computer components rather than a technique of improving the technology itself. Examiner notes MPEP 2106.05(a) "It is important to note, the judicial exception alone cannot provide the improvement. The improvement can be provided by one or more additional elements. […] An important consideration in determining whether a claim improves technology is the extent to which the claim covers a particular solution to a problem or a particular way to achieve a desired outcome, as opposed to merely claiming the idea of a solution or outcome.”. For at least these reasons and those further detailed below Examiner asserts that it is reasonable and appropriate to maintain the rejection under 35 U.S.C. 101. Claim Rejections - 35 USC § 101 101 Rejection 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-20 are rejected under 35 USC § 101 because the claimed invention is directed to non-statutory subject matter. Regarding Claim 1: Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Step 1 Analysis: Claim 1 is directed to a method, which is directed to a process, one of the statutory categories. Step 2A Prong One Analysis: Claim 1 under its broadest reasonable interpretation is a series of mental processes. For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: generating, […], a plurality of first parts from a first data sample of deep learning (observation, evaluation, and judgement), generating, […], a plurality of second parts from a second data sample of deep learning (observation, evaluation, and judgement) shuffling, […] according to a map, at least the first parts and the second parts to mix parts generated from the first data sample and the second data sample (observation, evaluation, and judgement) communicating, […] to a first entity, third parts to request the first entity to apply a same operation of computing to each of the third parts, the third parts identified according to the map to include a first subset from the first parts and a second subset from the second parts (observation, evaluation, and judgement) generating, […] based at least in part on the third results and the map, a first result of applying the same operation to the first data sample and a second result of applying the same operation to the second data sample (observation, evaluation, and judgement) Therefore, claim 1 recites an abstract idea which is a judicial exception. Step 2A Prong Two Analysis: Claim 1 recites additional elements “by a computing device”, . However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component. An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application. Claim 1 also recites additional elements “receiving, by the computing device from the first entity, third results of applying the same operation to the third parts respectively” which amounts to gathering data which is insignificant extra-solution activity that does not integrate the judicial exception into a practical application (See MPEP 2106.05(g)). Therefore, claim 1 is directed to a judicial exception. Step 2B Analysis: Claim 1 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 1 amount to no more than mere instructions to apply the judicial exception using a generic computer component and insignificant extra-solution activity. The gathering of data is considered well-understood, routine, and conventional in the art (See MPEP 2106.05(d)(II)). For the reasons above, claim 1 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to dependent claims 2-11. The additional limitations of the dependent claims are addressed briefly below: Dependent claim 2 recites additional insignificant extra-solution activity of gathering and outputting data “communicate to the first entity parts from the first parts and the second parts but without communicating to the first entity at least one of the first parts and at least one of the second parts.” Which is well-understood, routine, and conventional in the art (see MPEP 2106.05(d)(II)) Dependent claim 3 recites additional observation, evaluation, and judgement “wherein each of the first parts is based on random numbers; and a sum of the first parts is equal to the first data sample.” Dependent claim 4 recites additional observation, evaluation, and judgement “wherein the generating of the plurality of first parts includes generating a set of random numbers as one of the plurality of first parts”. Dependent claim 5 recites additional observation, evaluation, and judgement “identifying, by the computing device according to the map, fourth results of applying the same operation to the first parts respectively; and summing, by the computing device, the fourth results to obtain the first result” Dependent claim 6 recites additional observation, evaluation, and judgement “communicating, […], the at least one of the first parts to request the second entity to apply the same operation of computing to each of the at least one of the first parts” and “wherein the first result is generated based on the respective at least one result of applying the same operation to the at least one of the first parts” as well as additional insignificant extra-solution activity of gathering data “receiving, by the computing device from the second entity, respective at least one result of applying the same operation to the at least one of the first parts” which is well-understood, routine, and conventional in the art. Dependent claim 7 recites additional observation, evaluation, and judgement “wherein the first parts are provided at a same precision level as the first data sample” Dependent claim 8 recites additional observation, evaluation, and judgement “wherein each respective data item in the first data sample has a corresponding data item in each of the first parts; and the respective data item and the corresponding data item are specified via a same number of bits” Dependent claim 9 recites additional elements “wherein the same operation is representative of a computation in an artificial neural network” which amounts to generally linking the judicial exception to a particular field or technology (See MPEP 2106.05(h)) Dependent claim 10 recites additional elements “the same operation is configured to be performed via multiply-accumulate units” which amounts to instructions to apply the judicial exception using generic computer components Dependent claim 11 recites additional observation, evaluation, and judgement “generating, […] from a description of a first artificial neural network, a description of a second artificial neural network describing the same operation to be performed using a deep learning accelerator of the first entity” Regarding Claim 12: Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Step 1 Analysis: Claim 12 is directed to a device, which is directed to a product, one of the statutory categories. Step 2A Prong One Analysis: Claim 12 under its broadest reasonable interpretation is a series of mental processes. For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: generate a plurality of first parts from a first data sample (observation, evaluation, and judgement), generate a plurality of second parts from a second data sample (observation, evaluation, and judgement) shuffle, according to a map, at least the first parts and the second parts to mix parts generated from the first data sample and the second data sample (observation, evaluation, and judgement) communicate, to a first entity, third parts to request the first entity to apply a same operation of computing to each of the third parts, the third parts identified according to the map to include a first subset from the first parts and a second subset from the second parts (observation, evaluation, and judgement) Therefore, claim 12 recites an abstract idea which is a judicial exception. Step 2A Prong Two Analysis: Claim 12 recites additional elements “memory”, “at least one microprocessor coupled to the memory and configured via instructions to”. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component. An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application. Therefore, claim 12 is directed to a judicial exception. Step 2B Analysis: Claim 12 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 12 amount to no more than mere instructions to apply the judicial exception using a generic computer component. For the reasons above, claim 12 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to dependent claims 13-16. The additional limitations of the dependent claims are addressed briefly below: Dependent claim 13 recites additional insignificant extra-solution activity of gathering and outputting data “receive, from the first entity, third results of applying the same operation to the third parts respectively.” Which is well-understood, routine, and conventional in the art (see MPEP 2106.05(d)(II)) as well as additional observation, evaluation, and judgement “generate, based at least in part on the third results and the map, a first result of applying the same operation to the first data sample and a second result of applying the same operation to the second data sample” Dependent claim 14 recites additional observation, evaluation, and judgement “wherein the first data sample is equal to a sum of the first parts; and the first result is generated from a sum of results of applying the same operation to the first parts respectively.” Dependent claim 15 recites additional observation, evaluation, and judgement “to exclude communication of at least one of the first parts and at least one of the second parts to the first entity”. Dependent claim 16 recites additional observation, evaluation, and judgement “to generate, for each respective data item in the first data sample, a corresponding data item in each of the first parts; and the respective data item and the corresponding data item are specified via a same number of bits” Regarding Claim 17: Claim 17 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Step 1 Analysis: Claim 17 is directed to a device, which is directed to a product, one of the statutory categories. Step 2A Prong One Analysis: Claim 17 under its broadest reasonable interpretation is a series of mental processes. For example, but for the generic computer components language, the above limitations in the context of this claim encompass neural network processing, including the following: generating, based at least in part on the third results and a map used to shuffle the first parts and the second parts, a first result of applying the same operation to the first data sample and a second result of applying the same operation to the second data sample (observation, evaluation, and judgement), Therefore, claim 17 recites an abstract idea which is a judicial exception. Step 2A Prong Two Analysis: Claim 17 recites additional elements “A non-transitory computer storage medium storing instructions which, when executed in a computing device, cause the computing device to perform a method, comprising”. However, these additional features are computer components recited at a high-level of generality, such that they amount to no more than mere instructions to apply the judicial exception using a generic computer component. An additional element that merely recites the words “apply it” (or an equivalent) with the judicial exception, or merely includes instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea, does not integrate the judicial exception into a practical application. Dependent claim 17 also recites additional elements “receiving, from a first entity, third results of applying a same operation to third parts respectively, wherein the first entity is provided with the third parts selected from a shuffled collection of first parts generated from a first data sample and second parts generated from a second data sample” which amounts to gathering data which is insignificant extra-solution activity (See MPEP 2106.05(g)). Therefore, claim 17 is directed to a judicial exception. Step 2B Analysis: Claim 17 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the lack of integration of the abstract idea into a practical application, the additional elements recited in claim 17 amount to no more than mere instructions to apply the judicial exception using a generic computer component and insignificant extra-solution activity. The gathering of data is considered well-understood, routine, and conventional in the art (See MPEP 2106.05(d)(II)). For the reasons above, claim 17 is rejected as being directed to non-patentable subject matter under §101. This rejection applies equally to dependent claims 18-20. The additional limitations of the dependent claims are addressed briefly below: Dependent claim 18 recites additional observation, evaluation, and judgement “generating the first parts from the first data sample and random numbers”, “generating the second parts from the second data sample and random numbers”, “shuffling, according to the map, at least the first parts and the second parts to mix parts generated from the first data sample and the second data sample”, and “communicating, to the first entity, the third parts to request the first entity to apply the same operation of computing to each of the third parts, the third parts identified according to the map to include a first subset from the first parts and a second subset from the second parts” Dependent claim 19 recites additional observation, evaluation, and judgement “wherein the first data sample is equal to a sum of the first parts; and the first result is generated from a sum of results of applying the same operation to the first parts respectively.” Dependent claim 20 recites additional observation, evaluation, and judgement “excluding at least one of the first parts and at least one of the second parts from being communicated to the first entity”. Therefore, when considering the elements separately and in combination, they do not add significantly more to the inventive concept. Accordingly, claims 1-20 are rejected under 35 U.S.C. § 101. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 1, 2, 12, 13, 14, 15, 17, 18, 19, and 20 are rejected under U.S.C. §103 as being unpatentable over Radhakrishnan (US20220374762A1). PNG media_image1.png 486 700 media_image1.png Greyscale FIG. 7 of US20220374762A1 Regarding claim 1, Radhakrishnan teaches A method, comprising: generating, by a computing device, a plurality of first parts from a first data sample of deep learning; ([¶0002] "Federated learning (FL) provides a collaborative training mechanism, which allows multiple parties to build a machine learning (ML) model together." See FIG. 7, elements 1-4 interpreted as first parts from a first data sample of a trained local deep learning model.) generating, by the computing device, a plurality of second parts from a second data sample of deep learning;(See FIG. 7 elements k to k-4 interpreted as second parts from a second data sample.) shuffling, by the computing device according to a map, at least the first parts and the second parts to mix parts generated from the first data sample and the second data sample; ([Abstract] "By using multiple decentralized aggregators, parties are enabled to partition their respective model updates at model-parameter granularity, and can map single weights to a specific aggregator entity. Parties also can dynamically shuffle fragmentary model updates at each training iteration to further obfuscate the information dispatched to each aggregator execution entity") communicating, by the computing device to a first entity, third parts to request the first entity to apply a same operation of computing to each of the third parts, the third parts identified according to the map to include a first subset from the first parts and a second subset from the second parts;(See FIG. 7 where aggregator 1 (interpreted as first entity) receives elements 2 and k-1 from the first and second data sample to perform fusion on.) receiving, by the computing device from the first entity, third results of applying the same operation to the third parts respectively; and generating, by the computing device based at least in part on the third results and the map, a first result of applying the same operation to the first data sample and a second result of applying the same operation to the second data sample.([¶0089] "As also shown, the model updates are disassembled and rearranged for different aggregators (step (4) in FIG. 6) to generate the shuffled partitions. The shuffled partitions are then uploaded to the respective aggregators and the fusion is carried out to generate the aggregated partitions. After parties receive aggregated model updates from the different aggregators, they reversely shuffle the aggregated model update to the correct order. The same mapper 710 is then queried again to merge model updates to original positions within the local model (step (5) in FIG. 6). In FIG. 7, only one local model (trained and then merged) is depicted, but each of the parties has its own such local model constructs." See FIG. 7 merge step where mapper receives results from the aggregators.). While Radhakrishnan does not explicitly teach that the mapper performs the shuffling, it would be obvious in view of FIG. 6 and 7 that the mapper is used or can be used for the shuffling which occurs prior to upload to the aggregator. Regarding claim 2, Radhakrishnan teaches The method of claim 1, wherein the computing device is configured communicate to the first entity parts from the first parts and the second parts but without communicating to the first entity at least one of the first parts and at least one of the second parts.(Radhakrishnan See FIG. 7 where aggregator 1 receives elements 2 and k-1 from the first and second data sample to perform fusion on but does not receive parts k and 3 for example.). Regarding claim 12, Radhakrishnan teaches A computing device, comprising: memory; and at least one microprocessor coupled to the memory and configured via instructions to:([¶0025] "Processor unit 204 serves to execute instructions for software that may be loaded into memory 206. Processor unit 204 may be a set of one or more processors or may be a multi-processor core, depending on the particular implementation. Further, processor unit 204 may be implemented using one or more heterogeneous processor systems in which a main processor is present with secondary processors on a single chip. As another illustrative example, processor unit 204 may be a symmetric multi-processor (SMP) system containing multiple processors of the same type.") generate a plurality of first parts from a first data sample;(See FIG. 7, elements 1-4 interpreted as first parts from a first data sample.) generate a plurality of second parts from a second data sample;(See FIG. 7 elements k to k-4 interpreted as second parts from a second data sample.) shuffle, according to a map, at least the first parts and the second parts to mix parts generated from the first data sample and the second data sample; and([Abstract] "By using multiple decentralized aggregators, parties are enabled to partition their respective model updates at model-parameter granularity, and can map single weights to a specific aggregator entity. Parties also can dynamically shuffle fragmentary model updates at each training iteration to further obfuscate the information dispatched to each aggregator execution entity") communicate, to a first entity, third parts to request the first entity to apply a same operation of computing to each of the third parts, the third parts identified according to the map to include a first subset from the first parts and a second subset from the second parts.(See FIG. 7 where aggregator 1 (interpreted as first entity) receives elements 2 and k-1 from the first and second data sample to perform fusion on.). While Radhakrishnan does not explicitly teach that the mapper performs the shuffling, it would be obvious in view of FIG. 6 and 7 that the mapper is used or can be used for the shuffling which occurs prior to upload to the aggregator. Regarding claim 13, Radhakrishnan teaches The computing device of claim 12, wherein the at least one microprocessor is further configured via the instructions to: receive, from the first entity, third results of applying the same operation to the third parts respectively; and generate, based at least in part on the third results and the map, a first result of applying the same operation to the first data sample and a second result of applying the same operation to the second data sample.(Radhakrishnan [¶0089] "As also shown, the model updates are disassembled and rearranged for different aggregators (step (4) in FIG. 6) to generate the shuffled partitions. The shuffled partitions are then uploaded to the respective aggregators and the fusion is carried out to generate the aggregated partitions. After parties receive aggregated model updates from the different aggregators, they reversely shuffle the aggregated model update to the correct order. The same mapper 710 is then queried again to merge model updates to original positions within the local model (step (5) in FIG. 6). In FIG. 7, only one local model (trained and then merged) is depicted, but each of the parties has its own such local model constructs." See FIG. 7 merge step where mapper receives results from the aggregators.). Regarding claim 14, Radhakrishnan teaches The computing device of claim 13, wherein the first data sample is equal to a sum of the first parts; (Radhakrishnan See FIG. 7, the first data sample is made entirely of the first parts.) and the first result is generated from a sum of results of applying the same operation to the first parts respectively.(Radhakrishnan [¶0067] "The aggregator computes the gradient sum of all parties and let the parties synchronize their model parameters"). Regarding claim 15, Radhakrishnan teaches The computing device of claim 14, wherein the at least one microprocessor is further configured via the instructions to exclude communication of at least one of the first parts and at least one of the second parts to the first entity.(Radhakrishnan See FIG. 7 where aggregator 1 receives elements 2 and k-1 from the first and second data sample to perform fusion on but does not receive parts k and 3 for example.). Regarding claim 17, Radhakrishnan teaches A non-transitory computer storage medium storing instructions which, when executed in a computing device, cause the computing device to perform a method, comprising:([¶0025] "Processor unit 204 serves to execute instructions for software that may be loaded into memory 206. Processor unit 204 may be a set of one or more processors or may be a multi-processor core, depending on the particular implementation. Further, processor unit 204 may be implemented using one or more heterogeneous processor systems in which a main processor is present with secondary processors on a single chip. As another illustrative example, processor unit 204 may be a symmetric multi-processor (SMP) system containing multiple processors of the same type.") receiving, from a first entity, third results of applying a same operation to third parts respectively, wherein the first entity is provided with the third parts selected from a shuffled collection of first parts generated from a first data sample and second parts generated from a second data sample; and generating, based at least in part on the third results and a map used to shuffle the first parts and the second parts, a first result of applying the same operation to the first data sample and a second result of applying the same operation to the second data sample.([¶0089] "As also shown, the model updates are disassembled and rearranged for different aggregators (step (4) in FIG. 6) to generate the shuffled partitions. The shuffled partitions are then uploaded to the respective aggregators and the fusion is carried out to generate the aggregated partitions. After parties receive aggregated model updates from the different aggregators, they reversely shuffle the aggregated model update to the correct order. The same mapper 710 is then queried again to merge model updates to original positions within the local model (step (5) in FIG. 6). In FIG. 7, only one local model (trained and then merged) is depicted, but each of the parties has its own such local model constructs." See FIG. 7 merge step where mapper receives results from the aggregators.). While Radhakrishnan does not explicitly teach that the mapper performs the shuffling, it would be obvious in view of FIG. 6 and 7 that the mapper is used or can be used for the shuffling which occurs prior to upload to the aggregator. Regarding claim 18, Radhakrishnan teaches The non-transitory computer storage medium of claim 17, wherein the method further comprises: generating the first parts from the first data sample and random numbers;(Radhakrishnan See FIG. 7, elements 1-4 interpreted as first parts from a first data sample.) generating the second parts from the second data sample and random numbers;(Radhakrishnan See FIG. 7 elements k to k-4 interpreted as second parts from a second data sample.) shuffling, according to the map, at least the first parts and the second parts to mix parts generated from the first data sample and the second data sample; and(Radhakrishnan [Abstract] "By using multiple decentralized aggregators, parties are enabled to partition their respective model updates at model-parameter granularity, and can map single weights to a specific aggregator entity. Parties also can dynamically shuffle fragmentary model updates at each training iteration to further obfuscate the information dispatched to each aggregator execution entity") communicating, to the first entity, the third parts to request the first entity to apply the same operation of computing to each of the third parts, the third parts identified according to the map to include a first subset from the first parts and a second subset from the second parts.(Radhakrishnan See FIG. 7 where aggregator 1 (interpreted as first entity) receives elements 2 and k-1 from the first and second data sample to perform fusion on.). Regarding claim 19, Radhakrishnan teaches The non-transitory computer storage medium of claim 18, wherein the first data sample is equal to a sum of the first parts; (Radhakrishnan See FIG. 7, the first data sample is made entirely of the first parts.) and the first result is generated from a sum of results of applying the same operation to the first parts respectively.(Radhakrishnan [¶0067] "The aggregator computes the gradient sum of all parties and let the parties synchronize their model parameters"). Regarding claim 20, Radhakrishnan teaches The non-transitory computer storage medium of claim 19, wherein the method further comprises: excluding at least one of the first parts and at least one of the second parts from being communicated to the first entity.(Radhakrishnan See FIG. 7 where aggregator 1 receives elements 2 and k-1 from the first and second data sample to perform fusion on but does not receive parts k and 3 for example.). Claims 3, 4, 5, 6, and 9 are rejected under U.S.C. §103 as being unpatentable over the combination of Radhakrishnan and Fan ("Learning From Pseudo-Randomness With an Artificial Neural Network—Does God Play Pseudo-Dice?", 2018) Regarding claim 3, Radhakrishnan teaches and a sum of the first parts is equal to the first data sample.(Radhakrishnan See FIG. 7). However, Radhakrishnan doesn't explicitly teach The method of claim 2, wherein each of the first parts is based on random numbers; . Fan, in the same field of endeavor, teaches The method of claim 2, wherein each of the first parts is based on random numbers; ([p. 22988 §IIA] "FIGURE 2. Pseudo-randomness of π' [...] In our study, we cast this exemplary question into a binary version, which is even more challenging. Specifically, we binarized the sequence of π with the threshold 5 to obtain a 0-1 sequence π', as shown in Figure 1. Then, we used an artificial neural network with the configuration of 6-30-20-1 to predict the seventh digit from its precedent consecutive six numbers. The training dataset consisted of 40,000 instances made from the 1st to 40,007th digits of π'"). Radhakrishnan as well as Fan are directed towards neural network training. Therefore, Radhakrishnan as well as Fan are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Radhakrishnan with the teachings of Fan by using pseudo-random numbers as a training dataset for the machine learning model. Fan provides as additional motivation for combination ([p. 22991 §III] "A major point we want to make is that the use of a neural network in extremely uncertain environments such as with pseudo-randomness can be interesting and promising. Our numerical simulation suggests a non-trivial extension of what machine learning can do. More powerful architectures of neural networks, such as GAN and RNN, could bring more insight into pseudo-randomness"). This motivation for combination also applies to the remaining claims which depend on this combination. Regarding claim 4, the combination of Radhakrishnan and Fan teaches The method of claim 3, wherein the generating of the plurality of first parts includes generating a set of random numbers as one of the plurality of first parts.(Fan [p. 22988 §IIA] "FIGURE 2. Pseudo-randomness of π' [...]In our study, we cast this exemplary question into a binary version, which is even more challenging. Specifically, we binarized the sequence of π with the threshold 5 to obtain a 0-1 sequence π', as shown in Figure 1. Then, we used an artificial neural network with the configuration of 6-30-20-1 to predict the seventh digit from its precedent consecutive six numbers. The training dataset consisted of 40,000 instances made from the 1st to 40,007th digits of π'"). Regarding claim 5, the combination of Radhakrishnan and Fan teaches The method of claim 3, wherein the generating of the first result includes: identifying, by the computing device according to the map, fourth results of applying the same operation to the first parts respectively; and summing, by the computing device, the fourth results to obtain the first result.(Radhakrishnan [¶0067] "The aggregator computes the gradient sum of all parties and let the parties synchronize their model parameters"). Regarding claim 6, the combination of Radhakrishnan and Fan teaches The method of claim 5, further comprising: communicating, by the computing device to a second entity, the at least one of the first parts to request the second entity to apply the same operation of computing to each of the at least one of the first parts; and(Radhakrishnan See FIG. 7 where aggregator 2 (interpreted as second entity) receives elements 2 and k-1 from the first and second data sample to perform fusion on.) receiving, by the computing device from the second entity, respective at least one result of applying the same operation to the at least one of the first parts; wherein the first result is generated based on the respective at least one result of applying the same operation to the at least one of the first parts.(Radhakrishnan [¶0089] "As also shown, the model updates are disassembled and rearranged for different aggregators (step (4) in FIG. 6) to generate the shuffled partitions. The shuffled partitions are then uploaded to the respective aggregators and the fusion is carried out to generate the aggregated partitions. After parties receive aggregated model updates from the different aggregators, they reversely shuffle the aggregated model update to the correct order. The same mapper 710 is then queried again to merge model updates to original positions within the local model (step (5) in FIG. 6). In FIG. 7, only one local model (trained and then merged) is depicted, but each of the parties has its own such local model constructs." See FIG. 7 merge step where mapper receives results from the aggregators.). Regarding claim 9, the combination of Radhakrishnan and Fan teaches The method of claim 6, wherein the same operation is representative of a computation in an artificial neural network.(Radhakrishnan [¶0006] "According to this disclosure, a federated learning system and method for neural network training to defend against privacy leakage and data reconstruction attacks is described" [¶0067] "The aggregator computes the gradient sum of all parties and let the parties synchronize their model parameters"). Claims 7 and 8 are rejected under U.S.C. §103 as being unpatentable over the combination of Radhakrishnan and Fan and Tang (“How to Train a Compact Binary Neural Network with High Accuracy?”, 2017). Regarding claim 7, the combination of Radhakrishnan and Fan teaches The method of claim 6. However, the combination of Radhakrishnan and Fan doesn't explicitly teach, wherein the first parts are provided at a same precision level as the first data sample. Tang, in the same field of endeavor, teaches The method of claim 6, wherein the first parts are provided at a same precision level as the first data sample.([p. 2625] "binarizes both the weights and activations of a full precision neural network" [p. 2626] "Here we mainly focus our discussion on methods in which both weights and activations are binary-valued, since this kind of network not only reduces the memory storage, but also is more computational efficient." Tang teaches that all the model weights in the network have a same number of bits (1).). The combination of Radhakrishnan and Fan as well as Tang are directed towards neural network training. Therefore, the combination of Radhakrishnan and Fan as well as Tang are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Radhakrishnan and Fan with the teachings of Tang by using binary weight networks. Tang provides as additional motivation for combination ([p. 2626] "Here we mainly focus our discussion on methods in which both weights and activations are binary-valued, since this kind of network not only reduces the memory storage, but also is more computational efficient."). This motivation for combination also applies to the remaining claims which depend on this combination. Regarding claim 8, the combination of Radhakrishnan and Fan teaches The method of claim 6. However, the combination of Radhakrishnan and Fan doesn't explicitly teach, wherein each respective data item in the first data sample has a corresponding data item in each of the first parts; and the respective data item and the corresponding data item are specified via a same number of bits. Tang, in the same field of endeavor, teaches The method of claim 6, wherein each respective data item in the first data sample has a corresponding data item in each of the first parts; and the respective data item and the corresponding data item are specified via a same number of bits.([p. 2626] "Here we mainly focus our discussion on methods in which both weights and activations are binary-valued, since this kind of network not only reduces the memory storage, but also is more computational efficient." Tang teaches that all the model weights in the network have a same number of bits (1).). The combination of Radhakrishnan and Fan as well as Tang are directed towards neural network training. Therefore, the combination of Radhakrishnan and Fan as well as Tang are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Radhakrishnan and Fan with the teachings of Tang by using binary weight networks. Tang provides as additional motivation for combination ([p. 2626] "Here we mainly focus our discussion on methods in which both weights and activations are binary-valued, since this kind of network not only reduces the memory storage, but also is more computational efficient."). This motivation for combination also applies to the remaining claims which depend on this combination. Claims 10 and 11 are rejected under U.S.C. §103 as being unpatentable over the combination of Radhakrishnan, Fan, and Kaul (US20180315399A1). Regarding claim 10, the combination of Radhakrishnan and Fan teaches The method of claim 9. However, the combination of Radhakrishnan and Fan doesn't explicitly teach, wherein the same operation is configured to be performed via multiply-accumulate units. Kaul, in the same field of endeavor, teaches The method of claim 9, wherein the same operation is configured to be performed via multiply-accumulate units.([¶0023] "FIG. 17A-17B illustrates logic units including merged computation circuits to perform floating point and integer fused-multiply accumulate operations, according to an embodiment"). The combination of Radhakrishnan and Fan as well as Kaul are directed towards distributed neural network training. Therefore, the combination of Radhakrishnan and Fan as well as Kaul are analogous art in the same field of endeavor. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Radhakrishnan and Fan with the teachings of Kaul by using a multiply-accumulate unit to perform the fusion function in Radhakrishnan. Kaul provides as additional motivation for combination ([¶0211] "Logic unit 1740 of FIG. 17B provides a merged floating-point/integer multiply-accumulate design with a local accumulator width that is twice as wide as the input operands. This enables much higher accumulation accuracy for operations like dot-products without impacting memory storage footprint of input operands and affects a small portion of the design for only 11% total area impact."). This motivation for combination also applies to the remaining claims which depend on this combination. Regarding claim 11, the combination of Radhakrishnan, Fan, and Kaul teaches The method of claim 10, further comprising: generating, by the computing device from a description of a first artificial neural network, a description of a second artificial neural network describing the same operation to be performed using a deep learning accelerator of the first entity.(Radhakrishnan [¶0066] "each party trains a local model with its private training data and uploads the model updates to a central server. Typically, the aggregator also is responsible for managing parties, orchestrating training tasks, and merging model updates. The aggregated model updates are dispatched to parties for synchronizing their local models (typically) after each training iteration" Updated version of the artificial neural network interpreted as a second artificial neural network). Claim 16 is rejected under U.S.C. §103 as being unpatentable over the combination of Radhakrishnan and Tang. Regarding claim 16, Radhakrishnan teaches The computing device of claim 14, wherein the at least one microprocessor is further configured via the instructions to generate, for each respective data item in the first data sample, a corresponding data item in each of the first parts; (Radhakrishnan [¶0089] "As also shown, the model updates are disassembled and rearranged for different aggregators (step (4) in FIG. 6) to generate the shuffled partitions. The shuffled partitions are then uploaded to the respective aggregators and the fusion is carried out to generate the aggregated partitions. After parties receive aggregated model updates from the different aggregators, they reversely shuffle the aggregated model update to the correct order. The same mapper 710 is then queried again to merge model updates to original positions within the local model (step (5) in FIG. 6). In FIG. 7, only one local model (trained and then merged) is depicted, but each of the parties has its own such local model constructs." See FIG. 7 merge step where mapper receives results from the aggregators.). However, Radhakrishnan doesn't explicitly teach and the respective data item and the corresponding data item are specified via a same number of bits.. Tang, in the same field of endeavor, teaches and the respective data item and the corresponding data item are specified via a same number of bits.([p. 2626] "Here we mainly focus our discussion on methods in which both weights and activations are binary-valued, since this kind of network not only reduces the memory storage, but also is more computational efficient." Tang teaches that all the model weights in the network have a same number of bits (1).). The combination of Radhakrishnan as well as Tang are directed towards neural network training. Therefore, the combination of Radhakrishnan as well as Tang are reasonably pertinent analogous art. It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Radhakrishnan with the teachings of Tang by using binary weight networks. Tang provides as additional motivation for combination ([p. 2626] "Here we mainly focus our discussion on methods in which both weights and activations are binary-valued, since this kind of network not only reduces the memory storage, but also is more computational efficient."). This motivation for combination also applies to the remaining claims which depend on this combination. Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Zheng (“Towards Secure and Practical Machine Learning via Secret Sharing and Random Permutation”, 2022) is directed towards shuffling deep learning data for privacy preservation. Cheu (“Distributed Differential Privacy via Shuffling”, 2019) is also directed towards shuffling deep learning data for privacy preservation. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720. The examiner can normally be reached M-F 7:30am-5:00pm EST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /SIDNEY VINCENT BOSTWICK/Examiner, Art Unit 2124 /MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124
Read full office action

Prosecution Timeline

Apr 07, 2022
Application Filed
May 01, 2025
Non-Final Rejection — §101, §103
Aug 06, 2025
Response Filed
Aug 27, 2025
Final Rejection — §101, §103
Nov 03, 2025
Response after Non-Final Action
Dec 03, 2025
Request for Continued Examination
Dec 10, 2025
Response after Non-Final Action
Jan 02, 2026
Non-Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12561604
SYSTEM AND METHOD FOR ITERATIVE DATA CLUSTERING USING MACHINE LEARNING
2y 5m to grant Granted Feb 24, 2026
Patent 12547878
Highly Efficient Convolutional Neural Networks
2y 5m to grant Granted Feb 10, 2026
Patent 12536426
Smooth Continuous Piecewise Constructed Activation Functions
2y 5m to grant Granted Jan 27, 2026
Patent 12518143
FEEDFORWARD GENERATIVE NEURAL NETWORKS
2y 5m to grant Granted Jan 06, 2026
Patent 12505340
STASH BALANCING IN MODEL PARALLELISM
2y 5m to grant Granted Dec 23, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
52%
Grant Probability
90%
With Interview (+38.2%)
4y 7m
Median Time to Grant
High
PTA Risk
Based on 136 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month