Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
PRIOR ART AVAILABILITY AND IDS INCOMPLETE
On the IDS filed 9/21/2022, Applicant notes Assisted Learning and Imitation Privacy by Xian et al and gives a publication date of 12/20/2020. The inventor was a co-author on this paper. This paper teaches claim elements as filed and it was published to Arxiv on 4/1/2020 and 5/31/2020, see below excerpt from https://arxiv.org/abs/2004.00566v2.
PNG
media_image1.png
164
312
media_image1.png
Greyscale
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
Claims 9, 14 and 17-20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 9 recites “creating, by the processing circuitry, a learner unit for the first agent by fitting, into a fitted label set, an initial label set based on the first machine learning model and the first feature set…” Creating a learner by fitting labels into a fitted label set is not described in the specification. That is not a term of art and it’s nothing a person in the art would understand.
Applicant goes on to claim “wherein the learner unit is configured with the first learning technique to generate the first label set to comprise the fitted label set, wherein the first machine learning model is configured to generate a mapping between the first feature set and the initial label set.” Claim 9. There is a section of the specification where a applicant fits “the label set… to a fitted label set…” Spec. para 37. However, these two remaining clauses are not described in that step.
Claim 14 recites “centralized feature datasets or decentralized feature datasets.” This is not a term of art and it is not described in the specification.
Claim 17 recites, “map a first feature set to a first label set based on confidence scores for samples…” This is never described in the application.
Claim 18 recites “map the first feature set to the task label set based on the next iteration of the confidence scores and the second model weight.” This is never described in the specification.
Claim 19 recites “modifying a corresponding label of the first label set for a sample of the first feature set in satisfaction of a threshold.” This is never described in the specification.
Claim 20 recites “The method of claim 17, wherein non-transitory, computer-readable medium comprising executable instructions, which when executed by processing circuitry, cause a computing device to perform the steps of the method.” The method has insufficient antecedent basis because there a method in claim 1 and claim 17.
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
Claims 1-20 are rejected as failing to define the invention in the manner required by 35 U.S.C. 112(b) or pre-AIA 35 U.S.C. 112, second paragraph.
The claims are replete with indefinite language. The steps and structures which make up the methods must be clearly and positively specified. The structure must be organized and correlated in such a manner as to present a complete operative device.
Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claims 1, 4, 5 and 17 recite the limitation "the updated machine learning model". There is insufficient antecedent basis for this limitation in the claim. Examiner guesses that applicant intends to claim the updated first machine learning model.
Claim 1 recites “configured to determine a second model weight for fitting, into the observed label set, the second label set…” It is unclear what fitting a second label set into an observed label set means.
Claim 1 recites “wherein a first set of sample weights correspond to training the first machine learning model…” This doesn’t make sense and it’s not a term of art.
Claim 7 recites “the sending and receiving”. There are too many types of sending and receiving to only claim “the sending and receiving”. This renders the claim indefinite.
Claim 7 recites “out-sample error”. It is unclear which model’s out-sample error is calculated.
Claim 9 recites “creating, by the processing circuitry, a learner unit for the first agent by fitting, into a fitted label set, an initial label set based on the first machine learning model and the first feature set, wherein the learner unit is configured with the first learning technique to generate the first label set to comprise the fitted label set, wherein the first machine learning model is configured to generate a mapping between the first feature set and the initial label set.” This claim doesn’t make sense in general, but it also doesn’t make sense to create a learner unit by fitting an initial label into a fitted label. That is not a term of art and it’s nothing a person in the art would understand.
Claims 11-16 recite the limitation "the at least one other machine learning model". This claim becomes indefinite because the applicant goes on to claim many different machine learning models such as the “second machine learning model” and the “trained machine learning model”. It is unclear what is the “other” machine learning model in the context of multiple named machine learning models.
Claims 11-16 have the same problem with respect to “the at least one other agent”. It is unclear what the other agent is when there are several other named agents.
Where applicant acts as his or her own lexicographer to specifically define a term of a claim contrary to its ordinary meaning, the written description must clearly redefine the claim term and set forth the uncommon definition so as to put one reasonably skilled in the art on notice that the applicant intended to so redefine that claim term. Process Control Corp. v. HydReclaim Corp., 190 F.3d 1350, 1357, 52 USPQ2d 1029, 1033 (Fed. Cir. 1999). The term “confidence scores” in claim 11-16 is used by the claim to mean “a set of sample weights” while the accepted meaning is numbers that represent how accurate the prediction is. The term is indefinite because the specification does not clearly redefine the term. If the confidence score is the same as sample weights, the confidence score should be called sample weights. If the confidence score some version of sample weights, then the version of sample weights needs to be defined. Otherwise, this claim limitation is indefinite in light of the accepted meaning of a confidence score.
Claim 14 recites “centralized feature datasets or decentralized feature datasets.” This is not a term of art and it is not defined in the specification. It is unclear what these feature datasets are.
Claim 17 recites “a second model weight for fitting, into the task label set, a dataset combining the second feature set, a second label set, and the second confidence scores…” It doesn’t make sense to fit a feature, label and score into a label set.
Claim 18 recites “map the first feature set to a third label set based on the next iteration of the confidence scores and the second model weight.” The specification doesn’t explain how to map based on a confidence score. The confidence score is usually based on the mappings or labels.
Claim 19 recites, “modifying a corresponding label of the first label set for a sample of the first feature set in satisfaction of a threshold.” It’s unclear what modifying a label in satisfaction of a threshold means.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being described by Assisted Learning: A Framework for Multi-Organization Learning by Xian et al.
Xian teaches claim 1. A method comprising: by processing circuitry of a computing device, (Xian fig. 1) sending first statistical information from a first agent to a second agent in an architecture having at least two agents, (Xian p. 4 sec. d “a given learner, say Alice, by allowing its module, Ma = (Aa, XA) to exchange statistics with other m modules…”) wherein a first machine learning model is configured to map a first feature set to a first label set and the first agent is configured to train the first machine learning model to predict an observed label set, (Xian abs. “a learner Alice with supervised learning tasks…” learning is training to map features to labels.) wherein a first set of sample weights correspond to training the first machine learning model, wherein the first set of sample weights determine a first model weight for fitting the first label set with the observed label set based on a first learning technique and the first machine learning model, wherein the first set of sample weights and the first model weight determine a second set of sample weights corresponding to training a second machine learning model at the second agent, wherein the first statistical information comprises the second set of sample weights,
wherein the second agent receives the first statistical information comprising the second set of sample weights and comprises the second machine learning model configured to map a second feature set to a second label set, (Xian p. 6 sec. E “If k is even, Alice will update her current weight wa,k to wa,k+1 as well as w˜k to w˜k+1. Bob will fix his weight for the next iteration, i.e. wb,k+1 = wb,k. If k is odd, Alice will fix her weight for next iteration, i.e. wa,k+1 = wa,k and updates w˜k to w˜k+1. Then she sends w˜k and other related information to Bob.”)
wherein the second agent is configured to determine a second model weight for fitting, into the observed label set, the second label set based on a second learning technique, the second machine learning model, and the first statistical information, wherein the second agent is further configured to update the first set of sample weights based on the second model weight and the second set of sample weights, wherein the second agent executes on at least one computing system; (Xian p. 6 sec. E “Bob will use the information to update his current weight wb,k to wb,k+1. Such a procedure repeats K times until the out-sample error (measured by, e.g., cross-validation) of Alice no longer decreases.”)
by the processing circuitry of the computing device, receiving, from the second agent, second statistical information comprising the second model weight and the updated first set of sample weights or, from a third agent of the architecture, third statistical information comprising a third model weight and a next iteration of the first set of sample weights, wherein the third model weight is derived from the first statistical information and the second statistical information; and (Xian p. 6 sec. E “at the kth iteration, Alice first calculates w T a,kXA, and inquiries Bob’s w T b,kXB, then combine them to feed the neural network.” Alice can perform this with other modules, including a third agent/module, Xian p. 4 sec. D “say Alice, by allowing its module, Ma = (Aa, XA) to exchange statistics with other m modules M1,M2, . . . ,Mm.”)
updating, by the processing circuitry of the computing device, the first machine learning model using the second statistical information or the third statistical information, wherein the updated machine learning model is configured to map the first feature set to an updated first label set based on the updated first set of sample weights or the next iteration of the first set of sample weights. (Xian p. 6 sec. E “If k is even, Alice will update her current weight wa,k to wa,k+1 as well as w˜k to w˜k+1. Bob will fix his weight for the next iteration, i.e. wb,k+1 = wb,k.” Xian p. 7 procedure 3 “Alice sets wa,k+1 ← wa,k and updates w˜k by using back-propagation to get w˜k+1”)
Xian teaches claim 2. The method of claim 1 further comprising sending, by the processing circuitry of the computing device, to the second agent or a fourth agent in the architecture, fourth statistical information defined by an updated first model weight for fitting, into the observed label set, the updated first label set using the first learning technique, wherein the fourth statistical information comprises an updated second set of sample weights based on the updated first model weight and the updated first set of sample weights or the next iteration of the first set of sample weights. (It’s iterative so Bob is the fourth agent because he has changed/updated since he was the second agent. Xian p. 6 “Then she sends w˜k and other related information to Bob. Bob will use the information to update his current weight wb,k to wb,k+1.” Alice can perform this with other modules, including a fourth agent/module, Xian p. 4 sec. D “say Alice, by allowing its module, Ma = (Aa, XA) to exchange statistics with other m modules M1,M2, . . . ,Mm.”)
Xian teaches claim 3. The method of claim 2, wherein sending the fourth statistical information further comprises updating, by the processing circuitry, the first model weight into the updated first model weight based on the second model weight or the third model weight. (Xian p. 6 “Then she sends w˜k and other related information to Bob. Bob will use the information to update his current weight wb,k to wb,k+1.”)
Xian teaches claim 4. The method of claim 1, further comprising: generating, based on the updated machine learning model, a predicted label set for a new feature set. (Xian p. 6 sec. E “Alice first calculates w T a,kXA… Alice queries the corresponding w T b,Kx ∗ i (i ∈ Sb) from Bob, and uses the trained neural network to get the assisted learning prediction.” This is Alice’s weights times Alice’s features X, which produces final prediction labels WK.)
Xian teaches claim 5. The method of claim 4, further comprising:
generating a first prediction label vector based on the updated machine learning model and a feature vector of a new sample in the new feature set; (Xian p. 7 lines 12 and 13 in procedure 3, “On arrival of a new data {x ∗ i , i ∈ S}, Alice calculates w T a,Kx ∗ i (x ∗ i , i ∈ Sa).”)
querying the second agent in the architecture for a second predicted label vector based on the feature vector of the new sample or a partially aligned feature vector of the new sample; and (Xian p. 7 procedure 3 “13: Alice queries w T b,Kx ∗ i ,(x ∗ i , i ∈ Sb) from Bob and combine them to feed the neural network to get the final prediction y˜ ∗”.)
combining the first predicted label vector with the second predicted label vector into a final predicted label vector. (Xian p. 7 procedure 3 “13: Alice queries w T b,Kx ∗ i ,(x ∗ i , i ∈ Sb) from Bob and combine them to feed the neural network to get the final prediction y˜ ∗”.)
Xian teaches claim 6. The method of claim 5, wherein a learner unit the first machine learning model comprises a network ensemble in which a neural network is configured to combine the first predicted label vector with the second predicted label vector into the final predicted label vector. (Xian p. 7 lines 12 and 13 in procedure 3, “On arrival of a new data {x ∗ i , i ∈ S}, Alice calculates w T a,Kx ∗ i (x ∗ i , i ∈ Sa). 13: Alice queries w T b,Kx ∗ i ,(x ∗ i , i ∈ Sb) from Bob and combine them to feed the neural network to get the final prediction y˜ ∗”.)
Xian teaches claim 7. The method of claim 1, further comprising: repeating the sending and the receiving until an out-sample error satisfies a criterion, wherein the out-sample error is computed by cross-validation. (Xian p. 6 sec. E “Such a procedure repeats K times until the out-sample error (measured by, e.g., cross-validation) of Alice no longer decreases.”)
Xian teaches claim 8. The method of claim 1, wherein sending the first statistical information further comprises determine the first model weight to minimize an in-sample prediction loss associated with the first machine learning model. (Xian p. 7 procedure 3 “Alice updates wa,k, w˜k by using back-propagation to get wa,k+1, w˜k+1 respectively…” In-sample here means that the data is in the training sample. Backpropagation minimizes error.)
Xian teaches claim 9. The method of claim 1 further comprising: creating, by the processing circuitry, a learner unit for the first agent by fitting, into a fitted label set, an initial label set based on the first machine learning model and the first feature set, wherein the learner unit is configured with the first learning technique to generate the first label set to comprise the fitted label set, (Xian p. 3 “module M receives a user’s query of a label vector y ∈ Yn that is collated with the rows of X; a prediction function fM,y is produced and privately stored; the fitted value fM,y(X) = [fM,y(x1), . . . , fM,y(xn)]T is sent to the user.”) wherein the first machine learning model is configured to generate a mapping between the first feature set and the initial label set. (Xian p. 3 “the fitted value, fM,y(X), returned from the service module (Bob) upon an inquiry of y…” Xian p. 7 procedure 3 “Module Alice, its initial label y ∈ R n , initial weight wa,1 (from input to hidden layers) and w˜1 (the rest weights) for the neural network, assisting module Bob, (optional) new predictors {x ∗ i , i ∈ S}…”)
Xian teaches claim 10. The method of claim 1 further comprising modifying an ordering of multiple agents of the architecture based on performance information. (Xian p. 16 “If they are all restricted to use the same model, e.g., ordinary linear regression, then the out-sample prediction performance can be very bad. However, if they follow the assisted learning protocol, i.e. learners with noisy data opt to robust learning technique, then the out-sample prediction performances will be very promising.” This is ordering the learners/agent based on their performance on noisy data.)
Xian teaches claim 11. A computing device for an agent of an assisted learning architecture comprising: (Xian fig. 1)
processing circuitry coupled to memory and configured to: (Xian fig. 1)
execute a training process on a machine learning model by exchanging, with at least one other agent of the assisted learning architecture, confidence scores over a number of iterations, wherein the at least one other agent is configured to train at least one other machine learning model, wherein for each iteration,
the training process determines a set of sample weights as the confidence scores for the machine learning model and communicates, to a second agent of the at least one agent, a second set of sample weights as the confidence scores for a second machine learning model of the at least one other machine learning model, wherein the confidence scores for the machine learning model corresponds to a progress level in the training process and the confidence scores for the second machine learning model correspond to a progress level in training the second machine learning model when compared to the progress level in training the machine learning model, and (Xian p. 6 sec. E “If k is even, Alice will update her current weight wa,k to wa,k+1 as well as w˜k to w˜k+1. Bob will fix his weight for the next iteration, i.e. wb,k+1 = wb,k. If k is odd, Alice will fix her weight for next iteration, i.e. wa,k+1 = wa,k and updates w˜k to w˜k+1. Then she sends w˜k and other related information to Bob.” Updating weights is training. Applicant’s confidence scores are a “set of sample weights” not confidence scores.)
the second agent updates the set of sample weights in response to further training the second machine learning model and returns, to the agent for a next iteration of the confidence scores for the machine learning model, the updated set of sample weights and a model weight determined from the confidence scores of the second machine learning model. (Xian p. 6 sec. E “Bob will use the information to update his current weight wb,k to wb,k+1. Such a procedure repeats K times until the out-sample error (measured by, e.g., cross-validation) of Alice no longer decreases.”)
Xian teaches claim 12. The computing device of claim 11, wherein the processing circuitry is further configured to:
execute an evaluation process to apply the trained machine learning model to a feature set to generate a first predicted label set, (Xian p. 7 lines 12 and 13 in procedure 3, “On arrival of a new data {x ∗ i , i ∈ S}, Alice calculates w T a,Kx ∗ i (x ∗ i , i ∈ Sa).”) query the at least one agent to return at least one second predicted label, and generate a final predicted label set based on the first predicted label set and the at least one second predicted label set. (Xian p. 7 procedure 3 “13: Alice queries w T b,Kx ∗ i ,(x ∗ i , i ∈ Sb) from Bob and combine them to feed the neural network to get the final prediction y˜ ∗”.)
Xian teaches claim 13. The computing device of claim 11, wherein to execute the training process, the processing circuitry is further configured to:
output the trained machine learning model in response to a determination that an iteration of the confidence scores satisfies a threshold for the confidence score. (Xian p. 6 sec. E “Such a procedure repeats K times until the out-sample error (measured by, e.g., cross-validation) of Alice no longer decreases.”)
Xian teaches claim 14. The computing device of claim 11, wherein the agent and the at least one agent implement centralized feature datasets or decentralized feature datasets. (Xian p. 3 sec. C “In a decentralized scenario where the trustworthiness or learning capability of Bob is doubtful, Alice tends not to release private data.”)
Xian teaches claim 15. The computing device of claim 11, wherein the processing circuitry is further configured to: terminate the training process in response to determining that an out-sample error satisfies a criterion, wherein the out-sample error is computed by cross-validation. (Xian p. 6 sec. E “Such a procedure repeats K times until the out-sample error (measured by, e.g., cross-validation) of Alice no longer decreases.”)
Xian teaches claim 16. The computing device of claim 11, wherein the processing circuitry is further configured to: determine the model weight to minimize an in-sample prediction loss associated with the machine learning model. (Xian p. 7 procedure 3 “Alice updates wa,k, w˜k by using back-propagation to get wa,k+1, w˜k+1 respectively…” In-sample here means that the data is in the training sample. Backpropagation minimizes error.)
Xian teaches claim 17. A method performed by processing circuitry, the method comprising: (Xian fig. 1) creating a learner unit comprising a first machine learning model configured with a first learning technique, wherein the learner unit uses the first learning technique to train the first machine learning model to map a first feature set to a first label set based on confidence scores for samples in the first feature set and generates a model weight for fitting, into a task label set, a dataset combining the first feature set and the first label set; (Xian p. 6 sec. E “If k is even, Alice will update her current weight wa,k to wa,k+1 as well as w˜k to w˜k+1. Bob will fix his weight for the next iteration, i.e. wb,k+1 = wb,k. If k is odd, Alice will fix her weight for next iteration, i.e. wa,k+1 = wa,k and updates w˜k to w˜k+1. Then she sends w˜k and other related information to Bob.”)
sending, to a second learner unit of a second computing device, second confidence scores for samples in a second feature set used by the second learner unit in training a second machine learning model with the task label set using a second learning technique, wherein the second confidence scores are computed from the model weight and the confidence scores for the samples in the first feature set, wherein the second learning technique generates a second model weight for fitting, into the task label set, a dataset combining the second feature set, a second label set, and the second confidence scores; (Xian p. 6 sec. E “Bob will use the information to update his current weight wb,k to wb,k+1. Such a procedure repeats K times until the out-sample error (measured by, e.g., cross-validation) of Alice no longer decreases.” Applicant’s confidence score is a set of weights.)
receiving, from the second learner unit of the second computing device, the second model weight and a next iteration of the confidence scores defined by the second model weight and the second label set; and (Xian p. 6 sec. E “Bob will use the information to update his current weight wb,k to wb,k+1. Such a procedure repeats K times until the out-sample error (measured by, e.g., cross-validation) of Alice no longer decreases.” Applicant’s confidence score is a set of weights.)
updating the machine learning model using the first learning technique, wherein the updated machine learning model is configured to map the first feature set to a third label set based on the next iteration of the confidence scores and the second model weight. (Xian p. 6 sec. E “If k is even, Alice will update her current weight wa,k to wa,k+1 as well as w˜k to w˜k+1. Bob will fix his weight for the next iteration, i.e. wb,k+1 = wb,k.” Xian p. 7 procedure 3 “Alice sets wa,k+1 ← wa,k and updates w˜k by using back-propagation to get w˜k+1”)
Xian teaches claim 18. The method of claim 17, wherein the updated machine learning model is configured to map the first feature set to the task label set based on the next iteration of the confidence scores and the second model weight. (Xian p. 6 sec. E “If k is even, Alice will update her current weight wa,k to wa,k+1 as well as w˜k to w˜k+1. Bob will fix his weight for the next iteration, i.e. wb,k+1 = wb,k.” Xian p. 7 procedure 3 “Alice sets wa,k+1 ← wa,k and updates w˜k by using back-propagation to get w˜k+1”)
Xian teaches claim 19. The method of claim 17, wherein updating the machine learning model further comprises based on the next iteration of the confidence scores, modifying a corresponding label of the first label set for a sample of the first feature set in satisfaction of a threshold. (Xian p. 7 procedure 3 “Alice sets wa,k+1 ← wa,k and updates w˜k by using back-propagation to get w˜k+1”)
Xian teaches claim 20. The method of claim 17, wherein non-transitory, computer-readable medium comprising executable instructions, which when executed by processing circuitry, cause a computing device to perform the steps of the method. (Xian fig. 1)
Notice of References cited not relied upon
US10824634B2 abstract teaches the hierarchy of devices shown in Applicant’s figures one and five.
US20220129791A1 abstract expands on Applicant’s decentralized data embodiment, “a given target data sample, which improves on exhaustive or random data sample generation algorithms. Specifically, using principles of locality and approximation of local decision boundaries, techniques described herein identify a hypersphere (or data sample neighborhood) over which to train the surrogate ML model such that the surrogate ML model produces valuable, high-quality information explaining data samples in the neighborhood of the target data sample.”
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Austin Hicks whose telephone number is (571)270-3377. The examiner can normally be reached Monday - Thursday 8-4 PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached at (571) 270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AUSTIN HICKS/Primary Examiner, Art Unit 2124