Office Action Analysis: 18137811 — NON-LINEAR MULTITASK SUPPORT VECTOR MACHINES

Office Action

§101 §102 §103
Detailed Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This action is in response to an application filed on April 21st, 2023. Claims 1-20 are pending in this application.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim 1, Under Step 1 of the Subject Matter Eligibility Test of Products and
Processes, the claim is directed towards a node, which is interpreted as a machine which is one of the four statutory categories.
Next, under a Step 2A Prong 1 Analysis, the claim mentions:
determine one or more parameters for the global model and to determine one or more parameters for a local model
As drafted, this is a process that, under the broadest reasonable interpretation, falls under the “mental processes” grouping of abstract ideas.
Therefore, we have to examine the claim under Step 2A prong 2, which considers the additional elements within the claim. The claim’s additional elements are:
receive one or more global parameters for a distributed multitask support vector machine from a coordinator node
model trainer circuitry
perform training for the distributed multitask support vector machine based on the one or more global parameters associated with a global model
the distributed multitask support vector machine
and the coordinator node interface circuitry is further to send the one or more parameters for the global model to the coordinator node.
The “model trainer circuitry”, and “distributed multitask support vector machine” are limitations that, as drafted merely show the field of use and technological environment of the judicial exception and “generally link” the judicial exception to circuitry and a machine. (See MPEP 2106.05(h)) The “coordinator node interface circuitry is further to send the one or more parameters for the global model to the coordinator node” is interpreted to be mere instructions to apply the judicial exception, as it instructs to send parameters using interface circuitry. (See MPEP 2106.05(f)) Lastly, to “receive one or more global parameters for a distributed multitask support vector machine from a coordinator node”, and “perform training for the distributed multitask support vector machine based on the one or more global parameters associated with a global model” is considered insignificant extra-solution activity. (See MPEP 2106.05(g)) Therefore, these additional elements do not integrate the abstract idea into a practical application. The claim is directed to an abstract idea.
Under a Step 2B analysis, the claim’s addition elements do not amount to significantly
more than the judicial exception as explained above in Step 2A prong 2. Additionally, to “receive one or more global parameters for a distributed multitask support vector machine from a coordinator node”, and “perform training for the distributed multitask support vector machine based on the one or more global parameters associated with a global model” is considered to be well-understood, routine and conventional. To “receive one or more global parameters for a distributed multitask support vector machine from a coordinator node” is merely considered receiving data, (See MPEP 2106.05(d)) and “perform training for the distributed multitask support vector machine based on the one or more global parameters associated with a global model” is well-understood, routine, and conventional, as disclosed by Nguyen. (U.S. Patent Application No. US 20190042887) (“Typically, computer systems for training, validating and tuning machine learning models implement a serial process in which a model is trained, its out-of-sample performance is measured, and parameters are tuned and the cycle repeats itself. This can be accomplished serially on local machines or cloud-based machines. However, this process is time intensive and may result in long lead times before a machine learning model can be deployed into a production environment.”, Paragraph 5) Therefore, the claim is ineligible.

Regarding claim 2, the claim mentions “to receive one or more regularization parameters, wherein a first regularization parameter of the one or more regularization parameters indicates a relative weighting between the one or more parameters for the global model and one or more parameters for the local model.” The “to receive one or more regularization parameters” is considered to be insignificant extra-solution activity, as it is merely receiving data over a network, (See MPEP 2106.05(g) and (d)) and the “first regularization parameter of the one or more regularization parameters indicates a relative weighting between the one or more parameters for the global model and one or more parameters for the local model”, as drafted, merely indicates the field of use and technological environment of the judicial exception and “generally link” the judicial exception to relative weighting between parameters. (See MPEP 2106.05(h)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 1.

Regarding claim 3, the claim mentions “to receive a plurality of groups of regularization parameters, wherein individual groups of regularization parameters of the plurality of groups of regularization parameters comprise a first regularization parameter that indicates a relative weighting between the one or more parameters for the global model and one or more parameters for the local model and a second regularization parameter that indicates a tolerance to errors caused by outliers.” The “to receive a plurality of groups of regularization parameters” is considered to be insignificant extra-solution activity, as it is merely receiving data over a network, (See MPEP 2106.05(g) and (d)) and the “individual groups of regularization parameters of the plurality of groups of regularization parameters comprise a first regularization parameter that indicates a relative weighting between the one or more parameters for the global model and one or more parameters for the local model and a second regularization parameter that indicates a tolerance to errors caused by outliers”  as drafted, merely indicates the field of use and technological environment of the judicial exception and “generally link” the judicial exception to parameters that indicate tolerance and relative weighting. (See MPEP 2106.05(h)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 1.

Regarding claim 4, the claim recites “to perform training for the distributed multitask support vector machine for individual groups of regularization parameters of the plurality of groups of regularization parameters, wherein, after the one or more parameters for the global model are sent to the coordinator node, the coordinator node interface circuitry is further to receive an updated one or more global parameters, wherein to receive the updated one or more global parameters comprises to receive an indication that at least one group of regularization parameters of the plurality of groups of regularization parameters has been removed from the global model.”
The “after the one or more parameters for the global model are sent to the coordinator node, the coordinator node interface circuitry is further to receive an updated one or more global parameters” is interpreted to be mere instructions to apply the judicial exception, as it instructs to update global parameters after parameters are sent to a coordinator node, (See MPEP 2106.05(f)) and “to perform training for the distributed multitask support vector machine for individual groups of regularization parameters of the plurality of groups of regularization parameters”, and “to receive the updated one or more global parameters comprises to receive an indication that at least one group of regularization parameters of the plurality of groups of regularization parameters has been removed from the global model” is considered insignificant extra-solution activity, and “to receive the updated one or more global parameters comprises to receive an indication that at least one group of regularization parameters of the plurality of groups of regularization parameters has been removed from the global model” is well-understood, routine, and conventional, as it is considered to be receiving data, and “to perform training for the distributed multitask support vector machine for individual groups of regularization parameters of the plurality of groups of regularization parameters” is well-understood, routine, and conventional, as disclosed by Nguyen. (U.S. Patent Application No. US 20190042887) (“Typically, computer systems for training, validating and tuning machine learning models implement a serial process in which a model is trained, its out-of-sample performance is measured, and parameters are tuned and the cycle repeats itself. This can be accomplished serially on local machines or cloud-based machines. However, this process is time intensive and may result in long lead times before a machine learning model can be deployed into a production environment.”, Paragraph 5) (See MPEP 2106.05(g) and MPEP 2106.05(d)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 3.

Regarding claim 5, the claim recites “the participant node does not send the one or more parameters for the local model to the coordinator node.” The limitation, as drafted, is interpreted to be mere instructions to apply the judicial exception by limiting how the abstract idea is applied. (See MPEP 2106.05(f)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 1.

Regarding claim 6, the claim recites “the distributed multitask support vector machine is an anomaly detection algorithm.” The limitation, as drafted, merely indicates the field of use and technological environment of the judicial exception and “generally link” the judicial exception to an anomaly detection algorithm. (See MPEP 2106.05(h)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 1.

Regarding claim 7, the claim recites “the distributed multitask support vector machine is a classification algorithm.” The limitation, as drafted, merely indicates the field of use and technological environment of the judicial exception and “generally link” the judicial exception to a classification algorithm. (See MPEP 2106.05(h)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 1.

Regarding claim 8, the claim recites “to transform local training data using random Fourier feature mapping.” The limitation, as drafted, is interpreted to be mere instructions to apply the judicial exception, as it instructs to transform the data using Fourier feature mapping. (See MPEP 2106.05(f)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 1.

Regarding claim 9, the claim recites “the coordinator node interface circuitry is further to receive a seed for a random number generator from the coordinator node, wherein to transform the local training data comprises to generate the random Fourier feature mapping using the seed and the random number generator.” The “coordinator node interface circuitry is further to receive a seed for a random number generator from the coordinator node” merely indicates the field of use and technological environment of the judicial exception and “generally link” the judicial exception to a random seed (See MPEP 2106.05(h)) and “to transform the local training data comprises to generate the random Fourier feature mapping using the seed and the random number generator” is interpreted to be mere instructions to apply the judicial exception as it instructs to generate the Fourier feature mapping using a seed and random number generator. (See MPEP 2106.05(f)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 8.

Regarding claim 10, the claim recites, “to subsample training data and performing training on the subsampled training data.” The limitation, as drafted, is interpreted to be mere instructions to apply the judicial exception as it instructs to subsample the data, and then perform training on the data. (See MPEP 2106.05(f)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 1.

Regarding claim 11, the claim recites, “remove a random subset of data points of the training data to generate reduced training data set; and determine one or more parameters for the global model based on the reduced training data set.” The “remove a random subset of data points of the training data to generate reduced training data set” is interpreted to be mere instructions to apply the judicial exception as it instructs to remove a random subset of data to generate a reduced training dataset. (See MPEP 2106.05(f)) and to “determine one or more parameters for the global model based on the reduced training data set” is, as drafted, falls under the “mental processes” grouping of abstract ideas. Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 10.

Regarding claim 12, the claim recites, “to add a random amount of noise to each data points of the training data during training.” The limitation, as drafted, is interpreted to be mere instructions to apply the judicial exception, as it instructs one to add noisy data to the training data. (See MPEP 2106.05(f)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 10.

Regarding claim 13, the claim recites “A system comprising the participant node of claim 1, further comprising the coordinator node, the coordinator node comprising: parameter initialization circuitry to determine the one or more global parameters for the distributed multitask support vector machine; participant node interface circuitry to: send the one or more global parameters to one or more participant nodes, wherein the one or more participant nodes includes the participant node; and receive model updates from the one or more participant nodes, wherein the model updates are based on training data associated with the one or more participant nodes; and model updater circuitry to update one or more of the one or more global parameters based on the model updates from the one or more participant nodes.”
The claim is directed towards a system, which is a type of machine, which is one of the four statutory categories.
The “participant node of claim 1, further comprising the coordinator node” is an abstract idea without significantly more. What’s left is:
parameter initialization circuitry to determine the one or more global parameters for the distributed multitask support vector machine;
participant node interface circuitry to: send the one or more global parameters to one or more participant nodes
wherein the one or more participant nodes includes the participant node
and receive model updates from the one or more participant nodes
wherein the model updates are based on training data associated with the one or more participant nodes
and model updater circuitry to update one or more of the one or more global parameters based on the model updates from the one or more participant nodes.
The “parameter initialization circuitry to determine the one or more global parameters for the distributed multitask support vector machine”, “participant node interface circuitry to: send the one or more global parameters to one or more participant nodes” and  “model updater circuitry to update one or more of the one or more global parameters based on the model updates from the one or more participant nodes.” is interpreted to be mere instructions to apply the judicial exception as it instructs on how to use the different circuitries. (See MPEP 2106.05(f)) The “one or more participant nodes includes the participant node” and “the model updates are based on training data associated with the one or more participant nodes” merely indicates the field of use and technological environment of the judicial exception and “generally link” the judicial exception and updates to participant nodes. (See MPEP 2106.05(h)) and to “receive model updates from the one or more participant nodes” is considered to be insignificant extra-solution activity, and well-understood, routine and conventional, as it is considered to be receiving or transmitted data over a network. (See MPEP 2106.05(g) and MPEP 2106.05(d)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 1.

Regarding claim 14, under Step 1 of the Subject Matter Eligibility Test of Products and
Processes, the claim is directed towards a node which is interpreted as a machine, which is one of the four statutory categories.
Next, under a Step 2A Prong 1 Analysis, the claim mentions to “determine a one or more global parameters for a distributed multitask support vector machine.” As drafted, these are processes that, under the broadest reasonable interpretation, fall under the “mental processes” grouping of abstract ideas.
Therefore, we have to examine the claim under Step 2A prong 2, which considers the additional elements within the claim. The claim’s additional elements are:
parameter initialization circuitry
participant node interface circuitry
send the one or more global parameters to one or more participant nodes
receive model updates from the one or more participant nodes
the model updates are based on training data associated with the one or more participant nodes
model updater circuitry
and update one or more of the one or more global parameters based on the model updates from the one or more participant nodes.
The “parameter initialization circuitry”, “participant node interface circuitry”, “model updater circuitry”, and “the model updates are based on training data associated with the one or more participant nodes“ merely indicates the field of use and technological environment of the judicial exception and “generally link” the judicial exception to circuitry that can run it, and updates associated with the abstract idea. (See MPEP 2106.05(h)) and to “send the one or more global parameters to one or more participant nodes”, “receive model updates from the one or more participant nodes”, and “update one or more of the one or more global parameters based on the model updates from the one or more participant nodes” is considered to be insignificant extra-solution activity. (See MPEP 2106.05(g)) Therefore, these additional elements do not integrate the abstract idea into a practical application. The claim is directed to an abstract idea.
Under a Step 2B analysis, the claim’s addition elements do not amount to significantly
more than the judicial exception as explained above in Step 2A prong 2. Additionally, to “send the one or more global parameters to one or more participant nodes”, “receive model updates from the one or more participant nodes”, and “update one or more of the one or more global parameters based on the model updates from the one or more participant nodes” is considered to be well understood, routine, and conventional, as it is considered to be receiving or transmitting data over a network. (See MPEP 2106.05(d)) Therefore, the claim is ineligible.

	Regarding claim 15, the claim recites “to determine a plurality of groups of regularization parameters, wherein individual groups of regularization parameters of the plurality of groups of regularization parameters comprise a first regularization parameter that indicates a relative weighting between the one or more parameters for a global model and one or more parameters for a local model and a second regularization parameter that indicates a tolerance to errors caused by outliers, wherein the model updater circuitry is further to remove at least one group of regularization parameters from the plurality of groups of regularization parameters to generate a reduced plurality of groups of regularization parameters, wherein the participant node interface circuitry is further to send an indication of the reduced plurality of groups of regularization parameters.”
To “determine a plurality of groups of regularization parameters” falls under the “mental processes” grouping of abstract ideas, the “individual groups of regularization parameters of the plurality of groups of regularization parameters comprise a first regularization parameter that indicates a relative weighting between the one or more parameters for a global model and one or more parameters for a local model”, and “a second regularization parameter that indicates a tolerance to errors caused by outliers” merely indicates the field of use and technological environment of the judicial exception and “generally link” the judicial exception to parameters that indicate tolerance and relative weighting. (See MPEP 2106.05(h)) and “the model updater circuitry is further to remove at least one group of regularization parameters from the plurality of groups of regularization parameters to generate a reduced plurality of groups of regularization parameters”, and “the participant node interface circuitry is further to send an indication of the reduced plurality of groups of regularization parameters.” is interpreted to be mere instructions to apply the judicial exception as it instructs on how to use the different circuitries. (See MPEP 2106.05(f)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 14.

	Regarding claim 16, the claim recites “to update the one or more of the one or more global parameters based on the model updates from individual participant nodes of the plurality of participant nodes comprises to perform an alternating direction method of multipliers.” To “update the one or more of the one or more global parameters based on the model updates from individual participant nodes of the plurality of participant nodes” is interpreted to be mere instructions to apply the judicial exception as it instructs on how to use update the parameters (See MPEP 2106.05(f)) and “to perform an alternating direction method of multipliers” merely indicates the field of use and technological environment of the judicial exception and “generally link” the judicial exception to parameters that indicate tolerance and relative weighting. (See MPEP 2106.05(h)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 14.

	Regarding claim 17, the claim recites “individual participant nodes of the plurality of participant nodes have training data with different random distributions.” The limitation, as drafted, merely indicates the field of use and technological environment of the judicial exception and “generally link” the judicial exception to training data with different random distribution. (See MPEP 2106.05(h)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 14.

Regarding claim 18, under Step 1 of the Subject Matter Eligibility Test of Products and Processes, the claim is directed towards a computer-readable media which is interpreted as a manufacture, which is one of the four statutory categories.
Next, under a Step 2A Prong 1 Analysis, the claim mentions to “determine one or more parameters for the global model for the distributed multitask support vector machine and to determine one or more parameters for a local model for the distributed multitask support vector machine.” As drafted, these are processes that, under the broadest reasonable interpretation, fall under the “mental processes” grouping of abstract ideas.
Therefore, we have to examine the claim under Step 2A prong 2, which considers the additional elements within the claim. The claim’s additional elements are:
receive a one or more global parameters for a distributed multitask support vector machine from a coordinator node
perform training for the distributed multitask support vector machine based on the one or more global parameters associated with a global model for the distributed multitask support vector machine
and send the one or more parameters for the global model to the coordinator node.
The limitations, as drafted, are considered to be insignificant extra-solution activity. Therefore, these additional elements do not integrate the abstract idea into a practical application. The claim is directed to an abstract idea.
Under a Step 2B analysis, the claim’s addition elements do not amount to significantly
more than the judicial exception as explained above in Step 2A prong 2. Additionally, to “receive a one or more global parameters for a distributed multitask support vector machine from a coordinator node”, and “send the one or more parameters for the global model to the coordinator node.” is considered to be well understood, routine, and conventional, as it is considered to be receiving or transmitting data over a network (See MPEP 2106.05(d)) and to “perform training for the distributed multitask support vector machine based on the one or more global parameters associated with a global model for the distributed multitask support vector machine” is considered to be well understood, routine, and conventional as disclosed by Nguyen. (U.S. Patent Application No. US 20190042887) (“Typically, computer systems for training, validating and tuning machine learning models implement a serial process in which a model is trained, its out-of-sample performance is measured, and parameters are tuned and the cycle repeats itself. This can be accomplished serially on local machines or cloud-based machines. However, this process is time intensive and may result in long lead times before a machine learning model can be deployed into a production environment.”, Paragraph 5) Therefore, the claim is ineligible.

	Regarding claim 19, the claim recites “to receive a plurality of groups of regularization parameters, wherein individual groups of regularization parameters of the plurality of groups of regularization parameters comprise a first regularization parameter that indicates a tolerance to errors caused by outliers and a second regularization parameter that indicates a relative weighting between the one or more parameters for the global model and one or more parameters for the local model, wherein to perform training for the distributed multitask support vector machine comprises to perform training for the distributed multitask support vector machine for individual groups of regularization parameters of the plurality of groups of regularization parameters, wherein, after the one or more parameters for the global model are sent to the coordinator node, the plurality of instructions further causes the participant node to receive an updated one or more global parameters, wherein to receive the updated one or more global parameters comprises to receive an indication that at least one group of regularization parameters from the plurality of groups of regularization parameters has been removed from the global model.” To “receive a plurality of groups of regularization parameters”, and “to receive the updated one or more global parameters comprises to receive an indication that at least one group of regularization parameters from the plurality of groups of regularization parameters has been removed from the global model” is considered to be insignificant extra-solution activity, and well-understood, routine and conventional, as it is receiving or transmitting data. (See MPEP 2106.05(g) and MPEP 2106.05(d)) To “perform training for the distributed multitask support vector machine comprises to perform training for the distributed multitask support vector machine for individual groups of regularization parameters of the plurality of groups of regularization parameters” is also considered to be insignificant extra-solution activity, and is well-understood, routine and conventional, as disclosed by Nguyen. (U.S. Patent Application No. US 20190042887) (“Typically, computer systems for training, validating and tuning machine learning models implement a serial process in which a model is trained, its out-of-sample performance is measured, and parameters are tuned and the cycle repeats itself. This can be accomplished serially on local machines or cloud-based machines. However, this process is time intensive and may result in long lead times before a machine learning model can be deployed into a production environment.”, Paragraph 5) (See MPEP 2106.05(g) and MPEP 2106.05(d)) The “individual groups of regularization parameters of the plurality of groups of regularization parameters comprise a first regularization parameter that indicates a tolerance to errors caused by outliers and a second regularization parameter that indicates a relative weighting between the one or more parameters for the global model and one or more parameters for the local model” merely indicates the field of use and technological environment of the judicial exception and “generally link” the judicial exception to parameters that indicate tolerance and relative weighting. (See MPEP 2106.05(h)) Lastly, “after the one or more parameters for the global model are sent to the coordinator node, the plurality of instructions further causes the participant node to receive an updated one or more global parameters” is interpreted to be mere instructions to apply the judicial exception as it instructs on how the nodes operate. (See MPEP 2106.05(f)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 18.

Regarding claim 20, the claim recites “to subsample training data and performing training on the subsampled training data.” The limitation, as drafted, is interpreted to be mere instructions to apply the judicial exception as it instructs to subsample the data, and then perform training on the data. (See MPEP 2106.05(f)) Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claim 18.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claim(s) 1, 2, 7, 13, 14, and 18 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Rui Li. (Herein referred to as Li.) (Online Federated Multitask Learning)

Regarding claim 1, Li teaches a coordinator node interface circuitry to receive one or more global parameters for a distributed multitask support vector machine from a coordinator node (“Here, we introduce a central server to coordinate the training process so that each user only needs to communicate with the central server…Global SVM. In this setting, we assume that all the raw data storing on different mobile devices are transmitted to the server. We then train a classifier using SVM (Support Vector Machines) on the server, i.e., the global model. All the mobile devices (i.e., local servers) share the same parameters.”, pg. 1, right column, bottom paragraph; pg. 4, right column, under “Baselines.”) (Here, the central server acts as our coordinator node, and the central server and local servers share the same parameters. (Global and local parameters)) and model trainer circuitry to perform training for the distributed multitask support vector machine based on the one or more global parameters associated with a global model for the distributed multitask support vector machine (“As shown in Figure 1, the new device first trains its local model fm+1(x), and the parameters of this model is represented by wm+1, which is then sent to the central server… We then train a classifier using SVM (Support Vector Machines) on the server, i.e., the global model. All the mobile devices (i.e., local servers) share the same parameters.”, pg. 2, left column, paragraph 3; pg. 4, right column, under “Baselines.” (See Figure 1 below)) and wherein the coordinator node interface circuitry is further to send the one or more parameters for the global model to the coordinator node. (“We then train a classifier using SVM (Support Vector Machines) on the server, i.e., the global model. All the mobile devices (i.e., local servers) share the same parameters.”, pg. 4, right column, under “Baselines.”) (The server hosts the global model, acts as the coordinator node, and shares the parameters with local devices.)

    PNG
    media_image1.png
    180
    333
    media_image1.png
    Greyscale

Li’s Figure 1

Regarding claim 2, Li teaches the participant node of claim 1, wherein to receive the one or more global parameters comprises to receive one or more regularization parameters (“Actually, the regularizer has been proved to be joint convex w.r.t Wˆ and Ωˆ −1 when Ωˆ −1 is a positive semidefinite matrix (satisfied by the second constraint)”, pg. 2, right column, right above “Alternating Optimization Algorithm”) (Ω^ acts as a regularization parameter.) wherein a first regularization parameter of the one or more regularization parameters indicates a relative weighting between the one or more parameters for the global model and one or more parameters for the local model. (“In each iteration of the algorithm, we first optimize wm+1 of the new device with fixed precision matrix Ωˆ, and then update the precision matrix Ωˆ based on the model parameter wm+1 that we just learned in the first step and the model parameters w1, w2, ··· , wm of all the previous devices”, pg. 2, right column, under “Alternating Optimization Algorithm”) (Ω^ optimizes the weights based on the current model parameters of all devices, including the central server (global model) and local devices. (local model))

Regarding claim 7, Li teaches the participant node of claim 1, wherein the distributed multitask support vector machine is a classification algorithm. (“We then train a classifier using SVM (Support Vector Machines) on the server, i.e., the global model”, pg. 4, right column, under “Baselines.”)

Regarding claim 13, Li teaches a system comprising the participant node of claim 1, further comprising the coordinator node, the coordinator node comprising: parameter initialization circuitry to determine the one or more global parameters for the distributed multitask support vector machine (“To learn a global model, intermediate parameters are transmitted between mobile devices and the central server.”, pg. 1, left column, second paragraph under “Introduction”; See also Algorithm 1, prior to step 1 of initialization, pg. 4, right column) (This shows global intermediate parameters) participant node interface circuitry to: send the one or more global parameters to one or more participant nodes (“We use the weight matrix W := (w1, w2, ··· , wm) ∈ Rd×m to represent all the model parameters of current devices uploaded in the system and introduce a precision matrix Ω ∈ Rm×m to model the relationships between each pair of devices. Both parameters are stored on the central server.”, pg. 2, left column, paragraph 2; See also Algorithm 1, pg. 4, right column) (The intermediate parameters, W and Ω, are used in both the local and global model, with the local models acting as participant nodes. Algorithm 1 then shows the variables being initialized partially based on the weight matrix and the precision matrix, that was stored on the central server.) and receive model updates from the one or more participant nodes (“Global SVM. In this setting, we assume that all the raw data storing on different mobile devices are transmitted to the server. We then train a classifier using SVM (Support Vector Machines) on the server, i.e., the global model. All the mobile devices (i.e., local servers) share the same parameters… Local SVM. For the local model, we assume that there is no central server, and each mobile device only utilizes the data residing on itself to train its local SVM. For the online setting, if a new mobile device joins the system, the new device simply utilizes its own local data to train the local model… The proposed OFMTL needs to update the overall model when the number of new devices exceeds a threshold.”, pg. 4, right column, under “Baselines”; pg. 5, left column, under “Implementation Details”) (The new devices that train their own local model, that then causes the overall model to update whenever a threshold is exceeded, based on the parameters shared between the local models and global model.) wherein the model updates are based on training data associated with the one or more participant nodes (“As shown in Figure 1, the new device first trains its local model fm+1(x), and the parameters of this model is represented by wm+1, which is then sent to the central server. Based on the uploaded local parameters wm+1, the proposed model can update the parameters W and Ω, without revisiting existing devices. Using the updated parameters Wˆ and Ωˆ, the new device updates its parameters again.”, pg. 2, left column, paragraph right above “Methodology”) and model updater circuitry to update one or more of the one or more global parameters based on the model updates from the one or more participant nodes. (“Updating the precision matrix Ωˆ on the central server based on the model parameter wm+1 computed on the second part and all the previous model parameters w1, w2, ···, wm saved on the server, which is in the pseudo code line 10.”, pg. 4, right column, first paragraph; See Algorithm 1, step 10)



    PNG
    media_image2.png
    369
    326
    media_image2.png
    Greyscale

Li’s Algorithm 1

Regarding claim 14, Li teaches a coordinator node comprising: parameter initialization circuitry to determine a one or more global parameters for a distributed multitask support vector machine (“To learn a global model, intermediate parameters are transmitted between mobile devices and the central server.”, pg. 1, left column, second paragraph under “Introduction”; See also Algorithm 1, prior to step 1 of initialization, pg. 4, right column) (This shows global intermediate parameters) participant node interface circuitry to: send the one or more global parameters to one or more participant nodes (“We use the weight matrix W := (w1, w2, ··· , wm) ∈ Rd×m to represent all the model parameters of current devices uploaded in the system and introduce a precision matrix Ω ∈ Rm×m to model the relationships between each pair of devices. Both parameters are stored on the central server.”, pg. 2, left column, paragraph 2; See also Algorithm 1, pg. 4, right column) (The intermediate parameters, W and Ω, are used in both the local and global model, with the local models acting as participant nodes. Algorithm 1 then shows the variables being initialized partially based on the weight matrix and the precision matrix, that was stored on the central server.) and receive model updates from the one or more participant nodes (“Global SVM. In this setting, we assume that all the raw data storing on different mobile devices are transmitted to the server. We then train a classifier using SVM (Support Vector Machines) on the server, i.e., the global model. All the mobile devices (i.e., local servers) share the same parameters… Local SVM. For the local model, we assume that there is no central server, and each mobile device only utilizes the data residing on itself to train its local SVM. For the online setting, if a new mobile device joins the system, the new device simply utilizes its own local data to train the local model… The proposed OFMTL needs to update the overall model when the number of new devices exceeds a threshold.”, pg. 4, right column, under “Baselines”; pg. 5, left column, under “Implementation Details”) (The new devices that train their own local model, that then causes the overall model to update whenever a threshold is exceeded, based on the parameters shared between the local models and global model.) wherein the model updates are based on training data associated with the one or more participant nodes (“As shown in Figure 1, the new device first trains its local model fm+1(x), and the parameters of this model is represented by wm+1, which is then sent to the central server. Based on the uploaded local parameters wm+1, the proposed model can update the parameters W and Ω, without revisiting existing devices. Using the updated parameters Wˆ and Ωˆ, the new device updates its parameters again.”, pg. 2, left column, paragraph right above “Methodology”) and model updater circuitry to update one or more of the one or more global parameters based on the model updates from the one or more participant nodes. (“Updating the precision matrix Ωˆ on the central server based on the model parameter wm+1 computed on the second part and all the previous model parameters w1, w2, ···, wm saved on the server, which is in the pseudo code line 10.”, pg. 4, right column, first paragraph; See Algorithm 1, step 10 above)

Regarding claim 18, Li teaches one or more computer-readable media comprising a plurality of instructions stored thereon that, when executed, causes a participant node to: receive a one or more global parameters for a distributed multitask support vector machine from a coordinator node, (“Here, we introduce a central server to coordinate the training process so that each user only needs to communicate with the central server…Global SVM. In this setting, we assume that all the raw data storing on different mobile devices are transmitted to the server. We then train a classifier using SVM (Support Vector Machines) on the server, i.e., the global model. All the mobile devices (i.e., local servers) share the same parameters.”, pg. 1, right column, bottom paragraph; pg. 4, right column, under “Baselines.”) (Here, the central server acts as our coordinator node, and the central server and local servers share the same parameters. (Global and local parameters)) perform training for the distributed multitask support vector machine based on the one or more global parameters associated with a global model for the distributed multitask support vector machine (“As shown in Figure 1, the new device first trains its local model fm+1(x), and the parameters of this model is represented by wm+1, which is then sent to the central server… We then train a classifier using SVM (Support Vector Machines) on the server, i.e., the global model. All the mobile devices (i.e., local servers) share the same parameters.”, pg. 2, left column, paragraph 3; pg. 4, right column, under “Baselines.” (See Figure 1 above)) wherein to perform training comprises to determine one or more parameters for the global model for the distributed multitask support vector machine and to determine one or more parameters for a local model for the distributed multitask support vector machine (“The parameters of each local model can be represented by wt ∈ Rd. We use ft(x) = wt · x to represent the decision boundary. To guarantee the privacy of data, without uploading data to the central server, only local model parameters are sent to the central server. We use the weight matrix W: = (w1, w2, ···, wm) ∈ Rd×m to represent all the model parameters of current devices uploaded in the system and introduce a precision matrix Ω ∈ Rm×m to model the relationships between each pair of devices. Both parameters are stored on the central server.”, pg. 2, left column, second paragraph) and send the one or more parameters for the global model to the coordinator node. (“We then train a classifier using SVM (Support Vector Machines) on the server, i.e., the global model. All the mobile devices (i.e., local servers) share the same parameters.”, pg. 4, right column, under “Baselines.”) (The server hosts the global model and acts as the coordinator node, and shares the parameters with local devices.)


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Li in view of Qiang Shen. (Herein referred to as Shen) (Tolerance-based Adaptive Online Outlier Detection for Internet of Things)

Regarding claim 3, Li teaches the participant node of claim 1, wherein to receive the one or more global parameters comprises to receive a plurality of groups of regularization parameters, (“Based on the uploaded local parameters wm+1, the proposed model can update the parameters W and Ω, without revisiting existing devices … Actually, the regularizer has been proved to be joint convex w.r.t Wˆ and Ωˆ −1 when Ωˆ −1 is a positive semidefinite matrix (satisfied by the second constraint)”, pg. 2, left column, right above “Methodology”; pg. 2, right column, right above “Alternating Optimization Algorithm”) (This shows Ωˆ as a regularization parameter, and with the plurality of local parameters uploaded to the proposed model, Li teaches the receiving of groups of regularization parameters.) and individual groups of regularization parameters of the plurality of groups of regularization parameters comprising a first regularization parameter that indicates a relative weighting between the one or more parameters for the global model. (“In each iteration of the algorithm, we first optimize wm+1 of the new device with fixed precision matrix Ωˆ, and then update the precision matrix Ωˆ based on the model parameter wm+1 that we just learned in the first step and the model parameters w1, w2, ··· , wm of all the previous devices”, pg. 2, right column, under “Alternating Optimization Algorithm”) (Ω^ optimizes the weights based on the current model parameters of all devices, including the central server (global model) and local devices. (local model))
However, Li does not explicitly a second regularization parameter that indicates a tolerance to errors caused by outliers.
Shen teaches a second regularization parameter that indicates a tolerance to errors caused by outliers. (“With an accurate ratio of outliers and a tolerance parameter, a tolerance-based adaptive online outlier detection (TAOOD) algorithm is proposed”, pg. 1, Abstract)
Therefore, it would have been considered obvious to one of ordinary skill in the art,
prior to the current application’s filing date, to combine the nodes and parameters of Li with the addition of a tolerance parameter to have a tolerance-based adaptive online outlier detection (TAOOD) algorithm, introduced by Shen. One would be motivated to combine the two teachings, prior to the filing date of the current application, as a tolerance-based adaptive online outlier detection algorithm decreases the amount of transmitted data by discarding duplicate data and outliers, and eliminates the limitation of original window-based outlier detection algorithm, as disclosed by Shen. (“The contributions of TAOOD are two folds: (i) TAOOD decreases the amount of transmitted data by discarding duplicate data and outliers; (ii) TAOOD eliminates the limitation of original window-based outlier detection algorithm by adapting an accurate ratio of outliers and a tolerance parameter”, pg. 1, Abstract)


Claims 4, 15, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Li in view of Shen and in further view of Gaoyang Liu. (Herein referred to as Liu) (FedEraser: Enabling Efficient Client-Level Data Removal from Federated Learning Models)

Regarding claim 4, Li, as modified by Shen teaches the participant node of claim 3, wherein to perform training for the distributed multitask support vector machine comprises to perform training for the distributed multitask support vector machine fo
Read full office action
NON-LINEAR MULTITASK SUPPORT VECTOR MACHINES

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

NON-LINEAR MULTITASK SUPPORT VECTOR MACHINES

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email