Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. JP2021-164984, filed on 10/06/2021.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 03/18/2024 and 02/06/2025 are being considered by the examiner.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: sensing device in claims 1, 11-19, calculation unit in claims 1, 3-4, 8, 11, learning unit in claims 1, 2, 5-6, 8-12, prediction unit in claims 5-6, data management unit in claim 7, transmission unit in claims 10, 12, 15-16, 19, receiving unit in claims 15-17, collection unit in claims 15-19, sensor unit in claim 18, learning device in claims 15-17, 19-20.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Tajima (US 20230144809 A1) and further in view of Barshan (US 20210103829 A1).
Regarding claims 1 and 14, Tajima discloses a calculation unit that calculates a ([0069] The SWD calculation unit 261 determines presence or absence of an invalid label. If an invalid label is present, the SWD calculation unit 261 excludes the invalid label from the plurality of labels (S601). The remaining labels are valid labels. For example, at least one valid label (or invalid label) may be manually selected via the input UI provided by the user input unit 270. Here, the “invalid label” is a label for which the number of target data elements (data elements in the target data) is less than a predetermined value L (L is a natural number).); and
a learning unit that generates a trained model by a few-label learning process of training the model by using data in which the ("[0038] For each of the one or more second periods, the variation point specifying unit 262 determines whether the calculated SWD is greater than or equal to a first threshold, and specifies a time according to the second period as a variation point when a result of the determination is true. The candidate determination unit 263 determines one or more training data candidates respectively corresponding to one or more periods in the entire period from a part or all of the entire data in the data store 131 based on one or more variation points. The result output unit 264 outputs the determined one or more training data candidates or meta-information about them for relearning of the learning model.
[0059] The generation of the pseudo label enables utilization of partial data, such as calculation of an SWD for each label and a determination of a training data candidate for each label even if the label is not associated with the data element." (the examiner interprets a few-label learning as disclosed by the instant application to be the same as a pseudo-label as disclosed by prior art Tajima since Pseudo-labeling is a semi-supervised machine learning technique that boosts model performance by using a model trained on a small labeled dataset to generate "pseudo-labels" (predicted labels) for a larger set of unlabeled data, then retraining the model on both the original labeled and the new pseudo-labeled data)). Tajima implicitly discloses a degree of influence as an SWD (sliced Wasserstein distance). The prior art obtains the point of change based on the SWD as a base point, and selects training data candidates from data after said point of change. The training data candidates are data after the point of change.
However, in a similar field of endeavor of applying an influence function to training data points, Barshan explicitly teaches, in better detail, a calculation unit that calculates a degree of influence of data collected by a sensing device on model training by machine learning ([0064] In order to identify the influential training data points, an influence function 410 is provided the database of labeled training data points 210, the trained MLA 230, the input 310, and/or the prediction 320 corresponding to the input 310. The influence function 410 may determine an indicator of influence, i.e. an influence score, for each of the training data points in the database of labeled training data points 210. The influence score may be a numerical indicator of how much influence the training data point has with respect to the prediction 320.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Barshan’s teaching of influence data scores and functions, with Tajima’s disclosure of pseudo labeling, in order to provide training data points that are more related to a prediction, and reduce the likelihood that outlier training data points will be provided to the user (Barshan [0008]).
Regarding claim 2, Tajima discloses the learning unit performs the few-label learning process using data having the degree of influence greater than a predetermined threshold ("[0038] For each of the one or more second periods, the variation point specifying unit 262 determines whether the calculated SWD is greater than or equal to a first threshold, and specifies a time according to the second period as a variation point when a result of the determination is true. The candidate determination unit 263 determines one or more training data candidates respectively corresponding to one or more periods in the entire period from a part or all of the entire data in the data store 131 based on one or more variation points. The result output unit 264 outputs the determined one or more training data candidates or meta-information about them for relearning of the learning model.
[0098] The meta-information list 112 is a list of meta-information about training data candidates. For example, the meta-information list 112 represents, for each training data candidate, an ID of a candidate, a start time of a period corresponding to the candidate, an end time of the period, an SWD of the candidate, and an exclusion period during the period."). Tajima implicitly discloses a degree of influence as an SWD (sliced Wasserstein distance). The prior art obtains the point of change based on the SWD as a base point, and selects training data candidates from data after said point of change. The training data candidates are data after the point of change.
However, in a similar field of endeavor of applying an influence function to training data points, Barshan explicitly teaches, in better detail, a degree of influence ([0064] In order to identify the influential training data points, an influence function 410 is provided the database of labeled training data points 210, the trained MLA 230, the input 310, and/or the prediction 320 corresponding to the input 310. The influence function 410 may determine an indicator of influence, i.e. an influence score, for each of the training data points in the database of labeled training data points 210. The influence score may be a numerical indicator of how much influence the training data point has with respect to the prediction 320.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Barshan’s teaching of influence data scores and functions, with Tajima’s disclosure of pseudo labeling, order to provide training data points that are more related to a prediction, and reduce the likelihood that outlier training data points will be provided to the user (Barshan [0008]).
Regarding claim 3, Tajima does not disclose but Barshan teaches the calculation unit calculates the degree of influence based on a loss function ([0064] The influence score for a training data point may be determined based on the gradient of loss corresponding to the respective training data point.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Barshan’s teaching of influence data scores and functions, with Tajima’s disclosure of pseudo labeling, order to provide training data points that are more related to a prediction, and reduce the likelihood that outlier training data points will be provided to the user (Barshan [0008]).
Regarding claim 4, Tajima does not disclose but Barshan teaches the calculation unit calculates the degree of influence by influence functions ([0064] In order to identify the influential training data points, an influence function 410 is provided the database of labeled training data points 210, the trained MLA 230, the input 310, and/or the prediction 320 corresponding to the input 310. The influence function 410 may determine an indicator of influence, i.e. an influence score, for each of the training data points in the database of labeled training data points 210.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Barshan’s teaching of influence data scores and functions, with Tajima’s disclosure of pseudo labeling, order to provide training data points that are more related to a prediction, and reduce the likelihood that outlier training data points will be provided to the user (Barshan [0008]).
Regarding claim 5, Tajima discloses a prediction unit that predicts a label of unlabeled data in which no label is assigned ("[0053] The SWD calculation unit 261 specifies target data for which the SWD is not calculated (S301). In a case where a data element with which a label is not associated is present in the target data, the pseudo label generation unit 267 sets a result of clustering the base data as a pseudo label, generates a pseudo label to be given to the data element by causing an identification model such as a k-nearest neighbor algorithm to learn a relationship between the pseudo label and the base data (S302), and causes the generated label to be included in the data element in the data store 131.
[0057] An example of a case where a data element including an input variable has no label is a case where the learning model is a learning model without teacher data. In such a case, generation and assignment of a pseudo label are effective. For each label, inference surveillance may be performed, or a determination may be made whether a set of data elements including labels is used as a training data candidate."), wherein
wherein the learning unit performs the few-label learning process using a predicted label predicted, by the prediction unit, with unlabeled data in which the degree of influence satisfies the condition as target data, and the target data ([0053] The SWD calculation unit 261 calculates the SWD between the base data and the target data selected in S302 (S303), and stores the calculated target data in the data store 131 as at least a part of the meta-information about the target data.
Claim 9. The model operation support system according to claim 4, wherein the processor further generates a pseudo label for a target data element when the target data element has no label.). Tajima implicitly discloses a degree of influence as an SWD (sliced Wasserstein distance). The prior art obtains the point of change based on the SWD as a base point, and selects training data candidates from data after said point of change. The training data candidates are data after the point of change.
However, in a similar field of endeavor of applying an influence function to training data points, Barshan explicitly teaches, in better detail, a degree of influence ([0064] In order to identify the influential training data points, an influence function 410 is provided the database of labeled training data points 210, the trained MLA 230, the input 310, and/or the prediction 320 corresponding to the input 310. The influence function 410 may determine an indicator of influence, i.e. an influence score, for each of the training data points in the database of labeled training data points 210. The influence score may be a numerical indicator of how much influence the training data point has with respect to the prediction 320.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Barshan’s teaching of influence data scores and functions, with Tajima’s disclosure of pseudo labeling, order to provide training data points that are more related to a prediction, and reduce the likelihood that outlier training data points will be provided to the user (Barshan [0008]).
Regarding claim 6, Tajima discloses the prediction unit predicts the predicted label of the target data using a classifier trained with a dataset of labeled data in which a label is assigned ([0053] The SWD calculation unit 261 specifies target data for which the SWD is not calculated (S301). In a case where a data element with which a label is not associated is present in the target data, the pseudo label generation unit 267 sets a result of clustering the base data as a pseudo label, generates a pseudo label to be given to the data element by causing an identification model such as a k-nearest neighbor algorithm to learn a relationship between the pseudo label and the base data (S302), and causes the generated label to be included in the data element in the data store 131.), and
the learning unit generates the trained model using a dataset to which the target data with the predicted label assigned is added ([0058] The generation of the pseudo label may be performed as follows, for example. That is, the pseudo label generation unit 267 clusters a plurality of data elements in the base data used to create the learning model into a plurality of data sets, and assigns a class to each of the data sets. The pseudo label generation unit 267 creates a model (for example, a neural network) in which a class assigned to each of the data sets is a pseudo label, and causes a pseudo label generation model as this model to learn through self-teaching.).
Regarding claim 7, Tajima discloses a data management unit that deletes data in which the degree of influence does not satisfy the condition ([0069] The SWD calculation unit 261 determines presence or absence of an invalid label. If an invalid label is present, the SWD calculation unit 261 excludes the invalid label from the plurality of labels (S601). The remaining labels are valid labels. For example, at least one valid label (or invalid label) may be manually selected via the input UI provided by the user input unit 270. Here, the “invalid label” is a label for which the number of target data elements (data elements in the target data) is less than a predetermined value L (L is a natural number).),
and stores data in which the degree of influence satisfies the condition into a storage unit as a log ([0048] The model operation support system 100 includes an interface device 251, a storage device 252, and a processor 253 connected to them. Through the interface device 251, data is received from the collection device 210, and a UI is provided to the user terminal 230. The storage device 252 is a base of the data store 131. Data stored in the data store 131 is stored in the storage device 252.). Tajima implicitly discloses a degree of influence as an SWD (sliced Wasserstein distance). The prior art obtains the point of change based on the SWD as a base point, and selects training data candidates from data after said point of change. The training data candidates are data after the point of change.
However, in a similar field of endeavor of applying an influence function to training data points, Barshan explicitly teaches, in better detail, a data management unit that deletes data in which the degree of influence does not satisfy the condition ([0063]For example the user may wish to identify training data points causing erroneous predictions and remove those labeled training data points from the database of labeled training data points 210. Identifying the influential training data points may provide a human-understandable explanation as to how the trained MLA 230 made the prediction 320. As described above, humans are more likely to rely on the prediction 320 if they can understand how the prediction 320 was made.
[0064] In order to identify the influential training data points, an influence function 410 is provided the database of labeled training data points 210, the trained MLA 230, the input 310, and/or the prediction 320 corresponding to the input 310. The influence function 410 may determine an indicator of influence, i.e. an influence score, for each of the training data points in the database of labeled training data points 210. The influence score may be a numerical indicator of how much influence the training data point has with respect to the prediction 320.
[0079] If at step 540 a determination is made that one or more of the training data points are causing errors, the method 500 may continue at step 545 where the training data points causing errors are removed from the set of training data. The training data points may be deleted from the database of labeled training data points 210, or an indication may be stored that these training data points should not be used for training the MLA.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Barshan’s teaching of removing influence data scores and functions that do not satisfy a condition, with Tajima’s disclosure of pseudo labeling, order to provide training data points that are more related to a prediction, and reduce the likelihood that outlier training data points will be provided to the user (Barshan [0008]). The modification would also be motivated to reduce erroneous predictions (Barshan [0063]).
Regarding claim 8, Tajima discloses the calculation unit calculates the degree of influence of image data collected by an image sensor ([0047] The target device 220 is a device as an object of the learning model. The collection device 210 collects data about the target device 220 and transmits the collected data to the model operation support system 100. For example, in a case where the learning model is a model for detecting a value indicated by an analog meter, the target device 220 may be an analog meter, the collection device 210 may be a camera that images the analog meter or a device that collects a captured image from the camera, and the collected data may be captured image data of the analog meter.), and
the learning unit performs the few-label learning process using image data in which the degree of influence satisfies the condition ("[0038] For each of the one or more second periods, the variation point specifying unit 262 determines whether the calculated SWD is greater than or equal to a first threshold, and specifies a time according to the second period as a variation point when a result of the determination is true. The candidate determination unit 263 determines one or more training data candidates respectively corresponding to one or more periods in the entire period from a part or all of the entire data in the data store 131 based on one or more variation points. The result output unit 264 outputs the determined one or more training data candidates or meta-information about them for relearning of the learning model.
[0059] The generation of the pseudo label enables utilization of partial data, such as calculation of an SWD for each label and a determination of a training data candidate for each label even if the label is not associated with the data element."). Tajima implicitly discloses a degree of influence as an SWD (sliced Wasserstein distance). The prior art obtains the point of change based on the SWD as a base point, and selects training data candidates from data after said point of change. The training data candidates are data after the point of change.
However, in a similar field of endeavor of applying an influence function to training data points, Barshan explicitly teaches, in better detail, the calculation unit calculates the degree of influence of image data collected by an image sensor ([0064] In order to identify the influential training data points, an influence function 410 is provided the database of labeled training data points 210, the trained MLA 230, the input 310, and/or the prediction 320 corresponding to the input 310. The influence function 410 may determine an indicator of influence, i.e. an influence score, for each of the training data points in the database of labeled training data points 210.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Barshan’s teaching of influence data scores and functions, with Tajima’s disclosure of pseudo labeling, order to provide training data points that are more related to a prediction, and reduce the likelihood that outlier training data points will be provided to the user (Barshan [0008]).
Regarding claim 9, Tajima discloses the learning unit performs the few-label learning process using corrected image data obtained by correcting the image data in which the degree of influence satisfies the condition ([0076] In a case where the detection is made that the current time is the relearning timing of the learning model, this processing is performed. For example, the model surveillance unit 268 may surveil whether the degree of change in the tendency of inference including the input to the inference model and the output from the inference model has reached a certain degree or more. The relearning timing may be timing at which it is detected that the degree of change is greater than or equal to the certain degree. This makes it possible to automatically determine an appropriate training data candidate and cause the learning model to relearn when the degree of change in the tendency of inference reaches the certain degree or more.). Tajima implicitly discloses a degree of influence as an SWD (sliced Wasserstein distance). The prior art obtains the point of change based on the SWD as a base point, and selects training data candidates from data after said point of change. The training data candidates are data after the point of change.
However, in a similar field of endeavor of applying an influence function to training data points, Barshan explicitly teaches, in better detail, the learning unit performs the few-label learning process using corrected image data obtained by correcting the image data in which the degree of influence satisfies the condition (["[0066] Influence function formulation can be viewed as an inner product between the decrease loss vector for a given test sample and the vector of parameter change caused by upweighting a training sample. For low probability training samples (e.g., outliers) the magnitude of loss gradient and consequently the vector of parameter change are often large. This large-sized magnitude can dominate the effect of directional similarity in the inner product and leads to a significantly larger influence score for low probability training samples compared to more typical ones.
[0067] A normalizing function 430 may be used to determine samples that change the model parameters in the direction of reinforcing the generated prediction the most. To accomplish this, influence scores may be modified to reflect the directional alignment between the change in model parameters and improved loss for the generated prediction.
[0080] After removing the training data points causing errors at step 545, the MLA may be re-trained at step 550. The MLA may be retrained without using the training data points that were removed at step 545. Instead of retraining the MLA, the MLA may be modified so that the removed training data points no longer have an influence on the MLA. In other words, rather than retraining the entire MLA, the MLA may be modified so that it acts as if it were retrained without the removed training data points. ").
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Barshan’s teaching of influence data scores and functions, with Tajima’s disclosure of pseudo labeling, order to provide training data points that are more related to a prediction, and reduce the likelihood that outlier training data points will be provided to the user (Barshan [0008]).
Regarding claim 10, Tajima discloses a transmission unit that transmits the trained model generated by the learning unit to an external device ("[0042] In addition, as described later, the result output unit 264 outputs one or more determined training data candidates for relearning of the learning model, and performs model deployment that is to change an inference model by replacing the learning model with the relearned learning model in response to a manual instruction or without a manual instruction. As described above, in the present embodiment, the model deployment of the relearned learning model is performed using one or more training data candidates instead of or in addition to one or more training data candidates or the meta-information about them. Therefore, convenience is high.
[0047] The target device 220 is a device as an object of the learning model. The collection device 210 collects data about the target device 220 and transmits the collected data to the model operation support system 100. For example, in a case where the learning model is a model for detecting a value indicated by an analog meter, the target device 220 may be an analog meter, the collection device 210 may be a camera that images the analog meter or a device that collects a captured image from the camera, and the collected data may be captured image data of the analog meter.
[0048] Through the interface device 251, data is received from the collection device 210, and a UI is provided to the user terminal 230. The storage device 252 is a base of the data store 131.
[0053] The SWD calculation unit 261 specifies target data for which the SWD is not calculated (S301). In a case where a data element with which a label is not associated is present in the target data, the pseudo label generation unit 267 sets a result of clustering the base data as a pseudo label, generates a pseudo label to be given to the data element by causing an identification model such as a k-nearest neighbor algorithm to learn a relationship between the pseudo label and the base data (S302), and causes the generated label to be included in the data element in the data store 131. ").
Regarding claim 11, Tajima discloses the calculation unit calculates the degree of influence of data collected by a sensing device that is the external device using the trained model ([0045] The model operation support system 100 communicates with a user terminal 230 and one or a plurality of collection devices 210 (for example, two collection devices 210A and 210B) via a network 200 (for example, the Internet). The collection device 210 collects data from one or more target devices 220. For example, the collection device 210A collects data from one target device 220A, and the collection device 210B collects data from two target devices 220B1 and 220B2. For example, the model operation support system 100 may be an example of a core system, and the collection device 210 may be an example of an edge system.
[0069] The SWD calculation unit 261 determines presence or absence of an invalid label. If an invalid label is present, the SWD calculation unit 261 excludes the invalid label from the plurality of labels (S601). The remaining labels are valid labels. For example, at least one valid label (or invalid label) may be manually selected via the input UI provided by the user input unit 270. Here, the “invalid label” is a label for which the number of target data elements (data elements in the target data) is less than a predetermined value L (L is a natural number).), and
the learning unit updates the trained model using data in which the degree of influence calculated by the calculation unit satisfies the condition ([0076] The relearning timing may be timing at which it is detected that the degree of change is greater than or equal to the certain degree. This makes it possible to automatically determine an appropriate training data candidate and cause the learning model to relearn when the degree of change in the tendency of inference reaches the certain degree or more.
[0093] On the other hand, if the result of the determination in S707 is true (YES in S707), for example, in a case where the evaluation of the relearned learning model is higher than or equal to the second evaluation, the result output unit 264 performs model deployment that is to change the inference model by replacing the learning model with the relearned learning model without presenting the operation support UI 110 (S709).). Tajima implicitly discloses a degree of influence as an SWD (sliced Wasserstein distance). The prior art obtains the point of change based on the SWD as a base point, and selects training data candidates from data after said point of change. The training data candidates are data after the point of change.
However, in a similar field of endeavor of applying an influence function to training data points, Barshan explicitly teaches, in better detail, a degree of influence ([0064] In order to identify the influential training data points, an influence function 410 is provided the database of labeled training data points 210, the trained MLA 230, the input 310, and/or the prediction 320 corresponding to the input 310. The influence function 410 may determine an indicator of influence, i.e. an influence score, for each of the training data points in the database of labeled training data points 210. The influence score may be a numerical indicator of how much influence the training data point has with respect to the prediction 320.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Barshan’s teaching of influence data scores and functions, with Tajima’s disclosure of pseudo labeling, order to provide training data points that are more related to a prediction, and reduce the likelihood that outlier training data points will be provided to the user (Barshan [0008]).
Regarding claim 12, Tajima discloses the transmission unit transmits the trained model updated by the learning unit to the sensing device ([0042] In addition, as described later, the result output unit 264 outputs one or more determined training data candidates for relearning of the learning model, and performs model deployment that is to change an inference model by replacing the learning model with the relearned learning model in response to a manual instruction or without a manual instruction.).
Regarding claim 13, Tajima discloses the learning device is a server device that provides a model to the sensing device ([0045] The model operation support system 100 communicates with a user terminal 230 and one or a plurality of collection devices 210 (for example, two collection devices 210A and 210B) via a network 200 (for example, the Internet). The collection device 210 collects data from one or more target devices 220. For example, the collection device 210A collects data from one target device 220A, and the collection device 210B collects data from two target devices 220B1 and 220B2. For example, the model operation support system 100 may be an example of a core system, and the collection device 210 may be an example of an edge system.).
Regarding claims 15 and 20, Tajima discloses a transmission unit that transmits data collected by sensing to a learning device that generates, in a case where a degree of influence of the data on model training by machine learning satisfies a condition, a trained model by a few-label learning process of training the model by using the data ("[0042] In addition, as described later, the result output unit 264 outputs one or more determined training data candidates for relearning of the learning model, and performs model deployment that is to change an inference model by replacing the learning model with the relearned learning model in response to a manual instruction or without a manual instruction. As described above, in the present embodiment, the model deployment of the relearned learning model is performed using one or more training data candidates instead of or in addition to one or more training data candidates or the meta-information about them. Therefore, convenience is high.
[0047] The target device 220 is a device as an object of the learning model. The collection device 210 collects data about the target device 220 and transmits the collected data to the model operation support system 100. For example, in a case where the learning model is a model for detecting a value indicated by an analog meter, the target device 220 may be an analog meter, the collection device 210 may be a camera that images the analog meter or a device that collects a captured image from the camera, and the collected data may be captured image data of the analog meter.
[0053] The SWD calculation unit 261 specifies target data for which the SWD is not calculated (S301). In a case where a data element with which a label is not associated is present in the target data, the pseudo label generation unit 267 sets a result of clustering the base data as a pseudo label, generates a pseudo label to be given to the data element by causing an identification model such as a k-nearest neighbor algorithm to learn a relationship between the pseudo label and the base data (S302), and causes the generated label to be included in the data element in the data store 131.
[0093] On the other hand, if the result of the determination in S707 is true (YES in S707), for example, in a case where the evaluation of the relearned learning model is higher than or equal to the second evaluation, the result output unit 264 performs model deployment that is to change the inference model by replacing the learning model with the relearned learning model without presenting the operation support UI 110 (S709).");
a receiving unit that receives the trained model trained by the learning device from the learning device ("[0045] The model operation support system 100 communicates with a user terminal 230 and one or a plurality of collection devices 210 (for example, two collection devices 210A and 210B) via a network 200 (for example, the Internet). The collection device 210 collects data from one or more target devices 220. For example, the collection device 210A collects data from one target device 220A, and the collection device 210B collects data from two target devices 220B1 and 220B2. For example, the model operation support system 100 may be an example of a core system, and the collection device 210 may be an example of an edge system.
[0051] In addition, it is assumed that the model learning unit 265 creates a learning model and the model inference unit 266 uses the learning model as an inference model. For example, it is assumed that the data collection unit 269 receives data collected for the target device 220 from the collection device 210 and stores the data in the data store 131, and the model inference unit 266 inputs an input variable in the data to the inference model to obtain an output variable."); and
a collection unit that collects data by sensing using the trained model ([0050] The data collection unit 269 collects data from the collection device 210 and stores the collected data in the data store 131.). Tajima implicitly discloses a degree of influence as an SWD (sliced Wasserstein distance). The prior art obtains the point of change based on the SWD as a base point, and selects training data candidates from data after said point of change. The training data candidates are data after the point of change.
However, in a similar field of endeavor of applying an influence function to training data points, Barshan explicitly teaches, in better detail, a degree of influence ([0064] In order to identify the influential training data points, an influence function 410 is provided the database of labeled training data points 210, the trained MLA 230, the input 310, and/or the prediction 320 corresponding to the input 310. The influence function 410 may determine an indicator of influence, i.e. an influence score, for each of the training data points in the database of labeled training data points 210. The influence score may be a numerical indicator of how much influence the training data point has with respect to the prediction 320.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to combine Barshan’s teaching of influence data scores and functions, with Tajima’s disclosure of pseudo labeling, order to provide training data points that are more related to a prediction, and reduce the likelihood that outlier training data points will be provided to the user (Barshan [0008]).
Regarding claim 16, Tajima discloses the transmission unit transmits the data collected, by the collection unit, by sensing using the trained model to the learning device ([0047] The target device 220 is a device as an object of the learning model. The collection device 210 collects data about the target device 220 and transmits the collected data to the model operation support system 100. For example, in a case where the learning model is a model for detecting a value indicated by an analog meter, the target device 220 may be an analog meter, the collection device 210 may be a camera that images the analog meter or a device that collects a captured image from the camera, and the collected data may be captured image data of the analog meter.).
Regarding claim 17, Tajima discloses the receiving unit receives, from the learning device, the trained model updated using the data collected, by the sensing device, by sensing using the trained model ("[0045] The model operation support system 100 communicates with a user terminal 230 and one or a plurality of collection devices 210 (for example, two collection devices 210A and 210B) via a network 200 (for example, the Internet). The collection device 210 collects data from one or more target devices 220. For example, the collection device 210A collects data from one target device 220A, and the collection device 210B collects data from two target devices 220B1 and 220B2. For example, the model operation support system 100 may be an example of a core system, and the collection device 210 may be an example of an edge system.
[0051] In addition, it is assumed that the model learning unit 265 creates a learning model and the model inference unit 266 uses the learning model as an inference model. For example, it is assumed that the data collection unit 269 receives data collected for the target device 220 from the collection device 210 and stores the data in the data store 131, and the model inference unit 266 inputs an input variable in the data to the inference model to obtain an output variable."), and
the collection unit collects data by sensing using the trained model updated by the learning device ([0050] The data collection unit 269 collects data from the collection device 210 and stores the collected data in the data store 131.).
Regarding claim 18, Tajima discloses the collection unit collects image data detected by a sensor unit ([0047] The target device 220 is a device as an object of the learning model. The collection device 210 collects data about the target device 220 and transmits the collected data to the model operation support system 100. For example, in a case where the learning model is a model for detecting a value indicated by an analog meter, the target device 220 may be an analog meter, the collection device 210 may be a camera that images the analog meter or a device that collects a captured image from the camera, and the collected data may be captured image data of the analog meter.).
Regarding claim 19, Tajima discloses the transmission unit transmits image data collected by sensing to the learning device ([0047] The target device 220 is a device as an object of the learning model. The collection device 210 collects data about the target device 220 and transmits the collected data to the model operation support system 100.)
the receiving unit receives, from the learning device, the trained model trained by the learning device using image data ([0051] FIG. 3 illustrates an example of a flow of processing from specification of target data to calculation of SWD. Note that the base data and the target data in the following description are acquired from the data store 131. In addition, it is assumed that the model learning unit 265 creates a learning model and the model inference unit 266 uses the learning model as an inference model. For example, it is assumed that the data collection unit 269 receives data collected for the target device 220 from the collection device 210 and stores the data in the data store 131, and the model inference unit 266 inputs an input variable in the data to the inference model to obtain an output variable.), and
the collection unit collects image data by sensing using the trained model ([0047] The target device 220 is a device as an object of the learning model. The collection device 210 collects data about the target device 220 and transmits the collected data to the model operation support system 100. For example, in a case where the learning model is a model for detecting a value indicated by an analog meter, the target device 220 may be an analog meter, the collection device 210 may be a camera that images the analog meter or a device that collects a captured image from the camera, and the collected data may be captured image data of the analog meter.).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20190377982 A1 to claim 17: [0008] One or more embodiments provide a processing method for generating learning data, which may include: a step of specifying requirement information for generating learning data, based on request information for making a request for learning; and a step of transmitting the requirement information to a device that generates the learning data.
Kuchnik "EFFICIENT AUGMENTATION VIA DATA SUBSAMPLING" to claim 1: "abstract: We propose a novel set of subsampling policies, based on model influence and loss, that can achieve a 90% reduction in augmentation set size while maintaining the accuracy gains of standard data augmentation.
Page 3, sec 4: Our proposed policies consist of two parts: (i) an augmentation score which maps each training point (xi , yi) to a value s ∈ R, and (ii) a policy by which to sample points based on these augmentation scores. In Section 4.1, we describe two metrics, loss and model influence, by which augmentation scores are generated."
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AHMED A NASHER whose telephone number is (571)272-1885. The examiner can normally be reached Mon - Fri 0800 - 1700.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Moyer can be reached at (571) 272-9523. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AHMED A NASHER/Examiner, Art Unit 2675
/ANDREW M MOYER/Supervisory Patent Examiner, Art Unit 2675