DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This action is responsive to the application filed on 05/08/2023. Claims 1-20 are presented in the case. Claims 1, 8 and 15 are independent claims.
Priority
Applicant's claim for the benefit of United States Provisional Patent Application No. 63/445,600, filed on February 14, 2023 is acknowledged.
Information Disclosure Statement
The information disclosure statements submitted on 05/09/2023 and 10/11/2024 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements are being considered by the examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: Claims 1-20 are directed to a method. Therefore, the claims are eligible under Step 1 for being directed to a process.
Independent claim 1:
Step 2A Prong 1:
Claim recites:
identifying a plurality of machine learning models - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and identifying a plurality of machine learning models based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;
determining provenance information for each machine learning model of the plurality of machine learning models - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and identifying parent information of machine learning models that were used in creating the machine learning model based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper; and
generating, using the provenance information, a lineage graph with a plurality of nodes and a plurality of provenance edges, wherein the plurality of nodes correspond to the plurality of machine learning models and a provenance edge between two nodes indicates a node is derived from another node - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of observing data and generating a lineage graph from observed data based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent claim 2:
Step 2A Prong 1: The claim recites the abstract ideas of claim 1.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the plurality of machine learning models include machine learning model derivatives that depend on another machine learning model - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the plurality of machine learning models include machine learning model derivatives that depend on another machine learning model - viewed individually or in combination, describes selecting a particular data source or type of data to be manipulated similar to selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display described in MPEP § 2106.05(g).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 3:
Step 2A Prong 1: The claim recites the abstract ideas of claim 1.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the provenance information provides parent information of machine learning models that were used in creating each machine learning model - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the provenance information provides parent information of machine learning models that were used in creating each machine learning model - viewed individually or in combination, describes selecting a particular data source or type of data to be manipulated similar to selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display described in MPEP § 2106.05(g).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 4:
Step 2A Prong 1: The claim recites the abstract ideas of claim 1.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein each node of the plurality of nodes includes a creation function that identifies how each machine learning model is created - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein each node of the plurality of nodes includes a creation function that identifies how each machine learning model is created - viewed individually or in combination, describes selecting a particular data source or type of data to be manipulated similar to selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display described in MPEP § 2106.05(g).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 5:
Step 2A Prong 1: The claim recites the abstract ideas of claim 1.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the lineage graph further includes: a versioning edge between nodes in the lineage graph with consecutive versions of a machine learning model
- the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the lineage graph further includes: a versioning edge between nodes in the lineage graph with consecutive versions of a machine learning model - viewed individually or in combination, describes selecting a particular data source or type of data to be manipulated similar to selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display described in MPEP § 2106.05(g).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 6:
Step 2A Prong 1:
Claim recites:
wherein the lineage graph is automatically generated by:
determining a structural difference between two machine learning models - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and identifying structural difference between two machine learning models based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;
determining a contextual difference between the two machine learning models - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and identifying contextual difference between two machine learning models based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper; and
using the structural difference and the contextual difference to insert one of the two machine learning models into the lineage graph - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and identifying a machine learning model to add into the lineage graph based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent claim 7:
Step 2A Prong 1: The claim recites the abstract ideas of claim 6.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
receiving a modification of the lineage graph from a user - the steps recited at a high level of generality, and amounts to mere data gathering, receiving interaction between a user and the interactive GUI element is well known which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)); and
updating the lineage graph in response to the modification - the steps recited at a high level of generality, and amounts to mere data modifying which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
receiving a modification of the lineage graph from a user - which is a well-understood, routine, conventional activity similar to receiving or transmitting data over a network described in MPEP 2106.05(d)(II); and
updating the lineage graph in response to the modification - the steps recited at a high level of generality, and amounts to mere data modifying which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Independent claim 8:
Step 2A Prong 1:
Claim recites:
obtaining a lineage graph with a plurality of nodes and a plurality of provenance edges, wherein each node of the plurality of nodes corresponds to a machine learning model and a provenance edge between two nodes indicates a machine learning model is derived from another machine learning model - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of obtaining and evaluating data, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;
performing a traversal of the lineage graph - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of traversing of the lineage graph, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;
using the provenance edge of the node to identify another node connected to the node - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and identifying another node connected to the node based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper;
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
applying, in response to the traversal, a function to a node of the plurality of nodes - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
automatically applying the function to the other node - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
applying, in response to the traversal, a function to a node of the plurality of nodes - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
automatically applying the function to the other node - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 9:
Step 2A Prong 1: The claim recites
wherein the traversal of the lineage graph includes visiting nodes of the lineage graph in an arbitrary order in response to an identification of a type of edge to traverse - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of evaluating data and identifying nodes to traverse based on judgement, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent claim 10:
Step 2A Prong 1: The claim recites the abstract ideas of claim 8.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the function is debugging the machine learning model of the node - the step recited at a high level of generality, and amounts to insignificant application, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the function is debugging the machine learning model of the node - the step recited at a high level of generality, and amounts to insignificant application, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 11:
Step 2A Prong 1: The claim recites the abstract ideas of claim 8.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the function is a modification to the machine learning model of the node - the step recited at a high level of generality, and amounts to insignificant application, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the function is a modification to the machine learning model of the node - the step recited at a high level of generality, and amounts to insignificant application, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 12:
Step 2A Prong 1: The claim recites the abstract ideas of claim 8.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the function is applying a test to the machine learning model of the node - the step recited at a high level of generality, and amounts to insignificant application, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the function is applying a test to the machine learning model of the node - the step recited at a high level of generality, and amounts to insignificant application, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 13:
Step 2A Prong 1: The claim recites the abstract ideas of claim 8.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the function runs diagnosis on machine learning models - the step recited at a high level of generality, and amounts to insignificant application, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the function runs diagnosis on machine learning models wherein the function is applying a test to the machine learning model of the node - the step recited at a high level of generality, and amounts to insignificant application, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 14:
Step 2A Prong 1: The claim recites
wherein the diagnosis includes measuring sparsity levels of the machine learning models or computing deltas between the machine learning models - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses a mathematical concept of a mathematical calculation of calculating using mathematical methods to measuring sparsity levels of the machine learning models or computing deltas between the machine learning models.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Independent claim 15:
Step 2A Prong 1:
Claim recites:
obtaining a lineage graph with a plurality of nodes and a plurality of provenance edges, wherein each node of the plurality of nodes corresponds to a machine learning model and a provenance edge between two nodes indicates a machine learning model is derived from another machine learning model - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of obtaining and evaluating data, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper; and
using the lineage graph to determine a storage optimization for generating a compressed lineage graph - Under its broadest reasonable interpretation in light of the specification, this limitation encompasses the mental process of obtaining, evaluating and identifying data, which is observing, evaluating and judging that is practically capable of being performed in the human mind with the assistance of pen and paper.
Step 2A Prong 2 & Step 2B: There are no additional elements recited so the claims do not provide a practical application and is not considered to be significantly more. As such, the claims are ineligible.
Dependent claim 16:
Step 2A Prong 1: The claim recites the abstract ideas of claim 15.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the compressed lineage graph reduces a storage footprint for a plurality of machine learning models represented in the lineage graph - the step recited at a high level of generality, and amounts to insignificant application, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the compressed lineage graph reduces a storage footprint for a plurality of machine learning models represented in the lineage graph - the step recited at a high level of generality, and amounts to insignificant application, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 17:
Step 2A Prong 1: The claim recites the abstract ideas of claim 15.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the storage optimization is recursively applied to the lineage graph - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the storage optimization is recursively applied to the lineage graph - the step recited at a high level of generality, and amounts to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer (see MPEP § 2106.05(f)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 18:
Step 2A Prong 1: The claim recites the abstract ideas of claim 15.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the lineage graph provides provenance information for the plurality of nodes in the lineage graph with parent information of each machine learning model associated with the plurality of nodes - the step recited at a high level of generality, and amounts to selecting a particular data source or type of data to be manipulated, which is a form of insignificant extra-solution activity (see MPEP § 2106.05(g)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the lineage graph provides provenance information for the plurality of nodes in the lineage graph with parent information of each machine learning model associated with the plurality of nodes - viewed individually or in combination, describes selecting a particular data source or type of data to be manipulated similar to selecting information, based on types of information and availability of information in a power-grid environment, for collection, analysis and display described in MPEP § 2106.05(g).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 19:
Step 2A Prong 1: The claim recites the abstract ideas of claim 15.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the storage optimization is content-based hashing - the step recited at a high level of generality, and amounts to merely indicating a field of use or technological environment in which the judicial exception is performed (see MPEP § 2106.05(h)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the storage optimization is content-based hashing - the step recited at a high level of generality, and amounts to merely indicating a field of use or technological environment in which the judicial exception is performed (see MPEP § 2106.05(h)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Dependent claim 20:
Step 2A Prong 1: The claim recites the abstract ideas of claim 15.
Step 2A Prong 2: This judicial exception is not integrated into a practical application because they recite the additional elements:
wherein the storage optimization is delta compression - the step recited at a high level of generality, and amounts to merely indicating a field of use or technological environment in which the judicial exception is performed (see MPEP § 2106.05(h)).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are thus directed to the abstract idea.
Step 2B: The claims do not include additional elements that amount to significantly more than the judicial exception.
The additional elements:
wherein the storage optimization is delta compression - the step recited at a high level of generality, and amounts to merely indicating a field of use or technological environment in which the judicial exception is performed (see MPEP § 2106.05(h)).
Accordingly, these additional elements do not amount to significantly more than the judicial exception. As such, the claims are ineligible.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-3, 8 and 10-14 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Bowers et al. (hereinafter Bowers), US 20160300156 A1.
Regarding independent claim 1, Bowers teaches a method, comprising:
identifying a plurality of machine learning models ([0025] FIG. 2 is a block diagram illustrating an application service system 200 implementing a machine learner system 202; The application service system 200 can run one or more application services (e.g., an application service 204A, an application service 204B, and an application service 204C, collectively as “the application services 204”; [0028] The model tracking engine 214 implements a model tracking service. The model tracking service can track one or more machine learning models (e.g., including the production models 206A, 206B, and 206C) for one or more of the application services 204. The model tracker database 234 is configured to record data and metadata associated with the machine learning models tracked by the model tracking engine 214. For example, the model tracker database 234 can store and index the machine learning models by version histories, sources of training dataset, training datasets, training configurations, evaluative metrics, or any combination thereof, such that the model tracker database 234 can be queried using one of these variables);
determining provenance information for each machine learning model of the plurality of machine learning models ([0020] the version history can include the production copy as a parent model. The production copy can be based on (e.g., modified from) another machine learning model, and this other machine learning model can be a grandparent model to the latent model in question. In some embodiments, the model tracking service can track and record one or more differences in training configurations of the latent model as compared to its parent model (e.g., the production copy) in the model tracker database 118; [0045] FIG. 4 is a data flow diagram illustrating an example of how a machine learner system tracks source information of a machine learning model (i.e. provenance information)); and
generating, using the provenance information, a lineage graph with a plurality of nodes and a plurality of provenance edges, wherein the plurality of nodes correspond to the plurality of machine learning models and a provenance edge between two nodes indicates a node is derived from another node ([0029] In some embodiments, the model tracking service can tracking a version history of a machine learning model (e.g., a latent or a production model). The version history can be represented by a provenance chain of one or more machine learning models that are based on one another in order (i.e. a lineage graph). For example, the version history can include one or more modifications from a previous machine learning model in the provenance chain to a subsequent machine learning model in the provenance chain; [0045] FIG. 4 illustrates an example of a version history 400 of a production model 402 in the form of a provenance chain maintained by a machine learner system (e.g., the machine learner system 202 of FIG. 2). The version history 400 includes an ex-production model 410A and an ex-production model 410B. The version history 400 also includes a test model 412A and a test model 412B. The test model 412A is built based on a template copy of the ex-production model 410A. The ex-production model 410B can a variant of the test model 412A (e.g., built with updated training dataset compared to the test model 412A). The test model 412B can be built based on a template copy of the ex-production model 410B. The production model 402 can be a variant of the test model 412B).
Regarding dependent claim 2, Bowers teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Bowers further teaches wherein the plurality of machine learning models include machine learning model derivatives that depend on another machine learning model ([0020] the version history can include the production copy as a parent model. The production copy can be based on (e.g., modified from) another machine learning model, and this other machine learning model can be a grandparent model to the latent model in question. In some embodiments, the model tracking service can track and record one or more differences in training configurations of the latent model as compared to its parent model (e.g., the production copy) in the model tracker database 118; Fig. 4, 412b and 412b; [0045] FIG. 4 illustrates an example of a version history 400 of a production model 402 in the form of a provenance chain maintained by a machine learner system (e.g., the machine learner system 202 of FIG. 2). The version history 400 includes an ex-production model 410A and an ex-production model 410B. The version history 400 also includes a test model 412A and a test model 412B. The test model 412A is built based on a template copy of the ex-production model 410A. The ex-production model 410B can a variant of the test model 412A (e.g., built with updated training dataset compared to the test model 412A). The test model 412B can be built based on a template copy of the ex-production model 410B. The production model 402 can be a variant of the test model 412B).
Regarding dependent claim 3, Bowers teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Bowers further teaches wherein the provenance information provides parent information of machine learning models that were used in creating each machine learning model ([0020] the version history can include the production copy as a parent model. The production copy can be based on (e.g., modified from) another machine learning model, and this other machine learning model can be a grandparent model to the latent model in question. In some embodiments, the model tracking service can track and record one or more differences in training configurations of the latent model as compared to its parent model (e.g., the production copy) in the model tracker database 118; [0045] FIG. 4 is a data flow diagram illustrating an example of how a machine learner system tracks source information of a machine learning model).
Regarding independent claim 8, Bowers teaches a method, comprising:
obtaining a lineage graph with a plurality of nodes and a plurality of provenance edges, wherein each node of the plurality of nodes corresponds to a machine learning model and a provenance edge between two nodes indicates a machine learning model is derived from another machine learning model ([0029] In some embodiments, the model tracking service can tracking a version history of a machine learning model (e.g., a latent or a production model). The version history can be represented by a provenance chain of one or more machine learning models that are based on one another in order (i.e. a lineage graph). For example, the version history can include one or more modifications from a previous machine learning model in the provenance chain to a subsequent machine learning model in the provenance chain; [0045] FIG. 4 illustrates an example of a version history 400 of a production model 402 in the form of a provenance chain maintained by a machine learner system (e.g., the machine learner system 202 of FIG. 2). The version history 400 includes an ex-production model 410A and an ex-production model 410B. The version history 400 also includes a test model 412A and a test model 412B. The test model 412A is built based on a template copy of the ex-production model 410A. The ex-production model 410B can a variant of the test model 412A (e.g., built with updated training dataset compared to the test model 412A). The test model 412B can be built based on a template copy of the ex-production model 410B. The production model 402 can be a variant of the test model 412B);
performing a traversal of the lineage graph ([0020] The model trainer engine 112 can be coupled to a model tracking service (e.g., implemented by the model tracking engine 214 of FIG. 2). The model tracking service can record the latent model in a model tracker database 118. The model tracker database 118 can store one or more machine learning models 120. The model tracking service can also record the training configurations used to generate the latent model in the model tracker database 118. In some embodiments, the model tracking service can index the latent model in the model tracker database 118 based on the training data source used, the training dataset used, the data features used, or any combination thereof, in creating and training the latent model. In some embodiments, the model tracking service can store a version history of the latent model in the model tracker database 118. The version history can include a provenance chain of the latent model. Tracking the version history can include tracking one or more modifications from a previous machine learning model to a subsequent machine learning model. For example, the version history can include the production copy as a parent model. The production copy can be based on (e.g., modified from) another machine learning model, and this other machine learning model can be a grandparent model to the latent model in question. In some embodiments, the model tracking service can track and record one or more differences in training configurations of the latent model as compared to its parent model (e.g., the production copy) in the model tracker database 118. The tracked differences in the training configurations can include differences in one or more sources of training datasets, one or more training datasets, one or more data features, or any combination thereof that were used to train the latent model; [0029] In some embodiments, the model tracking service can tracking a version history of a machine learning model (e.g., a latent or a production model). The version history can be represented by a provenance chain of one or more machine learning models that are based on one another in order);
applying, in response to the traversal, a function to a node of the plurality of nodes ([0021] In some embodiments, a model evaluation engine 122 can perform offline testing of the latent model and compute evaluative metrics 124 based on the offline testing results 126; [0023] In several embodiments, the model evaluation engine 122 can also perform live data testing (referred to as “online testing”) of one or more of the models; [0029] For example, the version history can include one or more modifications from a previous machine learning model in the provenance chain to a subsequent machine learning model in the provenance chain. The version history can be used to facilitate roll back of a defective model in production; [0035] The model evaluation engine 222 implements a model evaluation service, including testing, evaluation, and/or validation. The model evaluation service can compute one or more evaluative metrics of a machine learning model (e.g., as described for the model evaluation engine 122). In some embodiments, the machine learner interface can present the evaluative metric for a machine learning model along with the training configurations of the machine learning model to facilitate evaluation of the resulting model (e.g., by a developer/analyst user using the machine learner interface). In some embodiments, the model evaluation engine 222 can detect corruption of a machine learning model based on a computed evaluative metric of the machine learning model);
using the provenance edge of the node to identify another node connected to the node ([0032] the machine learner interface can receive a provenance query targeting a target machine learning model. In response, the machine learner interface can render a diagram representing the version history from the model tracker database 234. The diagram can illustrate one or more related machine learning models of the target machine learning model in response to the provenance query); and
automatically applying the function to the other node ([0020] The version history can include a provenance chain of the latent model. Tracking the version history can include tracking one or more modifications from a previous machine learning model to a subsequent machine learning model. For example, the version history can include the production copy as a parent model. The production copy can be based on (e.g., modified from) another machine learning model, and this other machine learning model can be a grandparent model to the latent model in question. In some embodiments, the model tracking service can track and record one or more differences in training configurations of the latent model as compared to its parent model (e.g., the production copy) in the model tracker database 118. The tracked differences in the training configurations can include differences in one or more sources of training datasets, one or more training datasets, one or more data features, or any combination thereof that were used to train the latent model; [0031] The interface engine 230 can implement the user interface of the machine learner platform (e.g., referred to as the “machine learner interface”) for developer and/or analyst users. The machine learner interface can present interactive controls for building, modifying, tracking, training (e.g., manually or automatically according to a schedule and a training plan), evaluating, and/or deploying the machine learning models tracked by the model tracking engine 214. In some embodiments, the machine learner interface can present a comparison report of two or more of the machine learning models by presenting a rendering of each model's training configurations and evaluative metrics side-by-side).
Regarding dependent claim 10, Bowers teaches all the limitations as set forth in the rejection of claim 8 that is incorporated. Bowers further teaches wherein the function is debugging the machine learning model of the node ([0015] The model tracker service can maintain source control trails of the production machine learning models for the application services and thereby enabling error identification in and/or roll back of the production machine learning models).
Regarding dependent claim 11, Bowers teaches all the limitations as set forth in the rejection of claim 8 that is incorporated. Bowers further teaches wherein the function is a modification to the machine learning model of the node ([0030] For example, the model tracking service can receive an indication that a machine learning model is corrupted. Based on tracked training configuration modification of the corrupted model as compared to a previously working model in the version history, the model tracking service can identify a problematic training dataset or a problematic data feature. The previously working model can be the most recent working model in the provenance chain of the corrupted model. The model tracking service can instead receive user indication of a problem data source or a problem data feature. In that case, the model tracking service can identify a model as being corrupted in the model tracker database 234, where the model is trained with the problem data source or the problem data feature. Regardless, in response to identifying the corrupted model and when the corrupted model is in production, the model tracking service can trigger/cause a rollback of the corrupted model by replacing the corrupted model with the previously working model).
Regarding dependent claim 12, Bowers teaches all the limitations as set forth in the rejection of claim 8 that is incorporated. Bowers further teaches wherein the function is applying a test to the machine learning model of the node ([0021] In some embodiments, a model evaluation engine 122 can perform offline testing of the latent model and compute evaluative metrics 124 based on the offline testing results 126; [0023] the model evaluation engine 122 can also perform live data testing (referred to as “online testing”) of one or more of the models).
Regarding dependent claim 13, Bowers teaches all the limitations as set forth in the rejection of claim 8 that is incorporated. Bowers further teaches wherein the function runs diagnosis on machine learning models ([0035] The model evaluation engine 222 implements a model evaluation service, including testing, evaluation, and/or validation).
Regarding dependent claim 14, Bowers teaches all the limitations as set forth in the rejection of claim 13 that is incorporated. Bowers further teaches wherein the diagnosis includes measuring sparsity levels of the machine learning models or computing deltas between the machine learning models ([0020] the model tracking service can track and record one or more differences in training configurations of the latent model as compared to its parent model (e.g., the production copy) in the model tracker database 118. The tracked differences in the training configurations can include differences in one or more sources of training datasets, one or more training datasets, one or more data features, or any combination thereof that were used to train the latent model; [0030] The model evaluation service can compute one or more evaluative metrics of a machine learning model (e.g., as described for the model evaluation engine 122). In some embodiments, the machine learner interface can present the evaluative metric for a machine learning model along with the training configurations of the machine learning model to facilitate evaluation of the resulting model (e.g., by a developer/analyst user using the machine learner interface). In some embodiments, the model evaluation engine 222 can detect corruption of a machine learning model based on a computed evaluative metric of the machine learning model; [0038] In some embodiments, the model evaluation engine 222 computes a ranking of at least a subset of the machine learning models in the model tracker database 234. For example, the ranking can be based on values of one type of evaluative metrics corresponding to the subset of the machine learning models stored in the model tracker database 234).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 4-5 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Bowers as applied in claims 1 and 9, in view of OLINER et al. (hereinafter OLINER), US 20220414157 A1.
Regarding dependent claim 4, Bowers teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Bowers does not explicitly disclose wherein each node of the plurality of nodes includes a creation function that identifies how each machine learning model is created.
However, in the same field of endeavor, OLINER teaches wherein each node of the plurality of nodes includes a creation function that identifies how each machine learning model is created ([0078] Formally, we say that the model repository tracks two types of objects: artifacts and executors. An artifact is a typed piece of data where typed means that the model repository has specific methods (conflict rules) for resolving conflicts between artifacts that are based on their types; [0079] An executor is a piece of code and an environment in which that code is run which takes one or more artifacts as inputs and produces a new artifact as an output; [0080] Internally, the model repository maintains a directed acyclic graph, where artifacts are nodes, and executors are multi-edges which join those nodes. This is illustrated in FIG. 6, which shows executors E1 and E2 producing artifacts A1 and A2. Executor E3 then processes the artifacts to produce artifact A3; [0082] A given artifact can only be produced by an execution of a particular executor; thus, the repository also stores metadata describing these individual executions).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of storing metadata describing executions for reproducibility purposes as suggested in OLINER into Bowers’s system because both of these systems are addressing maintaining a repository for tracking machine learning models. This modification would have been motivated by the desire to provide a technique to maintain a repository of machine learning directed acyclic graphs to ruse to reduce cost and effort (OLINER, [0004]-[0006]).
Regarding dependent claim 5, Bowers teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Bowers does not explicitly disclose
wherein the lineage graph further includes: a versioning edge between nodes in the lineage graph with consecutive versions of a machine learning model.
However, in the same field of endeavor, OLINER teaches wherein the lineage graph further includes: a versioning edge between nodes in the lineage graph with consecutive versions of a machine learning model ([0080] Internally, the model repository maintains a directed acyclic graph, where artifacts are nodes, and executors are multi-edges which join those nodes. This is illustrated in FIG. 6, which shows executors E1 and E2 producing artifacts A1 and A2. Executor E3 then processes the artifacts to produce artifact A3. Each artifact and executor is tagged with a version hash to enable both direct and relational interactions with these artifacts/executors. One example of a direct interaction is retrieval of a particular version of a specific artifact/executor. This is illustrated in FIGS. 7 and 8).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of tagging each artifact and executor with a version hash to enable both direct and relational interactions with these artifacts/executors as suggested in OLINER into Bowers’s system because both of these systems are addressing maintaining a repository for tracking machine learning models. This modification would have been motivated by the desire to provide a technique to maintain a repository of machine learning directed acyclic graphs to ruse to reduce cost and effort (OLINER, [0004]-[0006]).
Regarding dependent claim 9, Bowers teaches all the limitations as set forth in the rejection of claim 8 that is incorporated. Bowers does not explicitly disclose wherein the traversal of the lineage graph includes visiting nodes of the lineage graph in an arbitrary order in response to an identification of a type of edge to traverse.
However, in the same field of endeavor, OLINER teaches wherein the traversal of the lineage graph includes visiting nodes of the lineage graph in an arbitrary order in response to an identification of a type of edge to traverse ([0088] In the disclosed model repository, whether two hashes refer semantically to the same artifact depends on the set of edges which connect those two artifacts. Types help to resolve this complexity. For example, if the user points to a versioned artifact, the model repository knows its type (e.g., model). Using this information, the repository can answer the question “what was the state of this artifact in the past?” by searching backwards through the edges which lead to this artifact and filtering out the nodes which specifically correspond to models (as opposed to, say, training data)).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of using the type information to search through the edges which lead to the artifact and filtering out the nodes which specifically correspond to models as suggested in OLINER into Bowers’s system because both of these systems are addressing maintaining a repository for tracking machine learning models. This modification would have been motivated by the desire to provide a technique to maintain a repository of machine learning directed acyclic graphs to ruse to reduce cost and effort (OLINER, [0004]-[0006]).
Claims 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Bowers as applied in claims 1, in view of Rossi, US 20240119251 A1.
Regarding dependent claim 6, Bowers teaches all the limitations as set forth in the rejection of claim 1 that is incorporated. Bowers does not explicitly disclose wherein the lineage graph is automatically generated by:
determining a structural difference between two machine learning models;
determining a contextual difference between the two machine learning models; and
using the structural difference and the contextual difference to insert one of the two machine learning models into the lineage graph.
However, in the same field of endeavor, Rossi teaches wherein the lineage graph is automatically generated by ([0018] In one or more embodiments, the model selection system selects a machine-learning model for a graph representation. In particular, the model selection system utilizes estimated graph learning performance metrics to determine a particular machine-learning model to use with the graph representation. For example, the model selection system selects a machine-learning model to process data associated with the graph representation based on the estimated graph learning performance metrics generated for the extracted meta-graph features. Accordingly, the model selection system automatically selects the best machine-learning model for the graph representation based on the structural characteristics of the graph representation):
determining a structural difference between two machine learning models ([0086] Turning now to FIG. 7, this figure shows a flowchart of a series of acts 700 of selecting a machine-learning model for a graph learning task based on meta-graph features of a graph representation; [0087] As shown, the series of acts 700 includes an act 702 of extracting meta-graph features from a graph representation. For example, act 702 involves extracting, utilizing a graph feature machine-learning model, meta-graph features representing structural characteristics of a graph representation comprising a plurality of nodes and a plurality of edges indicating relationships between the plurality of nodes. For instance, act 702 can involve extracting, utilizing a graph feature machine-learning model comprising parameters learned based on a graph dataset and corresponding model performances for a plurality of machine-learning models, meta-graph features comprising structural characteristics of a graph representation in a latent space, the graph representation comprising a plurality of nodes and a plurality of edges indicating relationships between the plurality of nodes);
determining a contextual difference between the two machine learning models ([0092] The series of acts 700 also includes an act 704 of generating estimated graph learning performance metrics for machine-learning models according to the meta-graph features. For example, act 704 involves generating, utilizing the graph feature machine -learning model, a plurality of estimated graph learning performance metrics for a plurality of machine-learning models according to the meta-graph features, wherein the plurality of estimated graph learning performance metrics indicate predicted performances of the plurality of machine-learning models in a graph learning task for the graph representation. Act 704 can involve generating, utilizing the graph feature machine-learning model, a plurality of estimated graph learning performance metrics for a plurality of machine-learning models according to the meta-graph features and learned mappings between the meta-graph features and graph learning performance metrics of the plurality of machine-learning models. Act 704 can involve determining, utilizing the graph feature machine-learning model, based on learned mappings between meta-graph features and model graph learning performance metrics of the plurality of machine-learning models; [0095] Additionally, the series of acts 700 also optionally includes an act 704a of training a graph feature machine-learning model based on learned mappings. For example, act 704a can involve extracting, utilizing the graph feature machine-learning model, a plurality of sets of meta-graph features for training graph representations in the graph dataset. Act 704a can also involve generating, for the plurality of machine-learning models, a plurality of sets of ground-truth graph learning performance metrics according to the plurality of sets of meta-graph features. Act 704a can further involve learning the parameters of the graph feature machine-learning model by determining mappings between the plurality of sets of meta-graph features and the plurality of sets of ground-truth graph learning performance metrics); and
using the structural difference and the contextual difference to insert one of the two machine learning models into the lineage graph ([0096] The series of acts 700 further includes an act 706 of selecting a machine-learning model according to the estimated graph learning performance metrics. For example, act 706 involves selecting a machine-learning model to process data associated with the graph representation according to the plurality of estimated graph learning performance metrics; [0097] Act 706 can involve selecting a machine-learning model of the plurality of machine-learning models corresponding to a highest estimated graph learning performance metric of the plurality of estimated graph learning performance metrics. For example, act 706 can involve selecting the first machine-learning model in response to determining that the first estimated graph learning performance metric is higher than the second estimated graph learning performance metric).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of selecting a machine-learning model for a graph learning task based on meta-graph features of a graph representation as suggested in Rossi into Bowers’s system because both of these systems are addressing graph representation of machine-learning models comprising a plurality of nodes and a plurality of edges indicating relationships between the plurality of nodes. This modification would have been motivated by the desire to provide a technique to select an appropriate machine-learning model to accurately capture and interpret relationships between data points (Rossi, [0001]).
Regarding dependent claim 7, the combination of Bowers and Rossi teaches all the limitations as set forth in the rejection of claim 6 that is incorporated. Rossi further teaches further comprising:
receiving a modification of the lineage graph from a user ([0022] As shown in FIG. 1, the server device(s) 104 includes or hosts the data graph learning system 110. The data graph learning system 110 include, or be part of, one or more systems that implement data processing via graph representations. For example, the data graph learning system 110 provides tools for performing various operations on data (e.g., in the database 116). To illustrate, the data graph learning system 110 provides tools to utilize one or more machine-learning models to perform data processing operations, including, but not limited to, link prediction, node classification, graph classification, node clustering, or graph modification; [0026] In addition, as shown in FIG. 1, the system environment 100 includes the client device 106. In one or more embodiments, the client device 106 includes, but is not limited to, a mobile device (e.g., smartphone or tablet), a laptop, a desktop, including those explained below with reference to FIG. 8. Furthermore, although not shown in FIG. 1, the client device 106 can be operated by a user (e.g., a user included in, or associated with, the system environment 100) to perform a variety of functions. In particular, the client device 106 performs functions such as, but not limited to, accessing, viewing, and interacting with a variety of digital content (e.g., datasets associated with a particular domain). In some embodiments, the client device 106 also performs functions for generating, capturing, or accessing data to provide to the data graph learning system 110 and the model selection system 102 in connection with performing graph learning tasks on datasets. For example, the client device 106 communicates with the server device(s) 104 via the network 108 to provide information (e.g., user interactions) associated with datasets (e.g., stored at the database 116); ); and
updating the lineage graph in response to the modification ([0082] The model performance manager 602 also includes the performance mapping manager 606 to determine mappings of model performance of a plurality of machine-learning models to the meta-graph features of graph representations. In particular, the performance mapping manager 606 generates mappings between the meta-graph features and graph learning performance metrics of the machine-learning models. Additionally, in some embodiments, the performance mapping manager 606 communicates with the machine-learning model manager 608 to generate a meta-graph model including model nodes and graph nodes with edges linking machine-learning models and graph representations based on the mappings; [0083] The model performance manager 602 further includes the machine-learning model manager 608 to manage a plurality of machine-learning models for graph learning tasks. For example, the machine-learning model manager 608 identifies a plurality of machine-learning models for various graph learning tasks including link prediction, node classification, graph classification, node clustering, or graph modification. In particular, the machine-learning model manager 608 manages a plurality of hyperparameters of various machine-learning models that use various methods for the different graph learning tasks. The machine-learning model manager 608 communicates with the performance mapping manager 606 to generate mappings between model performances and meta-graph features. The machine-learning model manager 608 also generates, trains, or utilizes a graph feature machine-learning model or a meta-graph model).
Claims 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Bowers, in view of KUHTZ et al. (hereinafter KUHTZ), US 20180218005 A1.
Regarding independent claim 15, Bowers teaches a method, comprising:
obtaining a lineage graph with a plurality of nodes and a plurality of provenance edges, wherein each node of the plurality of nodes corresponds to a machine learning model and a provenance edge between two nodes indicates a machine learning model is derived from another machine learning model ([0029] In some embodiments, the model tracking service can tracking a version history of a machine learning model (e.g., a latent or a production model). The version history can be represented by a provenance chain of one or more machine learning models that are based on one another in order (i.e. a lineage graph). For example, the version history can include one or more modifications from a previous machine learning model in the provenance chain to a subsequent machine learning model in the provenance chain; [0045] FIG. 4 illustrates an example of a version history 400 of a production model 402 in the form of a provenance chain maintained by a machine learner system (e.g., the machine learner system 202 of FIG. 2). The version history 400 includes an ex-production model 410A and an ex-production model 410B. The version history 400 also includes a test model 412A and a test model 412B. The test model 412A is built based on a template copy of the ex-production model 410A. The ex-production model 410B can a variant of the test model 412A (e.g., built with updated training dataset compared to the test model 412A). The test model 412B can be built based on a template copy of the ex-production model 410B. The production model 402 can be a variant of the test model 412B).
Bowers does not explicitly disclose using the lineage graph to determine a storage optimization for generating a compressed lineage graph.
However, in the same field of endeavor, KUHTZ teaches
using the lineage graph to determine a storage optimization for generating a compressed lineage graph ([0003] Some technologies described herein are directed to the technical activity of identifying chunks or blocks whose content has already been stored in a chunk or block storage system. The terms “chunk” and “block” are used interchangeably herein to mean a portion of a file which is generally but not always less than the entire file. A file may be kept as a sequence of chunks. In this case, a chunk is a sub-file unit in a storage system. Although many files will include a sequence of multiple chunks, some files may be small enough to fit in a single chunk. A file may also be kept as a base plus zero or more deltas from that base; this is known as “delta encoding”. When delta encoding is used, a chunk may include the base version of a file, or a chunk may include a delta from the base version or a delta from a later version of the file. Some of the technologies herein are directed to reducing storage system operations (i.e. storage optimization) such as presence queries or uploads involving artifacts that are created or updated during a software build; [0203] Some discussions of deduplication technology described herein spoke in terms of data deduplication via a provenance graph. Thus, “provenance graph” may be encountered as another name for a dedup graph 422; Figs. 5-6; [0220] In this example, the dedup graph 422 includes a directed acyclic graph data structure 424 that resides in and configures the dedup processing memory 112; [0229] The deduplication ratio for data indicates the data's original size versus its size after removing redundancy. It is calculated as the size of data before deduplication (redundancy removal) divided by the size after deduplication. For example, a 4:1 deduplication ratio means 4 terabytes of data can be stored in 1 terabyte of physical memory; [0280] As to artifacts directory structure, the artifacts of a build are written to the file system during the build. The resulting directory structure of the artifacts is independent of the build graph. FIG. 6 shows how the artifacts of the build from FIG. 5 may be stored in the filesystem. This directory structure layout is optimized for distribution, usage, and consumption by package managers. By contrast, a build graph is optimized for tracking causal dependencies between resources of the build in order to reduce build time and for avoiding unnecessary rebuilds).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of constructing and utilizing dedup graphs to support storage deduplication to optimize storage as suggested in KUHTZ into Bowers’s system because both of these systems are addressing maintaining a repository for provenance graph. This modification would have been motivated by the desire to facilitate efficient choices between storage-compute-network options by using constructing a dedup graph (KUHTZ, [0006]).
Regarding dependent claim 16, the combination of Bowers and KUHTZ teaches all the limitations as set forth in the rejection of claim 15 that is incorporated. Bowers teaches machine learning models represented in the lineage graph ([0029] In some embodiments, the model tracking service can tracking a version history of a machine learning model (e.g., a latent or a production model). The version history can be represented by a provenance chain of one or more machine learning models that are based on one another in order (i.e. a lineage graph). For example, the version history can include one or more modifications from a previous machine learning model in the provenance chain to a subsequent machine learning model in the provenance chain). KUHTZ further teaches wherein the compressed lineage graph reduces a storage footprint for a plurality of nodes represented in the lineage graph ([0005] The dedup graph does not necessarily have the same structure as a file system directory graph for the build, so a change caused by the build may impact fewer dedup graph nodes than directory graph nodes, resulting in fewer storage operations to update the storage with new or changed build artifacts. Storage operations directed using the dedup graph may be performed concurrently with software build operations; [0210] Dedup graphs, by contrast, are intended and configured to support storage deduplication).
Regarding dependent claim 17, the combination of Bowers and KUHTZ teaches all the limitations as set forth in the rejection of claim 15 that is incorporated. KUHTZ further teaches wherein the storage optimization is recursively applied to the lineage graph ([0165] 1128 recursively process child nodes to determine whether they are present in the chunk store, e.g., by placing them in the query queue and getting a response from the chunk store, or by comparing their expiration dates to cached expiration dates).
Regarding dependent claim 18, the combination of Bowers and KUHTZ teaches all the limitations as set forth in the rejection of claim 15 that is incorporated. Bowers further teaches wherein the lineage graph provides provenance information for the plurality of nodes in the lineage graph with parent information of each machine learning model associated with the plurality of nodes ([0028] The model tracker database 234 is configured to record data and metadata associated with the machine learning models tracked by the model tracking engine 214. For example, the model tracker database 234 can store and index the machine learning models by version histories, sources of training dataset, training datasets, training configurations, evaluative metrics, or any combination thereof, such that the model tracker database 234 can be queried using one of these variables).
Regarding dependent claim 19, the combination of Bowers and KUHTZ teaches all the limitations as set forth in the rejection of claim 15 that is incorporated. KUHTZ further teaches wherein the storage optimization is content-based hashing ([0021] The present disclosure describes and illustrates solutions that leverage the hierarchical and incremental nature of the software build systems to reduce the number of storage operations needed. The reduction may be from a number of storage operations that is proportional to the total number of files in a build, down to a number of storage operations that is proportional to the amount of change in the build, e.g., the number of changed file chunks. Specifically, by a deduplicating process which shadows (e.g., partially mirrors) the causal dependency graph of the build workflow, some embodiments reduce or even minimize the number of churned nodes in a hash tree representation of the build's transformations; [0198] FIGS. 3, 5, 8, and 9 illustrate aspects of graphs that are used in some embodiments. In general, the graphs used by embodiments are directed acyclic graphs (DAGs) with hashes that identify their nodes (a.k.a. vertexes) and that are based on the hashes of child nodes (which are adjacent vertexes)).
Regarding dependent claim 20, the combination of Bowers and KUHTZ teaches all the limitations as set forth in the rejection of claim 15 that is incorporated. KUHTZ further teaches wherein the storage optimization is delta compression ([0003] Some technologies described herein are directed to the technical activity of identifying chunks or blocks whose content has already been stored in a chunk or block storage system. The terms “chunk” and “block” are used interchangeably herein to mean a portion of a file which is generally but not always less than the entire file. A file may be kept as a sequence of chunks. In this case, a chunk is a sub-file unit in a storage system. Although many files will include a sequence of multiple chunks, some files may be small enough to fit in a single chunk. A file may also be kept as a base plus zero or more deltas from that base; this is known as “delta encoding”. When delta encoding is used, a chunk may include the base version of a file, or a chunk may include a delta from the base version or a delta from a later version of the file. Some of the technologies herein are directed to reducing storage system operations such as presence queries or uploads involving artifacts that are created or updated during a software build).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Applicant is required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
Ashrafi et al. (US 20220292309 A1) discloses capturing data transitions in machine learning models.
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMY P HOANG whose telephone number is (469)295-9134. The examiner can normally be reached M-TH 8:30-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JENNIFER WELCH can be reached at 571-272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AMY P HOANG/Examiner, Art Unit 2143
/JENNIFER N WELCH/Supervisory Patent Examiner, Art Unit 2143