DETAILED ACTION
Remarks
This office action is issued in response to communication filed on 8/18/2022. Claims 1-20 are pending in this Office Action.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 1,8 and 15 are objected to because of the following informalities: Claims 1,8 and 15 recite the limitation of “train the machine learning model” . There is insufficient antecedent basis for this limitation in the claim. It is not clear which of “ the machine learning model” this limitation is referring to since the claim recites “a machine learning model” and “a graph -attention augmented temporal neural network machine learning model”. Appropriate correction is required.
Claims 1,8 and 15 recites undefined variables t, T and l (i.e. a negative or fractional or floating point values would render these claims indefinite) . Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
2. Claims 1 - 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Claim 1:
Step 1: Statutory Category ?: Yes. claim 1 recites a method (i.e., a “process”) which is statutory category.
Step 2A-Prong 1: Judicial Exception Recited ?: Yes.
The limitations: “generating, by the computing device, a global guidance correlation graph data object, wherein: (i) each node of the global guidance correlation graph data object corresponds to a feature in the plurality of features, and (ii) each edge of the global guidance correlation graph data object corresponds to a feature pair and describes a co-occurrence probability for the feature pair; for each temporal sequence, generating, by the computing device, one or more dynamic co-occurrence graph data object based at least in part on the global guidance correlation graph, wherein each dynamic co-occurrence graph data object for a particular temporal sequence describes a projection of the global guidance correlation graph data object on the input data object for the temporal sequence” and “performing one or more prediction-based actions based at least in part on the one or more predictive classification labels” ” are mental processes that can be performed in the human mind using observation, evaluation, judgment and opinion including using a pen and paper.
The limitations of “training the machine learning model comprises, for each combination of a given temporal sequence t of T number of temporal sequences in the plurality of temporal sequences, a given non-initial embedding layer ! of the one or more embedding layers, and a given feature i of the plurality of features, generating a historical node representation based at least in part on: (i) a prior-layer historical node representation for the given temporal sequence t and the given feature i as generated by a preceding embedding layer /-1, and (ii) neighbor nodes for a target node associated with the given feature i in the dynamic co-occurrence graph corresponding to the given temporal sequence t” is mathematical relationships that falls within the mathematical concepts grouping of abstract idea.
Step 2A-Prong 2: Integrated into a practical application? No.
Claim 1 recites additional element of “receiving, by a computing device, one or more input data objects, each input data object comprising a temporal sequence in a plurality of temporal sequences and comprising a related feature subset of a plurality of features associated with the temporal sequence” is simply data gathering step and therefore are insignificant extra-solution activities. (See MPEP 2106.05(g)).
The additional element of “computing device” which is recited at the very high level of generality such that it amounts no more than mere instructions to apply the exception using generic computer components.
The additional element of “generating, by the computing device, using the machine learning model, and based at least in part on the plurality of temporal sequences and each dynamic co-occurrence graph data object, one or more predicted classification labels” and “an initial embedding layer is configured to, for an initial temporal sequence, generate historical node representations for the plurality of features using a tree-of- sequences based at least in part on initial embeddings that are generated using a sequential long short-term memory machine learning model” amount no more than using generic computer with generic machine learning model to apply the abstract idea.
Step 2B: Recites additional elements that amount to significantly more than the judicial exception? No.
Claim 1 does not include additional elements that are sufficient to amount to significantly more than judicial exception. As indicates above, the additional element of “computing device” and generating steps are at best the equivalent of merely adding the words “apply it” to the exception. The receiving step is mere data gathering and is well-understood, routine conventional activities previously known to the industry and therefore does not amount to significantly more than the judicial exception. (See MPEP 2106.05(d)), subsection II. Even when considered in combination, the additional elements do not provide an inventive concept, claim 1 therefore is ineligible.
Claim 2 recites additional element of “wherein each edge of the one or more dynamic co-occurrence graph data objects for a particular temporal sequence is associated with a respective feature pair that are both in the related feature subset for the particular temporal sequence” which is mental process that can be performed in the human mind using observation, evaluation, judgment and opinion. Claim 2 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 2 is not patent eligible.
Claim 3 recites additional element of “wherein an initial embedding for a particular feature is generated based at least in part on a latent representation of text data associated with the particular feature and hidden representation of sequential long short-term memory machine learning models for one or more related features for the particular feature as defined by a classification tree of a tree-of-sequences long short-term memory machine learning model” amount no more than using generic computer with generic long short-term memory machine learning model to apply the abstract idea at best the equivalent of merely adding the words “apply it” to the exception. Claim 3 is not patent eligible.
Claim 4 recites additional element of “ wherein the one or more predicted classification labels are generated based at least in part on a hidden state generated based at least in part on historical node representations for the related feature subset of a final temporal sequence” which is mental process that can be performed in the human mind using observation, evaluation, judgment and opinion. Claim 4 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 4 is not patent eligible.
Claim 5 recites additional element of “wherein: each dynamic co-occurrence graph comprises a sequence of adjacency matrices” which is mental process that can be performed in the human mind using observation, evaluation, judgment and opinion. Claim 5 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 5 is not patent eligible.
Claim 6 recites additional element of “ wherein the historical node representation for the given temporal sequence t, the given non-initial embedding layer 1, and the given feature i is generated using operations of
PNG
media_image1.png
38
464
media_image1.png
Greyscale
Where
PNG
media_image2.png
40
1089
media_image2.png
Greyscale
comprises the neighbor nodes for the target node associated with the given feature i in the dynamic co-occurrence graph corresponding to the given temporal sequence t” which is a mathematical relationships. Claim 6 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 6 is not patent eligible.
Claim 7 recites additional element of computer-implemented method of claim 1, “wherein the co-occurrence probability for a particular feature pair describes a count of co-occurrences of the particular feature pair in a common temporal sequence across all of the plurality of input data objects” which is a mathematical relationships. Claim 7 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 7 is not patent eligible.
Claim 8:
Step 1: Statutory Category ?: Yes. claim 8 recites an apparatus (i.e., a “machine”) which is statutory category.
Step 2A-Prong 1: Judicial Exception Recited ?: Yes.
The limitations: “generate a global guidance correlation graph data object, wherein: (i) each node of the global guidance correlation graph data object corresponds to a feature in the plurality of features, and (ii) each edge of the global guidance correlation graph data object corresponds to a feature pair and describes a co-occurrence probability for the feature pair; for each temporal sequence, generate, one or more dynamic co-occurrence graph data object based at least in part on the global guidance correlation graph, wherein each dynamic co-occurrence graph data object for a particular temporal sequence describes a projection of the global guidance correlation graph data object on the input data object for the temporal sequence” and “perform one or more prediction-based actions based at least in part on the one or more predictive classification labels” are mental processes that can be performed in the human mind using observation, evaluation, judgment and opinion including using a pen and paper.
The limitations of “training the machine learning model comprises, for each combination of a given temporal sequence t of T number of temporal sequences in the plurality of temporal sequences, a given non-initial embedding layer ! of the one or more embedding layers, and a given feature i of the plurality of features, generating a historical node representation based at least in part on: (i) a prior-layer historical node representation for the given temporal sequence t and the given feature i as generated by a preceding embedding layer /-1, and (ii) neighbor nodes for a target node associated with the given feature i in the dynamic co-occurrence graph corresponding to the given temporal sequence t” is mathematical relationships that falls within the mathematical concepts grouping of abstract idea.
Step 2A-Prong 2: Integrated into a practical application? No.
Claim 8 recites additional element of “receive one or more input data objects, each input data object comprising a temporal sequence in a plurality of temporal sequences and comprising a related feature subset of a plurality of features associated with the temporal sequence” is simply data gathering step and therefore are insignificant extra-solution activities. (See MPEP 2106.05(g)).
The additional element of “memory and processor” which is recited at the very high level of generality such that it amounts no more than mere instructions to apply the exception using generic computer components.
The additional element of “generate, using the machine learning model, and based at least in part on the plurality of temporal sequences and each dynamic co-occurrence graph data object, one or more predicted classification labels” and “an initial embedding layer is configured to, for an initial temporal sequence, generate historical node representations for the plurality of features using a tree-of- sequences based at least in part on initial embeddings that are generated using a sequential long short-term memory machine learning model” amount no more than using generic computer with generic machine learning model to apply the abstract idea.
Step 2B: Recites additional elements that amount to significantly more than the judicial exception? No.
Claim 8 does not include additional elements that are sufficient to amount to significantly more than judicial exception. As indicates above, the additional element of “memory and processor”; “machine learning model” and generate steps are at best the equivalent of merely adding the words “apply it” to the exception. The receiving step is mere data gathering and is well-understood, routine conventional activities previously known to the industry and therefore does not amount to significantly more than the judicial exception. (See MPEP 2106.05(d)), subsection II. Even when considered in combination, the additional elements do not provide an inventive concept, claim 8 therefore is ineligible.
Claim 9 recites additional element of “wherein each edge of the one or more dynamic co- occurrence graph data objects for a particular temporal sequence is associated with a respective feature pair that are both in the related feature subset for the particular temporal sequence” which is mental process that can be performed in the human mind using observation, evaluation, judgment and opinion. Claim 9 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 9 is not patent eligible.
Claim 10 recites additional element of “wherein an initial embedding for a particular feature is generated based at least in part on a latent representation of text data associated with the particular feature and hidden representation of sequential long short-term memory machine learning models for one or more related features for the particular feature as defined by a classification tree of a tree-of-sequences long short-term memory machine learning model” amount no more than using generic computer with generic long short-term memory machine learning model to apply the abstract idea at best the equivalent of merely adding the words “apply it” to the exception. Claim 10 is not patent eligible.
Claim 11 recites additional element of “wherein the one or more predicted classification labels are generated based at least in part on a hidden state generated based at least in part on historical node representations for the related feature subset of a final temporal sequence” which is mental process that can be performed in the human mind using observation, evaluation, judgment and opinion. Claim 11 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 11 is not patent eligible.
Claim 12 recites additional element of “wherein: each dynamic co-occurrence graph comprises a sequence of adjacency matrices” which is mental process that can be performed in the human mind using observation, evaluation, judgment and opinion. Claim 12 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 12 is not patent eligible.
Claim 13 recites additional element of “wherein the historical node representation for the given temporal sequence t, the given non-initial embedding layer 1, and the given feature i is generated using operations of
PNG
media_image1.png
38
464
media_image1.png
Greyscale
Where
PNG
media_image2.png
40
1089
media_image2.png
Greyscale
comprises the neighbor nodes for the target node associated with the given feature i in the dynamic co-occurrence graph corresponding to the given temporal sequence t” which is a mathematical relationships. Claim 13 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 13 is not patent eligible.
Claim 14 recites additional element of “wherein the co-occurrence probability for a particular feature pair describes a count of co-occurrences of the particular feature pair in a common temporal sequence across all of the plurality of input data objects” which is a mathematical relationships. Claim 14 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 14 is not patent eligible.
Claim 15:
Step 1: Statutory Category ?: Yes. claim 15 recites a computer program product (i.e., an article of manufacture ) which is statutory category.
Step 2A-Prong 1: Judicial Exception Recited ?: Yes.
The limitations: “generate a global guidance correlation graph data object, wherein: (i) each node of the global guidance correlation graph data object corresponds to a feature in the plurality of features, and (ii) each edge of the global guidance correlation graph data object corresponds to a feature pair and describes a co-occurrence probability for the feature pair; for each temporal sequence, generate, one or more dynamic co-occurrence graph data object based at least in part on the global guidance correlation graph, wherein each dynamic co-occurrence graph data object for a particular temporal sequence describes a projection of the global guidance correlation graph data object on the input data object for the temporal sequence” and “perform one or more prediction-based actions based at least in part on the one or more predictive classification labels” are mental processes that can be performed in the human mind using observation, evaluation, judgment and opinion including using a pen and paper.
The limitations of “training the machine learning model comprises, for each combination of a given temporal sequence t of T number of temporal sequences in the plurality of temporal sequences, a given non-initial embedding layer ! of the one or more embedding layers, and a given feature i of the plurality of features, generating a historical node representation based at least in part on: (i) a prior-layer historical node representation for the given temporal sequence t and the given feature i as generated by a preceding embedding layer /-1, and (ii) neighbor nodes for a target node associated with the given feature i in the dynamic co-occurrence graph corresponding to the given temporal sequence t” is mathematical relationships that falls within the mathematical concepts grouping of abstract idea.
Step 2A-Prong 2: Integrated into a practical application? No.
Claim 15 recites additional element of “receive one or more input data objects, each input data object comprising a temporal sequence in a plurality of temporal sequences and comprising a related feature subset of a plurality of features associated with the temporal sequence” is simply data gathering step and therefore are insignificant extra-solution activities. (See MPEP 2106.05(g)).
The additional element of “non-transitory computer readable storage medium” which is recited at the very high level of generality such that it amounts no more than mere instructions to apply the exception using generic computer component.
The additional element of “generate, using the machine learning model, and based at least in part on the plurality of temporal sequences and each dynamic co-occurrence graph data object, one or more predicted classification labels” and “an initial embedding layer is configured to, for an initial temporal sequence, generate historical node representations for the plurality of features using a tree-of- sequences based at least in part on initial embeddings that are generated using a sequential long short-term memory machine learning model” amount no more than using generic computer with generic machine learning model to apply the abstract idea.
Step 2B: Recites additional elements that amount to significantly more than the judicial exception? No.
Claim 15 does not include additional elements that are sufficient to amount to significantly more than judicial exception. As indicates above, the additional element of “non-transitory computer readable storage medium; “machine learning model” and generate steps are at best the equivalent of merely adding the words “apply it” to the exception. The receiving step is mere data gathering and is well-understood, routine conventional activities previously known to the industry and therefore does not amount to significantly more than the judicial exception. (See MPEP 2106.05(d)), subsection II. Even when considered in combination, the additional elements do not provide an inventive concept, claim 15 therefore is ineligible.
Claim 16 recites additional element of “wherein each edge of the one or more dynamic co-occurrence graph data objects for a particular temporal sequence is associated with a respective feature pair that are both in the related feature subset for the particular temporal sequence” which is mental process that can be performed in the human mind using observation, evaluation, judgment and opinion. Claim 16 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 16 is not patent eligible.
Claim 17 recites additional element of “wherein an initial embedding for a particular feature is generated based at least in part on a latent representation of text data associated with the particular feature and hidden representation of sequential long short-term memory machine learning models for one or more related features for the particular feature as defined by a classification tree of a tree-of-sequences long short-term memory machine learning model” amount no more than using generic computer with generic long short-term memory machine learning model to apply the abstract idea at best the equivalent of merely adding the words “apply it” to the exception. Claim 17 is not patent eligible.
Claim 18 recites additional element of “wherein the one or more predicted classification labels are generated based at least in part on a hidden state generated based at least in part on historical node representations for the related feature subset of a final temporal sequence” which is mental process that can be performed in the human mind using observation, evaluation, judgment and opinion. Claim 18 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 18 is not patent eligible.
Claim 19 recites additional element of “wherein: each dynamic co-occurrence graph comprises a sequence of adjacency matrices” which is mental process that can be performed in the human mind using observation, evaluation, judgment and opinion . Claim 19 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 19 is not patent eligible.
Claim 20 recites additional element of “wherein the historical node representation for the given temporal sequence t, the given non-initial embedding layer 1, and the given feature i is generated using operations of
PNG
media_image1.png
38
464
media_image1.png
Greyscale
Where
PNG
media_image2.png
40
1089
media_image2.png
Greyscale
comprises the neighbor nodes for the target node associated with the given feature i in the dynamic co-occurrence graph corresponding to the given temporal sequence t” which is a mathematical relationships. Claim 20 does not include any additional element that integrates the abstract idea into practical application in step 2A-Prong 2 and amounts to significantly more than the judicial exception in step 2B. Claim 20 is not patent eligible.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the claims at issue are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the reference application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The USPTO internet Web site contains terminal disclaimer forms which may be used. Please visit http://www.uspto.gov/forms/. The filing date of the application will determine what form should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-4,8-11,16-18 are rejected on the ground of nonstatutory obviousness type double patenting as being unpatentable over claims 1-4,8-11 and 15-17 respectively of Co-Pending US Patent Application 18/153,047, hereinafter “047 application”. Although the claims at issue are not identical, they are not patentably distinct from each other because all the elements of the instant application claims 1-4,8-11,16-18 are to be found in the claims 1-4,8-11 and 15-17 respectively of the 047 application .
Instant Application (17/820,861)
Co-Pending Application 18/153,047
1. A computer-implemented method for classification using a machine learning model, the computer-implemented method comprising: receiving, by a computing device, one or more input data objects, each input data object comprising a temporal sequence in a plurality of temporal sequences and comprising a related feature subset of a plurality of features associated with the temporal sequence;
generating, by the computing device, a global guidance correlation graph data object, wherein: (i) each node of the global guidance correlation graph data object corresponds to a feature in the plurality of features, and (ii) each edge of the global guidance correlation graph data object corresponds to a feature pair and describes a co-occurrence probability for the feature pair;
for each temporal sequence, generating, by the computing device, one or more dynamic co-occurrence graph data object based at least in part on the global guidance correlation graph, wherein each dynamic co-occurrence graph data object for a particular temporal sequence describes a projection of the global guidance correlation graph data object on the input data object for the temporal sequence;
generating, by the computing device, using the machine learning model, and based at least in part on the plurality of temporal sequences and each dynamic co-occurrence graph data object, one or more predicted classification labels,
wherein: the machine learning model comprises a graph-attention augmented temporal neural network machine learning model comprising a plurality of embedding layers, training the machine learning model comprises,
for each combination of a given temporal sequence t of T number of temporal sequences in the plurality of temporal sequences, a given non-initial embedding layer ! of the one or more embedding layers, and a given feature i of the plurality of features,
generating a historical node representation based at least in part on: (i) a prior-layer historical node representation for the given temporal sequence t and the given feature i as generated by a preceding embedding layer /-1, and
(ii) neighbor nodes for a target node associated with the given feature i in the dynamic co-occurrence graph corresponding to the given temporal sequence t,
an initial embedding layer is configured to, for an initial temporal sequence, generate historical node representations for the plurality of features using a tree-of- sequences based at least in part on initial embeddings that are generated using a sequential long short-term memory machine learning model; and
performing one or more prediction-based actions based at least in part on the one or more predictive classification labels.
2. The computer-implemented method of claim 1, wherein each edge of the one or more dynamic co-occurrence graph data objects for a particular temporal sequence is associated with a respective feature pair that are both in the related feature subset for the particular temporal sequence.
3. The computer-implemented method of claim 1, wherein an initial embedding for a particular feature is generated based at least in part on a latent representation of text data associated with the particular feature and hidden representation of sequential long short-term memory machine learning models for one or more related features for the particular feature as defined by a classification tree of a tree-of-sequences long short-term memory machine learning model.
4. The computer-implemented method of claim 1, wherein the one or more predicted classification labels are generated based at least in part on a hidden state generated based at least in part on historical node representations for the related feature subset of a final temporal sequence.
1. A computer-implemented method for classification using a machine learning model, the computer-implemented method comprising: receiving, by one or more processors, one or more input data objects, each input data object comprising (i) a temporal sequence in a plurality of temporal sequences, (ii) a related classification feature subset of a plurality of classification features associated with the temporal sequence, and (iii) a descriptive text feature associated with the temporal sequence; generating, by the one or more processors, a global guidance correlation graph data object, wherein: (i) each node of the global guidance correlation graph data object corresponds to a classification feature in the plurality of classification features, and (ii) each edge of the global guidance correlation graph data object corresponds to a classification feature pair and describes a co-occurrence probability for the classification feature pair; for each temporal sequence, generating, by the one or more processors, one or more dynamic co-occurrence graph data object based on the global guidance correlation graph, wherein each dynamic co-occurrence graph data object for a particular temporal sequence describes a projection of the global guidance correlation graph data object on the input data object for the temporal sequence;
generating, by the one or more processors, using the machine learning model, and based on the plurality of temporal sequences and each dynamic co-occurrence graph data object, a plurality of prediction classification features,
wherein: the machine learning model comprises a graph-attention augmented temporal neural network machine learning model comprising a plurality of embedding layers, training the machine learning model comprises,
for each combination of a given temporal sequence t of T number of temporal sequences in the plurality of temporal sequences, a given non-initial embedding layer l of the plurality embedding layers, and a given classification feature i of the plurality of classification features,
a) generating a historical node representation based on: (i) a prior-layer historical node representation for the given temporal sequence t and the given classification feature i as generated by a preceding embedding layer l – 1, and
(ii) neighbor nodes for a target node associated with the given classification feature i in the dynamic co-occurrence graph corresponding to the given temporal sequence t, and b) appending an attention vector comprising a node attention layer associated with the descriptive text feature to the historical node representation, an initial embedding layer is configured to, for an initial temporal sequence, generate historical node representations for the plurality of classification features using a tree-of-sequences based on initial embeddings that are generated using a sequential long short-term memory machine learning model; generating, by the one or more processors, one or more predicted edges in the global guidance correlation graph data object, wherein the one or more predicted edges are connected to at least one node of the global guidance correlation graph data object associated with the plurality of prediction classification features; determining, by the one or more processors, one or more lowest common ancestor nodes of the plurality of prediction classification features from a classification feature graph based on the one or more predicted edges; generating, by the one or more processors, one or more link predictions based on the one or more lowest common ancestor nodes; and initiating performance, by the one or more processors, of one or more prediction-based actions based on the one or more link predictions.
2. The computer-implemented method of claim 1, wherein each edge of the one or more dynamic co-occurrence graph data objects for a particular temporal sequence is associated with a respective classification feature pair that are both in the related classification feature subset for the particular temporal sequence.
3. The computer-implemented method of claim 1, wherein an initial embedding for a particular classification feature is generated based on a latent representation of text data associated with the particular classification feature and hidden representation of sequential long short-term memory machine learning models for one or more related classification features for the particular classification feature as defined by a classification tree of a tree-of-sequences long short-term memory machine learning model.
4. The computer-implemented method of claim 1, wherein the plurality of prediction classification features are generated based on a hidden state generated based on historical node representations for the related classification feature subset of a final temporal sequence.
Claims 8-11
Claim 16-18
Claims 8-11
Claim 15-17
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Su et al., “GATE: Graph-Attention Augmented Temporal Neural Network for medication Recommendation” IEEEE Access, Vol 8, July 7 , 2020”, hereinafter “Su”
As to 1 claim 1, Su teaches a computer-implemented method for classification using a machine learning model, the computer-implemented method comprising: receiving, by a computing device, one or more input data objects, each input data object comprising a temporal sequence in a plurality of temporal sequences and comprising a related feature subset of a plurality of features associated with the temporal sequence(Su page 125450 teaches The EHR of each patient can be represented as a set of temporal admission sequences: E n = {x n 1 , x n 2 , . . . , x n T (n) }, where T (n) is the number of admissions of the n-th patient. Each admission sequence xt = {dt, pt , mt} is a collection of codes that contains all the diagnosis event codes dt , procedure event codes pt and medication prescription event codes mt at the t-th admission, where d refer to the collection of codes corresponding to the diagnosis symptoms recorded like acute renal failure and anemia, p refer to the collection of codes corresponding to various examinations and operations performed such as liver transplantation, liver biopsy, etc, and m refer to the collection of codes corresponding to the medications (eg insulin, cardiac glycoside) prescribed according to the patient’s condition);
generating, by the computing device, a global guidance correlation graph data object, wherein: (i) each node of the global guidance correlation graph data object corresponds to a feature in the plurality of features, and (ii) each edge of the global guidance correlation graph data object corresponds to a feature pair and describes a co-occurrence probability for the feature pair; (Su page 125450 left column teaches “We first need to construct a global guidance correlation graph G, where each node is a clinical event code. These nodes include all diagnosis event codes, procedure event codes and medication prescription event codes that ever appeared in the dataset. Edges are based on co-occurrence probability between events in each admission of every patient used for guidance.)
for each temporal sequence, generating, by the computing device, one or more dynamic co-occurrence graph data object based at least in part on the global guidance correlation graph, wherein each dynamic co-occurrence graph data object for a particular temporal sequence describes a projection of the global guidance correlation graph data object on the input data object for the temporal sequence; (Su page 125451, left column teaches “We construct dynamic co-occurrence graphs from each patient’s history admission sequences E1:t−1 = {x1, x2, x3, . . . , xt−1} and the clinical events occurred at current admission xt = {dt, pt }, which is specifically expressed as a sequence of adjacency matrices A = {A1,A2, . . . ,At}. The co-occurrence graph at each time step can be considered as a local mapping of the global guidance co-occurrence graph.”)
generating, by the computing device, using the machine learning model, and based at least in part on the plurality of temporal sequences and each dynamic co-occurrence graph data object, one or more predicted classification labels (Su page 125453 , left column teaches “In this module, we make the final multi-label classification for this feature bag. Inspired by the DeepMIML [12] model that applies deep learning to MIML, we introduce our MIML classification module to discover instance-label relationships”), wherein: the machine learning model comprises a graph-attention augmented temporal neural network machine learning model comprising a plurality of embedding layers (Su page 125447 title/abstract section teaches Gate: Graph-Attention Augmented Temporal Neural Network for medication recommendation),
training the machine learning model comprises, for each combination of a given temporal sequence t of T number of temporal sequences in the plurality of temporal sequences (Su page 125452 left column teaches “Given the co-occurrence correlation matrix A; at time t, we use a two-layer graph-attention neural network to encode node features, which allows the encoded event node vectors to contain the information of other co-occurrence events at the same admission with different degrees of correlation to obtain a more comprehensive representation”, a given non-initial embedding layer l of the one or more embedding layers, and a given feature i of the plurality of features, generating a historical node representation based at least in part on: (i) a prior-layer historical node representation for the given temporal sequence t and the given feature i as generated by a preceding embedding layer /-1 (Su page 125451 teaches “At each layer, it embeds a set of node representations H; = {ht,1, ht,2, .., ht,c,|} by recursively aggregating information from their neighbors. Formally, the node representation
PNG
media_image3.png
70
486
media_image3.png
Greyscale
, and (ii) neighbor nodes for a target node associated with the given feature i in the dynamic co-occurrence graph corresponding to the given temporal sequence t,(Ni is the neighborhood of node I in the graph)
an initial embedding layer is configured to, for an initial temporal sequence, generate historical node representations for the plurality of features using a tree-of- sequences based at least in part on initial embeddings that are generated using a sequential long short-term memory machine learning model (Su page 125452, right column teaches
PNG
media_image4.png
99
503
media_image4.png
Greyscale
; and
performing one or more prediction-based actions based at least in part on the one or more predictive classification labels.(Su page 125457 , left column teaches “our model can make more complete and accurate medication recommendations in actual scenarios”)
As to claim 2, Su teaches the computer-implemented method of claim 1, wherein each edge of the one or more dynamic co-occurrence graph data objects for a particular temporal sequence is associated with a respective feature pair that are both in the related feature subset for the particular temporal sequence. (Su page 125448 , right column teaches “specifically, for each admission record of a patient, we construct the patient’s diagnoses, treatments and medication history into a co-occurrent graph, in which each node represents the clinical event and the wight of the edge between nodes represents the relevance between clinical events)
As to claim 3, Su teaches the computer-implemented method of claim 1, wherein an initial embedding for a particular feature is generated based at least in part on a latent representation of text data associated with the particular feature and hidden representation of sequential long short-term memory machine learning models for one or more related features for the particular feature as defined by a classification tree of a tree-of-sequences long short-term memory machine learning model. (Su page 125542 teaches “we capture the temporal development of different kinds of diseases and treatments, as well as the long-term historical record information by using GRU as the basic model. Note that other types of RNN, e.g. LSTM, can also be choices”)
As to claim 4, Su teaches the computer-implemented method of claim 1, wherein the one or more predicted classification labels are generated based at least in part on a hidden state generated based at least in part on historical node representations for the related feature subset of a final temporal sequence. (Su page 125542 teaches “We propose a temporal dependency encoding module to comprehensively consider the clinical events experienced by all historical admissions of the patient, and embed all nodes that have ever occurred up to time step t − 1 to have the final historical representation of size |c| × d. For that, we capture the temporal development of different kinds of diseases and treatments, as well as the long-term historical record information by using GRU as the basic model. Note that other types of RNN, e.g. LSTM, can also be choices”)
As to claim 5, Su teaches the computer-implemented method of claim 1, wherein: each dynamic co-occurrence graph comprises a sequence of adjacency matrices. (Su page 125451 , right column teaches “We construct dynamic co-occurrence graphs from each patient’s history admission sequences E1:t−1 = {x1, x2, x3, . . . , xt−1} and the clinical events occurred at current admission xt = {dt, pt }, which is specifically expressed as a sequence of adjacency matrices A = {A1,A2, . . . ,At}. )
As to claim 6, Su teaches the computer-implemented method of claim 1, wherein the historical node representation for the given temporal sequence t, the given non-initial embedding layer 1, and the given feature i is generated using operations of
PNG
media_image1.png
38
464
media_image1.png
Greyscale
, where
PNG
media_image2.png
40
1089
media_image2.png
Greyscale
PNG
media_image5.png
114
1074
media_image5.png
Greyscale
(Su page 125451-125452, left column teaches
PNG
media_image6.png
369
496
media_image6.png
Greyscale
As to claim 7, Su teaches the computer-implemented method of claim 1, wherein the co-occurrence probability for a particular feature pair describes a count of co-occurrences of the particular feature pair in a common temporal sequence across all of the plurality of input data objects).Su page 125450 teaches where d(i, j) is the total number of admission records that events i and j co-occurred)
Claims 8-14 merely recite an apparatus for performing the method of claims 1-7 respectively. Accordingly, Su teaches every limitation of claims 8-14 as indicates in the above rejection of claims 1-7 respectively.
Claims 15-20 merely recite a computer program product comprising non-transitory computer readable storage medium having program code stored therein configured to performed the method of claims 1-6 respectively. . Accordingly, Su teaches every limitation of claims 15-20 as indicates in the above rejection of claims 1-6 respectively.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HIEN DUONG whose telephone number is (571)270-7335. The examiner can normally be reached Monday-Friday 8:00AM-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Viker Lamardo can be reached at 571-270-5871. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/HIEN L DUONG/Primary Examiner, Art Unit 2147