DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Acknowledgment is made of applicant’s claim to domestic benefit of provisional application 63/449,713 filed on March 3, 2023.
Information Disclosure Statement
The information disclosure statement(s) (IDS) submitted on 5/18/2023, 06/05/2023, 06/20/2024, and 09/26/2025 is/are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement(s) is/are being considered by the examiner.
Specification Objections
The disclosure is objected to because of the following informalities:
Paragraph [0065] recites "KPS". It appears "KPS" should be "KBPs” instead, since there is no explanation as to what “KPS” is.
Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-14 and 20 are rejected under 35 U.S.C.101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1: Claims 1-14 are directed to a process. Claim 20 is directed to a machine or an article of manufacture.
With respect to claim(s) 1:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
generating, based on the at least one first tensor and at least one query generator parameter of the attention layer, a structured query comprising: a target property name embedding vector, a match property name embedding vector, and a match property value embedding vector associated with the match property name embedding vector; (Mathematical concepts – Generating a structured query based on at least one tensor and at least one query parameter involves mathematical calculations (see paragraph [0076]). Additionally, the structured query is represented by a single embedding vector comprising in part of a target property name embedding vector (
1
x
d
tensor), a match property name embedding vector (
k
×
d
tensor), and a match property value embedding vector (
k
×
d
tensor), thus the outputs of a mathematical calculation (see paragraphs [0069], [0072]). Furthermore, the target property name embedding vector (i.e.,
Q
t
p
n
), the match property name embedding vector (i.e.,
Q
m
p
n
), and the match property value embedding vector (i.e.,
Q
m
p
v
) are generated via matrix multiplication (see paragraph [0080]) and are terms involved in the mathematical computation of the target property value embedding vector (i.e.,
O
p
v
), as shown in Fig. 7 and paragraphs [0070-0071]. Moreover, the property name embedding vector (i.e.,
I
p
n
) is a gradient-based tunable parameter of the knowledge base (see paragraph [0116]). – see MPEP § 2106.04(a)(2)(I))
computing a condition property name match score between the match property name embedding vector of the structured query and a first property name embedding vector, the first property name embedding vector numerically representing a first property name of a first property name-value pair of a structured knowledge base; (Mathematical concepts – Computing a condition property name match score involves matrix multiplication (see paragraphs [0070-0072], [0085], [0089], and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
computing a condition property value match score between the match property value embedding vector of the structured query and a first property value embedding vector, the first property value embedding vector numerically representing a first property value of the first name-value pair of the structured knowledge base; and (Mathematical concepts – Computing a condition property value match score involves matrix multiplication (see paragraphs [0070-0072], [0086-0087], [0089], and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
based on the condition property name match score, the condition property value match score, a second property name-value pair of the structured knowledge base, and the target property name embedding vector, calculating a target property value embedding vector numerically representing a target property value. (Mathematical concepts – Computing the target property value embedding vector (i.e.,
O
p
v
) involves mathematical calculations (see paragraphs [0070-0071] and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
If claim limitations, under their broadest reasonable interpretation, cover performance of the limitations as a mental process, but for the recitation of generic computer components, then the claim limitations fall within the mathematical or mental process grouping of abstract ideas. Accordingly, the claim “recites” an abstract idea.
2A Prong 2: The additional elements recited in the claim(s) do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
A computer-implemented method comprising: (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
receiving at an attention layer of a neural network at least one first tensor; (Mere data gathering – Adding insignificant extra-solution activity of mere data gathering to the judicial exception – see § MPEP2106.05(g).)
Since the claim as a whole, looking at the additional elements individually and in combination, does not contain any other additional elements that are indicative of integration into a practical application, the claim is directed to an abstract idea.
2B: The claim(s) do(es) not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
A computer-implemented method comprising: (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
receiving at an attention layer of a neural network at least one first tensor; (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC)- see MPEP § 2106.05(d)(ll)(i) - Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).)
Considering the additional elements individually and in combination, and the claim as a whole, the additional elements do not provide significantly more than the abstract idea. Therefore, the claim is not patent eligible.
With respect to claim(s) 2:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
wherein the structured knowledge base comprises an entity-property mapping tensor numerically encoding a first entity-property mapping relating to the first property name-value pair, and a second entity-property mapping relating to the second property name-value pair; (Mathematical concepts – An entity-property mapping tensor numerically encoding involves mathematical calculations (see paragraph [0063] and [0072]) – see MPEP § 2106.04(a)(2)(I))
wherein the target property value embedding vector is calculated based on the entity-property mapping tensor. (Mathematical concepts – Calculating a target property value embedding vector (i.e.,
O
p
v
) based on the entity-property mapping tensor involves mathematical calculations (see paragraph [0063-0065] and [0072]) – see MPEP § 2106.04(a)(2)(I))
Additionally, the claim(s) do not recite any new additional elements that would amount to an integration of the abstract idea into a practical application (individually or in combination) or significantly more than the judicial exception.
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 3:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
wherein the structured knowledge base comprises a knowledge base entity-entity relationship tensor numerically encoding relationships between entities, wherein the structured query comprises a target entity-entity relationship tensor, and wherein the target property value embedding vector is calculated based on the knowledge base entity-entity relationship tensor and the target entity-entity relationship tensor. (Mathematical concepts – The relationship tensors and property value tensors are represented numerically for tensor operations (see paragraph [0072]) – see MPEP § 2106.04(a)(2)(I))
Additionally, the claim(s) do not recite any new additional elements that would amount to an integration of the abstract idea into a practical application (individually or in combination) or significantly more than the judicial exception.
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 3:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
wherein the structured knowledge base comprises a knowledge base entity-entity relationship tensor numerically encoding relationships between entities, wherein the structured query comprises a target entity-entity relationship tensor, and wherein the target property value embedding vector is calculated based on the knowledge base entity-entity relationship tensor and the target entity-entity relationship tensor. (Mathematical concepts – The relationship tensors and property value tensors are represented numerically for tensor operations (see paragraph [0072]) – see MPEP § 2106.04(a)(2)(I))
Additionally, the claim(s) do not recite any new additional elements that would amount to an integration of the abstract idea into a practical application (individually or in combination) or significantly more than the judicial exception.
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 4:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
computing a target property name match score between the target property name embedding vector and a second property name embedding vector that numerically represents a second property name of the second property name-value pair; (Mathematical concepts – Computing a target property name match score (i.e.,
Ω
I
p
n
×
Q
t
p
n
T
) involves mathematical calculations (see paragraphs [0070-0071] and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
wherein the target property value embedding vector is calculated based on: the target property name match score, and a second property value embedding vector that numerically represents a second property value of the second property name-value pair. (Mathematical concepts – Calculating a target property value embedding vector (i.e.,
O
p
v
) involves mathematical calculations (see paragraphs [0070-0071] and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
Additionally, the claim(s) do not recite any new additional elements that would amount to an integration of the abstract idea into a practical application (individually or in combination) or significantly more than the judicial exception.
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 5:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
wherein the target property value embedding vector is calculated based on: a product of the condition property name match score with the condition property value match score, and the second property value embedding vector weighted by the target property name match score. (Mathematical concepts – Calculating a target property value embedding vector (i.e.,
O
p
v
) involves mathematical calculations (see paragraphs [0070-0071] and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
Additionally, the claim(s) do not recite any new additional elements that would amount to an integration of the abstract idea into a practical application (individually or in combination) or significantly more than the judicial exception.
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 6:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
wherein the structured knowledge base comprises an entity-property mapping tensor numerically encoding a first entity-property mapping relating to the first property name-value pair, and a second entity-property mapping relating to the second property name-value pair; (Mathematical concepts – An entity-property mapping and property name-value pair are denoted by mathematical tensor structures and are gradient-based tunable parameters of the knowledge base (see paragraph [0063-0065] and [0116]) – see MPEP § 2106.04(a)(2)(I))
wherein the target property value embedding vector is calculated based on the entity-property mapping tensor. (Mathematical concepts – Calculating a target property value embedding vector (i.e.,
O
p
v
) involves mathematical calculations (see paragraphs [0070-0071] and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
Additionally, the claim(s) do not recite any new additional elements that would amount to an integration of the abstract idea into a practical application (individually or in combination) or significantly more than the judicial exception.
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 7:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
wherein the structured knowledge base comprises a plurality of property name-value pairs, each property name-value pair comprising: a property name embedding vector numerically representing a property name, and a property value embedding vector numerically representing a property value associated with the property name, the plurality of name-value pairs comprising the first name-value pair and the second name-value pair; (Mathematical concepts – Name-value pairs are denoted by mathematical tensor structures and are gradient-based tunable parameters of the knowledge base (see paragraph [0063-0065] and [0116]) – see MPEP § 2106.04(a)(2)(I))
wherein the target property value embedding vector is calculated based on: a target property name match score computed between the target property name embedding vector of the structured query and the property name embedding vector of each property name-value pair of the structured knowledge base. (Mathematical concepts – Calculating a target property value embedding vector (i.e.,
O
p
v
) involves mathematical calculations (see paragraphs [0070-0071] and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
Additionally, the claim(s) do not recite any new additional elements that would amount to an integration of the abstract idea into a practical application (individually or in combination) or significantly more than the judicial exception.
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 8:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
wherein the target property value embedding vector is calculated based on the property value embedding vector of each property name-value pair of the structured knowledge base. (Mathematical concepts – Calculating a target property value embedding vector (i.e.,
O
p
v
) involves mathematical calculations (see paragraphs [0070-0071] and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
Additionally, the claim(s) do not recite any new additional elements that would amount to an integration of the abstract idea into a practical application (individually or in combination) or significantly more than the judicial exception.
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 9:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
wherein the structured knowledge base comprises a plurality of property name-value pairs, each property name-value pair comprising: a property name embedding vector numerically representing a property name, and a property value embedding vector numerically representing a property value associated with the property name, the plurality of name-value pairs comprising the first name-value pair and the second name-value pair; (Mathematical concepts –Property name-value pair are denoted by mathematical tensor structures and are gradient-based tunable parameters of the knowledge base (see paragraph [0063-0065] and [0116]) – see MPEP § 2106.04(a)(2)(I))
wherein the target property value embedding vector is calculated based on: a condition property name match score computed between the match property name embedding vector of the structured query and the property name embedding vectors of each property name-value pair of the structured knowledge base, a condition property value match score computed between the match property value embedding vector of the structured query and the property value of each property name-value pair of the structured knowledge base, and an entity-property mapping tensor numerically encoding an entity-property mapping for each property name-value pair of the knowledge database. (Mathematical concepts – Calculating a target property value embedding vector (i.e.,
O
p
v
) involves mathematical calculations (see paragraphs [0070-0071] and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
Additionally, the claim(s) do not recite any new additional elements that would amount to an integration of the abstract idea into a practical application (individually or in combination) or significantly more than the judicial exception.
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 10:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
computing […] a gradient of a training loss function with respect to the at least one query generator parameter; and (Mathematical concepts – Computing a gradient of a training loss function involves mathematical calculations (see paragraph [0115-0116]) – see MPEP § 2106.04(a)(2)(I))
updating the at least one query generator parameter based on the gradient of the training loss function. (Mathematical concepts – Updating parameters based on the gradient of the training loss function involves mathematical calculations (see paragraph [0115-0116]) – see MPEP § 2106.04(a)(2)(I))
2A Prong 2: The additional elements recited in the claim(s) do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
[…] on a graphical processing unit or other accelerator processor […] (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
2B: The claim(s) do(es) not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
[…] on a graphical processing unit or other accelerator processor […] (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 11:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
computing […] a first gradient of a joint training loss function with respect to the at least one query generator parameter; (Mathematical concepts – Computing gradient descent involves mathematical calculations (see paragraph [0115-0116]) – see MPEP § 2106.04(a)(2)(I))
updating the at least one query generator parameter based on the first gradient; computing a second gradient of the joint training loss function with respect to a knowledge base parameter of the structured knowledge base; and updating, based on the second gradient of the joint training loss function, the knowledge base parameter. (Mathematical concepts – Computing gradient descent and updating parameters based on the gradient descent involves mathematical calculations (see paragraph [0115-0116]) – see MPEP § 2106.04(a)(2)(I))
2A Prong 2: The additional elements recited in the claim(s) do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
[…] on a graphical processing unit or other accelerator processor […] (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
2B: The claim(s) do(es) not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
[…] on a graphical processing unit or other accelerator processor […] (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 12:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
wherein the training loss function encodes a joint masked modeling task. (Mathematical concepts – Masking involves mathematical probabilities (see paragraph [0136]) – see MPEP § 2106.04(a)(2)(I))
Additionally, the claim(s) do not recite any new additional elements that would amount to an integration of the abstract idea into a practical application (individually or in combination) or significantly more than the judicial exception.
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 13:
2A Prong 2: The additional elements recited in the claim(s) do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
generating an output using the neural network applied to an input comprising at least one of: image data, video data, audio data, text data, cybersecurity data, sensor data, medical data. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
2B: The claim(s) do(es) not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
generating an output using the neural network applied to an input comprising at least one of: image data, video data, audio data, text data, cybersecurity data, sensor data, medical data. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 14:
2A Prong 2: The additional elements recited in the claim(s) do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
generating, based on the target property name embedding vector, a detection output; and performing a cybersecurity action based on the detection output. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
wherein the at least one first tensor embodies cybersecurity telemetry, the method comprising: (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
2B: The claim(s) do(es) not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
generating, based on the target property name embedding vector, a detection output; and performing a cybersecurity action based on the detection output. (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
wherein the at least one first tensor embodies cybersecurity telemetry (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
With respect to claim(s) 20:
2A Prong 1: The claim(s) recite(s) an abstract idea. Specifically:
a match property name embedding vector, a match property value embedding vector associated with the match property name embedding vector, and a target property name embedding vector numerically representing a target property name; (Mathematical concepts – A target property name embedding vector (i.e.,
Q
t
p
n
), a match property name embedding vector (i.e.,
Q
m
p
n
), and a match property value embedding vector (i.e.,
Q
m
p
v
) are generated via matrix multiplication (see paragraph [0080]) and are terms involved in the mathematical computation of the target property value embedding vector (i.e.,
O
p
v
), as shown in Fig. 7 and paragraphs [0070-0071]. Moreover, the property name embedding vector (i.e.,
I
p
n
) is a gradient-based tunable parameter of the knowledge base (see paragraph [0116]). – see MPEP § 2106.04(a)(2)(I))
computing a condition property name match score between the match property name embedding vector of the structured query and a first property name embedding vector, the first property name embedding vector numerically representing a first property name of a first property name-value pair of a structured knowledge base; (Mathematical concepts – Computing a condition property name match score involves matrix multiplication (see paragraphs [0070-0072], [0085], [0089], and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
computing a condition property value match score between the match property value embedding vector of the structured query and a first property value embedding vector, the first property value embedding vector numerically representing a first property value of the first property name-value pair of the structured knowledge base; and (Mathematical concepts – Computing a condition property name match score involves matrix multiplication (see paragraphs [0070-0072], [0086-0087], [0089], and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
based on the condition property name match score, the condition property value match score, and a second property-name value pair of the structured knowledge base, returning a target property value embedding vector numerically representing a target property value associated with the target property name. (Mathematical concepts – Computing the target property value embedding vector (i.e.,
O
p
v
) involves mathematical calculations (see paragraphs [0070-0071] and Fig. 7) – see MPEP § 2106.04(a)(2)(I))
2A Prong 2: The additional elements recited in the claim(s) do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
A computer-readable storage medium configured to store executable instructions, which are configured to, upon execution by at least one processor, cause the at least one processor to implement operations comprising: (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
receiving a structured query comprising: (Mere data gathering – Adding insignificant extra-solution activity of mere data gathering to the judicial exception – see § MPEP2106.05(g).)
2B: The claim(s) do(es) not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
A computer-readable storage medium configured to store executable instructions, which are configured to, upon execution by at least one processor, cause the at least one processor to implement operations comprising: (Mere instructions to apply an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
receiving a structured query comprising: (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC)- see MPEP § 2106.05(d)(ll)(i) - Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).)
Since the claim does not recite additional elements that either integrate the judicial exception into a practical application, nor provide significantly more than the judicial exception, the claim is not patent eligible. Therefore, the claim is not patent eligible.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 15-17 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by YIN ("Neural Enquirer: Learning to Query Tables with Natural Language"), hereafter YIN.
Regarding Claim 15:
YIN teaches:
A computer system comprising: at least one memory configured to store: executable instructions, and a structured knowledge base comprising: […] at least one processor coupled to the at least one memory, and configured to execute the executable instructions, which upon execution cause the at least one processor to perform at least one of: (YIN [pg. 16, Section B] teaches: “We train Neural Enquirer-CPU and Sempre on a machine with Intel Core i7-3770@3.40GHz and 16GB memory, while Neural Enquirer-GPU is tuned on Nvidia Tesla K40.”)
property name-value pairs, each property name-value pair comprising: a property name embedding vector numerically representing a property name, and a property value embedding vector numerically representing a property value associated with the property name (YIN [pg. 4, Figure 2] teaches a table embedding in which both field names (i.e., property name embedding vector) and values (i.e., property value embedding vector) are embedding vectors forming multiple name-value pairs.)
an entity-property mapping tensor numerically encoding an entity-property mapping for each property name-value pair; (YIN [pg. 4, section 3.2. Table Encoder] teaches: "More specifically, for the entry in the
m
-th row and
n
-th column with a value of
w
m
n
, Table Encoder computes a
d
ℇ
-dimensional embedding vector
e
m
n
(i.e., entity-property mapping tensor) by fusing the embedding of the entry value with the embedding of its corresponding field name […]”.)
a read operation comprising extracting from the knowledge base, based on the entity-property mapping tensor: a target property name embedding vector, or a target property value embedding vector, (YIN [pg. 4, section 3.2. Table Encoder] teaches: "More specifically, for the entry in the
m
-th row and
n
-th column with a value of
w
m
n
, Table Encoder computes a
d
ℇ
-dimensional embedding vector
e
m
n
(i.e., entity-property mapping tensor) by fusing the embedding of the entry value with the embedding of its corresponding field name (i.e., target property value embedding vector) […]”. Examiner’s note: Computing an embedding vector for values in a knowledge table requires to obtain the values for the computation. Under broadest reasonable interpretation, a read operation can be interpreted as the step for obtaining (i.e., extracting) the values prior to computing the embedding vector.)
a write operation comprising at least one of: […] modifying a property name embedding vector contained in the knowledge base, modifying a property value embedding vector contained in the knowledge base, […] (YIN [pg. 4, section 3.2] teaches: “Our Table Encoder functions differently from classical knowledge embedding models (e.g., TransE [4]), where embeddings of entities (entry values) and relations (field names) are learned in a unsupervised fashion via minimizing certain reconstruction errors. Embeddings in Neural Enquirer are optimized (i.e., modifying both a property name embedding and a property value embedding vector) via supervised learning towards end-to-end QA tasks.” YIN [pg. 9, section 5.2] teaches: “Neural Enquirer is trained via standard back-propagation. Objective functions are optimized using SGD in a mini-batch of size 100 with adaptive learning rates (AdaDelta [16]).”)
Regarding Claim 16:
YIN teaches the elements of claim 15 as outlined above. YIN further teaches:
The computer system of claim 15, wherein the executable instructions are configured to cause the at least one processor to: (YIN [pg. 16, Section B] teaches: “We train Neural Enquirer-CPU and Sempre on a machine with Intel Core i7-3770@3.40GHz and 16GB memory, while Neural Enquirer-GPU is tuned on Nvidia Tesla K40.”)
compute a gradient of a training loss function with respect to a knowledge base parameter of the structured knowledge base; and update the knowledge base parameter based on the gradient of the training loss function. (YIN [pg. 1, Abstract] teaches: “Neural Enquirer can be trained with gradient descent, with which not only the parameters of the controlling components and semantic parsing component, but also the embeddings of the tables and query words can be learned from scratch.” YIN [pg. 9, section 5.2] teaches: “Neural Enquirer is trained via standard back-propagation. Objective functions are optimized using SGD in a mini-batch of size 100 with adaptive learning rates (AdaDelta [16]).” Examiner’s note: a knowledge base parameter of the structure knowledge base can be interpreted as the learning and optimizing embeddings of the knowledge-base tables.)
Regarding Claim 17:
YIN teaches the elements of claim 16 as outlined above. YIN further teaches:
The computer system of claim 16, wherein the at least one processor comprises a graphical processing unit or other accelerator processor configured to compute the gradient of the training loss function. (YIN [pg. 16, Section B] teaches: “We train Neural Enquirer-CPU and Sempre on a machine with Intel Core i7-3770@3.40GHz and 16GB memory, while Neural Enquirer-GPU is tuned on Nvidia Tesla K40.” YIN [pg. 1, Abstract] teaches: “Neural Enquirer can be trained with gradient descent […].”)
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-2, 4, 6-11, 13, and 20 rejected under 35 U.S.C. 103 as being unpatentable over RAJENDRAN (“NE-Table: A Neural Key-Value Table for Named Entities”) in view of YIN, MILLER ("Key-Value Memory Networks for Directly Reading Documents"), and CHO ("Adversarial TableQA: Attention Supervision for Question Answering on Tables"), hereafter RAJENDRAN, YIN, MILLER, and CHO respectively.
Regarding Claim 1:
RAJENDRAN teaches:
A computer-implemented method comprising: receiving at an attention layer of a neural network at least one first tensor; (RAJENDRAN [pg. 981, Figure 1 ] teaches: "For input question - Who teaches EECS-545, the NE-Embedding Generation Module (
f
Φ
) takes the context embedding as input and generates a NE-Embedding for the NE EECS-545.” RAJENDRAN [pg. 984, section 3.3] teaches: "Both models use a Recurrent Neural Network (RNN) to encode the question (i.e., at least one first tensor) and use the multiple-attention based neural retrieval mechanism to retrieve answers." RAJENDRAN [pg. 981, section 2] teaches: "For our purposes,
f
ϕ
is an multi-layer perceptron (MLP).")
generating, based on the at least one first tensor and at least one query generator parameter of the attention layer, a structured query comprising: a target property name embedding vector, a match property name embedding vector, and a match property value embedding vector associated with the match property name embedding vector; (RAJENDRAN [pg. 984, section 3.3] teaches: "Both models use a Recurrent Neural Network (RNN) to encode the question (i.e., based on the at least one first tensor) and use the multiple-attention based neural retrieval mechanism to retrieve answers." RAJENDRAN [pg. 990, section C] teaches: "DB-Retrieval Module (
h
ψ
) does that by generating 3 different attention key embeddings (vectors) (i.e., a structured query): Attention over Columns for Columns (ACC) (i.e., a target property name embedding vector), Attention over Columns for Rows (ACR) (i.e., a match property name embedding vector), Attention over Rows for Rows (ARR) (i.e., and a match property value embedding vector associated with the match property name embedding vector." RAJENDRAN [pg. 989, section A.1] teaches: “The entire model is trained using stochastic gradient descent (learning rate = 0.05), minimizing a standard cross-entropy loss between predicted answer
a
^
and the correct answer
a
.” Examiner’s note: the embedding generation model
f
ϕ
is a multi-layer perceptron trained using stochastic gradient descent, and thus the layer’s parameters that use attention in the neural embedding space are being update as the model learns to correctly retrieve information.)
RAJENDRAN is not relied upon for teaching:
computing a condition property name match score between the match property name embedding vector of the structured query and a first property name embedding vector, the first property name embedding vector numerically representing a first property name of a first property name-value pair of a structured knowledge base;
computing a condition property value match score between the match property value embedding vector of the structured query and a first property value embedding vector, the first property value embedding vector numerically representing a first property value of the first name-value pair of the structured knowledge base; and
based on the condition property name match score, the condition property value match score, a second property name-value pair of the structured knowledge base, and the target property name embedding vector, calculating a target property value embedding vector numerically representing a target property value.
However, YIN teaches: computing a condition property name match score between the match property name embedding vector of the structured query and a first property name embedding vector, the first property name embedding vector numerically representing a first property name of a first property name-value pair of a structured knowledge base; (YIN [pg. 6, section 3.3.1 Reader] teaches: "where
ω
~
(
⋅
)
is the normalized attention weights (i.e., computing a condition property name match score) given by:
ω
~
f
n
,
q
,
g
l
-
1
=
exp
ω
f
n
,
q
,
g
l
-
1
∑
n
'
=
1
N
e
x
p
ω
f
n
'
,
q
,
g
l
-
1
YIN [pg. 4, section 3.2 Table Encoder] teaches: " [...] where
f
n
is the embedding of the field name (of the
n
-th column) (i.e., a first property name embedding vector)." YIN [pg. 3, section 2 Overview of Neural Enquirer] teaches: "Table Encoder (Section 3.2), which encodes entries in the table into distributed vectors. Table Encoder outputs an embedding vector for each table entry, which retains the two dimensional structure of the table. (i.e., the first property name embedding vector numerically representing a first property name of a first property name-value pair of a structured knowledge base)." YIN [pg. 3, section 3.1 Query Encoder] teaches: "Given an NL query
Q
composed of a sequence of words
{
w
1
,
w
2
,
…
,
w
T
}
, Query Encoder parses
Q
into a
d
Q
-dimensional vectorial representation
q
:
Q
→
e
n
c
o
d
e
q
∈
R
d
Q
(i.e., the match property name embedding vector […])." YIN [pg. 2, section 1. Introduction] teaches: "Our work, inspired by above-mentioned threads of research, aims to design a neural network system that can learn to understand the query and execute it on a knowledge-base table from examples of queries and answers.")
computing a condition property value match score between the match property value embedding vector of the structured query and a first property value embedding vector, the first property value embedding vector numerically representing a first property value of the first name-value pair of the structured knowledge base; (YIN [pg. 7, section 3.3.3] teaches: "Instead of computing annotations based on read vectors, the last executor in Neural Enquirer directly outputs the probability (i.e., computing a condition property value match score) of the value of each entry in T being the answer: […] where
f
A
N
S
l
(
⋅
)
is modeled as a DNN. Note that the last executor, which is devoted to returning answers, carries out a specific kind of execution using
f
A
N
S
l
(
⋅
)
based on the entry value, the query, and annotations from previous layer." YIN [pg. 4, section 3.2. Table Encoder] teaches: "More specifically, for the entry in the
m
-th row and
n
-th column with a value of
w
m
n
” Examiner's note: Under BRI, a first property value embedding vector can be interpreted as
w
m
n
, which is encoded in a
d
ℇ
-dimensional embedding vector
e
m
n
by fusing the embedding of the entry value with the embedding of its corresponding field name (see YIN [pg. 4, section 3.2 Table Encoder]).)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of RAJENDRAN and YIN before them, to include YIN's calculation of normalized weight in RAJENDRAN’s neural retrieval mechanism. One would have been motivated to make such a combination in order to guide the learning process using the attention weights when processing difficult queries (YIN [pg. 7-8, section 4 Learning]).
RAJENDRAN in view of YIN are not relied upon for teaching, but MILLER teaches: based on the condition property name match score […], a second property name-value pair of the structured knowledge base, and the target property name embedding vector, calculating a target property value embedding vector numerically representing a target property value. (MILLER [pg. 3-4, section 3.1 Model Description] teaches the following (Expanded) formula:
O
=
∑
i
s
o
f
t
m
a
x
A
Φ
X
x
⋅
A
Φ
K
k
h
i
⋅
A
Φ
V
(
v
h
i
)
where
A
Φ
X
x
represents the query,
A
Φ
K
k
h
i
denotes the key, and
A
Φ
V
(
v
h
i
)
denotes the target value stored in the knowledge base. Under broadest reasonable interpreration, the target property name embedding vector can be interpreted as
A
Φ
X
x
, the condition property name match score can be interpreted as
p
h
i
, and a target property value embedding vector numerically representing a target property value can be interpreted as the output vector
O
.)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of RAJENDRAN, YIN, and MILLER before them, to include MILLER’s output vector calculation based on a knowledge base in RAJENDRAN and YIN’s neural retrieval mechanism. One would have been motivated to make such a combination in order to allow the model encode prior knowledge for the considered task and leverage possibly complex transforms between keys and values, while still being trained via stochastic gradient descent (MILLER [pg. 2, section 1 Introduction]).
However, RAJENDRAN in view of YIN and MILLER is not relied upon for teaching, but CHO teaches: based on […] the condition property value match score, […] calculating a target property value […] (CHO [pg. 399, section 4.4. Operand Selector] teaches: "Moreover, the operand selector uses the product of both the row scores
p
r
j
t
'
and the column attention scores from Column SelRU
a
f
f
k
t
'
(Eq. 3) to calculate the cell scores
C
j
,
k
, which represents the probability of selecting the operands used when calculating the final answer (i.e., target property value), as shown in Eq. 7.
C
j
,
k
=
p
r
j
t
'
×
a
f
f
k
t
'
At test time, we filter the cells using a threshold
γ
, where cells with scores
C
j
,
k
>
γ
are the selected operands. MILLER [pg. 399, section 4.5. Operation Solver] teaches: "Finally, we use the cell scores
C
j
,
k
to solve all the operations available in the model." Examiner's note: Under broadest reasonable interpretation, the condition property value match score can be interpreted as
p
r
j
t
'
, which denotes the scalar row score with values corresponding to the probability of selecting the row. Furthermore, a person haivng ordinary skill in the art would note that
a
f
f
k
t
'
corresponds to the probability value of selecting a column, and thus similar to MILLER's
A
Φ
K
k
h
i
in the relevance probability calculation.)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of RAJENDRAN, YIN, MILLER, and CHO before them, to include CHO's cell scores calculation in RAJENDRAN, YIN, and MILLER's neural retrieval mechanism. One would have been motivated to make such a combination in order to learn both the operand and the operation to calculate the final answer and provide an intuitive process for users who want to know the answer was generated (CHO [pg. 396, section 4. Our model: Neural Operator]).
Regarding Claim 2:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 1 as outlined above. YIN further teaches:
wherein the structured knowledge base comprises an entity-property mapping tensor numerically encoding a first entity-property mapping relating to the first property name-value pair, and a second entity-property mapping relating to the second property name-value pair; (YIN [pg. 4, section 3.2. Table Encoder] teaches: "More specifically, for the entry in the
m
-th row and
n
-th column with a value of
w
m
n
, Table Encoder computes a
d
ℇ
-dimensional embedding vector
e
m
n
(i.e., entity-property mapping tensor) by fusing the embedding of the entry value with the embedding of its corresponding field name […]”.)
wherein the target property value […] is calculated based on the entity-property mapping tensor. (YIN [pg. 7, section 3.3.3.] teaches “the last executor in Neural Enquirer directly outputs the probability of the value of each entry in T being the answer […].” Examiner’s note: Equation 6 uses the embedding vector
e
m
n
(i.e., based on the entity-property mapping tensor) to compute the probability that the entry is the answer.)
MILLER further teaches: […] target property value […] is calculated based on the […] tensor (MILLER [pg. 3-4, section 3.1] teaches (expanded) formula:
O
=
∑
i
s
o
f
t
m
a
x
A
Φ
X
x
⋅
A
Φ
K
k
h
i
⋅
A
Φ
V
(
v
h
i
)
where the values of the memories are read by taking their weighted sum using probabilities to compute the vector
O
. Both MILLER’s equation to compute the
O
vector and YIN’s equation 6 use softmax to compute an answer for a query or question. Therefore, it would have been obvious for person of ordinary skill in the art to test combinations YIN and MILLER’s equations to experiment a combination that could allow higher probabilities of selecting the correct answer.)
Regarding Claim 4:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 1 as outlined above. CHO further teaches:
computing a target property name match score between the target property name embedding vector and a second property name embedding vector that numerically represents a second property name of the second property name-value pair; (CHO [pg. 398, section 4.2. Selective Recurrent Units] teaches equation (3) for computing attention scores (i.e., target property name match score) using a softmax function between the query's target column embedding and each column header embedding
f
k
.)
wherein the target property value […] is calculated based on: the target property name match score, and a second property value embedding vector that numerically represents a second property value of the second property name-value pair. (CHO [pg. 398, section 4.3. Row RNN] teaches the selected cell vector
c
~
j
t
=
∑
k
a
f
k
×
c
j
,
k
that calculates the output value using the computed column attention scores (i.e., target property name match score) and a weighted sum of the cell vectors
c
j
,
k
(i.e., and a second property value embedding vector that numerically represents a second property value of the second property name-value pair). (CHO [pg. 399, section 4.4. Operand Selector] teaches Equation 7 for calculating the final answer (i.e., the target property value […]) as:
C
j
,
k
=
p
r
j
t
'
×
a
f
f
k
t
'
where
p
r
j
t
'
computes a per-row score probability of selecting a row. CHO [pg. 398, section 4.3. Row RNN] teaches the selected cell vector
c
~
j
t
=
∑
k
a
f
k
×
c
j
,
k
that calculates the weighted sum of the cell vectors using the computed column attention scores
a
f
k
and a weighted sum of the cell vectors
c
j
,
k
.)
MILLER further teaches: […] the target property value embedding vector is calculated based on […] property value embedding vector […] (MILLER [pg. 3-4, section 3.1] teaches (expanded) formula:
O
=
∑
i
s
o
f
t
m
a
x
A
Φ
X
x
⋅
A
Φ
K
k
h
i
⋅
A
Φ
V
(
v
h
i
)
where
A
Φ
X
x
represents the query,
A
Φ
K
k
h
i
denotes the key, and
A
Φ
V
(
v
h
i
)
denotes the target value (i.e., property value embedding vector) stored in the knowledge base. The values of the memories are read by taking their weighted sum using probabilities to compute the vector
O
. Both MILLER’s equation to compute the
O
vector (i.e., target property value embedding vector) and CHO’s equation 3 use softmax to compute an answer for a query or question. Therefore, it would have been obvious for person of ordinary skill in the art to test combinations CHO and MILLER’s equations to experiment a combination that could allow higher probabilities of selecting the correct answer.)
Regarding Claim 6:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 4 as outlined above. YIN further teaches:
wherein the structured knowledge base comprises an entity-property mapping tensor numerically encoding a first entity-property mapping relating to the first property name-value pair, and a second entity-property mapping relating to the second property name-value pair; (YIN [pg. 4, section 3.2. Table Encoder] teaches: "More specifically, for the entry in the
m
-th row and
n
-th column with a value of
w
m
n
, Table Encoder computes a
d
ℇ
-dimensional embedding vector
e
m
n
(i.e., entity-property mapping tensor) by fusing the embedding of the entry value with the embedding of its corresponding field name […]”. YIN [pg. 4, section 3.3] teaches: “An executor processes a table row-by-row. For the
m
-th row, with N (field, value) composite embeddings
R
m
=
{
e
m
1
,
e
m
2
,
…
,
e
m
N
}
, the Reader fetches a read vector
r
m
l
from
R
m
, […].” Examiner’s note: the embeddings
e
m
N
are stored in the knowledge base.)
wherein the target property value […] is calculated based on the entity-property mapping tensor. (YIN [pg. 4, section 3.2. Table Encoder] teaches: "More specifically, for the entry in the
m
-th row and
n
-th column with a value of
w
m
n
, Table Encoder computes a
d
ℇ
-dimensional embedding vector
e
m
n
(i.e., entity-property mapping tensor) by fusing the embedding of the entry value with the embedding of its corresponding field name […]”. YIN [pg. 7, section 3.3.3.] teaches using
e
m
n
(i.e., based on the entity-property mapping tensor) as input for computing the probability of each entry in the knowledge base being the answer (i.e., the target property value is calculated).)
Regarding Claim 7:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 4 as outlined above. CHO further teaches:
wherein the structured knowledge base comprises a plurality of property name-value pairs, each property name-value pair comprising: a property name embedding vector numerically representing a property name, and a property value embedding vector numerically representing a property value associated with the property name, the plurality of name-value pairs comprising the first name-value pair and the second name-value pair; (CHO [pg. 396, section 4.1] teaches a table with multiple columns, each having a field embedding
f
k
obtained from the word mebedding matrix (i.e., property name embedding vector numerically representing a property name) and cell vectors
c
j
,
k
encoding each cell's value (i.e., a property value embedding vector numerically representing a property value associated with the property name). The table has multiple columns with multiple embedding fields and cells (i.e., the plurality of name-value pairs comprising the first name-value pair and the second name-value pair).)
wherein the target property value […] is calculated based on: a target property name match score computed between the target property name embedding vector of the structured query and the property name embedding vector of each property name-value pair of the structured knowledge base. (CHO [pg. 8, Equations 2-3] teaches computing attention scores
a
f
f
k
(i.e., a target property name match score) for every column
k
by comparing the query vector
q
t
f
(i.e., the target property name embedding vector of the structured query) against each column's field embedding
f
k
(i.e., the property name embedding vector of each property name-value pair of the structured knowledge base). Furthermore, CHO [pg. 398, section 4.3] teaches using each cell's value to compute a weighted sum of the cell vectors
c
~
j
t
. CHO [pg. 399, section 4.4. Operand Selector] teaches Equation 7 for calculating the final answer (i.e., the target property value […]) as:
C
j
,
k
=
p
r
j
t
'
×
a
f
f
k
t
'
where
p
r
j
t
'
computes a per-row score probability of selecting a row. CHO [pg. 398, section 4.3. Row RNN] teaches the selected cell vector
c
~
j
t
=
∑
k
a
f
k
×
c
j
,
k
that calculates the weighted sum of the cell vectors using the computed column attention scores
a
f
k
and a weighted sum of the cell vectors
c
j
,
k
.)
MILLER further teaches: […] the target property value embedding vector is calculated based on […] property name embedding vector […] (MILLER [pg. 3-4, section 3.1] teaches (expanded) formula:
O
=
∑
i
s
o
f
t
m
a
x
A
Φ
X
x
⋅
A
Φ
K
k
h
i
⋅
A
Φ
V
(
v
h
i
)
where
A
Φ
X
x
represents the query,
A
Φ
K
k
h
i
denotes the key (i.e., property name embedding vector), and
A
Φ
V
(
v
h
i
)
denotes the target value stored in the knowledge base. The values of the memories are read by taking their weighted sum using probabilities to compute the vector
O
. Both MILLER’s equation to compute the
O
vector (i.e., target property value embedding vector) and CHO’s equation 3 use softmax to compute an answer for a query or question. Therefore, it would have been obvious for person of ordinary skill in the art to test combinations CHO and MILLER’s equations to experiment a combination that could allow higher probabilities of selecting the correct answer.)
Regarding Claim 8:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 7 as outlined above. CHO further teaches:
wherein the target property value […] is calculated based on the property value embedding vector of each property name-value pair of the structured knowledge base. (CHO [pg. 398, section 4.3] teaches using each cell's value to compute a weighted sum of the cell vectors
c
~
j
t
. CHO [pg. 399, section 4.4. Operand Selector] teaches Equation 7 for calculating the final answer (i.e., the target property value […]) as:
C
j
,
k
=
p
r
j
t
'
×
a
f
f
k
t
'
where
p
r
j
t
'
computes a per-row score probability of selecting a row. CHO [pg. 398, section 4.3. Row RNN] teaches the selected cell vector
c
~
j
t
=
∑
k
a
f
k
×
c
j
,
k
that calculates the weighted sum of the cell vectors using the computed column attention scores
a
f
k
and a weighted sum of the cell vectors
c
j
,
k
. (i.e., based on the property value embedding vector of each property name-value pair of the structured knowledge base)
MILLER further teaches: […] the target property value embedding vector is calculated based on […] property value embedding vector […] (MILLER [pg. 3-4, section 3.1] teaches (expanded) formula:
O
=
∑
i
s
o
f
t
m
a
x
A
Φ
X
x
⋅
A
Φ
K
k
h
i
⋅
A
Φ
V
(
v
h
i
)
where
A
Φ
X
x
represents the query,
A
Φ
K
k
h
i
denotes the key, and
A
Φ
V
(
v
h
i
)
denotes the target value (i.e., property value embedding vector) stored in the knowledge base. The values of the memories are read by taking their weighted sum using probabilities to compute the vector
O
. Both MILLER’s equation to compute the
O
vector (i.e., target property value embedding vector) and CHO’s equation 3 use softmax to compute an answer for a query or question. Therefore, it would have been obvious for person of ordinary skill in the art to test combinations CHO and MILLER’s equations to experiment a combination that could allow higher probabilities of selecting the correct answer.)
Regarding Claim 9:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 1 as outlined above. CHO further teaches:
wherein the structured knowledge base comprises a plurality of property name-value pairs, each property name-value pair comprising: a property name embedding vector numerically representing a property name, and a property value embedding vector numerically representing a property value associated with the property name, the plurality of name-value pairs comprising the first name-value pair and the second name-value pair; (CHO [pg. 397, section 4.1.] teaches a table T consisting of n rows and m columns forming an
n
×
m
matrix. Each column header is represented by field embedding
f
k
and each cell vector is initialized from the word embedding matrix. The field embeddings and cell vectors form multiple name-value pairs.)
YIN further teaches: wherein the target property value […] is calculated based on: […] a condition property name match score computed between the match property name embedding vector of the structured query and the property name embedding vectors of each property name-value pair of the structured knowledge base (YIN [pg. 6, section 3.3.1 Reader] teaches: "where
ω
~
(
⋅
)
is the normalized attention weights (i.e., a condition property name match score computed) given by:
ω
~
f
n
,
q
,
g
l
-
1
=
exp
ω
f
n
,
q
,
g
l
-
1
∑
n
'
=
1
N
e
x
p
ω
f
n
'
,
q
,
g
l
-
1
YIN [pg. 4, section 3.2 Table Encoder] teaches: " [...] where
f
n
is the embedding of the field name (of the
n
-th column) (i.e., the property name embedding vector of each property name-value pair of the structured knowledge base)." YIN [pg. 3, section 2 Overview of Neural Enquirer] teaches: "Table Encoder (Section 3.2), which encodes entries in the table into distributed vectors. Table Encoder outputs an embedding vector for each table entry, which retains the two dimensional structure of the table." YIN [pg. 3, section 3.1 Query Encoder] teaches: "Given an NL query
Q
composed of a sequence of words
{
w
1
,
w
2
,
…
,
w
T
}
, Query Encoder parses
Q
into a
d
Q
-dimensional vectorial representation
q
:
Q
→
e
n
c
o
d
e
q
∈
R
d
Q
(i.e., the match property name embedding vector)." YIN [pg. 2, section 1. Introduction] teaches: "Our work, inspired by above-mentioned threads of research, aims to design a neural network system that can learn to understand the query and execute it on a knowledge-base table from examples of queries and answers." Examiner’s note: Equation 6 uses the embedding vector
e
m
n
to compute the probability that the entry is the answer (i.e., wherein the target property value […] is calculated based on).)
a condition property value match score computed between the match property value embedding vector of the structured query and the property value of each property name-value pair of the structured knowledge base, (YIN [pg. 7, section 3.3.3] teaches: "Instead of computing annotations based on read vectors, the last executor in Neural Enquirer directly outputs the probability (i.e., computing a condition property value match score) of the value of each entry in T being the answer: […] where
f
A
N
S
l
(
⋅
)
is modeled as a DNN. Note that the last executor, which is devoted to returning answers, carries out a specific kind of execution using
f
A
N
S
l
(
⋅
)
based on the entry value, the query, and annotations from previous layer." YIN [pg. 4, section 3.2. Table Encoder] teaches: "More specifically, for the entry in the
m
-th row and
n
-th column with a value of
w
m
n
” Examiner's note: Under BRI, the property value embedding vector of each property name-value pair of the structured knowledge base can be interpreted as
w
m
n
, which is encoded in a
d
ℇ
-dimensional embedding vector
e
m
n
by fusing the embedding of the entry value with the embedding of its corresponding field name (see YIN [pg. 4, section 3.2 Table Encoder]).
an entity-property mapping tensor numerically encoding an entity-property mapping for each property name-value pair of the knowledge database. (YIN [pg. 4, section 3.2. Table Encoder] teaches: "More specifically, for the entry in the
m
-th row and
n
-th column with a value of
w
m
n
, Table Encoder computes a
d
ℇ
-dimensional embedding vector
e
m
n
(i.e., entity-property mapping tensor) by fusing the embedding of the entry value with the embedding of its corresponding field name […]”.)
MILLER further teaches: wherein the target property value embedding vector is calculated based on: […] each property name-value pair of the knowledge database. (MILLER [pg. 3-4, section 3.1] teaches (expanded) formula:
O
=
∑
i
s
o
f
t
m
a
x
A
Φ
X
x
⋅
A
Φ
K
k
h
i
⋅
A
Φ
V
(
v
h
i
)
where
A
Φ
X
x
represents the query,
A
Φ
K
k
h
i
denotes the key, and
A
Φ
V
(
v
h
i
)
denotes the target value stored in the knowledge base. The values of the memories are read by taking their weighted sum using probabilities to compute the vector
O
. MILLER’s equation to compute the
O
vector, CHO’s equation 3, and YIN’s equation 6 use softmax to compute an answer for a query or question. Therefore, it would have been obvious for person of ordinary skill in the art to test combinations of CHO, YIN, and MILLER’s equations to experiment a combination that could allow higher probabilities of selecting the correct answer.)
Regarding Claim 10:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 1 as outlined above. YIN further teaches:
[…] on a graphical processing unit or other accelerator processor […] (YIN [pg. 16, Section B] teaches training the model using Intel Core i7-3770@3.40GHz and 16GB memory.)
RAJENDRAN further teaches: computing […] a gradient of a training loss function with respect to the at least one query generator parameter; and updating the at least one query generator parameter based on the gradient of the training loss function. (RAJENDRAN [pg. 989, section A.1] teaches: “The entire model is trained using stochastic gradient descent (learning rate = 0.05), minimizing a standard cross-entropy loss between predicted answer
a
^
and the correct answer
a
. We use the same embedding matrix for encoding both story and the query.” RAJENDRAN teaches: “Supervision is provided for DB-Retrieval attentions (i.e., with respect to the at least one query generator parameter) and standard cross-entropy loss is used.)
Regarding Claim 11:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 1 as outlined above. RAJENDRAN further teaches:
computing […] a first gradient of a joint training loss function with respect to the at least one query generator parameter; updating the at least one query generator parameter based on the first gradient; (RAJENDRAN [pg. 989, section A.1] teaches: “The entire model is trained using stochastic gradient descent (learning rate = 0.05), minimizing a standard cross-entropy loss between predicted answer
a
^
and the correct answer
a
. We use the same embedding matrix for encoding both story and the query.” RAJENDRAN teaches: “Supervision is provided for DB-Retrieval attentions (i.e., with respect to the at least one query generator parameter based on the first gradient) and standard cross-entropy loss is used.)
YIN further teaches: […] on a graphical processing unit or other accelerator processor […] (YIN [pg. 16, Section B] teaches training the model using Intel Core i7-3770@3.40GHz and 16GB memory.)
computing a second gradient of the joint training loss function with respect to a knowledge base parameter of the structured knowledge base; and updating, based on the second gradient of the joint training loss function, the knowledge base parameter. (YIN [pg. 1, Abstract] teaches: “Neural Enquirer can be trained with gradient descent, with which not only the parameters of the controlling components and semantic parsing component, but also the embeddings of the tables and query words can be learned from scratch.” YIN [pg. 9, section 5.2] teaches: “Neural Enquirer is trained via standard back-propagation. Objective functions are optimized using SGD in a mini-batch of size 100 with adaptive learning rates (AdaDelta [16]).” Examiner’s note: a knowledge base parameter of the structure knowledge base can be interpreted as the learning and optimizing embeddings of the knowledge-base tables.)
Regarding Claim 13:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 1 as outlined above. RAJENDRAN further teaches:
generating an output using the neural network applied to an input comprising at least one of: image data, video data, audio data, text data, cybersecurity data, sensor data, medical data. (RAJENDRAN [pg. 984, section 3.3] teaches: "Both models use a Recurrent Neural Network (RNN) to encode the question […].” RAJENDRAN [pg. 984, section 3.2] teaches: “For the question Who teaches EECS545? (i.e., text data) […].”)
Regarding Claim 20:
YIN teaches:
A computer-readable storage medium configured to store executable instructions, which are configured to, upon execution by at least one processor, cause the at least one processor to implement operations comprising: (YIN [pg. 16, Section B] teaches: “We train Neural Enquirer-CPU and Sempre on a machine with Intel Core i7-3770@3.40GHz and 16GB memory, while Neural Enquirer-GPU is tuned on Nvidia Tesla K40.”)
receiving a structured query comprising: a match property name embedding vector, a match property value embedding vector associated with the match property name embedding vector, and a target property name embedding vector numerically representing a target property name; (RAJENDRAN [pg. 984, section 3.3] teaches: "Both models use a Recurrent Neural Network (RNN) to encode the question (i.e., based on the at least one first tensor) and use the multiple-attention based neural retrieval mechanism to retrieve answers." RAJENDRAN [pg. 990, section C] teaches: "In order to retrieve a particular cell from the table, the system needs to find the correct column and row corresponding to it.DB-Retrieval Module (
h
ψ
) does that by generating 3 different attention key embeddings (vectors) (i.e., a structured query): Attention over Columns for Columns (ACC) (i.e., a target property name embedding vector numerically representing a target property name), Attention over Columns for Rows (ACR) (i.e., a match property name embedding vector), Attention over Rows for Rows (ARR) (i.e., and a match property value embedding vector associated with the match property name embedding vector." RAJENDRAN [pg. 989, section A.1] teaches: “The entire model is trained using stochastic gradient descent (learning rate = 0.05), minimizing a standard cross-entropy loss between predicted answer
a
^
and the correct answer
a
.” Examiner’s note: the embedding generation model
f
ϕ
is a multi-layer perceptron trained using stochastic gradient descent, and thus the layer’s parameters that use attention in the neural embedding space are being update as the model learns to correctly retrieve information. Furthermore, under broadest reasonable interpretation, receiving a structured query can be interpreted as using the 3 generated attention key embedding vectors for retrieving a particular cell from the table.)
REJENDRAN is not relied upon for teaching:
computing a condition property name match score between the match property name embedding vector of the structured query and a first property name embedding vector, the first property name embedding vector numerically representing a first property name of a first property name-value pair of a structured knowledge base;
computing a condition property value match score between the match property value embedding vector of the structured query and a first property value embedding vector, the first property value embedding vector numerically representing a first property value of the first property name-value pair of the structured knowledge base; and
based on the condition property name match score, the condition property value match score, and a second property-name value pair of the structured knowledge base, returning a target property value embedding vector numerically representing a target property value associated with the target property name.
However, YIN teaches: computing a condition property name match score between the match property name embedding vector of the structured query and a first property name embedding vector, the first property name embedding vector numerically representing a first property name of a first property name-value pair of a structured knowledge base; (YIN [pg. 6, section 3.3.1 Reader] teaches: "where
ω
~
(
⋅
)
is the normalized attention weights (i.e., computing a condition property name match score) given by:
ω
~
f
n
,
q
,
g
l
-
1
=
exp
ω
f
n
,
q
,
g
l
-
1
∑
n
'
=
1
N
e
x
p
ω
f
n
'
,
q
,
g
l
-
1
YIN [pg. 4, section 3.2 Table Encoder] teaches: " [...] where
f
n
is the embedding of the field name (of the
n
-th column) (i.e., a first property name embedding vector)." YIN [pg. 3, section 2 Overview of Neural Enquirer] teaches: "Table Encoder (Section 3.2), which encodes entries in the table into distributed vectors. Table Encoder outputs an embedding vector for each table entry, which retains the two dimensional structure of the table. (i.e., the first property name embedding vector numerically representing a first property name of a first property name-value pair of a structured knowledge base)." YIN [pg. 3, section 3.1 Query Encoder] teaches: "Given an NL query
Q
composed of a sequence of words
{
w
1
,
w
2
,
…
,
w
T
}
, Query Encoder parses
Q
into a
d
Q
-dimensional vectorial representation
q
:
Q
→
e
n
c
o
d
e
q
∈
R
d
Q
(i.e., the match property name embedding vector […])." YIN [pg. 2, section 1. Introduction] teaches: "Our work, inspired by above-mentioned threads of research, aims to design a neural network system that can learn to understand the query and execute it on a knowledge-base table from examples of queries and answers.")
computing a condition property value match score between the match property value embedding vector of the structured query and a first property value embedding vector, the first property value embedding vector numerically representing a first property value of the first property name-value pair of the structured knowledge base; and (YIN [pg. 7, section 3.3.3] teaches: "Instead of computing annotations based on read vectors, the last executor in Neural Enquirer directly outputs the probability (i.e., computing a condition property value match score) of the value of each entry in T being the answer: […] where
f
A
N
S
l
(
⋅
)
is modeled as a DNN. Note that the last executor, which is devoted to returning answers, carries out a specific kind of execution using
f
A
N
S
l
(
⋅
)
based on the entry value, the query, and annotations from previous layer." YIN [pg. 4, section 3.2. Table Encoder] teaches: "More specifically, for the entry in the
m
-th row and
n
-th column with a value of
w
m
n
” Examiner's note: Under BRI, a first property value embedding vector can be interpreted as
w
m
n
, which is encoded in a
d
ℇ
-dimensional embedding vector
e
m
n
by fusing the embedding of the entry value with the embedding of its corresponding field name (see YIN [pg. 4, section 3.2 Table Encoder]).)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of RAJENDRAN and YIN before them, to include YIN's calculation of normalized weight in RAJENDRAN’s neural retrieval mechanism. One would have been motivated to make such a combination in order to guide the learning process using the attention weights when processing difficult queries (YIN [pg. 7-8, section 4 Learning]).
RAJENDRAN in view of YIN are not relied upon for teaching, but MILLER teaches: based on the condition property name match score, […] and a second property-name value pair of the structured knowledge base, returning a target property value embedding vector numerically representing a target property value associated with the target property name. (MILLER [pg. 3-4, section 3.1 Model Description] teaches the following (Expanded) formula:
O
=
∑
i
s
o
f
t
m
a
x
A
Φ
X
x
⋅
A
Φ
K
k
h
i
⋅
A
Φ
V
(
v
h
i
)
where
A
Φ
X
x
represents the query,
A
Φ
K
k
h
i
denotes the key, and
A
Φ
V
(
v
h
i
)
denotes the target value stored in the knowledge base. Under broadest reasonable interpreration, the target property name embedding vector can be interpreted as
A
Φ
X
x
, the condition property name match score can be interpreted as
p
h
i
, and a target property value embedding vector numerically representing a target property value can be interpreted as the output vector
O
.)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of RAJENDRAN, YIN, and MILLER before them, to include MILLER’s output vector calculation based on a knowledge base in RAJENDRAN and YIN’s neural retrieval mechanism. One would have been motivated to make such a combination in order to allow the model encode prior knowledge for the considered task and leverage possibly complex transforms between keys and values, while still being trained via stochastic gradient descent (MILLER [pg. 2, section 1 Introduction]).
However, RAJENDRAN in view of YIN and MILLER is not relied upon for teaching, but CHO teaches: based on […] the condition property value match score, […] calculating a target property value […] (CHO [pg. 399, section 4.4. Operand Selector] teaches: "Moreover, the operand selector uses the product of both the row scores
p
r
j
t
'
and the column attention scores from Column SelRU
a
f
f
k
t
'
(Eq. 3) to calculate the cell scores
C
j
,
k
, which represents the probability of selecting the operands used when calculating the final answer (i.e., target property value), as shown in Eq. 7.
C
j
,
k
=
p
r
j
t
'
×
a
f
f
k
t
'
At test time, we filter the cells using a threshold
γ
, where cells with scores
C
j
,
k
>
γ
are the selected operands. MILLER [pg. 399, section 4.5. Operation Solver] teaches: "Finally, we use the cell scores
C
j
,
k
to solve all the operations available in the model." Examiner's note: Under broadest reasonable interpretation, the condition property value match score can be interpreted as
p
r
j
t
'
, which denotes the scalar row score with values corresponding to the probability of selecting the row. Furthermore, a person haivng ordinary skill in the art would note that
a
f
f
k
t
'
corresponds to the probability value of selecting a column, and thus similar to MILLER's
A
Φ
K
k
h
i
in the relevance probability calculation.)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of RAJENDRAN, YIN, MILLER, and CHO before them, to include CHO's cell scores calculation in RAJENDRAN, YIN, and MILLER's neural retrieval mechanism. One would have been motivated to make such a combination in order to learn both the operand and the operation to calculate the final answer and provide an intuitive process for users who want to know the answer was generated (CHO [pg. 396, section 4. Our model: Neural Operator]).
Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over RAJENDRAN in view of YIN, MILLER, and CHO as applied respectively above to claims 1 and 4, and further in view of COHEN ("Scalable Neural Methods For Reasoning With A Symbolic Knowledge Base"), hereafter COHEN.
Regarding Claim 3:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 2 as outlined above. RAJENDRAN in view of YIN, MILLER, and CHO is not relied upon for teaching, but COHEN teaches:
wherein the structured knowledge base comprises a knowledge base entity-entity relationship tensor numerically encoding relationships between entities, […] (COHEN [pg. 2, Table 1] teaches: "
M
s
u
b
j
,
M
o
b
j
,
M
r
e
l
:
the reified KB encoded as matrices mapping triple id
l
to subject, object, and relation ids." COHEN [pg. 2, section 2.1] teaches: "KBs, entities, and relations. A KB consists of entities and relations. We use
x
to denote an entity and
r
to denote a relation. Each entity has an integer index between 1 and
N
E
, where
N
E
is the number of entities in the KB, and we write
x
i
for the entity that has index
i
. A relation is a set of entity pairs, and represents a relationship between entities: for instance, if
x
i
represents “Quentin Tarantino” and
x
j
represents “Pulp Fiction” then
x
i
,
x
j
would be an member of the relation director_of. A relation
r
can thus be represented as a subset of
1
,
…
,
N
E
×
1
,
…
,
N
E
. Finally, a KB consists a set of relations and a set of entities."
[…] wherein the structured query comprises a target entity-entity relationship tensor, and [...]; COHEN [pg. 6, section 3.2] teaches computing relation sets
r
t
=
f
t
q
from the question
q
(i.e., structured query comprises a target entity-entity relationship tensor).
[…] wherein the target property value embedding vector is calculated based on the knowledge base entity-entity relationship tensor and the target entity-entity relationship tensor. (COHEN [pg. 4, Equation 4] teaches the relation-set following operation
f
o
l
l
o
w
(
x
,
r
)
, which uses the matrices in the reified KB as inputs (i.e., based on the knowledge base entity-entity relationship tensor) for computing an output vector and the relation set
r
computed from the question
q
(i.e., target entity-entity relationship tensor). COHEN [pg. 4, section 2.3] teaches: “so x must be a dense vector of size
O
N
E
, as is the output of relation-set following (i.e., target property value embedding vector).”)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of RAJENDRAN, YIN, MILLER, CHO, and COHEN before them, to include COHEN’s sets of relations and entities in RAJENDRAN, YIN, MILLER, and CHO’s neural retrieval mechanism. One would have been motivated to make such a combination in order to enable neural KB inference modules scalable enough to use with realistically large KBs (COHEN [pg. 10, section 5]).
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over RAJENDRAN in view of YIN, MILLER, and CHO as applied above to claim 4, and further in view of NEELAKANTAN ("Neural Programmer: Inducing Latent Programs With Gradient Descent"), hereafter NEELAKANTAN.
Regarding Claim 5:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 4 as outlined above. CHO further teaches:
wherein the target property value […] is calculated based on: […] the second property value embedding vector weighted by the target property name match score. (CHO [pg. 399, section 4.4. Operand Selector] teaches Equation 7 for calculating the final answer (i.e., the target property value […]) as:
C
j
,
k
=
p
r
j
t
'
×
a
f
f
k
t
'
where
p
r
j
t
'
computes a per-row score probability of selecting a row. CHO [pg. 398, section 4.3. Row RNN] teaches the selected cell vector
c
~
j
t
=
∑
k
a
f
k
×
c
j
,
k
that calculates the weighted sum of the cell vectors using the computed column attention scores
a
f
k
(i.e., weighted by the target property name match score) and a weighted sum of the cell vectors
c
j
,
k
(i.e., second property value embedding vector).
MILLER further teaches: […] the target property value embedding vector is calculated based on […] property value embedding vector […] (MILLER [pg. 3-4, section 3.1] teaches (expanded) formula:
O
=
∑
i
s
o
f
t
m
a
x
A
Φ
X
x
⋅
A
Φ
K
k
h
i
⋅
A
Φ
V
(
v
h
i
)
where the values of the memories are read by taking their weighted sum using probabilities to compute the vector
O
. Both MILLER’s equation to compute the
O
vector and CHO’s equation 3 use softmax to compute an answer for a query or question. Therefore, it would have been obvious for person of ordinary skill in the art to combine CHO and MILLER’s equations to experiment a combination that could allow higher probabilities of selecting the correct answer.)
RAJENDRAN in view of YIN, MILLER, and CHO is not relied upon for teaching, but NEELAKANTAN teaches: wherein the target property value […] is calculated based on: […] a product of the condition property name match score with the condition property value match score NEELAKANTAN [pg. 6, section 2.3] teaches the column selector
a
t
c
o
l
which compares question representation
q
against column name representations
P
to produce a probability distribution over columns (i.e., condition property name match score) using a softmax function. NEELAKANTAN [pg. 7, section 2.3] teaches
a
t
c
o
l
(
j
)
being multiplied (i.e., a product) with comparison result
g
t
i
[
j
]
(i.e., condition property value match score) inside the row selector variable calculation to compute the final answer of either
s
c
a
l
a
r
_
a
n
s
w
e
r
T
(single numerical value output) or
l
o
o
k
u
p
_
a
n
s
w
e
r
T
(matrix output).)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of RAJENDRAN, YIN, MILLER, CHO, and NEELAKANTAN before them, to include NEELAKANTAN's row and column selectors in RAJENDRAN, YIN, MILLER, and CHO's neural retrieval mechanism. One would have been motivated to make such a combination in order to perform the selection process for finding the answer in a differentiable fashion so that the whole network can be trained jointly by gradient descent (NEELAKANTAN [pg. 1, Abstract]).
Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over RAJENDRAN in view of YIN, MILLER, and CHO as applied above to claim 11, and further in view of SUN ("CoLAKE: Contextualized Language and Knowledge Embedding"), hereafter SUN.
Regarding Claim 12:
RAJENDRAN in view of YIN, MILLER, CHO and SUN teaches the elements of claim 11 as outlined above. RAJENDRAN in view of YIN, MILLER, and CHO is not relied upon for teaching, but SUN teaches:
wherein the training loss function encodes a joint masked modeling task. (SUN [pg. 3660, Abstract] teaches: "In this paper, we propose the Contextualized Language and Knowledge Embedding (CoLAKE), which jointly learns contextualized representation for both language and knowledge with the extended MLM objective." SUN [pg. 3661, section 1 Introduction] teaches: "Besides, we extend the masked language model (MLM) objective (Devlin et al., 2019) to the whole input graph. That is, apply the same masking strategy to word, entity, and relation nodes and training the model to predict the masked nodes based on the rest of the graph." SUN [pg. 3664, section 3.3 Pre-Training Objective] teaches: "The Masked Language Model (MLM) objective is to randomly mask some of tokens from the input and train the model to predict the original vocabulary id of the masked tokens based on their contexts. In this section, we extend the MLM from word sequences to WK graphs.”)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of RAJENDRAN, YIN, MILLER, CHO, and SUN before them, to include SUN’s joint learning of representations with extended masked language model (MLM) in RAJENDRAN, YIN, MILLER, and CHO's neural retrieval mechanism. One would have been motivated to make such a combination in order to train the model and entities in the KG in a tractable (i.e., manageable) manner due to the large number of entities (SUN [pg. 3665, section 3.4 Model Training]).
Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over RAJENDRAN in view of YIN, MILLER, and CHO as applied above to claim 1, and further in view of HAGI (US 20190149565 A1), hereafter HAGI.
Regarding Claim 14:
RAJENDRAN in view of YIN, MILLER, and CHO teaches the elements of claim 1 as outlined above. RAJENDRAN in view of YIN, MILLER, and CHO is not relied upon for teaching, but HAGI teaches:
wherein the at least one first tensor embodies cybersecurity telemetry, the method comprising: (HAGI [0003] teaches: "The set of cybersecurity data can comprise numeric data and textual data collected from a plurality of computational sources.” HAGI [0042] teaches: "In some embodiments, data received from HTM network 130 and post-processed by feature extraction system 126 is output to a user interface 146 (e.g., as a warning, a score, a probability, an infographic, a chart, a report, etc.).")
generating, based on the target property name embedding vector, a detection output; (HAGI [0085] teaches: "In some embodiments, the anomaly detection system can use a database (e.g., database 128 of FIG. 1) to convert relevant text of the query to an appropriate VSM, tensor, and/or SDR as used by the HTM network." HAGI [0081] teaches: "In operation 514, the anomaly detection system presents output from operation 512. The output can comprise prediction(s), confidence(s), anomaly detection(s), pattern(s), warning(s), score(s), or a different output.")
performing a cybersecurity action based on the detection output. (HAGI [0055] teaches: "In operation 214, the anomaly detection system automatically (or in response to user input) mitigates an anomaly by reconfiguring a cybersecurity environment.")
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of RAJENDRAN, YIN, MILLER, CHO, and HAGI before them, to include HAGI’s set of cybersecurity data for processing in RAJENDRAN, YIN, MILLER, and CHO’s neural retrieval mechanism. One would have been motivated to make such a combination in order to transform data into homogenous data (e.g., vector space models (VSMs), tensors, spatial-temporal multi-dimensional arrays, and/or sparse distributed representations (SDRs) to improve accuracy in anomaly detection using otherwise incompatible data (HAGI [0021]).
Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over YIN as applied above to claim 16, and further in view of PETERS ("Knowledge Enhanced Contextual Word Representations"), hereafter PETERS.
Regarding Claim 18:
YIN teaches the elements of claim 16 as outlined above. YIN is not relied upon for teaching, but PETERS teaches:
wherein the training loss function encodes an entity linking task.( PETERS [pg. 46, section 3.3 KAR] teaches: "If entity linking (EL) supervision is available, we can compute a loss with the gold entity
e
m
g
.")
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of YIN and PETERS before them, to include PETERS’ entity linking supervision in YIN’s neural enquirer. One would have been motivated to make such a combination in order to enhance general purpose knowledge representations that can be applied to a wide range of downstream tasks (PETERS [pg. 43, section 1 Introduction]).
Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over YIN as applied above to claim 15, and further in view of WINN ("Enterprise Alexandria: Online High-Precision Enterprise Knowledge Base Construction with Typed Entities"), hereafter WINN.
Regarding Claim 19:
YIN teaches the elements of claim 15 as outlined above. YIN is not relied upon for teaching, but WINN teaches:
wherein the structured knowledge base additionally comprises a human-interpretable representation of each property name-value pair. (WINN [pg. 5, section 3.3] teaches: "However, the schema is modified to allow values of the Types property to be any string of 1-3 words, rather than one of the fixed set of known types." WINN [pg. 7, section 4. Incremental Clustering] teaches: "During this process, the human curator can asynchronously edit the knowledge base and/or add new entities. Both the AI mined and curated entities are included in
Q
i
allowing mined and curated entities to be linked together into coherent merged entities.")
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of YIN and WINN before them, to include WINN’s curated entities in YIN’s neural enquirer. One would have been motivated to make such a combination in order to allow entities in a knowledge base to have alternative names and allow them to be linked into coherent merged entities (WINN [pg. 5, section 3.2] & [pg. 5, section 3.3]),
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Alvaro S Laham Bauzo whose telephone number is (571)272-5650. The examiner can normally be reached Mon-Fri 7:30 AM - 11:00 AM | 1:00 PM - 5:30 PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on (571) 272-4046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/A.S.L./Examiner, Art Unit 2146
/USMAAN SAEED/Supervisory Patent Examiner, Art Unit 2146