Last updated: April 19, 2026
Application No. 18/092,045
OPTIMIZING STRUCTURED QUERY LANGUAGE QUERIES USING CANDIDATE SETS

Non-Final OA §101§102§103
Filed
Dec 30, 2022
Examiner
HOANG, SON T
Art Unit
2169
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
1 (Non-Final)
Interview Optional

— +35.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 905 resolved cases, 2023–2026
Examiner Intelligence

HOANG, SON T View full profile →
Grants 83% — above average
Career Allow Rate
754 granted / 905 resolved
+28.3% vs TC avg
Strong +35% interview lift
Without
With
+35.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
21 currently pending
Career history
926
Total Applications
across all art units
Statute-Specific Performance

§101
19.7%
-20.3% vs TC avg
§103
48.2%
+8.2% vs TC avg
§102
11.7%
-28.3% vs TC avg
§112
5.8%
-34.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 905 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status
This instant application No. 18/092,045 has claims 1-20 pending.

Priority / Filing Date
There is no priority being claimed. The effective filing date of this application is December 30, 2022.

Abstract 
The abstract is acceptable for examination purposes.

Drawings
The drawings received on December 30, 2022 are acceptable for examination purposes.




Information Disclosure Statement
As required by M.P.E.P. 609(C), the Applicant’s submission of the Information Disclosure Statement filed on October 24, 2023 is acknowledged by the Examiner and the cited references have been considered in the examination of the claims. As required by M.P.E.P. 609 C(2), a copy of the PTOL-1449 initialed and dated by the Examiner is attached to the instant Office action.

Claim Objections
Claim 11 is objected for failing to provide proper antecedent basis for “…the computation of the semantic similarity…” since there is no such computation recited previously in claim 1. 
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

The claimed invention in claims 1-20 are directed to a judicial exception (i.e., an abstract idea) without significantly more.  

	Claims 1-20 pass step 1 of the 35 U.S.C. 101 analysis since each claim is either directed to a method, non-transitory computer readable medium, or an apparatus comprising a memory and at least one processor (i.e., hardware components [00107] of instant specification). 
Claims 1, 14, and 15 recite each, in part, elements that are directed to an abstract idea (“Courts have examined claims that required the use of a computer and still found that the underlying, patent-ineligible invention could be performed via pen and paper or in a person’s mind.” Versata Dev. Group v. SAP Am., Inc., 793 F.3d 1306, 1335, 115 USPQ2d 1681, 1702 (Fed. Cir. 2015)). Each claim recites the limitations of determining a count of unique values in a column of a database table; and performing a query on the database table…based on the count of unique values. The limitations, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components (e.g., mentally determining the count of unique values; and mentally running a search on a printed spreadsheet based on the count of unique values). That is, other than reciting generic components (e.g., processor, memory, and computer-executable instructions), nothing in the claim precludes the limitations from being performed in the human mind per step 2A – prong 1 of the Abstract Idea Analysis. Thus, the limitations are parts of a mental process since there are no additional elements for further consideration.
Claim 2 recites an additional element …determining the technique by comparing the count of unique values to a given threshold which is implementable in a human mind and/or with the aid of pen/paper similar to the above analysis (e.g., mentally comparing the count of unique values to a limit to perform the mental search). Thus, the claim is ineligible.

Claims 3, and 16 recite in each claim additional elements of accessing a vector corresponding to a predicate of the query; selecting a row of the database table; accessing an entry in the selected row…; accessing a stored vector…; computing…a semantic similarity…; and storing a result of the computer semantic similarity… which are all implementable in a human mind and/or with the aid of pen/paper similar to the above analysis (e.g., visually viewing a vector drawn on paper of a predicate of the query; mentally selecting a row of the database table, visually viewing an entry in the selected row; visually view a vector drawn on paper corresponding to the viewed entry; mentally computing the semantic similarity between the vectors; and writing down result of the computed similarity on paper). Thus, the claims are ineligible. Even assuming that the stored vector and storing a result of the computed semantic similarity… are involved a physical storage, these elements involving physical storage are considered as extra-solution activities (per step 2A – prong 2 of the Abstract Idea Analysis) that cannot be integrated into a practical application (e.g., the elements recite trivial elements that occurred or would occur after the mental process) since such limitation(s) is/are no more than mere instructions to apply the exception using a generic computer component (e.g., processor, memory, and computer-executable instructions). The extra-solution activities in step 2A - prong 2 are reevaluated in step 2B to determining if each limitation is more than what is well-understood, routine, conventional activity in the field. The background of the limitations does not provide any indication that the computer components (e.g., processor, memory, and computer-executable instructions) are not off-the-shelf computer components. The Symantec, TLI, and OOP Techs court decisions cited in MPEP 2106.05(d)(II) indicate that mere receiving, generating, storing, determining, identifying, and transmitting of data over a network are a well-understood, routine, and conventional functions when claimed in a merely generic manner (as it is here). Accordingly, a conclusion that the claims are well-understood, routine, conventional activity is supported under Berkheimer Option 2. For these reasons, there is no inventive concept in each claims, thus, the claims are ineligible. 
Claim 4 recites an additional element …accessing the stored result in response to the entry…being a repeated occurrence… which is implementable in a human mind and/or with the aid of pen/paper similar to the above analysis (e.g., visually viewing the computed result written on paper based on an occurrence condition when performing the mental search). Thus, the claim is ineligible.
Claim 5 merely defines a range of the result corresponding to a range of similarity for the mental search analyzed above. Thus, the claim is ineligible.
Claims 6, and 17 recite in each claim additional elements of pre-calculating a plurality of semantic similarities by: accessing…a vector corresponding to each unique value of the unique pair…; computing…a semantic similarity between the two…vectors;  and storing…a result of the computed semantic similarity which are all implementable in a human mind and/or with the aid of pen/paper similar to the above analysis (e.g., visually viewing a vector drawn on paper corresponding to each unique value of the unique pair; mentally computing a semantic similarity between the viewed vectors for each unique pair; and writing down result of the computed semantic similarity on paper). Thus, the claims are ineligible.

Claims 7, and 18 recite in each claim additional elements of accessing a predicate of the query; selecting a row of the database table; accessing an entry in the selected row…; accessing the stored result… which are all implementable in a human mind and/or with the aid of pen/paper similar to the above analysis (e.g., visually viewing a predicate of the query, mentally selecting a row of the database table; visually viewing an entry in the selected row; visually viewing the stored result). Thus, the claims are ineligible.
Claims 8, and 19 recite in each claim additional elements of accessing an entry in each row…; accessing a stored vector corresponding to a specific value…; clustering rows of the database table together…; storing a cluster identifier and a corresponding centroid value… which are all implementable in a human mind and/or with the aid of pen/paper similar to the above analysis (e.g., visually viewing an entry in each row; visually viewing a stored vector corresponding to a specific value of each entry; mentally grouping the rows of the database table together; and writing down on paper a cluster ID and a representative value of the cluster for each candidate set). Thus, the claims are ineligible.
Claims 9, and 20 recite in each claim additional elements of accessing a vector corresponding to a predicate of the query; identifying one or more of the candidate sets having a centroid value most similar to the vector…; …performing the query…on only rows of the database tables…included…in the one or more…set which are all implementable in a human mind and/or with the aid of pen/paper similar to the above analysis (e.g., visually viewing a vector corresponding to a predicate of the query; mentally identifying one or more of the candidate sets based on a centroid value and the vector; mentally performing the search based on the rows of the database tables and the candidate sets). Thus, the claims are ineligible.
Claim 10 merely recites the clustering is k-means clustering specifying a k-means algorithm to be utilized for the mental search analyzed above. Thus, the claim is ineligible.
Claim 11 merely recites the computation of the semantic similarity is implemented with a dot product calculation specifying a dot product to be utilized for the mental computation analyzed above. Thus, the claim is ineligible.
Claim 12 recites additional elements of converting each unique entry…to a corresponding vector; and storing each corresponding vector in a vector table indexed by a value…. which are implementable in a human mind and/or with the aid of pen/paper similar to the above analysis (e.g., mentally converting each entry in to a vector; and writing down each vector in a table format on table with a certain indexed value). Thus, the claim is ineligible.
Claim 13 recites an additional element of performing the query further comprises processing rows of a candidate set in batches…. which is implementable in a human mind and/or with the aid of pen/paper similar to the above analysis (e.g., mentally processing multiple rows of a candidate sets to perform the mental search). Thus, the claim is ineligible.




Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-2, 14, and 17 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Hu et al. (Pub. No. US 2022/0050843, published on February 17, 2022; hereinafter Hu).

Regarding claims 1, 14, and 17, Hu clearly shows and discloses a method (Abstract); a non-transitory computer readable medium comprising computer executable instructions which when executed by a computer cause the computer to perform the method; and an apparatus comprising: a memory; and at least one processor, coupled to said memory, and operative to perform operations (Figure 6) comprising: 
determining a count of unique values in a column of a database table (embodiments described herein maintain statistics including the cardinality and data size of columns used by steps of the execution plans in the execution phase, [0041]. A DBMS may obtain statistics, including cardinality data, from single-dimensional or multi-dimensional histograms or histograms of the table, [0045]); and
performing a query on the database table, wherein a technique for performing the query is selected based on the count of unique values (The cardinality and data size values and other stored and/or estimated statistics may be used by a query rewrite function of a cost-based optimizer in the compilation phase to generate a current best execution plan for the query, [0041]. Example embodiments attempt to choose the best execution plan for a query in training mode from among different execution plans generated for the query based on the actual cost of the execution plans and, optionally, on stored and estimated cardinalities determined for the steps of the execution plan, [0051]. The plan selection module 124 selects the execution plan 122 having the smallest cost. As these costs are based on statistics including cardinality estimates, better statistics can improve the performance of the plan selection module 124, [0070]).  
Regarding claim 2, Hu further discloses the performing the query further comprises determining the technique by comparing the count of unique values to a given threshold (The plan selection module 124 selects the execution plan 122 having the smallest cost. As these costs are based on statistics including cardinality estimates, better statistics can improve the performance of the plan selection module 124, [0070]. As each execution plan is executed, the method 200 determines or updates a cost of the current execution plan and compares the cost of the current execution plan to the cost of a best execution plan. When the cost of the current execution plan is lower, the method 200 identifies the current execution plan as the best execution plan, [0103]).  






Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 3, 5, 8-11, 13, 16, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hu in view of Bandyopadhyay et al. (Pub. No. US 2018/0267977, published on September 20, 2018; hereinafter Bandyopadhyay).

Regarding claims 3, and 16, Bandyopadhyay then discloses:
accessing a vector corresponding to a predicate of the query (Some embodiments enable CI queries using the word vectors in the vector space as user-defined functions (UDFs), [0027]. A query to identify similar customers would examine the word vectors for each customer (i.e. custA, custB, custC, custD), [0030]-[0031]); 
selecting a row of the database table (To prepare the WFFD for CI queries, the nutrients were partitioned into groups (vitamins, amino acids, etc.). The numeric values were grouped into clusters using K-means, and the word2Vec model was trained using 200 dimensions, [0049]); 
accessing an entry in the selected row (A query to identify similar customers would examine the word vectors for each customer (i.e. custA, custB, custC, custD). So, for custD, the relevant row (tuple) 404 would be “custD 9/16 Walmart NY Stationery ‘Crayons, Folders’ 25”, [0030]), the entry corresponding to a column identified by the query (The columns contain information such as ingredients, categories, nutrients, etc., [0048]. Similarity queries were run over ingredients (text), nutrients (text) and country (text), [0049]); 
accessing a stored vector corresponding to a specific value of the accessed entry in the selected row (In the vector space, the word vector of custD is more similar to the word vector of custB as both bought stationery, including crayons. Likewise, the word vector of custA is more similar to the word vector of custC as both bought fresh produce, including bananas, [0030]); 
computing, using cosine similarity (When comparing two sets of vectors, similarity UDFs may be used to output a scalar similarity value. Similarity measures between any pair of vectors are determined using cosine and max-norm algorithms, [0045]), a semantic similarity between the vector corresponding to the predicate of the query and the stored vector corresponding to the entry in the selected row in response to the entry in the selected row being a first occurrence of encountering the specific value during the performance of the query (a query to identify similar customers would determine that custA is more similar to custD as both purchased goods in NY on 9/16 for similar amounts. Likewise, custB is now more similar to custC as both purchased goods on 10/16 for similar amounts, [0030]-[0031]); and 
storing a result of the computed semantic similarity in response to computing the semantic similarity (Results for products having similar ingredients and similar nutrients in similar countries is shown in Table 2. For example, Special K original from Kellogg's is similar to Crispy Flakes with Red Berries Cereal from Market Pantry in the United States, [0051]-[0052]).  

It would have been obvious to an ordinary person skilled in the art at the time of the invention was effectively filed to incorporate the teachings of Bandyopadhyay with the teachings of Hu for the purpose of adapting a relational database containing multiple data types to enhance processing of a query based on relationship amongst a set of representative vectors such that relevant results are returned corresponding to the query.
Regarding claim 5, Bandyopadhyay further discloses the result varies between 1.0 and -1.0, with 1.0 representing a greatest similarity and -1.0 representing a smallest similarity (When comparing two sets of vectors, similarity UDFs may be used to output a scalar similarity value. Similarity measures between any pair of vectors are determined using cosine, [0045]. It is well-known that the range of similarity scores calculated using cosine similarity (i.e., cosine of the angle between two vectors) is typically range from -1 to 1 with 1 being maximum similarity and -1 being maximum dissimilarity).  
Regarding claims 8, and 19, Bandyopadhyay then discloses: 
accessing an entry in each row of the database table, each entry corresponding to a given column (A query to identify similar customers would examine the word vectors for each customer (i.e. custA, custB, custC, custD), [0030]-[0031]); 
accessing a stored vector corresponding to a specific value of each accessed entry (So, for custD, the relevant row (tuple) 404 would be “custD 9/16 Walmart NY Stationery ‘Crayons, Folders’ 25”. In the vector space, the word vector of custD is more similar to the word vector of custB as both bought stationery, including crayons, [0030]-[0031]); 
clustering rows of the database table together into a plurality of candidate sets based on a semantic similarity of the accessed vectors (To prepare the WFFD for CI queries, the nutrients were partitioned into groups (vitamins, amino acids, etc.). The numeric values were grouped into clusters using K-means, and the word2Vec model was trained using 200 dimensions. Similarity queries were run over ingredients (text), nutrients (text) and country (text). Both the single-model and the ensemble approach were used, [0049]); and 
storing a cluster identifier and a corresponding centroid value for each candidate set (For example, in a relational database having a number representing sales dollars, the actual dollar amount is converted to a cluster ID and expressed as “sales_clusterlD”. So, an actual token value of 5000 may be expressed as “sales_272” where 272 is the cluster ID of the cluster containing 5000, [0038].  
Regarding claims 9, and 20, Bandyopadhyay further discloses: 
accessing a vector corresponding to a predicate of the query (Some embodiments enable CI queries using the word vectors in the vector space as user-defined functions (UDFs), [0027]. A query to identify similar customers would examine the word vectors for each customer (i.e. custA, custB, custC, custD), [0030]-[0031]); 
identifying one or more of the candidate sets having a centroid value most similar to the vector corresponding to the predicate (To prepare the WFFD for CI queries, the nutrients were partitioned into groups (vitamins, amino acids, etc.). The numeric values were grouped into clusters using K-means, and the word2Vec model was trained using 200 dimensions. Similarity queries were run over ingredients (text), nutrients (text) and country (text), [0049]-[0051]. Results for products having similar ingredients and similar nutrients in similar countries is shown in Table 2. For example, Special K original from Kellogg's is similar to Crispy Flakes with Red Berries Cereal from Market Pantry in the United States., [0052]); and 
wherein the performing of the query on the database table is performed on only rows of the database table that are included in the one or more identified candidate sets (For the ensemble approach, more than one embedding model or clustering strategy (discussed below) are used for different data types (e.g., latitude/longitude, images or time-series). A default clustering approach or user-provided similarity functions may be used. The results are computed for each model or clustering group and final results are computed by merging multiple result sets. The final results are merged by finding the intersection between row-sets that represent results for each clustering group, [0034]).  
Regarding claim 10, Bandyopadhyay further discloses the clustering is k-means clustering (Any traditional clustering algorithm may be used to cluster data (e.g., K-means, hierarchical clustering, etc.), [0038]).
Regarding claim 11, Bandyopadhyay then discloses the computation of the semantic similarity is implemented with a dot product calculation (When comparing two sets of vectors, similarity UDFs may be used to output a scalar similarity value. Similarity measures between any pair of vectors are determined using cosine, [0045]. It is well-known that, by definition, the cosine similarity is the dot product of two vectors divided by the products of their magnitudes). 


Regarding claim 13, Bandyopadhyay further discloses the performing the query further comprises processing rows of a candidate set in batches (For the ensemble approach, more than one embedding model or clustering strategy (discussed below) are used for different data types (e.g., latitude/longitude, images or time-series). A default clustering approach or user-provided similarity functions may be used. The results are computed for each model or clustering group and final results are computed by merging multiple result sets. The final results are merged by finding the intersection between row-sets that represent results for each clustering group, [0034]).  
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Hu in view of Bandyopadhyay and further in view of Hu et al. (Pub. No. US 2021/0248258, published on August 21, 2021; hereinafter Hu II).

Regarding claim 4, Hu II then discloses accessing the stored result in response to the entry in the selected row being a repeated occurrence of encountering the specific value during the performance of the query (Historical access result data may be retrieved from the data structure for each of the subset of detectors used for the current access request. Historical access result data may include a history of outcomes (e.g., accepted or rejected) for past access requests and a plurality of data elements. In some embodiments, the one or more data elements associated with the current access request may not be the same or only a subset of the plurality of data elements within the historical access result data. Furthermore, the historical access result data may be stored within a vector or in any other suitable manner, [0134]).
It would have been obvious to an ordinary person skilled in the art at the time of the invention was effectively filed to incorporate the teachings of Hu II with the teachings of Hu, as modified by Bandyopadhyay, for the purpose of utilizing a stored vector of historical result values to determine whether a past result can be reused based on matching attributes associated with a current query.
Claims 6, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Hu in view of Douglas (Pub. No. US 2018/0374563, published on December 27, 2018).

Regarding claims 6, and 17, Douglas then discloses comprising pre-calculating a plurality of semantic similarities by: 
accessing, for each unique pair of unique values from a column of the database table, a vector corresponding to each unique value of the unique pair (facilitating record matching and entity resolution and for enabling improvements in record linkage including determining records that refer to the same entity or individual as one or more other records in a collection of records that are stored in a computer system and detecting matches of a new record with one or more others that already exist and are stored in online databases. In an embodiment, a phenotypic bit-vector “fingerprint” pattern-specific weight is incorporated into conventional record linkage methods to enhance the record linkage accuracy and statistical performance, [0012], [0029]), the column being identified by the query (for each entity, a record linkage weight (rl_wt), shown in column 307, is determined for each row. Column 310 shows a combined composite weight of the rl_wt and ps_wt. In this example embodiment, RMS is used to determine the composite weight or score. Furthermore, here the scores are normalized to (0,1), [0072]-[0073]); 
computing, for each unique pair of unique values, a semantic similarity between the two accessed vectors (Binary fingerprints are formed by (a) constructing bit-vectors (“fingerprints”) by calculating similarities or distances for each such combination and combining each pairwise similarity or distance with the corresponding conventional record-linkage weights, such as by using a root-mean-square, dot product cosine measure, [0014]. Outputting the unique identifiers of record matches identified by the pair-wise matching algorithm, [0088]); and 
storing, for each unique pair of unique values, a result of the computed semantic similarity (modified Tanimoto similarity determination from steps associate with 290, such as by fingerprint calculation, may be treated as one marker, indicator, or ‘weight’ that measures the similarity of a record associated with the current entity to records from putative matching entities stored in the target database, [0067]).  
It would have been obvious to an ordinary person skilled in the art at the time of the invention was effectively filed to incorporate the teachings of Douglas with the teachings of Hu for the purpose of facilitating record matching for enabling improvements in record linkage including determining records that refer to the same entity as one or more other records in a collection of records that are stored in a computer system and detecting matches of a new record with one or more others that already exist and are stored in different databases. 

Claims 7, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Hu in view of Douglas in view of Bandyopadhyay.

Regarding claims 7, and 18, Bandyopadhyay then discloses: 
accessing a predicate of the query (Some embodiments enable CI queries using the word vectors in the vector space as user-defined functions (UDFs), [0027]. A query to identify similar customers would examine the word vectors for each customer (i.e. custA, custB, custC, custD), [0030]-[0031]); 
selecting a row of the database table (To prepare the WFFD for CI queries, the nutrients were partitioned into groups (vitamins, amino acids, etc.). The numeric values were grouped into clusters using K-means, and the word2Vec model was trained using 200 dimensions, [0049]); 
accessing an entry in the selected row (A query to identify similar customers would examine the word vectors for each customer (i.e. custA, custB, custC, custD). So, for custD, the relevant row (tuple) 404 would be “custD 9/16 Walmart NY Stationery ‘Crayons, Folders’ 25”, [0030]), the entry corresponding to a column identified by the query (The columns contain information such as ingredients, categories, nutrients, etc., [0048]. Similarity queries were run over ingredients (text), nutrients (text) and country (text), [0049]); and 
accessing the stored result corresponding to the unique pair that includes both the predicate and the entry (In the vector space, the word vector of custD is more similar to the word vector of custB as both bought stationery, including crayons. Likewise, the word vector of custA is more similar to the word vector of custC as both bought fresh produce, including bananas, [0030]. When comparing two sets of vectors, similarity UDFs may be used to output a scalar similarity value. Similarity measures between any pair of vectors are determined using cosine and max-norm algorithms, [0045]. CI queries may be used in a number of retail cases, such as customer analytics to find similar customers based on buying patterns (e.g., purchased items, frequency, amount spent, etc.). CI queries may also be used for advanced sales predictions using external data to predict sales of a new item being introduced based on sales of related or similar items currently being sold. CI queries may also be used to analyze historical sales data using external data,, [0051]-[0052]).  
It would have been obvious to an ordinary person skilled in the art at the time of the invention was effectively filed to incorporate the teachings of Bandyopadhyay with the teachings of Hu, as modified by Douglas, for the purpose of adapting a relational database containing multiple data types to enhance processing of a query based on relationship amongst a set of representative vectors such that relevant results are returned corresponding to the query. 
Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Hu in view of Cason et al. (Pub. No. US 2020/0410050, published on January 27, 2019; hereinafter Cason).

Regarding claim 12, Cason then discloses: converting each unique entry in the database table to a corresponding vector; and storing each corresponding vector in a vector table indexed by a value of the unique entry (vector table 617 includes vectors 618A and 618B. Vector 618A is the result of one-hot encoding of row 608I in consideration table 614, and vector 618B is the result of one-hot encoding of row 608J in consideration table 614. Vector table 617 has been constructed as though the traversals that produced rows 608I and 608J are the only traversals present in the training data set. Thereby, the columns 610 represent every unique value present in consideration table 614 (i.e., every combination of an attribute type and the attribute value is represented, although not all of the columns 610 are visible in FIG. 6C), [0162]).  
It would have been obvious to an ordinary person skilled in the art at the time of the invention was effectively filed to incorporate the teachings of Cason with the teachings of Hu for the purpose of converting the training data into training conversion tables and using one-hot encoding the training conversion tables to generate training vectors to accurately applying classification to data within a query pipeline.

Relevant Prior Art
The following prior art are deemed relevant to the claims:
Vadlamani et al. (Pub. No. US 2011/0196851) teaches presenting and generating lateral concepts in response to a query from a user. The lateral concepts are presented in addition to search results that match the user query. Categories associated with the content are identified by the lateral concept generator. The lateral concept generator also obtains additional content associated with each category. A comparison between the retrieved content and the additional content is performed by the lateral concept generator to assign scores to each identified category. The lateral concept generator selects several categories based on scores assigned to content corresponding to each category and returns the retrieved content and several categories as lateral concepts.
Shmueli et al. (Pub. No. US 2023/0061341) teaches managing a dataset of records, each record associated with set(s) of vectors of real numbers that encode an approximation of lineage of the respective record, the set(s) of vectors computed by an encoding process, obtaining result record(s) in response to executing a query on the dataset, computing set(s) of vectors for the result record(s), searching the set(s) of vectors on the records of the dataset to identify a record associated with a subset of vectors that are statistically similar to the set(s) of vectors for the result record(s), and providing a subset of the records corresponding to the identified subset of records, the subset of the records having a likelihood of contributing to the existence of the result record(s) in response to execution of the query.

Contact Information
Any inquiry concerning this communication or earlier communications from the Examiner should be directed to Son Hoang whose telephone number is (571) 270-1752. The Examiner can normally be reached on Monday – Friday (7:00 AM – 4:00 PM).
If attempts to reach the Examiner by telephone are unsuccessful, the Examiner’s supervisor, Sherief Badawi can be reached on (571) 272-9782. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.


Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

          /SON T HOANG/
 Primary Examiner, Art Unit 2169                                                                                                                                                                                               January 18, 2026
Read full office action
Prosecution Timeline

Dec 30, 2022
Application Filed
Oct 18, 2023
Response after Non-Final Action
Jan 18, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/946,331
Patent 12591561
ACCESSING A PRIMARY CLUSTERY KEY INDEX STRUCTURE DURING QUERY EXECUTION
2y 5m to grant Granted Mar 31, 2026
18/814,009
Patent 12566762
Space Efficient Technique For Estimating Cardinality Using Probabilistic Data Structure
2y 5m to grant Granted Mar 03, 2026
18/935,161
Patent 12561337
SYSTEM AND METHOD FOR PATENT AND PRIOR ART ANALYSIS
2y 5m to grant Granted Feb 24, 2026
18/537,081
Patent 12554720
PREDICATE TRANSFER PRE-FILTERING ON MULTI-JOIN QUERIES
2y 5m to grant Granted Feb 17, 2026
18/671,445
Patent 12554766
ACCESS POINTS FOR MAPS
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
83%
Grant Probability
99%
With Interview (+35.0%)
3y 1m
Median Time to Grant
Low
PTA Risk
Based on 905 resolved cases by this examiner. Grant probability derived from career allow rate.