DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status
This instant application No. 19/209,870 has claims 1-20 pending.
Priority / Filing Date
Applicant’s claim for priority of provisional application No. 63/651,210 (filed on May 23, 2024) is acknowledged. The effective filing date of this application is May 23, 2024.
Abstract
The abstract of the disclosure is acceptable for examination purposes.
Drawings
The drawings received on May 16, 2025 are acceptable for examination purposes.
Information Disclosure Statement
As required by M.P.E.P. 609(C), the Applicant’s submissions of the Information Disclosure Statements filed on 21 May 2025, 23 May 2025, and 2 July 2025 are acknowledged by the Examiner and the cited references have been considered in the examination of the claims. As required by M.P.E.P. 609 C(2), a copy of each of the PTOL-1449s initialed and dated by the Examiner is attached to the instant Office action.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the claims at issue are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the reference application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/forms/. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to http://www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 are rejected on the ground of nonstatutory double patenting over claims 1-18 of co-pending application No. 19/209,875.
Claims 1-20 of the instant application recite similar limitations and claims 1-18 of ‘875 as being compared in the table below. For the purpose of illustration, only claims 1-7 (system claims) of the instant application are compared to the claims of the patent (underlining are used to indicate conflict limitations). The remaining claims of the instant application recite different categories (i.e., method and medium claims) and are therefore not compared for simplicity purposes.
Instant Application
App. No. US 19/209,875
Claim 1
A computing system for tabular data models using localized context, comprising one or more processors configured to execute instructions; and a non-transitory computer-readable storage medium containing instructions executable by the one or more processors for:
identifying a query to apply a tabular data model to a query data point;
identifying a set of query domain data including a plurality of domain data points associated with a domain of the query;
selecting a local context of context points from the set of query domain data based on a distance of the context points to the query data point; and
applying the local context and query data point to a trained tabular data model to generate a data point classification of the query data point.
Claim 1
A computing system for training a tabular data model with localized context, comprising: one or more processors configured to execute instructions; and a non-transitory computer-readable storage medium containing instructions executable by the one or more processors for:
selecting a training data point from a set of training data points for a domain of tabular data;
identifying a subset of data points in the set of training data that form a neighborhood around the training data point;
selecting a context and a plurality of query points from the subset of data points that form the neighborhood around the training data point; and
training parameters of a tabular data model with a training batch including the context and the plurality of query points.
See further Barel and Moon below for mapping and motivation to combine with the claims of ‘875.
Claim 2
The computing system of claim 1, wherein the trained tabular data model is trained with data different from the query domain data.
See Moon below for mapping and motivation to combine with the claims of ‘875.
Claim 3
The computing system of claim 1, wherein the trained tabular data model is not trained with data points in the query domain data.
See Moon below for mapping and motivation to combine with the claims of ‘875.
Claim 4
The computing system of claim 1, wherein the tabular data model is a transformer architecture having an attention layer that attends to the local context.
Claim 4
The computing system of claim 1, wherein the tabular data model is a transformer model.
Claim 5
The computing system of claim 1, wherein training parameters of the tabular data model comprises masking attention between the plurality of query points during application of the tabular data model.
Claim 5
The computing system of claim 1, wherein selecting the local context comprises determining a number of nearest data points in the set of query domain data to the query data point.
Claim 2
The computing system of claim 1, wherein identifying the subset of data points comprises selecting nearest-neighbors of the identified training data point as the neighborhood.
Claim 6
The computing system of claim 5, wherein the number of nearest data points is dynamically determined based on the distance of the respective data points in the set of query domain data to the query data points.
Claim 3
The computing system of claim 1, wherein a number of the subset of data points varies based on the distance of data points to the training data point.
Claim 7
The computing system of claim 1, wherein the distance of a context data point to the query data point is measured in a tabular data space of the query data point.
See Barel below for mapping and motivation to combine with the claims of ‘875.
Although the conflicting claims are not identical, they are not patentably distinct from each other because they are substantially similar in scope and they use the similar limitations to produce the same end result of providing local context for context-based tabular classification.
It would have been obvious to a person with ordinary skills in the art at the time of the invention was made to modify the elements of claims 1-18 of ‘875 with any combination of the cited references below to arrive at claims 1-20 of the instant application for the purpose of utilizing tabular transform to apply on proximity-based neighborhood to create a localized structure for scaling in large data tables to achieve a more resource-efficient model.
Further, it would have been obvious to a person with ordinary skills in the art at the time of the invention was made to modify or to omit the additional elements of claims 1-18 of ‘875 to arrive at claims 1-20 of the instant application because the person would have realized that the remaining element would perform the same functions as before. “Omission of element and its function in combination is obvious expedient if the remaining elements perform same functions as before.” See In re Karlson (CCPA) 136 USPQ 184, decide Jan 16, 1963, Appl. No. 6857, U.S. Court of Customs and Patent Appeals.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
The claimed invention in claims 1-20 are directed to a judicial exception (i.e., an abstract idea) without significantly more.
Claims 1-20 pass step 1 of the 35 U.S.C. 101 analysis since each claim is either directed to a method; a system comprising one or more processor and a non-transitory computer-readable storage medium; or a non-transitory computer-readable medium.
Claims 1, 8, and 15 recite each, in part, elements that are directed to an abstract idea (“Courts have examined claims that required the use of a computer and still found that the underlying, patent-ineligible invention could be performed via pen and paper or in a person’s mind.” Versata Dev. Group v. SAP Am., Inc., 793 F.3d 1306, 1335, 115 USPQ2d 1681, 1702 (Fed. Cir. 2015)).
Each claim recites the limitations of identifying a query to apply a tabular data model…; identifying a set of query domain data…; selecting a local context… The limitations, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitations in the mind but for the recitation of generic computer components (e.g., mentally identifying a query to apply to a tabular data model; mentally identifying a set of query domain data; and mentally selecting a local context based on a distance to the query data point). That is, other than reciting generic components (e.g., processor, memory, and computer-executable instructions), nothing in the claim precludes the limitations from being performed in the human mind to implement mathematical algorithms and proximity calculations per step 2A – prong 1 of the Abstract Idea Analysis.
Each claim further recites an additional step of applying the local context…to a trained tabular data model to generate data point classification…… which is an extra-solution activity to implement a high-level functional requirement that does not prove a specific technical improvement to the computer’s internal potation. The step only uses a generic computer as a tool (i.e., trained tabular data model) to apply the local context to generate a data point classification. The claims are drafted at a high level of generality to describe what the system does (e.g., applying a local context to a trained generic tabular model) rather than how the computer’s hardware or software architecture is fundamentally improved. Simply applying a mathematical selection process to an existing machine learning model does not constitute a technical improvement to the computer’s functionality itself. Thus, the claim does not pass step 2A – prong 2 of the Abstract Idea Analysis since each of the additional limitation(s) is no more than mere instructions to apply the exception using a generic computer component (e.g., processor, memory, and computer-executable instructions).
The extra-solution activity in step 2A - prong 2 are reevaluated in step 2B to determining if each limitation is more than what is well-understood, routine, conventional activity in the field (i.e., nearest-neighbor selection or kNN based on a distance is a well-known and conventional technique in data science). Simply adding an abstract idea (i.e., local context selection) using existing, well-known machine learning architectures does not constitute an invention step. The background of the limitations does not provide any indication that the computer components (e.g., processor, memory, and computer-executable instructions) are not off-the-shelf computer components. The Symantec, TLI, and OOP Techs court decisions cited in MPEP 2106.05(d)(II) indicate that mere receiving, generating, storing, determining, identifying, and transmitting of data over a network are a well-understood, routine, and conventional functions when claimed in a merely generic manner (as it is here). Accordingly, a conclusion that the claims are well-understood, routine, conventional (WURC) activity is supported under Berkheimer Option 2. For these reasons, there is no inventive concept in each claim, thus, the claims are ineligible.
Claims 2, 9, and 16 further recite in each claim …the trained tabular data model is trained with data different form the query domain data. This limitation merely describes a transfer learning or pre-training data management strategy. Limiting the source of data for a mathematical model does not change the underlying nature of the process from an abstract idea to a practical application. At the core, the limitation remains an abstract concept of comparing new data to a previously established mathematical pattern. Thus, the claims are ineligible.
Claims 3, 10, and 17 further recite in each claim …the trained tabular data model is trained with data different form the query domain data. This limitation is merely a negative limitation regarding the data set. Identifying what is not in a training set is a mental implementation that does not provide a technical solution to a computer-centric problem. This limitation is a common feature of zero-shot or in-context learning known as a mathematical methodology. Thus, the claims are ineligible.
Claims 4, 11, and 18 further recite in each claim the tabular data model is a transformer architecture having an attention layer… The recitation of a specific neural network architecture (i.e., transformer) and its internal mechanism (i.e., attention) constitutes a field-of-use limitation to a specific technological environment. Simply implementing an abstract idea on a specific well-known computer architecture does not transform the idea into patent-eligible subject matter. Thus, the claims are ineligible.
Claims 5, 12, and 19 further recite in each claim selecting the local context comprises determining a number of nearest data points in the set of query domain data… The selection of k-nearest neighbors is a classic mathematical algorithm used for data classification. Choosing a subset of data based on proximity is a mathematical concept that can be performed manually or through general algorithmic steps. It does not offer an inventive step beyond the abstract idea of organizing information based on similarity. Thus, the claims are ineligible.
Claims 6, 13, and 20 further recite in each claim the number of nearest data points is dynamically determined based on the distance of the respective data points… Adding a dynamic aspect to a mathematical calculation (e.g., increasing the sample size if points are far away) is at best a mathematical refinement. This logic (.e., if distance > X, then increase k) is a basic logical/mathematical heuristic. It does not result in an improvement to the computer’s hardware or basic operation. Thus, the claims are ineligible.
Claims 7, and 14 further recite in each claim the distance of a context data point to the query data point is measured in a tabular data space of the query data point. Defining the space in which a mathematical distance calculation occurs (e.g., Euclidean distance between feature columns) is a mathematical definition. This does not dictate the claim as a non-abstract idea since it simply defines the parameters for the calculation. Thus, the claims are ineligible.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Barel et al. (Pub. No. US 2017/0161271, published on June 8, 2017; hereinafter Barel) in view of Moon et al. (Pub. No. US 2024/0242024, published on July 18, 2024; hereinafter Moon).
Regarding claims 1, 8, and 15, Barel clearly shows and discloses a method for tabular data models using localized content (Abstract); a computing system for tabular data models using localized context, comprising one or more processors configured to execute instructions; and a non-transitory computer-readable storage medium containing instructions executable by the one or more processors for implementing the method; and a non-transitory computer-readable medium for tabular data models using localized content, the non-transitory computer-readable medium comprising instructions executable by a processor for implementing the method (Figures 14-15), comprising:
identifying a query to apply a data model to a query data point (input query 111 may be a multidimensional input query. ANN search tree module 101 may evaluate input query 111 and ANN search tree module 101 may be traverse an ANN search tree from a root node to a resultant leaf node (RLN) 112 based on input query 111, [0033]. ANN search tree module 101 may receive input query 111 representative of a patch of image data 511, [0055]);
identifying a set of query domain data including a plurality of domain data points (During the traversal of the approximate nearest neighbor search tree, a priority queue of best match entries may be maintained. For example, the priority queue may maintain a list of a particular number of closest database entries to the input query. As the approximate nearest neighbor search tree is traversed, if a node provides a closer entry than any of the entries in the priority queue, the entry associated with the node may be kept and the entry associated with a farthest distance from the input query may be discarded, [0028]) associated with a domain of the query (Device 100 may implement an image or video pipeline to generate demosaiced color patches based on input data from an image sensor, [0032]);
selecting a local context of context points from the set of query domain data based on a distance of the context points to the query data point (for a particular leaf node, multiple candidate entries (e.g., a particular number of candidate entries such as M candidate entries) from the database that are the highest frequency entries of the frequency distribution table for the particular leaf node may be provided as candidate entries, [0087]. It is clear that the hashing table is populated with the highest frequency entries that were found to be the actual nearest neighbors, i.e., closest by distance, for queries that landed in a specific leaf node. The selected candidate entries are local context of points that have a high probability of being the closest points to any query data point that reaches that specific locale in the search tree); and
applying the local context and query data point to a trained data model to generate a data point classification of the query data point (Entries evaluation/final search results module 103 may receive priority queue of best match entries 114 and candidate entries 113 and entries evaluation/final search results module 103 may generate final search results (FSRs) 115. For example, final search results 115 may include the best K entries based on minimum distances with respect to input query 111, [0036]. Based on a patch of image data 511, it may be desirable to find, within the database or dictionary of patches, a number of best matches. From the best matches, associated color patches may be fetched and combined to provide final color patch 513 for the patch of image data 511. Such processing may provide demosaicing for the patch of image sensor data to a final color patch. An important feature of such demosaicing is the accuracy of final search results 115 based on the patch of image data 511, [0050]. It is clear that the “trained” nature of the model is shown by its ability to take a query and a retrieved context of candidate entries to generate a file color path or search result based on the learned proximity. In the field of image processing, assigning a specific color value or patch based on a neighborhood of known examples is a form of classification of an unknown raw input into an known color category).
Moon then discloses the model / trained model being tabular model / trained tabular model (The data contained in the cells of the table may be embedded by the computer program. Specifically, the computer program may be trained to represent data in each cell as a vector representation using neural network layers and pre-trained large language models, [0050]. It is noted that training, using a transformer-style model, involves batches of data where attention mechanisms related subject cells to others cells in the table).
It would have been obvious to an ordinary person skilled in the art at the time of the invention was effectively filed to incorporate the teachings of Moon with the teachings of Barel for the purpose of utilizing tabular transform to apply on proximity-based neighborhood to create a localized structure for scaling in large data tables to achieve a more resource-efficient model.
Regarding claims 2, 9, and 16, Moon then discloses the trained tabular data model is trained with data different from the query domain data (Understanding the content in its original data format is critical for certain downstream tasks, such as numerical analysis, textual summarization, etc. Thus, in embodiments, a unique representation of the table contents based on their original data type format may be encoded. This preserves important information from the table. Cells containing textual data may be encoded with semantic embeddings obtained from large language models to obtain a semantic representation. Cells containing numeric data may be encoded to obtain a numeric representation, [0035]. A machine learning model may be trained to map the similarity between text and the contextual tabular embeddings. As another example, machine learning models may be trained to highlight relevant components of a large table given natural language query using grounding techniques. This allows users to quickly pick out relevant information from potentially large and complex tables. In still another embodiment, the contextual tabular embeddings may use unsupervised clustering algorithms and may identify similar groups of tables. These clusters may contain tables from similar domains (e.g., financial, healthcare, etc.), which may be used to label and organize the vast amount of tabular data available, [0055]-[0058]. It is clear that LLMs are inherently pre-trained on different general-purpose datasets different form a specific query table such that the underlying model parameters were learned from data different from the specific query domain data points).
Regarding claims 3, 10, and 17, Moon further discloses the trained tabular data model is not trained with data points in the query domain data (In step 235, the computer program may be used for various downstream tasks. In one embodiment, the contextual table embeddings may be used to generate stylized textual reports using, for example, few shot learning. For example, a sequence-to-sequence model may be trained on a few examples of specific types of reports (e.g., financial reports, project progress reports, etc.) to generate stylized reports. This sequence-to-sequence model architecture may take the contextual representations obtained from the computer program and generate a sequence of words (i.e., summary) using a decoder, [0054]-[0058]. It is clear that the tabular model utilizes few-shot learning capability to generate insights for complex tabular data that the model was not explicitly trained on using its generalized pre-trained weights. Such trained model interprets unseen sequential inputs to generate summaries or classifications).
Regarding claims 4, 11, and 18, Moon further discloses the tabular data model is a transformer architecture (In step 230, the computer program may obtain contextual embeddings using a table transformer. In one embodiment, the table transformer may include an encoder that learns to generate a contextual representation of sequential input (e.g., sequence of cells in a table) and a decoder that can interpret the encoded input to generate another meaningful sequence (e.g., a sequence of words describing the table), [0051]-[0052]) having an attention layer that attends to the local context (the computer program may be trained to obtain contextual embeddings of the table by solving a masked-cell prediction task during the pre-training step. In doing so, the computer program (e.g., the neural network layers) may learn to embed a table with masked cells. This embedding may then be used to reconstruct the table, and the computer program may evaluate how well the masked cells are reconstructed. By learning to accurately reconstruct the masked cells, the computer program learns to generate meaningful contextual tabular embeddings, [0058]).
Regarding claims 5, 12, and 19, Barel further discloses selecting the local context comprises determining a number of nearest data points in the set of query domain data to the query data point (it may be desirable to find one or more nearest neighbors in a database based on a query point, [0025]. The hashing table, based on the key, may provide pointers to one or more candidate entries for further evaluation to determine nearest neighbors for the input query, [0027]. The priority queue may maintain a list of a particular number of closest database entries to the input query. As the approximate nearest neighbor search tree is traversed, if a node provides a closer entry than any of the entries in the priority queue, the entry associated with the node may be kept and the entry associated with a farthest distance from the input query may be discarded, [0028]. Resultant leaf node 812, in this context, may be considered an actual nearest neighbor guess for the input query from training set 815, [0063]. It is noted that the leaf node is reached specifically by evaluating the distance of the input query against node thresholds. Because the search tree is a metric structure, the points associated with the leaf node are spatially proximate to the query wherein the points form a neighborhood around the training point).
Regarding claims 6, 13, and 20, Barel further discloses the number of nearest data points is dynamically determined based on the distance of the respective data points in the set of query domain data to the query data points (As the approximate nearest neighbor search tree is traversed, if a node provides a closer entry than any of the entries in the priority queue, the entry associated with the node may be kept and the entry associated with a farthest distance from the input query may be discarded, [0028]. The database entries at traversed nodes having a minimum-distance with respect to the input query), may be maintained and updated at every node along the traversal based on input query 111. The distance between input query 111 and the database entry associated with a particular node may be any suitable distance function, metric, or distance such as a Euclidian distance, [0034]).
Regarding claims 7, and 14, Barel further discloses the distance of a context data point to the query data point is measured (the database to be searched may include any suitable database including any suitable number of entries. For example, the entries of the database may be multidimensional points that reside in a metric space, [0033]. The distance between input query 111 and the database entry associated with a particular node may be any suitable distance function, metric, or distance such as a Euclidian distance, [0034]) in a tabular data space of the query data point (The patches such as patches 603-605 within dictionary 601 may each be represented by a multidimensional data point (e.g., a 25-dimensional data point with each point associated with a pixel of patches 603-605) such that dictionary 601 may be represented by a database of multidimensional data point entries, [0051]).
Relevant Prior Art
The following references are considered relevant to the claims:
Rangan et al. (Pub. No. US 2012/0296891) teaches or automatic sampling evaluation to evaluate convergence of one or more search processes. Each individual document's similarity in the one or more non-retrieved collections is automatically evaluated to other documents in any retrieved sets. Given a goal of achieving a high recall, documents with high similarity can then be analyzed for additional noun phrases that may be used for a next iteration of a search. Convergence can be expected if the information gain in the new feedback loop is less than previous iterations, and if the additional documents identified are below a certain threshold document count.
Patel (Pub. No. US 2020/0311345) teaches enabling a user to extract relevant information from character-based contextual embedding of entities in the document, thus overcoming language barrier during interpretation of the document. Specifically, character-based embedding of information is performed in the document so as to derive a language-independent interpretation of the document. Such language-independent interpretation of the document further enables programs equipped with artificial intelligence to perform a variety of operations such as named entity recognition, information extraction, information retrieval, machine translation, sentiment analysis, feedback analysis, link prediction, comparison, summarization and so forth.
Dunning et al. (Pub. No. US 2008/0189232) teaches using data structured according to an indicator-based recommendation paradigm such that items to be considered for recommendation are stored in a text retrieval system, along with associated meta-data such as title and description. To these conventional characteristics are added additional characteristics known as indicators which are derived from an analysis of the usage of the system by users. This indicator-based system provides a more robust recommendation system that is able to capture a greater depth and variety of real-world relationships among items, and is able to handle data of higher relations.
Contact Information
Any inquiry concerning this communication or earlier communications from the Examiner should be directed to Son Hoang whose telephone number is (571) 270-1752. The Examiner can normally be reached on Monday – Friday (7:00 AM – 4:00 PM).
If attempts to reach the Examiner by telephone are unsuccessful, the Examiner’s supervisor, Sherief Badawi can be reached on (571) 272-9782. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SON T HOANG/Primary Examiner, Art Unit 2169
February 7, 2026