DETAILED ACTION
This nonfinal action is in response to the amendment and remarks filed 12/19/2025 for application 17/162,970.
Claims 1, 3, 5, 8, and 19 have been amended. Claims 9-18 and 20 are cancelled. Claims 21-32 are newly added claims.
Claims 1-8, 19, and 21-32 thereby remain pending in the application. Claims 1, 8, and 19 are independent claims.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous office action mailed 09/19/2025 has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/19/2025 has been entered.
Response to Amendment
The amendment filed 12/19/2025 has been entered.
Applicant’s amendment to the claims with respect to resolving claim objections and indefiniteness rejections under 35 U.S.C. 112(b) (and/or rendering objections/rejections moot through cancellation of claims) has been considered, and the objections and 112(b) rejections set forth in the office action mailed 09/19/2025 are consequently withdrawn.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 1-8, 19, and 21-32 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Regarding claim 1, it recites, inter alia, the following limitations:
responsive to the query, perform a process using a plurality of models in real time, the process comprising a first subprocess and a second subprocess:
the first subprocess using a first model and an ensemble model, in the plurality of
models, comprising a second model and a third model different from the first model, to:
identify, using the first model, based on one or more attributes of the query, a plurality of data segments of semi-structured data stored in a semi-structured database;
generate, using the ensemble model, a ranking of the plurality of data
segments based on a plurality of computations of statistical measures, each respective
computation of statistical measure characterizing one or more unique similarities between each data segment of the plurality of data segments and the query provided by a corresponding model of the ensemble model;
further identify a data segment from the plurality of data segments based
on the ranking, wherein the data segment is collectively the most similar of the plurality of data segments to the query;”
the second subprocess using a machine reading comprehension model, in the
plurality of models, to
input the data segment to the machine reading comprehension model and receive from the machine reading comprehension model a portion of the data segment deemed by the machine reading comprehension model to be most relevant to the query.
As best understood [see also 112(b) rejection of claim 1 below], the claim appears to recite utilization of “a plurality of models”, the plurality comprising a “first model”, an “ensemble model” (i.e., based on its plain meaning, a machine learning model that combines predictions from multiple learner models) to generate data segment rankings, wherein the ensemble model further comprises at least individual “second” and “third” models that each compute a “statistical measure”, followed by an additional “machine learning comprehension model” to identify a most relevant portion of the most similar data segment.
However, the originally filed specification provides no support for such a configuration of a plurality of models, particularly the described “ensemble model”.
The provided description appears to support an “ensemble” (not an ensemble learning model) comprising a “non-stochastic retrieval based module” [¶ 0032], and a generic “BERT question similarity model” which is not further described [¶ 0032].
The described retrieval module appears to perform the functions ascribed to the “ensemble model” within the claims of “generat[ing] a ranking of the plurality of data segments based on a plurality of computations of statistical measure characterizing one or more unique similarities between each data segment of the plurality of data segments and the query” and “further identify[ing] a data segment from the plurality of data segments based on the ranking, wherein the data segment is collectively the most similar of the plurality of data segments to the query” – however, there is no support for the retrieval module being structured as an “ensemble model”, or utilizing different sub-models to respectively compute different statistical measures.
The described BERT model (which based on its plain meaning, is known in the art as a type of machine reading comprehension (MRC) model) appears to perform the functions ascribed to the “machine learning comprehension model” within the claims of receiving a most similar data segment and identifying “a portion of the data segment deemed to be most relevant to the query” [¶ 0033]. There is no support for the described BERT model being included as part of an “ensemble model” or “plurality” of models, or for an “ensemble” of different BERT models, or for a separate BERT model being utilized prior to the described “machine reading comprehension” model.
At best, the specification may be construed to provide support for a “first model” that identifies data segments via attributes of the query (e.g., via a mapping module which accesses the index store constructed by the index builder [¶ 0032 and 0037]), a second “retrieval” model that generates rankings of segments and identifies a most similar segment [¶ 0032], and a third “machine reading comprehension” model that identifies a most relevant portion of the most similar segment [¶ 0033]. However, there is no support in the specification for a “plurality of models” that comprises additional models beyond what is described, or for any of these models to be an “ensemble model” comprising further individual models that individually calculate statistical measures. Ultimately, the subject matter in the claim extends beyond the scope of what is supported by the originally filed specification.
Regarding claims 8 and 19, they have the same deficiencies as found in claim 1 above. Consequently, they are likewise rejected under 112(a) for failing to comply with the written description requirement.
Regarding claims 25-27, they inherit deficiencies from parent claim 1 and likewise appear to further recite multiple individual computations of statistical measures by respective “models”, or modules, within the “ensemble model” (e.g., “a first computation of statistical measure associated with an ensemble of a Bidirectional Encoder Representations for Transformers (BERT)”, “a second computation of statistical measure associated with a non-stochastic retrieval based function”, “a third computation of statistical measure associated with a term frequency-inverse document frequency (tf-idf) function”) that are not supported by the originally filed specification – they are thereby likewise rejected under 112(a) for failing to comply with the written description requirement.
Regarding claims 2-7, 21-24, and 28-32, they inherit the deficiencies of their parent claim. Consequently, they are likewise rejected under 112(a) for failing to comply with the written description requirement.
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-8, 19, and 21-32 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 1, it recites, inter alia, the limitation “the first subprocess using a first model and an ensemble model, in the plurality of models, comprising a second model and a third model different from the first model”. It is unclear if modifying phrase “comprising a second model and a third model different from the first model” is intended to refer to the “ensemble model”, or instead refer to the “plurality of models”. Consequently, one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
For purposes of examination, the limitation is interpreted as “the first subprocess using a first model and an ensemble model within the plurality of models, the ensemble model comprising a second model and a third model different from the first model”.
Claim 1 further recites “generate, using the ensemble model, a ranking of the plurality of data segments based on a plurality of computations of statistical measures, each respective computation of statistical measure characterizing one or more unique similarities between each data segment of the plurality of data segments and the query provided by a corresponding model of the ensemble model”. It is unclear what element the modifying phrase “provided by a corresponding model of the ensemble model” refers to (e.g., could be understood to refer to any of “each respective computation of statistical measure”, “one or more unique similarities”, “each data segment”, “the query”, etc.). It is further unclear if “a corresponding model” is intended to refer to a particular model within the ensemble model, or is instead referring to “each respective” model within the ensemble model. Consequently, one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
For purposes of examination, the limitation is interpreted as “generate, using the ensemble model, a ranking of the plurality of data segments based on a plurality of computations of statistical measures, each computation of statistical measure provided by a respective model of the ensemble model, and each respective computation of statistical measure characterizing one or more unique similarities between each data segment of the plurality of data segments and the query”.
Regarding claims 8 and 19, they have the same deficiencies as those found in claim 1 above. Consequently, they are rejected for the same reasons and are likewise interpreted as detailed above.
Regarding claim 27, it recites, inter alia, the limitation “wherein the plurality of computations of statistical measures comprises a third computation of statistical measure associated with a term frequency-inverse document frequency (tf-idf) function”. However, there is insufficient antecedent basis for “a third computation of statistical measure” in the claim – parent claim 25 only recites “a first computation of statistical measure”. Consequently, one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
For purposes of examination, the limitation is interpreted as “wherein the plurality of computations of statistical measures comprises a second computation of statistical measure associated with a term frequency-inverse document frequency (tf-idf) function”.
Regarding claim 32, it recites the limitation “create the inverted index comprising a plurality of probable text pairs generated from a plurality of text different from the plurality of probable text pairs”. However, there is insufficient antecedent basis for the term “the inverted index” in the claim, as parent claim 1 does not previously recited an “inverted index”.
For purposes of examination, the limitation is interpreted as “create an inverted index comprising a plurality of probable text pairs generated from a plurality of text different from the plurality of probable text pairs”.
Regarding 2-7, and 21-26, and 28-31, they inherit the deficiencies of their parent claims. Consequently, they are also rejected under 35 U.S.C. 112(b) as being indefinite for depending on an indefinite parent claim.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-8, 19, and 21-32 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).
Independent Claims (Claim 1, Claim 8, Claim 19):
Step 1: Claim 1 is drawn to a system/apparatus, claim 8 is drawn to a method, and claim 19 is drawn to a product. Therefore, each of these claims falls under one of the four categories of statutory subject matter (process/method, machine/apparatus, manufacture/product, or composition of matter).
Step 2A Prong 1: Claims 1, 8, and 19 each recite a judicially recognized exception of an abstract idea.
Claim 1 recites, inter alia:
A system for retrieving information without determining intent comprising: receiv[ing] a query from a user; responsive to the query, perform[ing] a process, the process comprising a first subprocess and a second subprocess; – This limitation amounts to performing a series of actions/processes to respond to a person’s request and obtain information, and therefore recites a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
the first subprocess comprising: identify[ing] based on one or more attributes of the query, a plurality of data segments of semi-structured data; generat[ing] a ranking of the plurality of data segments based on a plurality of computations of statistical measures, each respective computation of statistical measure characterizing one or more unique similarities between each data segment of the plurality of data segments and the query; further identify[ing] a data segment from the plurality of data segments based on the ranking, wherein the data segment is collectively the most similar of the plurality of data segments to the query – Within the described processes of evaluation, this limitation further recites a procedure of identifying “attributes” of a query (e.g., words of a sentence), observing a plurality of data segments, and ranking them based on their determined relevance to the query (where the relevance is based on calculation of generic statistical measures (e.g., term frequency, segment length, etc)), and identifying which data segment is “most similar” to the query through the ranking. It therefore recites a process of data observation and analysis that a human could reasonably perform using pen and paper.
the second subprocess [comprising]: input the data segment and receive a portion of the data segment deemed to be most relevant to the query; transmit the portion of the data segment, thereby causing display of the portion of the data segment – Within the described processes of evaluation, this limitation further recites a procedure of further observing the data segment deemed “most similar”, and determining which portion of the most similar data segment is most relevant to the query (e.g., identifying key words/phrases), and providing that portion back to the initial requester. It therefore recites a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
Claims 8 and 19 recite substantially similar abstract idea limitations to those found in claim 1, and therefore recite the same judicial exception.
Step 2A Prong 2: The following additional elements recited in claims 1, 8, and 19 also do not integrate the recited judicial exceptions into a practical application.
Claim 1 additionally recites:
A system comprising: a processor; and a non-transitory memory storing instructions that, when executed, cause the processor to: – These limitations amount to mere instructions to implement an abstract idea on a computer or computer components.
[retrieving information] from a semi-structured knowledge base; [a plurality of data segments] of semi-structured data stored in a semi-structured database – These limitations amount to insignificant pre-solution steps of gathering data, and therefore recite insignificant extra-solution activity.
[receive a query] from a remote device; transmit to the remote device, causing display at the remote device – These limitations amount to insignificant pre- and post- solution steps of gathering and transmitting data, and therefore recite insignificant extra-solution activity.
[perform a process] using a plurality of models in real time; [the first subprocess] using a first model and an ensemble model, in the plurality of models, comprising a second model and a third model different from the first model; [identify] using the first model; [generate] using the ensemble model, [each respective computation of statistical measure] provided by a corresponding model of the ensemble model; – These limitations do no more than invoke generic, high-level computational and/or machine learning models as tools to perform abstract mentally performable steps of data observation and analysis, and thereby amount to no more than mere instructions to “apply” an exception.
[the second subprocess] using a machine reading comprehension model, in the plurality of models; [input the data segment] to the machine reading comprehension model, [receive from] the machine reading comprehension model, [portion of the data segment deemed to be most relevant to the query] by the machine reading comprehension model – These limitations do no more than invoke generic, high-level natural language models as a tool to perform abstract mentally performable steps of data observation and analysis, and thereby amount to no more than mere instructions to “apply” an exception.
Claims 8 and 19 recite substantially similar additional elements to those recited in claim 1, and therefore also do not integrate the recited judicial exception into a practical application.
Step 2B: The additional elements recited in claims 1, 8, and 19, viewed individually or as an ordered combination, do not provide an inventive concept or otherwise amount to significantly more than the recited abstract ideas themselves.
Claim 1 additionally recites:
A system comprising: a processor; and a non-transitory memory storing instructions that, when executed, cause the processor to: – Mere instructions to implement an abstract idea on a computer or computer components do not provide an inventive concept or significantly more to the recited abstract idea.
[retrieving information] from a semi-structured knowledge base; [a plurality of data segments] of semi-structured data stored in a semi-structured database – Storing and retrieving information in memory is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Storing and retrieving information in memory”) and thereby does not provide an inventive concept or significantly more to the recited abstract idea.
[receive a query] from a remote device; transmit to the remote device, causing display at the remote device – Receiving and transmitting data across devices (e.g., a network) is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Receiving or transmitting data over a network”) and thereby does not provide an inventive concept or significantly more to the recited abstract idea.
[perform a process] using a plurality of models in real time; [the first subprocess] using a first model and an ensemble model, in the plurality of models, comprising a second model and a third model different from the first model; [identify] using the first model; [generate] using the ensemble model, [each respective computation of statistical measure] provided by a corresponding model of the ensemble model; – Invoking generic, high-level computational and/or machine learning models as tools to perform abstract mentally performable steps of data observation and analysis does not provide an inventive concept or significantly more to the recited abstract idea.
[the second subprocess] using a machine reading comprehension model, in the plurality of models; [input the data segment] to the machine reading comprehension model, [receive from] the machine reading comprehension model, [portion of the data segment deemed to be most relevant to the query] by the machine reading comprehension model – Invoking generic, high-level natural language models as a tool to perform abstract mentally performable steps of data observation and analysis does not provide an inventive concept or significantly more to the recited abstract idea.
Even when considered as an ordered combination, the additional elements recited in the claims ultimately do no more than place the claims in the context of receiving and/or transmitting information for an abstract procedure of observing, analyzing, and displaying text data, and performing said abstract procedure using generic computational/ML models without any significant details of technical implementation. As such, claims 1, 8, and 19 are not patent eligible.
Dependent Claims (Claims 2-7, Claims 21-32):
Dependent claims 2-7 and 21-32 narrow the scope of independent claim 1, and likewise narrow the recited judicial exceptions. They recite abstract idea limitations that are similar to those recited within the independent claims (i.e., mental processes and/or mathematical concepts), and thereby merely expand on the already recited exceptions. The dependent claims also do not recite any further additional elements that successfully integrate the recited judicial exceptions into a practical application or provide significantly more than the recited abstract ideas themselves. Consequently, claims 2-7 and 21-32 are also rejected under 35 U.S.C. 101.
Step 1: Claims 2-7 and 21-32 are drawn to a system/apparatus. Therefore, each of these claims falls under one of the four categories of statutory subject matter (process/method, machine/apparatus, manufacture/product, or composition of matter).
Step 2A Prong 1: Claims 2-7 and 21-32 each recite a judicially recognized exception of an abstract idea.
Claim 2 recites, inter alia:
create an inverted index of the data segments – This limitation amounts to creating a reverse lookup table, which is a process of evaluation that a human could reasonably perform using pen and paper.
Claim 3 recites the same judicial exception as claim 2.
Claim 4 recites, inter alia:
create the inverted index from a title a question field, and an answer field of the data segments – This limitation amounts to creating a reverse lookup table based on observable data types, which is a process of evaluation that a human could reasonably perform using pen and paper.
Claim 5 recites the same judicial exception as parent claim 1.
Claim 6 recites, inter alia:
segment the received data into the plurality of data segments – This limitation amounts to merely observing data and splitting the data into portions, which is a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
Claim 7 recites, inter alia:
rank two or more identified data segments – This limitation amounts to ranking data segments based on reasoning, which is a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
Claim 21 recites, inter alia:
wherein the one or more attributes comprises one or more bigrams extracted from the query – This limitation amounts to observing two-word segments / phrases within a query (e.g., sentences), which is a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
Claim 22 recites, inter alia:
wherein the one or more attributes comprises one or more trigrams extracted from the query – This limitation amounts to observing three-word segments / phrases within a query (e.g., sentences), which is a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
Claim 23 recites, inter alia:
wherein a first data segment in the plurality of data segments comprises a object comprising a question and answer field – This limitation amounts to organizing data based on observable data types, which is a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
Claim 24 recites, inter alia:
wherein the first data segment in the plurality of data segments comprises a title field – This limitation amounts to organizing data based on observable data types, which is a process of evaluation that a human could reasonably perform in the mind or using pen and paper.
Claim 25 recites the same judicial exception as claim 1.
Claim 26 recites, inter alia:
wherein the plurality of computations of statistical measures comprises a second computation of statistical measure associated with a non-stochastic retrieval based function – This limitation explicitly recites computation of variables (statistical measures) via mathematical methods (non-stochastic retrieval based function), and therefore amounts to mathematical calculation.
Claim 27 recites, inter alia:
wherein the plurality of computations of statistical measures comprises a third computation of statistical measure associated with a term frequency-inverse document frequency (tf-idf) function – This limitation explicitly recites computation of variables (statistical measures) via mathematical methods (tf-idf function), and therefore amounts to mathematical calculation.
Claims 28-30 recite the same judicial exception as claim 1.
Claim 31 recites the same judicial exception as claim 3.
Claim 32 recites, inter alia:
create the inverted index comprising a plurality of probable text pairs generated from a plurality of text different from the plurality of probable text pairs – This limitation amounts to creating a reverse lookup table based on observable data types, which is a process of evaluation that a human could reasonably perform using pen and paper.
Step 2A Prong 2: Claims 4, 6-7, 21-22,24, 26-27, and 32 do not recite any further additional elements besides those recited in the independent claims, and the following additional elements recited in claims 2-3, 5, 23, 25, and 28-31 also do not integrate the recited judicial exceptions into a practical application.
Claim 2 additionally recites:
receive data from the semi-structured database; and store the inverted index in an index database – These limitations amounts to mere steps of gathering and organizing data to enable further analysis, and therefore recite insignificant extra-solution activity.
Claim 3 additionally recites:
wherein the semi-structured database is a plurality of uniform resource locators (URLs) – This limitation amounts to selecting a type of data to be manipulated, and therefore recites insignificant extra-solution activity.
Claim 5 additionally recites:
wherein the semi-structured database is a JavaScript Object Notation (JSON} database – This limitation amounts to merely specifying a data format / selecting a type of data to be manipulated, and therefore recites insignificant extra-solution activity.
Claim 23 additionally recites:
wherein a first data segment comprising a JavaScript Object Notation (JSON) object – This limitation amounts to merely specifying a data format / selecting a type of data to be manipulated, and therefore recites insignificant extra-solution activity.
Claim 25 additionally recites:
wherein the plurality of computations of statistical measures comprises a first computation of statistical measure associated with an ensemble of a Bidirectional Encoder Representations for Transformers (BERT) – This limitation amounts to no more than generally linking a judicial exception to the field of use of natural language processing models without providing any significant details of implementation.
Claim 28 additionally recites:
wherein the machine reading comprehension model is pre trained to extract a text-based answer from the data segment that addresses some or all of the query – Wherein claim 1 invokes generic, high-level natural language models as a tool to perform abstract mentally performable steps of data observation and analysis, this limitation recites further insignificant details of NLP model implementation, and therefore recites insignificant extra-solution activity.
Claim 29 additionally recites:
wherein the receiving the query is performed in real time during the generating the query at the remote device – This limitations amounts to an insignificant pre-solution step with respect to implementing gathering of data, and therefore recites insignificant extra-solution activity.
Claim 30 additionally recites:
wherein the machine reading comprehension model transmits the portion of the data segment to the remote device automatically – This limitation amounts to an insignificant post-solution step with respect to implementing transmission of data, and therefore recites insignificant extra-solution activity.
Claim 31 additionally recites:
wherein the plurality of URLs is maintained in one or more specified patterns – This limitations amounts to an insignificant step with respect to implementing gathering of data, and therefore recites insignificant extra-solution activity.
Step 2B: The additional elements recited in claims 2-3, 5, 23, 25, and 28-31, viewed individually or as an ordered combination, do not provide an inventive concept or otherwise amount to significantly more than the recited abstract ideas themselves.
Claim 2 additionally recites:
receive data from the semi-structured database; and store the inverted index in an index database – Storing and retrieving information in memory is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Storing and retrieving information in memory”, “Receiving or transmitting data over a network”) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
Claim 3 additionally recites:
wherein the semi-structured database is a plurality of uniform resource locators (URLs) – Extracting data from a list of URLs (e.g., web scraping) is well understood, routine, and conventional activity (see Asikri et al., “Using Web Scraping In a Knowledge Environment To Build Ontologies Using Python and Scrapy”, [page 2 Introduction]) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
Claim 5 additionally recites:
wherein the semi-structured database is a JavaScript Object Notation (JSON} database – Storing JSON files as a means of transporting data is well understood, routine, and conventional activity (see Asikri et al., “Using Web Scraping In a Knowledge Environment To Build Ontologies Using Python and Scrapy”, [page 3 Introduction]) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
Claim 23 additionally recites:
wherein a first data segment comprising a JavaScript Object Notation (JSON) object – Storing JSON files as a means of transporting data is well understood, routine, and conventional activity (see Asikri et al., “Using Web Scraping In a Knowledge Environment To Build Ontologies Using Python and Scrapy”, [page 3 Introduction]) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
Claim 25 additionally recites:
wherein the plurality of computations of statistical measures comprises a first computation of statistical measure associated with an ensemble of a Bidirectional Encoder Representations for Transformers (BERT) – Generally linking a judicial exception to the field of use of natural language processing models without providing any significant details of implementation does not provide an inventive concept or significantly more to the recited abstract idea.
Claim 28 additionally recites:
wherein the machine reading comprehension model is pre trained to extract a text-based answer from the data segment that addresses some or all of the query – Fine-tuning BERT models (type of machine reading comprehension model), which are pre-trained by nature, is well-understood, routine, and conventional activity (see Gillioz et al., “Overview of the Transformer-based Models for NLPs”, [page 181 BERT and POST-BERT]) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
Claim 29 additionally recites:
wherein the receiving the query is performed in real time during the generating the query at the remote device – Receiving and transmitting data across devices (i.e., a network) is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Storing and retrieving information in memory”, “Receiving or transmitting data over a network”).
Claim 30 additionally recites:
wherein the machine reading comprehension model transmits the portion of the data segment to the remote device automatically – Receiving and transmitting data across devices (i.e., a network) is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Storing and retrieving information in memory”, “Receiving or transmitting data over a network”).
Claim 31 additionally recites:
wherein the plurality of URLs is maintained in one or more specified patterns – Storing and retrieving information is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Storing and retrieving information in memory”, “Receiving or transmitting data over a network”) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
Even when considered as an ordered combination, the additional elements recited in the claims ultimately do no more than generically place the claims in the context of using information that has been extracted from webpages, using generic NLP models without any significant details of technical implementation, and transmitting data to remote devices. As such, claims 2-7 and 21-32 also are not patent eligible.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 5, 7-8, 19, 21-22, 25-30 are rejected under 35 U.S.C. 103 as being unpatentable over Yilmaz et al., (“Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval”, available 2019), hereinafter Yilmaz, in view of Choudhary et al., (“Document Retrieval Using Deep Learning”, available 2020), hereinafter Choudhary, Xiong et al., (“Per-query Database Partition Relevance for Search”, filed 03/31/2020), hereinafter Xiong, Asikri et al., (“Using Web Scraping In a Knowledge Environment To Build Ontologies Using Python and Scrapy”, available 2020), hereinafter Asikri, and Billawala et al., (Pub. No. US 20190236086 A1, “Scalable and Effective Document Summarization Framework”, published 08/01/2019), hereinafter Billawala.
Regarding claim 1, Yilmaz teaches A system for retrieving information without determining intent (“The dominant approach to ad hoc document retrieval using neural networks today is to deploy the neural model as a reranker over an initial list of candidate documents retrieved using a standard bag-of-words term-matching technique…Extending our own previous work (Yang et al., 2019c), the main contribution of this paper is a successful application of BERT to yield large improvements in ad hoc document retrieval” [Yilmaz page 1 Introduction]; “During inference, we first retrieve an initial ranked list of documents to depth 1000 from the collection using the Anserini toolkit1 (postv0.5.1 commit from mid-August 2019, based on Lucene 8.0). Following Lin (2018) and Yang et al. (2019a), we use BM25 with RM3 query expansion (default parameters), which is a strong baseline, and has already been shown to beat most existing neural models” [Yilmaz page 3 Experimental Setup]; A candidate list of documents is initially retrieved using BM-25, a standard document ranking function that finds documents based on query term-matching techniques rather than based on semantic meaning of queries (i.e., determining intent)) comprising:
a processor; (“We conduct end-to-end document ranking experiments on three TREC newswire collections: the Robust Track from 2004 (Robust04) and the Common Core Tracks from 2017 and 2018 (Core17 and Core18). Robust04 comprises 250 topics, with relevance judgments on a collection of 500K documents (TREC Disks 4 and 5). Core17 and Core18 have only 50 topics each; the former uses 1.8M articles from the New York Times Annotated Corpus while the latter uses around 600K articles from the TREC Washington Post Corpus…Code for replicating all the experiments described
in this paper is available as part of our recently-developed Birch IR engine.2” [Yilmaz page 3 Experimental Setup]; Performing the disclosed experiments on millions of documents inherently requires a computer (“Birch IR engine”) with adequate processing capabilities (i.e., at least one processor)) and
a non-transitory memory storing instructions that, when executed, cause the processor to ([Yilmaz page 3 Experimental Setup] as detailed above; Performing the disclosed experiments on millions of documents inherently requires a computer (“Birch IR engine”) with adequate processing and storage capabilities (i.e., non-transitory memory coupled to processor storing instructions):
receive a query from a user; (“Formally, in response to a user query Q, the system’s task is to produce a ranking of documents from a corpus that maximizes some ranking metric” [Yilmaz page 2 Background and Approach])
responsive to the query, perform a process using a plurality of models in real time, the process comprising a first subprocess and a second subprocess (“During inference, we first retrieve an initial ranked list of documents to depth 1000 from toolkit1 the collection using the Anserini (post v0.5.1 commit from mid-August 2019, based on Lucene 8.0). Following Lin (2018) and Yang et al. (2019a), we use BM25 with RM3 query expansion (default parameters), which is a strong baseline, and has already been shown to beat most existing neural models…. We clean the retrieved documents by stripping any HTML/XML tags and splitting each document into its constituent sentences with NLTK…All sentences are then fed to the BERT model” [Yilmaz page 3 Experimental Setup]; The first module utilizes the BM25 document algorithm to initially rank documents (i.e., document ranking model), followed by feeding the retrieved documents into a BERT model for re-ranking (i.e., second subprocess)):
the first subprocess using a data segment ranking model in the plurality of models to:
generate, using the data segment ranking model, a ranking of the plurality of data segments based on a plurality of computations of statistical measures, each respective computation of statistical measure characterizing one or more unique similarities between each data segment of the plurality of data segments and the query; (The dominant approach to ad hoc document retrieval using neural networks today is to deploy the neural model as a reranker over an initial list of candidate documents retrieved using a standard bag-of-words term-matching technique [Yilmaz page 1 Introduction]; “During inference, we first retrieve an initial ranked list of documents to depth 1000 from the collection using the Anserini toolkit1 (post v0.5.1 commit from mid-August 2019, based on Lucene 8.0). Following Lin (2018) and Yang et al. (2019a), we use BM25 with RM3 query expansion (default parameters), which is a strong baseline, and has already been shown to beat most existing neural models.” [Yilmaz page 3 Experimental Setup]; The list of candidate documents (i.e., data segments) is initially retrieved using BM-25, a standard document ranking function that finds relevant (i.e., similar to query) documents based on bag-of-words term-matching techniques (e.g., determining relevance ranking based on query term frequencies (i.e., computations of statistical measures based on similarities between each data segment and the query)) – examiner note: see Connelly (“Practical BM25 – Part 2: The BM25 Algorithm and its variables”, cited in PTO-892) for detailed explanation of standard BM25 implementation)
further identify a data segment from the plurality of data segments based on the ranking, wherein the identified data segment is collectively the most similar of the plurality of data segments to the query; ([Yilmaz page 1 Introduction] and [Yilmaz page 3 Experimental Setup] as detailed above; Determining a ranked list of candidate documents (i.e., data segments) using BM-25 inherently identifies an order of documents based on rank, wherein the document with the highest rank is identified as most similar to the query)
the second subprocess using a machine reading comprehension model, in the plurality of models, ([Yilmaz page 3 Experimental Setup] as detailed above; The subsequent BERT model is a machine reading comprehension model) to
input the data segment to the machine reading comprehension model and receive from the machine reading comprehension model a portion of the data segment deemed by the machine reading comprehension model to be most relevant to the query; “There are two challenges to applying BERT to document retrieval: First, BERT has mostly been applied to sentence-level tasks, and was not designed to handle long spans of input, having a maximum of 512 tokens. For reference, the retrieved results from a typical bag-of- words query on Robust04 has a median length of 679 tokens, and 66% of documents are longer than 512 tokens” [Yilmaz page 2 Background and Approach]; “We clean the retrieved documents by stripping any HTML/XML tags and splitting each document into its constituent sentences with NLTK. If the length of a sentence with the metatokens exceeds BERT’s maximum limit of 512, we further segment the spans into fixed size chunks. All sentences are then fed to the BERT model” [Yilmaz page 3 Experimental Setup]; “The core of our model is a BERT sentence-level relevance classifier…To determine document relevance, we apply inference over each individual sentence in a candidate document, and then combine the top n scores with the original document score as follows:
PNG
media_image1.png
71
375
media_image1.png
Greyscale
where Sdoc is the original document score and Si is the i-th top scoring sentence according to BERT” [Yilmaz page 3 Model Details]; For a given candidate document (i.e., data segment) in the list of retrieved ranked documents, including the highest-ranking candidate document (i.e., identified data segment), it is split into sentences (i.e., portions) of 512 tokens and then input to a BERT model (i.e., machine reading comprehension model), wherein the BERT model identifies the top n highest-scoring sentences (i.e., most relevant portions) within the document, including the top scoring sentence)
However, Yilmaz does not expressly teach generating data segment rankings using an ensemble model wherein each respective statistical measure characterizing one or more unique similarities between each data segment of the plurality of data segments and the query [is] provided by a corresponding model of the ensemble model.
In the same field of endeavor, Choudhary teaches a system of retrieving relevant text information in response to a query (“In this paper, we propose an ensemble of BERT and TF-IDF for a document retrieval system, where TFIDF and BERT together score the documents against a query, to retrieve a final set of top K documents” [Choudhary Abstract]) using an ensemble model comprising a second model and a third model wherein each respective statistical measure characterizing one or more unique similarities between each data segment of the plurality of data segments and the query [is] provided by a corresponding model of the ensemble model (“In this paper, we adapt BERT by implementing an ensemble of Term Frequency - Inverse Document Frequency (TF-IDF) and BERT to overcome the shortcomings of TFIDF. Both models run in parallel, and documents are ranked based on the weighted sum of each score” [Choudhary page 1 Introduction]; “Intuitively, TF-IDF determines the importance of a given term in a particular document. The term frequency (TF) component refers to the number of times a term appears in a given document normalized by size of the document to prevent bias toward longer documents. Inverse document frequency (IDF) determines the number of documents in the corpus in which the given term occurs. It is a measure of how common or rare the term in the entire corpus” [Choudhary page 2 TF-IDF Model]; “Next, the cosine similarity was computed between every query-document pair to give each document a score. This score can be called a ”semantic score” as it essentially captures the semantic relationships across documents [2]. With BERT, our main aim was to rank the documents in the corpus according to the semantic relevance.” [Choudhary page 3 BERT Model])
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated generating document rankings using an ensemble model wherein each respective statistical measure characterizing one or more unique similarities between each data segment of the plurality of data segments and the query [is] provided by a corresponding model of the ensemble model as taught by Choudhary into Yilmaz because they are both directed towards a system of retrieving relevant text information in response to a query. Given that Choudhary leverages the described ensemble to overcome the shortcomings of the TF-IDF model [Choudhary page 1 Introduction], a person of ordinary skill in the art would recognize the value of substituting the BM-25 document ranking model of Yilmaz with the ensemble model of Choudhary to potentially leverage its demonstrated advantages and performance improvement [Choudhary page 5 Conclusion].
However, the combination does not expressly teach the plurality of models including a first model that identif[ies], based on one or more attributes of the query, a plurality of data segments of data stored in a database.
In the same field of endeavor, Xiong teaches a system of retrieving relevant text data from a knowledge base in response to user queries (“ Large-scale information retrieval systems often store documents in different search indexes, or shards. In product searches for e-commerce websites, a product is often assigned to a particular shard according to the product's category or categories. If an association can be determined between a search query and one or more product categories, infrastructure cost and compute overhead can be reduced by only retrieving search results (e.g., relevant products) from the most relevant shards. In various embodiments described herein, a machine learning model may be trained based on past user queries and past user behaviors to predict shard association for new queries. In various examples, a feed-forward neural network, with language-independent features, may be employed to provide low-latency prediction for the per-query shard relevance task. In various embodiments, machine learning models are described that may predict relevant shards for input search queries without degrading the user experience (e.g., in terms of accuracy and/or latency)” [Xiong col. 1 lines 56 – col. 2 line 7]) including a first model that identif[ies], based on one or more attributes of the query, a plurality of data segments of data stored in a database (“When each selected shard receives a query request, one replica is designated as being responsible for handling the request in a load-balanced manner. The responsible replica returns a ranked list of relevant documents for the input search query. The results from different selected shards (e.g., for the same query) are merged into a single, coherent ranked list and are returned to the user through result merging.” [Xiong col. 2 lines 28–35]; “FIG. 2 is a diagram illustrating an example feed forward machine learning model 200 for predicting per-query shard relevance, in accordance with various embodiments of the present disclosure. As depicted, in various examples, the feed-forward machine learning model 200 may receive character-level and word-level n-gram tokens as input (e.g., character-level token data and/or word-level token data). For example, for the query “red running shoes”, the list of character trigrams 202c is “re”, “red”…” [Xiong col. 7 lines 54–62]; The described model divides the query into shards (i.e., attributes) and identifies relevant segments from the database for each shard respectively)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the plurality of models including a first model that identif[ies], based on one or more attributes of the query, a plurality of data segments of data stored in a database as taught by Xiong into the combination because Yilmaz and Xiong are both systems of retrieving relevant text data from a knowledge base in response to user queries. Incorporating the teachings of Xiong would be advantageous for large database systems due to the increased computational efficiency of separating data into “shards” and searching through shard sets based on query relevance accordingly (“Separation of databases into shards distributed across different server instances reduces index size which generally improves search performance. Additionally, separation of a database into shards distributed across different server instances and/or physical machines may greatly improve search performance through load balancing and a reduced search set for the physical hardware” [Xiong col. 1 lines 11-17]; “If an association can be determined between a search query and one or more product categories, infrastructure cost and compute overhead can be reduced by only retrieving search results (e.g., relevant products) from the most relevant shards” [Xiong col. 1 lines 60–64]).
However, the combination does not expressly teach receiving data segments as semi-structured data stored in a semi-structured database.
In the same field of endeavor, Asikri teaches a system of retrieving text data from a knowledge base for further analysis (“People might want to collect and analyse data from multiple websites…Web Scraping is the technique which people can extract data from multiple websites to a single spreadsheet or database so that it becomes easy to analyse or even visualize the data” [Asikri page 1 Introduction]) which retrieves a plurality of data segments of semi-structured data stored in a semi-structured database ("Web scrapers typically take something out of a page, to make use of it for another purpose somewhere else. An example would be to find and copy names and phone numbers, or companies and their URLs, to a list (contact scraping)" [Asikri pages 2-3 Introduction]; "A big percent of the world‘s data is unstructured, estimated around 70%-80%. Websites are a rich source for unstructured text that can be mined and turned into useful insights. The process of extracting information from websites is usually referred to as Web scraping [14]" [Asikri page 7 Scraping e-commerce web site with scrapy and python]; "Many websites have large collections of pages generated dynamically from an underlying structured source like a database...some semi-structured data query languages, such as XQuery and the HTQL, can be used to parse HTML pages and to retrieve and transform page content" [Asikri page 5 HTML processing]; Web scrapers retrieve text information (i.e., data segments) from webpages for later processing, wherein webpages are generated dynamically from underlying database storage; webpages are furthermore a source of semi-structured data, as they store unstructured text elements while also organizing content via HTML tags and attributes).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated retrieving a plurality of data segments of semi-structured data stored in a semi-structured database as taught by Asikri into the combination because both Yilmaz and Asikri are directed towards retrieving text data from a knowledge base for further analysis. Given that Yilmaz already discusses means of “cleaning” retrieved data (e.g., stripping HTML/XML tags) prior to further analysis by the BERT model (“We clean the retrieved documents by stripping any HTML/XML tags” [Yilmaz page 3 Model Details]), a person of ordinary skill in the art would recognize the value of incorporating the web scraping techniques, as taught in Asikri, into Yilmaz to expand the BERT model’s field of use to analyzing text data hosted on webpages ("Scrapers are basically adopted to transform unstructured data and save them in structured databases...In this paper, we focus on web scrapers that extract textual information from Web pages" [Asikri page 6 Web scraping]; "Websites are a rich source for unstructured text that can be mined and turned into useful insights" [Asikri page 7 Scraping e-commerce web site with scrapy and python]).
However, the combination does not expressly teach queries being generated from a user at a remote device, and transmit[ting] the portion of the data segment to the remote device thereby causing display of the portion of the data segment at the remote device.
In the same field of endeavor, Billawala teaches a method of providing web-based textual information in response to a user request (“For example, a user operating user device 110 may control a web browsing application to transmit a viewing request to web server 120 for viewing content provided on a website hosted by web server 120. According to some embodiments, web server 120 may include a summarization engine configured to generate a summary of content hosted by web server 120. The summarization engine may be software, hardware, firmware, or a combination thereof, corresponding to web server 120. Although not illustrated, according to some embodiments the summarization engine may be a separate server in communication with web server 120. It follows that the summarization engine may receive the viewing request from user device 110 and, in response to the viewing request, generate a summary of the requested content” [Billawalla ¶ 0016]) with queries being generated from a user at a remote device ([Billawalla ¶ 0016] as detailed above), and transmit[ting] a portion of a data segment to the remote device thereby causing display of the portion of the data segment at the remote device (“The summarization engine described herein may generate the summary of web-based content based, at least in part, on a target summary length (e.g., target length of a textual summary measured in number of characters, words, or sentences” [Billawalla ¶ 0013]; “It follows that the summarization engine may receive the viewing request from user device 110 and, in response to the viewing request, generate a summary of the requested content. The summarization engine may further transmit the summary back to the user device 110. According to some embodiments, the summarization engine may further control certain aspects of displaying the summary on the user device 110” [Billawalla ¶ 0016]; “User device 110 may be a computing device which allows a user to connect to network 140. User device 110 may provide an interface for requesting/accessing/viewing web-based information made available by, for example, web server 120…User device 110 may also be referred to as a client device…A client device may vary in terms of capabilities or features…For example, a cell phone may include a numeric keypad or a display of limited functionality, such as a monochrome liquid crystal display (LCD) for displaying text” [Billawalla ¶ 0018]; “With respect to network 140, network 140 may couple devices so that communications may be exchanged, such as between web server 120 and user device 110, or other types of devices, including between wireless devices coupled via a wireless network, for example… Furthermore, a computing device or other related electronic devices may be remotely coupled to network 140, such as via a telephone line or link, for example” [Billawalla ¶ 0020]; Given a generated textual summary (i.e., portion of data segment) in response to a user request, the system is configured to transmit the summary to a remote user device (e.g., connected via wireless network) for display)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated queries being generated from a user at a remote device, and transmit[ting] the portion of the data segment to the remote device thereby causing display of the portion of the data segment at the remote device as taught by Billawalla into the combination because both Yilmaz and Billawalla are directed towards providing textual information in response to a user request. Given that Billawalla already suggests combination of the disclosed network system of devices with language preprocessing techniques for generating text summaries [Billawalla ¶ 0039] that have similarity to a user viewing request (e.g., an internet search query leading to the viewing request of the current web content) [Billawalla ¶ 0046], one of ordinary skill in the art would recognize the value of incorporating the techniques for transmitting and displaying information over a network, as taught by Billawalla, into the combination of Yilmaz and Asikri, thereby enabling the combined system to modify text within spans identified by the BERT model for proper display on user devices with varied display capabilities / screen sizes (“With the increasing popularity of mobile and portable devices, users are increasingly viewing web-based content (e.g., webpages, applications displaying websites, etc.) on their mobile and portable devices. However, when compared to a traditional home personal computer, a user's mobile or portable device may have a much smaller display screen. It follows that the present disclosure describes a summarization engine configured to generate a summary of web-based content, wherein the summary may be more easily viewable when displayed on a user's mobile or portable device” [Billawalla ¶ 0012-0013]).
Regarding claim 8, it is a method claim that corresponds to the system/apparatus of claim 1, which is already taught by the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla as detailed above. Consequently, it is rejected for the same reasons.
Regarding claim 19, it is a product claim that corresponds to the system/apparatus of claim 1, which is already taught by the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla as detailed above. Yilmaz further teaches A non-transitory computer readable medium having instructions stored thereon, wherein the instructions, when executed by at least one processor, cause a device to perform operations comprising: the claimed functions (“We conduct end-to-end document ranking experiments on three TREC newswire collections: the Robust Track from 2004 (Robust04) and the Common Core Tracks from 2017 and 2018 (Core17 and Core18). Robust04 comprises 250 topics, with relevance judgments on a collection of 500K documents (TREC Disks 4 and 5). Core17 and Core18 have only 50 topics each; the former uses 1.8M articles from the New York Times Annotated Corpus while the latter uses around 600K articles from the TREC Washington Post Corpus…Code for replicating all the experiments described in this paper is available as part of our recently-developed Birch IR engine.2” [Yilmaz page 3 Experimental Setup]; Performing the disclosed experiments on millions of documents inherently requires a computer (“Birch IR engine”) with adequate processing and storage (i.e., computer-readable medium having instructions) capabilities). Consequently, claim 19 is rejected for the same reasons as claim 1.
Regarding claim 5, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 1, and Asikri further teaches wherein the semi-structured database is a JavaScript Object Notation (JSON) database (“Newer forms of web scraping involve listening to data feeds from web servers. For example, JSON is commonly used as a transport storage mechanism between the client and the web server” [Asikri page 3 Introduction]; "There are several good open source Web scraping frameworks, including Scrapy , Nutch [15] and Heritrix [16] . For medium sized scraping projects, Scrapy stands out from the rest since it is:….built-in support for exporting to CSV, JSON and XML" [Asikri page 7 Scraping e-commerce web site with scrapy and python]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the database [being] a JavaScript Object Notation (JSON) database as taught by Asikri into the combination because both Yilmaz and Asikri directed towards retrieving text data from a knowledge base. JSON is a commonly used format to transport web data between a client and a server (“For example, JSON is commonly used as a transport storage mechanism between the client and the web server” [Asikri Introduction page 3]) and intrinsically provides a structure to the data; therefore, incorporating handling of JSON data, as taught by Asikri, into Yilmaz would further enable expanding the BERT model’s field of use to text data hosted on webpages ([Asikri page 6 Web scraping]; [Asikri page 7 Scraping e-commerce web site with scrapy and python]).
Regarding claim 7, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 1 and Yilmaz further teaches rank[ing] two or more identified data segments ("Formally, in response to a user query Q, the system’s task is to produce a ranking of documents from a corpus that maximizes some ranking metric—in our case, average precision (AP)” [Yilmaz page 2 Background and Approach]; “Our second innovation involves the aggregation of sentence-level evidence for document ranking. That is, given an initial ranked list of documents, we segment each into sentences, and then apply inference over each sentence separately, after which sentence-level scores are aggregated to yield a final score for ranking documents” [Yilmaz page 2 Key Insights]; Sentence-level analysis (i.e., analyzing spans of data segments) is performed for each document (i.e., data segment) in the candidate list of documents (i.e., two or more data segments), and is aggregated into the produced final ranking of documents).
Regarding claim 21, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 1, and Xiong further teaches wherein the one or more attributes comprises one or more bigrams extracted from the query (“In the example depicted in FIG. 2, input tokens may include one or more of character unigrams 202a, character bigrams 202b, character trigrams 202c, word unigrams 202d, word bigrams 202e, etc. More or fewer token inputs may be used in accordance with the desired implementation” [Xiong col. 7 line 65 – col. 8 line 2])
Regarding claim 22, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 1, and Xiong further teaches wherein the one or more attributes comprises one or more trigrams extracted from the query (“In the example depicted in FIG. 2, input tokens may include one or more of character unigrams 202a, character bigrams 202b, character trigrams 202c, word unigrams 202d, word bigrams 202e, etc. More or fewer token inputs may be used in accordance with the desired implementation” [Xiong col. 7 line 65 – col. 8 line 2])
Regarding claim 25, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 1, and Choudhary further teaches wherein the plurality of computations of statistical measures comprises a first computation of statistical measure associated with an ensemble of a Bidirectional Encoder Representations for Transformers (BERT) ([Choudhary page 1 Introduction] and [Choudhary page 3 BERT Model] as detailed in claim 1 above)
Regarding claim 26, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 25, and Choudhary further teaches wherein the plurality of computations of statistical measures comprises a second computation of statistical measure associated with a non-stochastic retrieval based function ([Choudhary page 1 Introduction] and [Choudhary page 2 TF-IDF Model] as detailed in claim 1 above; TF-IDF function is a non-stochastic retrieval based function by definition)
Regarding claim 27, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 25, and Choudhary further teaches wherein the plurality of computations of statistical measures comprises a third computation of statistical measure associated with a term frequency-inverse document frequency (tf-idf) function ([Choudhary page 1 Introduction] and [Choudhary page 2 TF-IDF Model] as detailed in claim 1 above)
Regarding claim 28, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 1, and Yilmaz further teaches wherein the machine reading comprehension model is pre trained to extract a text-based answer from the data segment that addresses some or all of the query (“We begin with BERTLarge (uncased, 340m parameters) from Devlin et al. (2019), and then finetune on the collections described in Section 2.1, individually and in combination” [Yilmaz page 3]; BERTLarge (i.e., machine reading comprehension model) is a pre-trained model that is further fine-tuned prior to being leveraged for answering queries)
Regarding claim 29, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 1, and Billawalla further teaches wherein the receiving the query is performed in real time during the generating the query at the remote device (“The summarization engine may be software, hardware, firmware, or a combination thereof, corresponding to web server 120. Although not illustrated, according to some embodiments the summarization engine may be a separate server in communication with web server 120. It follows that the summarization engine may receive the viewing request from user device 110 and, in response to the viewing request, generate a summary of the requested content…In the network system 100, user device 110 is coupled, through network 140, with web server 120. A user operating user device 110 may be running a web browser application on the user device 110 to access a web page, documents, or other web-based information available on web server 120. The web-based information made available by web server 120 may be stored directly on a memory of web server 120, or may be stored on database 130 and accessed by web server 120. Herein, the phrase “coupled with” is defined to mean directly connected to or indirectly connected through one or more intermediate components.” [Billawalla ¶ 0016-0017]; It is implicit that devices within the network system are appropriately coupled to allow for near-instantaneous communication (i.e., in real time))
Regarding claim 30, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 1, and Billawalla further teaches wherein the machine reading comprehension model transmits the portion of the data segment to the remote device automatically ([Billawalla ¶ 0016-0017] as detailed above; It is implicit that devices within the network system are appropriately coupled to allow for automatic communication)
Claims 2-3, 6, and 31 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla, as applied to claim 1 above, further in view of Kamphius et al., (“Which BM25 Do You Mean? A Large-Scale Reproducibility Study of Scoring Variants”, available 2020), hereinafter Kamphius.
Regarding claim 2, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 1, and Asikri further teaches receiv[ing] data from the semi-structured database ([Asikri pages 2-3 Introduction] and [Asikri page 7 Scraping e-commerce web site with scrapy and python] and [Asikri page 5 HTML processing] as detailed in claim 1 above).
However, the combination does not explicitly teach creat[ing] an inverted index of the data segments and stor[ing] the inverted index in an index database.
In the same field of endeavor, Kamphius teaches a system of retrieving information from a knowledge base (“"BM25 [8] is perhaps the most well-known scoring function for “bag of words” document retrieval...Our goal is a large-scale reproducibility study to explore the nuances of different variants of BM25 and their impact on retrieval effectiveness. We include in our study the specifics of the implementation of BM25 in the Lucene open-source search library" [Kamphius pages 1-2 Introduction]) that creat[es] an inverted index of the data segments; ("Although learning-to-rank approaches and neural ranking models are widely used today, they are typically deployed as part of a multi-stage reranking architecture, over candidate documents supplied by a simple term-matching method using traditional inverted indexes [1]." [Kamphius page 1 Introduction]; As part of ranking model approaches (e.g., BM25), a term-matching method using inverted indexes can be used to re-rank candidate documents) and,
store[s] the inverted index in an index database (“As an alternative, it is possible to “export” the inverted index to a relational database and recast the document ranking problem into a database (specifically, SQL) query” [Kamphius page 2 Introduction]; Inverted indexes can be exported to a separate relational database).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated creat[ing] an inverted index of the data segments and stor[ing] the inverted index in an index database as taught in Kamphius into the combination because both Yilmaz and Kamphius are directed towards retrieving information from a knowledge base. Given that Yilmaz already makes use of BM25 to retrieve initially ranked data ("During inference, we first retrieve an initial ranked list of documents...following Lin (2018) and Yang et al. (2019a), we use BM25 with RM3 query expansion (default parameters)" [Yilmaz page 3 Model Details]), a person of ordinary skill in the art would recognize the value of incorporating the teachings of Kamphius in order to provide multiple variants of BM25 that could be tested for retrieval effectiveness ("As many researchers have previously observed, e.g., Trotman et al. [11], the referent of BM25 is quite ambiguous. There are, in fact, many variants of the scoring function: beyond the original version proposed by Robertson et al. [8], many variants exist that include small tweaks by subsequent researchers...Our goal is a large-scale reproducibility study to explore the nuances of different variants of BM25 and their impact on retrieval effectiveness" [Kamphius Introduction pages 1-2]) prior to the re-ranking performed by the BERT model.
Regarding claim 3, the combination of Yilmaz, Choudhary, Xiong, Asikri, Billawalla, and Kamphius teaches the limitations of parent claim 2, and Asikri further teaches wherein the semi-structured database is a plurality of uniform resource locators (URLs) (“Scrapy [6] is a powerful Web scraping framework for Python, where robots are defined as classes inheriting from BaseSpider class, which defines a set of ‘starting urls‘ and a ‘parse‘ function called at each Web iteration. Web pages are automatically parsed and Web contents are extracted using XPath expressions” [Asikri page 4 Frameworks]; Web scraping can be performed using a list of URLs from which to extract web content).
Regarding claim 6, the combination of Yilmaz, Choudhary, Xiong, Asikri, Billawalla, and Kamphius teaches the limitations of parent claim 2, and Asikri further teaches segment[ing] the received data into the plurality of data segments (“In this paper, we focus on Web scrapers that extract textual information from Web pages” [Asikri page 6 Web scraping]; “Basicly [sic], selectors are the path (or formula) of the items we need to extract data from inside a HTML page” [Asikri page 9 Understanding selectors]; “When the extractions finishes we will have a CSV file (deals.csv) containing Jumia.ma deals for the day, contains a sample out” [Asikri page 9 Example of scraping e-commerce web site]; From the received HTML page (i.e., received data), the web scraper extracts text data (i.e., data segments) to be output to a CSV file (i.e., segmented into tabular data with commas separating each field)).
Regarding claim 31, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 3, and Asikri further teaches wherein the plurality of URLs is maintained in one or more specified patterns (see “url = href.extract()” in Example of scraping e-commerce web site [Asikri page 8]; URLs are stored within href attributes of HTML pages).
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Yilmaz, Choudhary, Xiong, Asikri, Billawalla, and Kamphius as applied to claim 2 above, further in view of Sermet et al., (“A Semantic Web Framework for Automated Smart Assistants: COVID-19 Case Study”, available 2020), hereinafter Sermet.
Regarding claim 4, the combination of Yilmaz, Choudhary, Xiong, Asikri, Billawalla, and Kamphius teaches the limitations of parent claim 2, and Asikri further teaches retrieving the title of the data segments ("This is the code used to scrape the web site: [code]" [Asikri pages 8-9 Example of scraping e-commerce web site]; See line "'Title': response.css('.product-title:text').extract()[0]" within code that extracts title field from webpage data). Kamphius further teaches creat[ing] the inverted index from fields of the data segments ([Kamphius page 1 Introduction] as detailed in claim 2 above; As part of ranking model approaches (e.g., BM25), a term-matching method (i.e., matching spans or fields of data segments) using inverted indexes can be used to re-rank candidate documents).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated retrieving the title fields of the data segments as taught by Asikri into the combination because both Yilmaz and Asikri are directed towards retrieving text data from a knowledge base. Web scraping intrinsically comprises selecting desired field components (e.g., title fields) of a webpage for extraction (“Before scraping the website, it is important to understand the concept of selectors in scrapy…selectors are the path (or formula) of the items we need to extract data from inside a HTML page” [Asikri page 8 Understanding Selectors]); therefore, incorporating the teachings of Asikri into Yilmaz would further enable expanding the BERT model’s field of use to text data hosted on webpages ([Asikri page 6 Web scraping]; [Asikri page 7 Scraping e-commerce web site with scrapy and python]).
However, the combination does not explicitly teach retrieving a question field and an answer field of the data segments.
In the same field of endeavor, Sermet teaches a system of using web scraping to retrieve text information from a knowledge base ("There are two main approaches for powering the Instant Expert with a knowledge base; knowledge engine mode and question and answer (Q&A) mode...The Q&A mode requires a list of question & answer pairs supplied manually or by providing the URL of a webpage containing such pairs (e.g. Frequently Asked Questions)" [Sermet page 5 Knowledge Generation Module]) that retrieves a question field, and an answer field of the data segments ("More concretely, the HTML elements containing the FAQ question will likely share certain characteristics such as element tag, class, styling, immediate parents, and DOM depth...The parsing process begins with finding innermost HTML elements containing a question mark to retrieve unique texts. Per our assumption, the majority of these question marks should belong to FAQ questions...the system is able to extract all answers regardless of their scope and structuring. Thus, all Q&A pairs are extracted from an FAQ webpage in a heuristic manner requiring no data other than the URL of the Q&A page" [Sermet pages 6-7 Q&A Parsing and Collection]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated retrieving a question field, and an answer field of the data segments as taught by Sermet into the combination because Yilmaz, Asikri, and Sermet are all directed towards retrieving text information from a knowledge base; furthermore, Asikri is also particularly directed towards extracting fields of data from a webpage via web scraping. Incorporating the teachings of Sermet into the combination would thereby further expand possible applications of the BERT model to handling text data hosted on webpages, e.g., enabling smart assistants/chatbots for websites that instantly provide responses to queries ("Instant Expert is capable of automatically parsing, processing, and modeling internal (same-origin) or external (cross-origin) Frequently Asked Questions (FAQ) webpages as an information resource as well as communicating with an external knowledge engine for more advanced use cases...The presented component makes it possible for any web system on any domain to have its own voice-enabled smart assistant to instantly provide factual responses to natural language queries" [Sermet page 3 Introduction])
Claims 23-24 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla, as applied to claim 1 above, further in view of Sermet et al., (“A Semantic Web Framework for Automated Smart Assistants: COVID-19 Case Study”, available 2020), hereinafter Sermet.
Regarding claim 23, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 1.
However, the combination does not expressly teach wherein a first data segment in the plurality of data segments comprises a JavaScript Object Notation (JSON) object comprising a question and answer field.
In the same field of endeavor, Sermet teaches a system of using web scraping to retrieve text information from a knowledge base ("There are two main approaches for powering the Instant Expert with a knowledge base; knowledge engine mode and question and answer (Q&A) mode...The Q&A mode requires a list of question & answer pairs supplied manually or by providing the URL of a webpage containing such pairs (e.g. Frequently Asked Questions)" [Sermet page 5 Knowledge Generation Module]) wherein a first data segment comprises a JavaScript Object Notation (JSON) object comprising a question and answer field (“If the developer enabled downloadModel attribute of the Instant Expert web component, then the framework will generate a JSON file consisting of the tensor matrix and the Q&A pairs. This JSON file can be hosted on a server and the URL to access the file can be provided to Instant Expert...Thus, the framework allows three different usage styles for the Q&A Mode: same-origin or cross-origin FAQ webpage, manual Q&A definition, and JSON file containing Q&A pairs and question embeddings.” [Sermet page 8 Q&A Encoding]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated wherein a first data segment comprises a JavaScript Object Notation (JSON) object comprising a question and answer field as taught by Sermet into the combination because Yilmaz, Asikri, and Sermet are all directed towards retrieving text information from a knowledge base; furthermore, Asikri is also particularly directed towards extracting fields of data from a webpage via web scraping and JSON files. Incorporating the teachings of Sermet into the combination would thereby further expand possible applications of the BERT model to handling text data hosted on webpages, e.g., enabling smart assistants/chatbots for websites that instantly provide responses to queries ("Instant Expert is capable of automatically parsing, processing, and modeling internal (same-origin) or external (cross-origin) Frequently Asked Questions (FAQ) webpages as an information resource as well as communicating with an external knowledge engine for more advanced use cases...The presented component makes it possible for any web system on any domain to have its own voice-enabled smart assistant to instantly provide factual responses to natural language queries" [Sermet page 3 Introduction]).
Regarding claim 24, the combination of Yilmaz, Choudhary, Xiong, Asikri, Billawalla, and Sermet teaches the limitations of parent claim 23, and Asikri further teaches wherein the first data segment in the plurality of data segments comprises a title field. ("This is the code used to scrape the web site: [code]" [Asikri pages 8-9 Example of scraping e-commerce web site]; See line "'Title': response.css('.product-title:text').extract()[0]" within code that extracts title field from webpage data).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated retrieving the title fields of the data segments as taught by Asikri into the combination because both Yilmaz and Asikri directed towards retrieving text data from a knowledge base. Web scraping intrinsically comprises selecting desired field components (e.g., title fields) of a webpage for extraction (“Before scraping the website, it is important to understand the concept of selectors in scrapy…selectors are the path (or formula) of the items we need to extract data from inside a HTML page” [Asikri page 8 Understanding Selectors]); therefore, incorporating the teachings of Asikri into Yilmaz would further enable expanding the BERT model’s field of use to text data hosted on webpages ([Asikri page 6 Web scraping]; [Asikri page 7 Scraping e-commerce web site with scrapy and python]).
Claim 32 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla, as applied to claim 1 above, further in view of Kamphius et al., (“Which BM25 Do You Mean? A Large-Scale Reproducibility Study of Scoring Variants”, available 2020), hereinafter Kamphius, and Sermet et al., (“A Semantic Web Framework for Automated Smart Assistants: COVID-19 Case Study”, available 2020), hereinafter Sermet.
Regarding claim 32, the combination of Yilmaz, Choudhary, Xiong, Asikri, and Billawalla teaches the limitations of parent claim 1.
However, the combination does not expressly teach creating [an] inverted index of data segments.
In the same field of endeavor, Kamphius teaches a system of retrieving information from a knowledge base (“"BM25 [8] is perhaps the most well-known scoring function for “bag of words” document retrieval...Our goal is a large-scale reproducibility study to explore the nuances of different variants of BM25 and their impact on retrieval effectiveness. We include in our study the specifics of the implementation of BM25 in the Lucene open-source search library" [Kamphius pages 1-2 Introduction]) that creat[es] an inverted index of data segments ("Although learning-to-rank approaches and neural ranking models are widely used today, they are typically deployed as part of a multi-stage reranking architecture, over candidate documents supplied by a simple term-matching method using traditional inverted indexes [1]." [Kamphius page 1 Introduction]; As part of ranking model approaches (e.g., BM25), a term-matching method using inverted indexes can be used to re-rank candidate documents).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated creat[ing] an inverted index of the data segments as taught in Kamphius into the combination because both Yilmaz and Kamphius are directed towards retrieving information from a knowledge base. Given that Yilmaz already makes use of BM25 to retrieve initially ranked data ("During inference, we first retrieve an initial ranked list of documents...following Lin (2018) and Yang et al. (2019a), we use BM25 with RM3 query expansion (default parameters)" [Yilmaz page 3 Model Details]), a person of ordinary skill in the art would recognize the value of incorporating the teachings of Kamphius in order to provide multiple variants of BM25 that could be tested for retrieval effectiveness ("As many researchers have previously observed, e.g., Trotman et al. [11], the referent of BM25 is quite ambiguous. There are, in fact, many variants of the scoring function: beyond the original version proposed by Robertson et al. [8], many variants exist that include small tweaks by subsequent researchers...Our goal is a large-scale reproducibility study to explore the nuances of different variants of BM25 and their impact on retrieval effectiveness" [Kamphius Introduction pages 1-2]) prior to the re-ranking performed by the BERT model.
However, the combination does not expressly teach data segments comprising a plurality of probable text pairs generated from a plurality of text different from the plurality of probable text pairs.
In the same field of endeavor, Sermet teaches a system of using web scraping to retrieve text information from a knowledge base ("There are two main approaches for powering the Instant Expert with a knowledge base; knowledge engine mode and question and answer (Q&A) mode...The Q&A mode requires a list of question & answer pairs supplied manually or by providing the URL of a webpage containing such pairs (e.g. Frequently Asked Questions)" [Sermet page 5 Knowledge Generation Module]) with data segments comprising a plurality of probable text pairs generated from a plurality of text different from the plurality of probable text pairs. ("More concretely, the HTML elements containing the FAQ question will likely share certain characteristics such as element tag, class, styling, immediate parents, and DOM depth...The parsing process begins with finding innermost HTML elements containing a question mark to retrieve unique texts. Per our assumption, the majority of these question marks should belong to FAQ questions...the system is able to extract all answers regardless of their scope and structuring. Thus, all Q&A pairs are extracted from an FAQ webpage in a heuristic manner requiring no data other than the URL of the Q&A page" [Sermet pages 6-7 Q&A Parsing and Collection]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated data segments comprising a plurality of probable text pairs generated from a plurality of text different from the plurality of probable text pairs. as taught by Sermet into the combination because Yilmaz, Asikri, and Sermet are all directed towards retrieving text information from a knowledge base; furthermore, Asikri is also particularly directed towards extracting fields of data from a webpage via web scraping. Incorporating the teachings of Sermet into the combination would thereby further expand possible applications of the BERT model to handling text data hosted on webpages, e.g., enabling smart assistants/chatbots for websites that instantly provide responses to queries ("Instant Expert is capable of automatically parsing, processing, and modeling internal (same-origin) or external (cross-origin) Frequently Asked Questions (FAQ) webpages as an information resource as well as communicating with an external knowledge engine for more advanced use cases...The presented component makes it possible for any web system on any domain to have its own voice-enabled smart assistant to instantly provide factual responses to natural language queries" [Sermet page 3 Introduction])
Response to Arguments
The remarks filed 12/19/2025 have been fully considered.
Applicant’s remarks [Remarks pages 8-10] traversing the non-eligible subject matter rejections under 35 U.S.C. 101 set forth in the office action mailed 9/19/2025, in view of claims 1-8, 19, and 21-32 as amended, have been considered but are not persuasive.
Applicant alleges that the claims are patent eligible under Step 2A Prong 1 because they do not recite mental processes, that the claims are patent eligible under Step 2A Prong 2 because any alleged abstract idea is integrated into a practical application, and that the claims are patent eligible under Step 2B because they amount to significantly more than the alleged abstract idea.
The examiner respectfully disagrees. Applicant is directed towards the grounds of rejection under 35 U.S.C. 101 with respect to amended claims 1-8, 19, and 21-32 set forth above. Applicant’s arguments are further summarized and addressed below.
Applicant argues [Remarks pages 8-9] that the claims contain limitations that cannot be practically performed in the human mind, that they output a modified computer data structure, and that because process are performed in “real time”, that they are “computationally complex” beyond what can be mentally performed.
In response, the examiner reiterates that a claim being performed on a computer or in a computer environment can still recite a mental process (see MPEP § 2106.04(a)(2)), and that simply invoking computers as tools to perform claim steps does not absolve the claims from abstraction. Applicant has not explained, beyond conclusory assertions, how the limitations at issue represent “computational complexity that is beyond that which can be mentally performed”.
Applicant argues [Remarks page 10] that independent claims provide a first and second subprocess that ultimately transmits a portion of a data segment to a remote device responsive to a query received from the remote device based upon semi-structured knowledge and lacking a known intent of the query, such that network resources are conserved, and processing efficiency of the computer system is increased. This concept of processing efficiency and conservation of resources is allegedly recited in a specific manner that represents a technical improvement over systems that are configured to process and transmit recommended items in an unrestricted manner during the respective session at the website.
In response, the examiner reiterates that simply claiming the improved speed or efficiency inherent with applying an abstract idea on a computer or in a computer environment is inadequate to suggest integration into a practical application or improvement over conventional technology (see MPEP § 2106.05(f)). Applicant’s alleged improvement is set forth in a conclusory manner and without support given from any significant technical details in the claims that would adequately reflect an improvement of computational efficiency. The claimed improvements to processing efficiency and conservation of resources thereby appear to rely on no more than the speed inherent to the conventional receiving and transmitting of data between devices.
Applicant argues [Remarks page 10] that the first and second subprocesses represent an unconventional combination of features that confine the claims to a particular useful application under MPEP § 2106.05(d).
In response, the examiner reiterates that conclusory assertions of an “unconventional combination of features”, without further detail or explanation provided, are inadequate to suggest subject matter eligibility.
Applicant has not presented further arguments with respect to the dependent claims. As such, amended claims 1-8, 19, and 21-32 stand rejected under 35 U.S.C. 101.
Applicant’s remarks traversing the obviousness rejections under 35 U.S.C. 103 set forth in the office action mailed 9/19/2025, in view of claims 1-8, 19, and 21-32 as amended, have been considered but are moot because the new grounds of rejection set forth above does not rely on the reference(s) applied in the prior rejection of record for the subject matter being specifically challenged in applicant' s argument.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Gillioz et al. (“Overview of the Transformer-based Models for NLP Tasks”, available 2020) discloses an overview and explanations of the latest transformer models in the field of natural language processing.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VIJAY M BALAKRISHNAN whose telephone number is (571) 272-0455. The examiner can normally be reached 10am-5pm EST Mon-Thurs.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JENNIFER WELCH can be reached on (571) 272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/V.M.B./
Examiner, Art Unit 2143
/JENNIFER N WELCH/
Supervisory Patent Examiner, Art Unit 2143