Last updated: May 29, 2026
Application No. 17/705,399
RETRIEVAL METHOD, COMPUTER-READABLE RECORDING MEDIUM, AND RETRIEVAL DEVICE

Final Rejection §101§103
Filed
Mar 28, 2022
Priority
Oct 31, 2019 — continuation of PCTJP2019042950
Examiner
KRIANGCHAIVECH, KETTIP
Art Unit
1686
Tech Center
1600 — Biotechnology & Organic Chemistry
Assignee
Fujitsu Limited
OA Round
2 (Final)
This examiner grants 21% of cases after interview

— +32.8% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 48 resolved cases, 2023–2026
Examiner Intelligence

KRIANGCHAIVECH, KETTIP View full profile →
Grants only 21% of cases
Career Allowance Rate
10 granted / 48 resolved
-39.2% vs TC avg
Strong +33% interview lift
Without
With
+32.8%
Interview Lift
resolved cases with interview
Typical timeline
4y 8m
Avg Prosecution
16 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
27.2%
-12.8% vs TC avg
§103
48.6%
+8.6% vs TC avg
§102
7.4%
-32.6% vs TC avg
§112
0.8%
-39.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 48 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. 

Applicant's response, filed on 12/29/2025, has been fully considered.  The following rejections and/or objections are either reiterated or newly applied.  They constitute the complete set presently being applied to the instant application.

Status of Claims
Claims 1-9 are amended. 
Claims 1-9 are pending. 
Claims 1 and 8-9 are independent claims.
Claims 1-9 are examined below.

Priority
As detailed on the 03/31/2022 filing receipt, this application claims domestic priority to as early as 10/31/2019 of PCT/JP2019/042950. 

Drawings
	The drawings filed 03/28/2022 are accepted.

Withdrawn Rejections/Objections
The objection of the disclosure in the Office action mailed 10/01/2025 is withdrawn in view of the amendments filed 12/29/2025. Applicant amended specification to delete the embedded hyperlink.
The rejection of claims 1-9 under 35 U.S.C. §112(a), in the Office action mailed 10/01/2025 is withdrawn in view of the amendments filed 12/29/2025.
The rejection of claims 1-9 under 35 U.S.C. §102(a)(1) as being anticipated by Willett, in the Office action mailed 10/01/2025 is withdrawn in view of the amendments filed 12/29/2025. However, a new rejection is applied.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-9 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
In accordance with MPEP § 2106, claims found to recite statutory subject matter (Step 1: YES) are then analyzed to determine if the claims recite any concepts that equate to an abstract idea, law of nature or natural phenomenon (Step 2A, Prong 1). In the instant application, the claims recite the following limitations that equate to an abstract idea:

Mental processes recited include:
Claims 1 and 8-9 recite:  specifying each compound name included in the input document and the plurality of documents, transforming each compound name to an individual chemical structure, dividing the individual chemical structure into one or more substructures, counting a number of each of substructures in the input document, and counting a number of respective substructures in each of the plurality of documents, generating a substructure vector for the input document based on each substructure and the number of substructures, and a substructure vector for each of the plurality of documents based on each substructure and the number of substructures, the vector representing meaning of respective documents based on meaning of compounds, calculating similarity between the input document and the plurality of documents based on comparison between the substructure vector for the input document and the substructure vector for each of the plurality of documents, and outputting, as a retrieval result, a document with highest similarity to the input document from the plurality of documents. The identified claim limitations involve evaluating, analyzing, judging and organizing data that could be practically performed in the human mind and/or with pen and paper. 
Claim 2 recites: generating of the substructure vector for the input document and the substructure vector for each of the plurality of documents includes generating the vector including information indicating the number of each of the substructures or information indicating whether the number of each of the substructures is zero as a component of the vector. 
Claim 3 recites: counting includes further counting number of each combination of the substructures included in the input document, and the generating includes generating the substructure vector for the input document based on both the number of each of the substructures and the number of each combination of the substructures that are totalized by totalization processing. Counting involves acts of analyzing, judging and evaluating data that can be practically performed in the human mind and/or with pen and paper.
Claim 4 recites counting includes calculating a sum of products between the number of each of the substructures included in each of the compounds and number of each of compound names indicating the compounds included in the input document, as the number of each of the substructures included in the input document.
Claim 5 recites: ... calculating the similarity between the input document and the plurality of documents based on comparison between a vector obtained by weighting the substructure vector for the input document generated in the generating on a basis of appearance frequency of each substructure and the substructure vector for each of the plurality of documents. Calculating and comparing involves acts of analyzing, judging and evaluating data that can be practically performed in the human mind and/or with pen and paper.
Claim 6 recites: ... calculating similarity between the input document and the plurality of documents based on comparison between the substructure vector for the input document and the substructure vectors for each of the plurality of documents and semantic comparison between the input document and the plurality of documents The process of calculating involves analyzing, judging and 
evaluating data that can be practically performed in the human mind and/or with pen and paper.
Claim 7 recites: “…a list of documents included in the plurality of documents in a descending order of the calculated similarity.” Ordering involves analyzing, judging, organizing and evaluating data that can be practically performed in the human mind and/or with pen and paper.

Mathematical concepts recited include:
Claims 1 and 8-9 recite:  counting a number of respective substructures in each of the plurality of documents, generating a substructure vector for the input document based on each substructure and the number of substructures, and a substructure vector for each of the plurality of documents based on each substructure and the number of substructures, the vector representing meaning of respective documents based on meaning of compounds, calculating similarity between the input document and the plurality of documents based on comparison between the substructure vector for the input document and the substructure vector for each of the plurality of documents... Counting, generating substructure vector and calculating similarity are involved with mathematical concepts and formulas and requires performing a series of mathematical calculations.  
Claim 2 recites: generating of the substructure vector for the input document and the substructure vector for each of the plurality of documents includes generating the vector including information indicating the number of each of the substructures or information indicating whether the number of each of the substructures is zero as a component of the vector. Generating substructure vectors are mathematical concepts.
Claim 3 recites: counting includes further counting number of each combination of the substructures included in the input document, and the generating includes generating the substructure vector for the input document based on both the number of each of the substructures and the number of each combination of the substructures that are totalized by totalization processing. Counting, generating substructure vector and totalization process are mathematical concepts.
Claim 4 recites:  counting includes calculating a sum of products between the number of each of the substructures included in each of the compounds and number of each of compound names indicating the compounds included in the input document, as the number of each of the substructures included in the input document. The claim limitation is a mathematical concept and formula and requires performing a series of mathematical calculations.  
Claim 5 recites: calculating includes calculating the similarity between the input document and the plurality of documents based on comparison between a vector obtained by weighting the substructure vector for the input document generated in the generating on a basis of appearance frequency of each substructure and the substructure vector for each of the plurality of documents. .  This limitation involves mathematical concepts that requires performing a series of mathematical calculations.
Claim 6 recites: calculating similarity between the input document and the plurality of documents based on comparison between the substructure vector for the input document and the substructure vectors for each of the plurality of documents and semantic comparison between the input document and the plurality of documents. Calculating similarity is a mathematical concept and/or formula.

 The mental processes of claims 1, 3, and 5-9 include specifying a compound name, transforming a compound name to a chemical structure, counting, comparing, ordering and calculating that involve evaluating, analyzing, judging and organizing data that could be practically performed in the human mind and/or with pen and paper. Therefore, under the broadest reasonable interpretation, the claims can be practically carried out in the human mind or with pen and paper as claimed, which falls under the "Mental processes" grouping of abstract ideas. Although, claims 1-9 recite performing the method as part of a method executed on a computer, there are no additional imitations to indicate that anything other than a generic computer is required. However, merely requiring that the steps are carried out with a generic computer does not negate the mental nature of these steps and equates rather to merely using a computer as a tool to perform the mental process. 
The mathematical concepts of claims 1-6 and 8-9 as discussed above include calculating and generating vectors that are mathematical concepts and require carrying out a series of mathematical calculations. This falls under the “mathematical concepts” grouping of abstract ideas. 
As such, claims 1-9 recite an abstract idea (Step 2A, Prong 1: YES). 

Claims found to recite a judicial exception under Step 2A, Prong 1 are then further analyzed to determine if the claims as a whole integrate the recited judicial exception into a practical application or not (Step 2A, Prong 2). This judicial exception is not integrated into a practical application because the claims do not recite an additional element that reflects an improvement to technology or applies or uses the recited judicial exception in some other meaningful way. Rather, the instant claims recite additional elements that equate to mere instructions to implement an abstract idea or insignificant extra solution activity. Specifically, the instant claims recite the following additional elements:

claims 1 and 8-9: the recited "specifying each compound name included in the input document and the plurality of documents, transforming each compound name to an individual chemical structure…, dividing the individual chemical structure into one or more substructures ... outputting, as a retrieval result, a document with highest similarity to the input document from the plurality of documents." step/element. These limitations equate to mere data gathering and outputting. 
claims 1-7: the recited "A non-transitory computer-readable recording medium having stored therein a program that causes a computer to execute a process" step/element. These elements equate to a generic computer environment. 
Claim 7: the recited “outputting includes displaying, on a display screen, a list of documents…”
claim 8: the recited "a computer" step/element. This element equates to a generic computer environment. 
claim 9: the recited "A retrieval device, comprising: a memory; and a processor coupled to the memory, the processor being configured to execute a process " step/element. These elements equate to a generic computer environment. 

The components identified in claims 1-9 equate to generic computer storage with stored instructions that are executed to implement the abstract idea on a generic computer. These limitations equate to a generic computer environment. The limitations of claims 1 and 7-9, as discussed above also equate to mere data gathering and outputting via generic computer components, such as receiving data at a computer or outputting data, amount to insignificant extra-solution activity (MPEP 2106.05(g)). As such, as currently recited, the claims do not appear to recite an improvement to technology or apply or use the recited judicial exception in some other meaningful way. Therefore, claims 1-9 are directed to an abstract idea (Step 2A, Prong 2: NO).

Claims found to be directed to a judicial exception are then further evaluated to determine if the claims recite an inventive concept that provides significantly more than the judicial exception itself (Step 2B). The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the claims recite additional elements that equate to well-understood, routine and conventional activities, insignificant extra-solution activity or mere instructions to implement the abstract idea on a generic computer. The instant claims recite the following additional elements:
claims 1 and 8-9: the recited "specifying each compound name included in the input document and the plurality of documents, transforming each compound name to an individual chemical structure…, dividing the individual chemical structure into one or more substructures ... outputting, as a retrieval result, a document with highest similarity to the input document from the plurality of documents." step/element. These limitations equate to mere data gathering and outputting. 
claims 1-7: the recited "A non-transitory computer-readable recording medium having stored therein a program that causes a computer to execute a process" step/element. These elements equate to a generic computer environment. 
Claim 7: the recited “outputting includes displaying, on a display screen, a list of documents…”
claim 8: the recited "a computer" step/element. This element equates to a generic computer environment. 
claim 9: the recited "A retrieval device, comprising: a memory; and a processor coupled to the memory, the processor being configured to execute a process " step/element. These elements equate to a generic computer environment. 

Limitations that equate to mere data gathering and outputting via generic computer components, such as receiving data at a computer or outputting data, amount to insignificant extra-solution activity as set forth by the courts in Mayo, 566 U.S. at 79, 101 USPQ2d at 1968 and OIP Techs., Inc, v, Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015). Also, the additional elements include storing and retrieving information in memory. Also, the use of a computer or other machinery in its ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) does not integrate a judicial exception into a practical application or provide significantly more as identified by the courts in Affinity Labs v. DirecTV, 838 F.3d 1253, 1262, 120 USPQ2d 1201, 1207 (Fed. Cir. 2016) (cellular telephone); TLI Communications LLC v. AV Auto, LLC, 823 F.3d 607, 613, 118 USPQ2d 1744, 1748 (Fed. Cir. 2016) (computer server and telephone unit).  Overall, the additional elements do not comprise an inventive concept when considered individually or as an ordered combination that transforms the claimed judicial exception into a patent-eligible application of the judicial exception. Therefore, the claims do not amount to significantly more than the judicial exception itself (Step 2B: No). As such, claims 1-9 are not patent eligible.


Response to 35 USC § 101 Arguments (Remarks filed 12/29/2025, pages 10-12) 
Applicant amended claims 1-9.
Applicant states that the claims are amended to include:
1. Extracting compound names from a document and mapping them to specific chemical structures.  
2. Dividing those structures into substructures and generating high-dimensional vectors. 
3. Calculating similarity scores. 
Applicant references the August 4, 2025, USPTO Memorandum. Applicant states that the human mind is not equipped to perform high-dimensional data processing in real-time or across vast databases. Applicant also states that the human mind cannot perform the steps of amended claims indicated above for hundreds of thousands of documents.  

In response, Applicant’s arguments under Step 2A, Prong 1 are not persuasive because there is no indication in the claims that the process or amount of data is too complicated or too large to be performed by the human mind and/or with pen and paper. Therefore, independent claims 1 and 8-9 recite mental processes and mathematical concepts as discussed in the 101 rejection section above.


Applicant further states that even if the claims were found to involve an abstract idea, the abstract ideas are integrated into a practical application by providing a technological solution to a technological problem. Applicant indicates that paragraph [0008] of the instant Specification discloses that in the chemical field, one compound may have dozens of different names (synonyms), and there are approximately one hundred million kinds of compound names. Applicant discusses that traditional keyword search fails because it cannot account for these semantic variations. Applicant also mentions that the claims are directed to using a substructure vector to represent the chemical characteristics of a document regardless of the specific nomenclature used. Applicant indicates that paragraph [0015] of the instant Specification, discloses the standardization of diverse nomenclature into a vector space, which improves the functioning of the retrieval device itself, i.e., making it more accurate. Applicant asserts that the claims provide an inventive concept because they recite a specific ordered combination of steps, e.g., extracting names, transforming them into substructures, vectorizing those substructures, and calculating similarity, that is not well-understood, routine, or conventional in the field of general document retrieval. 

In response, Applicant’s arguments under Step 2A Prong 2 of the 101 analysis regarding improvement have been fully considered and are not persuasive. Applicant’s argument of improvement is a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art. From the asserted improvement, it is not clear how the claimed invention improves over existing technology and it is also not clear how one would gauge the improvement since there are no metrics for comparison between the claimed technology and previous technology. Overall, one of ordinary skill in the art cannot gauge whether the improvements asserted are delivered by the claims because the details provided in the specification do not provide sufficient details such that the improvement would be apparent, do not explain the details of an unconventional technical solution expressed in the claim, or identify technical improvements realized by the claim over the prior art.  As stated in MPEP 2106.05(a) and MPEP 2106.04(d), the disclosure must provide sufficient details such that one of ordinary skill in the art would recognize the claimed invention as providing an improvement. Furthermore, if the specification explicitly sets forth an improvement but in a conclusory manner (i.e., a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art), the examiner should not determine the claim improves technology. An indication that the claimed invention provides an improvement can include a discussion in the specification that identifies a technical problem and explains the details of an unconventional technical solution expressed in the claim, or identifies technical improvements realized by the claim over the prior art. (see MPEP 2106.05(a) and MPEP 2106.04(d)). 
Under step 2B of the 101 analysis the claims are evaluated to determine whether the claims recite a non-conventional arrangement of additional elements (i.e. elements in addition to any JE). As indicated in the 101 rejection section above, the additional elements of independent claims 1 and 8-9 include generic computer components, inputting data and outputting data. The inputting of data or document to be processed by extracting names, transforming them into substructures, vectorizing those substructures, and calculating similarity are pre-solutional activities and outputting of data in the form of a list of similar documents are post-solution activity, which together amount to insignificant extra solutional activities. The post-solutional activity of outputting a list of similar documents is an element that is not integrated into the claim as a whole.  MPEP 2106.05(g)) provides an example similar to the claimed process below:

An example of pre-solution activity is a step of gathering data for use in a claimed process, e.g., a step of obtaining information about credit card transactions, which is recited as part of a claimed process of analyzing and manipulating the gathered information by a series of steps in order to detect whether the transactions were fraudulent. An example of post-solution activity is an element that is not integrated into the claim as a whole, e.g., a printer that is used to output a report of fraudulent transactions, which is recited in a claim to a computer programmed to analyze and manipulate information about credit card transactions in order to detect whether the transactions were fraudulent. (MPEP 2106.05(g), paragraph 1).

Also, text extraction, calculating similarities between documents and generating vectors are known methods as disclosed by Bao ("A fast document copy detection model." Soft Computing 10.1 (2006): 41-46.; cited on the attached 892 form). Bao discloses “Text similarity measure is a common issue in Information Retrieval, Text Mining, Web Mining, Text Classification/Clustering and Document Copy Detection etc. The most popular approach is word frequency based scheme, which uses a word frequency vector to represent a document. Cosine function, dot product and proportion function are regular similarity measures of vector.” (Bao, Abstract). Therefore, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the claims recite additional elements that equate to well-understood, routine and conventional activities, insignificant extra-solution activity or mere instructions to implement the abstract idea on a generic computer. As discussed in the 101 rejection section above, the additional elements equate to mere data gathering and outputting via generic computer components, such as receiving data at a computer or outputting data, amount to insignificant extra-solution activity as set forth by the courts in Mayo, 566 U.S. at 79, 101 USPQ2d at 1968 and OIP Techs., Inc, v, Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1092-93 (Fed. Cir. 2015). Also, the use of a computer or other machinery in its ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general purpose computer or computer components after the fact to an abstract idea (e.g., a fundamental economic practice or mathematical equation) does not integrate a judicial exception into a practical application or provide significantly more as identified by the courts in Affinity Labs v. DirecTV, 838 F.3d 1253, 1262, 120 USPQ2d 1201, 1207 (Fed. Cir. 2016) (cellular telephone); TLI Communications LLC v. AV Auto, LLC, 823 F.3d 607, 613, 118 USPQ2d 1744, 1748 (Fed. Cir. 2016) (computer server and telephone unit).  Overall, the additional elements do not comprise an inventive concept when considered individually or as an ordered combination that transforms the claimed judicial exception into a patent-eligible application of the judicial exception.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-2, 6-9, 11-12 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Hull ("Latent semantic structure indexing (LaSSI) for defining chemical similarity." Journal of medicinal chemistry 44.8 (2001): 1177-1184.; cited on the attached “Notice of References Cited” form 892), in view of Snedden ("Improving Search Ranking Using a Composite Scoring Approach." (2017).; cited on the attached “Notice of References Cited” form 892) and Lowe ("Chemical name to structure: OPSIN, an open source solution." (2011): 739-753.; cited on the attached “Notice of References Cited” form 892). 

Regarding independent claims 1 and 8-9, 
Hull teaches the non-transitory computer-readable recording medium having stored therein a program in claim 1; a computer in claim 8 and a memory; and a processor coupled to the memory in claim 9 with “There are two distinct phases of processing: (1) constructing a LaSSI version of a chemical database and (2) calculating the similarity of the molecules of the LaSSI database to the probe molecule(s). The first phase is computationally expensive. However, it only needs to be performed once to create the database. The second phase, on the other hand, can be accomplished very quickly - a search of an average-sized database (~105 molecules) can be performed in under 1 min on a modest computer workstation.” (page 1180, col. 2, para. 2). The computer workstation as taught by Hull would include a non-transitory computer-readable recording medium, a memory and a processor.
Hull teaches the claim limitation of dividing the individual chemical structure into one or more substructures, counting a number of each of substructures in the input document, and counting a number of respective substructures in each of the plurality of documents generating a substructure vector for the input document based on each substructure and the number of substructures, and a substructure vector for each of the plurality of documents based on each substructure and the number of substructures, the vector representing meaning of respective documents based on meaning of compounds with “One very practical approach to describing molecules is the vector space model popularized by Willett. This method involves representing a molecule as a set of 2D or 3Dsubstructures and their frequencies. Sometimes only the presence or absence of a substructure is noted, as in molecular fingerprint methods.” (page 1177, col. 2, para. 2) and “Atom pairs (APs) are substructures… All of the APs and/or TTs in a molecule are counted to form a frequency vector.” (page 1179, col. 2, para. 3). The recited “substructure vector” corresponds to the frequency vector as taught by Hull.

	Hull does not explicitly teach calculating similarity between the input document and the plurality of documents based on comparison between the substructure vector for the input document and the substructure vector for each of the plurality of documents and outputting, as a retrieval result, a document with highest similarity to the input document from the plurality of documents. However, this limitation is taught by Snedden.
Hull also does not teach specifying each compound name included in the input document and the plurality of documents, transforming each compound name to an individual chemical structure. However, this limitation is taught by Lowe.

Snedden teaches the claim limitation of calculating similarity between the input document and the plurality of documents based on comparison between the substructure vector for the input document and the substructure vector for each of the plurality of documents and outputting, as a retrieval result, a document with highest similarity to the input document from the plurality of documents with “Document 1 can be vector x represented here as x (3, 0, 1) while Document 2 can be represented as y (2, 1, 1). These two vectors are combined computing an inner-dot product. 𝑉𝑥×𝑉𝑦 = 3×2+0 ×1+1 ×1 =7. Now the absolute value of the vectors is determined: 𝑥 = 3 +0 +1 = 3.16, 𝑦 = 2 +1 +1 = 2.45. These two absolute values are multiplied and divided by the inner-dot product shown here: 7 3.16∗2.45 = 7 / 7.742 = .90. The closer the number is to one, the more similar the two documents are while the closer they are to zero means they are exactly orthogonal to each other with no similarity at all. This allows for a ranking of documents by a numerical score based on how similar they are to each other or to a search query represented as a term frequency vector.” (Page 12 to page 13, para. 1).

Lowe teaches the claim limitation of specifying each compound name included in the input document and the plurality of documents, transforming each compound name to an individual chemical structure with Fig. 1. Fig. 1 depicts Components of OPSIN’s workflow, showing the process from chemical name through to a structure.

It would have been prima facia obvious to combine the teachings of Hull, Snedden and Lowe to arrive at the claimed invention.  A person of ordinary skill in the art would have been motivated to modify the method of Hull to include comparing documents for similarity utilizing substructure vector and outputting the document with the highest similarity as taught by Snedden to better compare and detect similar documents. A person of ordinary skill in the art would have also been motivated to modify the method of Hull to include converting chemical name to chemical structures as taught by Lowe because Lowe’s method converts chemical name to structure quickly and accurately (Lowe, page 752, col. 1, para. 1). Furthermore, there would have been a reasonable expectation of success, since Hull and Snedden use frequency vectors to compare documents and Hull and Lowe teach methods that pertain to identifying chemicals in documents.  

Regarding claim 2, Hull teaches the claim limitation of wherein the generating of the substructure vector for the input document and the substructure vector for each of the plurality of documents includes generating the vector including information indicating the number of each of the substructures or information indicating whether the number of each of the substructures is zero as a component of the vector  with “One very practical approach to describing molecules is the vector space model popularized by Willett. This method involves representing a molecule as a set of 2D or 3Dsubstructures and their frequencies. Sometimes only the presence or absence of a substructure is noted, as in molecular fingerprint methods.” (page 1177, col. 2, para. 2) and “Atom pairs (APs) are substructures… All of the APs and/or TTs in a molecule are counted to form a frequency vector.” (page 1179, col. 2, para. 3). The recited “substructure vector” corresponds to the frequency vector as taught by Hull.

Regarding claim 3, Hull teaches the claim limitation of wherein the counting includes further counting number of each combination of the substructures included in the input document, and the generating includes generating the substructure vector for the input document based on both the number of each of the substructures and the number of each combination of the substructures that are totalized by totalization processing with “A collection of molecules in a chemical database is initially represented as a set of vectors, where each vector vi) (d1i, d2i, ..., dni)T consists of the nonnegative frequency of occurrence of each descriptor j in molecule i and where n is the total number of uniquely occurring descriptors in the entire set of molecules.” (page 1178, col. 1, para. 4)

Regarding claim 4, Hull teaches the claim limitation of wherein the counting includes calculating a sum of products between the number of each of the substructures included in each of the compounds with “A collection of molecules in a chemical database is initially represented as a set of vectors, where each vector vi) (d1i, d2i, ..., dni)T consists of the nonnegative frequency of occurrence of each descriptor j in molecule i and where n is the total number of uniquely occurring descriptors in the entire set of molecules.” (page 1178, col. 1, para. 4).
Hull does not teach the claim limitation of number of each of compound names indicating the compounds included in the input document, as the number of each of the substructures included in the input document in claim 4. However, this limitation is taught by Lowe. 
Lowe teaches the claim limitation of number of each of compound names indicating the compounds included in the input document, as the number of each of the substructures included in the input document in claim 4 with Table 2 (pages 742-743). Table 2 teaches the frequency of token character. The recited compound name corresponds to the token character as taught by Lowe. 

Hull does not explicitly teach wherein the calculating includes calculating the similarity between the input document and the plurality of documents based on comparison between a vector obtained by weighting the substructure vector for the input document generated in the generating on a basis of appearance frequency of each substructure and the substructure vector for each of the plurality of documents of claim 5; wherein the outputting includes calculating similarity between the input document and the plurality of documents based on comparison between the substructure vector for the input document and the substructure vectors for each of the plurality of documents and semantic comparison between the input document and the plurality of documents of claim 6 and wherein the outputting a list of documents included in the plurality of documents in a descending order of the calculated similarity of claim 7. However, these limitations are taught by Snedden.

Regarding claim 5, Snedden teaches the claim limitation of wherein the calculating includes calculating the similarity between the input document and the plurality of documents based on comparison between a vector obtained by weighting the substructure vector for the input document generated in the generating on a basis of appearance frequency of each substructure and the substructure vector for each of the plurality of documents with “Document 1 can be vector x represented here as x (3, 0, 1) while Document 2 can be represented as y (2, 1, 1). These two vectors are combined computing an inner-dot product. 𝑉𝑥×𝑉𝑦 = 3×2+0 ×1+1 ×1 =7. Now the absolute value of the vectors is determined: 𝑥 = 3 +0 +1 = 3.16, 𝑦 = 2 +1 +1 = 2.45. These two absolute values are multiplied and divided by the inner-dot product shown here: 7 3.16∗2.45 = 7 / 7.742 = .90. The closer the number is to one, the more similar the two documents are while the closer they are to zero means they are exactly orthogonal to each other with no similarity at all. This allows for a ranking of documents by a numerical score based on how similar they are to each other or to a search query represented as a term frequency vector.” (Page 12 to page 13, para. 1). The recited “weighting of the substructure vectors” corresponds to the absolute value of the vectors as taught by Snedden.

Regarding claim 6, Snedden teaches the claim limitation of wherein the outputting includes calculating similarity between the input document and the plurality of documents based on comparison between the substructure vector for the input document and the substructure vectors for each of the plurality of documents with “Document 1 can be vector x represented here as x (3, 0, 1) while Document 2 can be represented as y (2, 1, 1). These two vectors are combined computing an inner-dot product. 𝑉𝑥×𝑉𝑦 = 3×2+0 ×1+1 ×1 =7. Now the absolute value of the vectors is determined: 𝑥 = 3 +0 +1 = 3.16, 𝑦 = 2 +1 +1 = 2.45. These two absolute values are multiplied and divided by the inner-dot product shown here: 7 3.16∗2.45 = 7 / 7.742 = .90. The closer the number is to one, the more similar the two documents are while the closer they are to zero means they are exactly orthogonal to each other with no similarity at all. This allows for a ranking of documents by a numerical score based on how similar they are to each other or to a search query represented as a term frequency vector.” (Page 12 to page 13, para. 1). 
Snedden teaches the claim limitation of semantic comparison between the input document and the plurality of documents with “The final user selectable feature involved with applying user semantic meaning to ideas is named “phrase slop”. This effects how the Composite Scorer scores during the SpanQuery phase. It is not used in the TF/IDF scoring. This determines the maximum number of terms allowed between matching terms during scoring. The Composite Scoring Method uses a phrase query api call that allows the scorer to use the “phrase slop” value. The idea behind its use is that the closer terms are to each other from a multiterm phrase, the more likely the relationship to one another semantically. A large amount of phrase slop would result in higher retrieval but lower accuracy while a smaller might result in a missed concept or idea. This is influenced in the written structure of the documents as discussed in section 2.5 Structure. (page 46, para. 1).

Regarding claim 7, 
Hull teaches outputting includes displaying, on a display screen with “There are two distinct phases of processing: (1) constructing a LaSSI version of a chemical database and (2) calculating the similarity of the molecules of the LaSSI database to the probe molecule(s). The first phase is computationally expensive. However, it only needs to be performed once to create the database. The second phase, on the other hand, can be accomplished very quickly - a search of an average-sized database (~105 molecules) can be performed in under 1 min on a modest computer workstation.” (page 1180, col. 2, para. 2). The computer workstation as taught by Hull would include a display screen for displaying the outputs. 
Snedden teaches the claim limitation of wherein the outputting includes displaying, on a display screen, a list of documents included in the plurality of documents in a descending order of the calculated similarity with “The closer the number is to one, the more similar the two documents are while the closer they are to zero means they are exactly orthogonal to each other with no similarity at all. This allows for a ranking of documents by a numerical score based on how similar they are to each other or to a search query represented as a term frequency vector.” (Bottom Page 12 to page 13, para. 1).

It would have been prima facia obvious to combine the teachings of Hull, Snedden and Lowe to arrive at the claimed invention. A person of ordinary skill in the art would have been motivated to modify the method of Hull to include comparing and calculating document similarity as taught by Snedden to better compare and detect similar documents. A person of ordinary skill in the art would have also been motivated to modify the method of Hull to include outputting the document with the highest similarity as taught by Snedden for the advantage of being able to quickly identify the most relevant document. A person of ordinary skill in the art would have also been motivated to modify the method of Hull to include determining the number of compound names in a document as taught by Lowe for the purpose of identifying relevant documents. Furthermore, there would have been a reasonable expectation of success, since Hull and Snedden use frequency vectors to compare documents and Hull and Lowe teach methods that pertain to identifying chemicals in documents

Response to 35 USC § 102 Arguments (Remarks filed 12/29/2025, pages 12-13) 
Applicant amended claims 1-9. 
It is noted that Applicant’s remarks are based on amended claims.
	Applicant explains that the cited art, Willett is directed to comparing descriptors of a "target structure" against a database. Applicant states that Willett does not teach the specific claimed mechanism of using vectors based on respective substructures in the chemical structure of a compound to retrieve a document including the compound. Applicant also states that Willett fails to disclose or suggest all of the limitations recited in the amended independent claims because Willett lacks the specific transformation of names to meaning- based substructure vectors.

	In response, Applicants arguments have been fully considered and are persuasive. Therefore, the rejections have been withdrawn.  However, upon further consideration, a new ground of rejection is made in view of claim amendments as discussed above.

Conclusion
No claims are allowed.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KETTIP KRIANGCHAIVECH whose telephone number is (571)272-1735. The examiner can normally be reached 8:30am-5:00pm EDT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Larry D. Riggs can be reached at (571) 270-3062. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/K.K./Examiner, Art Unit 1686    

/LARRY D RIGGS II/Supervisory Patent Examiner, Art Unit 1686
Read full office action
Prosecution Timeline

Mar 28, 2022
Application Filed
May 31, 2023
Response after Non-Final Action
Oct 01, 2025
Non-Final Rejection mailed — §101, §103
Dec 29, 2025
Response Filed
Feb 24, 2026
Final Rejection mailed — §101, §103
May 15, 2026
Interview Requested
Precedent Cases

Applications granted by this same examiner with similar technology

16/988,965
Patent 12597484
TRAIT PREDICTION COORDINATION FOR GENOMIC APPLICATION ENVIRONMENT
5y 8m to grant Granted Apr 07, 2026
18/513,357
Patent 12584844
FLOW CYTOMETRY IMMUNOPROFILING OF PERIPHERAL BLOOD
2y 4m to grant Granted Mar 24, 2026
16/631,405
Patent 12512185
DNA-BASED DATA STORAGE AND RETRIEVAL
5y 11m to grant Granted Dec 30, 2025
16/347,104
Patent 12415981
AUTOMATED COLLECTION OF A SPECIFIED NUMBER OF CELLS
6y 4m to grant Granted Sep 16, 2025
16/237,959
Patent 12364989
HIGH THROUGHPUT METHOD AND SYSTEM FOR ANALYZING THE EFFECTS OF AGENTS ON PLANARIA
6y 6m to grant Granted Jul 22, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
21%
Grant Probability
54%
With Interview (+32.8%)
4y 8m (~6m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 48 resolved cases by this examiner. Grant probability derived from career allowance rate.