Prosecution Insights
Last updated: April 19, 2026
Application No. 18/750,828

GENERATING LARGE-LANGUAGE-MODEL COMPATIBLE SEQUENTIAL ATTACHMENT-BASED FRAGMENT EMBEDDING MOLECULAR REPRESENTATIONS

Non-Final OA §101§102§103
Filed
Jun 21, 2024
Examiner
VILLENA, MARK
Art Unit
2658
Tech Center
2600 — Communications
Assignee
Recursion Pharmaceuticals Inc.
OA Round
1 (Non-Final)
70%
Grant Probability
Favorable
1-2
OA Rounds
3y 10m
To Grant
85%
With Interview

Examiner Intelligence

Grants 70% — above average
70%
Career Allow Rate
334 granted / 478 resolved
+7.9% vs TC avg
Strong +16% interview lift
Without
With
+15.5%
Interview Lift
resolved cases with interview
Typical timeline
3y 10m
Avg Prosecution
22 currently pending
Career history
500
Total Applications
across all art units

Statute-Specific Performance

§101
13.7%
-26.3% vs TC avg
§103
51.5%
+11.5% vs TC avg
§102
20.4%
-19.6% vs TC avg
§112
5.0%
-35.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 478 resolved cases

Office Action

§101 §102 §103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Information Disclosure Statement The information disclosure statement (IDS) submitted on 07/25/2024 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner. Drawings The drawings were submitted on 06/21/2024. These drawings are reviewed and accepted by the examiner. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim(s) recite(s) steps for generating a molecular string representation. This judicial exception is not integrated into a practical application because no additional elements are recited outside of the method steps. Step 2A, Prong One: The claims recite mathematical concepts. For example, in claim 1: “identifying a molecular string representation”, “generating a set of fragments”, “generating a sequential attachment-based fragment embedding”, “concatenating fragments”, and “generating ring link characters” (see also claims 2-20). Limitations such as generating an embedding molecular string representation and concatenating fragments recite mathematical relationships and calculations. The claims recite mental processes, namely, observations/evaluations and decisions that could be performed conceptually in the human mind, including identifying connections between atom representations, generating fragments, and generating rink link characters. Step 2A, Prong Two: The claims are “computer-implemented” and include steps like “identifying a molecular string representation”, “generating a set of fragments”, “concatenating fragments,” and “generating ring link characters.” These are generic computer functions involving data gathering, processing (mathematical), decision-making, and output. Merely applying an abstract idea on a generic computer or using conventional speech recognition does not integrate the exception into a practical application. See Alice Corp. v. CLS Bank Int’l, 573 U.S. 208 (2014); Credit Acceptance Corp. v. Westlake Servs., 859 F.3d 1044 (Fed. Cir. 2017). The claims do not recite an improvement to the functioning of the computer or to another technology/technical field. There is no recitation of a specific, technological improvement. Step 2B: Beyond the abstract ideas, the claims recite generic computer implementation: identifying a string representation, fragmenting data, generating an embedding, concatenating fragmented data, and generating characters. The specification, as reflected by the claim language, does not require any unconventional hardware or a particular machine. Claim Rejections - 35 USC § 102 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claim(s) 1, 3-7, 12, and 14-19 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Cheng et al. (“Group SELFIES: a robust fragment-based molecular string representation”, 2023). Regarding claims 1, 12, and 16, Cheng teaches: “identifying a molecular string representation comprising ring structure identifiers that indicate virtual connections between atom representations of a molecular compound” (pg. 750, 3.1 SELFIES framework; ‘The encoder takes in a molecule and converts it to a SELFIES string, and the decoder takes in a SELFIES string and converts it to a molecule.’); “generating a set of fragments from the molecular string representation” (pg. 752, 3.5 Determining fragments; ‘Fragments can also be obtained from several fragment libraries found in the literature.38–40 Generally, a useful set of groups will appear in many molecules in the dataset and replace many atoms, with similar fragments merged together to reduce redundancy.’); and “generating a sequential attachment-based fragment embedding (SAFE) molecular string representation that represents the molecular string representation as an order agnostic sequence of interconnected fragment blocks” (pg. 751, Fig. 2; ‘Bottom: celecoxib represented in Group SELFIES. Tokens are colored by the groups and atoms they refer to. Index overloads are shown where interpreted. Colored arrows indicate how the decoder navigates around the attachment points of the groups.’) PNG media_image1.png 378 1244 media_image1.png Greyscale by: “concatenating fragments from the set of fragments utilizing a separation character between the fragments to generate a linked fragment string” (pg. 750, 3.3 Groups; ‘We call this dictionary a “group set”, and every group set denes its own distinct instance of Group SELFIES. In particular, the decoder will not recognize a Group SELFIES string that contains group tokens not present in the current group set.’); and “generating ring link characters in the linked fragment string to represent attachment points for fragment links” (pg. 751, top left; ‘To distinguish group tokens from other tokens, we include a : character at the front of the token (e.g. [:1parabenzene]). All group tokens are of the form [:S<group-name>], where S is the starting attachment index of the group, and <group-name> is any alphanumeric string that does not contain dashes or start with a number.’). Regarding claims 3 (dep. on claim 1) and 14 (dep. on claim 12), Cheng further teaches: “generating the linked fragment string by ordering the fragments from the set of fragments based on fragment size” (pg. 750, 3.2 Basic tokens in group SELFIES; ‘The next X tokens immediately following [RingX] will be interpreted as a number N, and we will count backwards N atoms in placement order to determine the target of the ring bond.’). Regarding claims 4 (dep. on claim 1), 15, (dep. on claim 12), and 17 (dep. on claim 16), Cheng further teaches: “generating the SAFE molecular string representation by: extracting attachment point indicators from the molecular string representation; and utilizing the attachment point indicators to generate the linked fragment string” (pg. 751, Fig. 2; ‘Bottom: celecoxib represented in Group SELFIES. Tokens are colored by the groups and atoms they refer to. Index overloads are shown where interpreted. Colored arrows indicate how the decoder navigates around the attachment points of the groups.’). Regarding claim 5 (dep. on claim 4), Cheng further teaches: “generating the SAFE molecular string representation by replacing the attachment point indicators in the linked fragment string with the ring link characters” (pg. 751, Fig. 2; ‘Bottom: celecoxib represented in Group SELFIES. Tokens are colored by the groups and atoms they refer to. Index overloads are shown where interpreted. Colored arrows indicate how the decoder navigates around the attachment points of the groups.’). Regarding claims 6 (dep. on claim 1) and 18 (dep. on claim 16), Cheng further teaches: “generating an additional SAFE molecular string representation from the SAFE molecular string representation by reordering fragment blocks comprising the fragments and the ring link characters, wherein the additional SAFE molecular string representation represents the molecular string representation” (pg. 750, 3.1 SELFIES framework; ‘For instance, when decoding [C][O][]C], adding []C] would exceed the valency of [O], so SELFIES changes the bond order and adds [C] instead.’; 3.2; ‘All tokens except [pop] can be modied by adding =, #, \ or / to change the bond order or stereochemistry of their parent bond (e.g. [#Branch] or [/C]).’). Regarding claims 7 (dep. on claim 1) and 19 (dep. on claim 16), Cheng further teaches: “wherein the ring link characters comprise ring digits” (pg. 751, top left; ‘To distinguish group tokens from other tokens, we include a : character at the front of the token (e.g. [:1parabenzene]). All group tokens are of the form [:S<group-name>], where S is the starting attachment index of the group, and <group-name> is any alphanumeric string that does not contain dashes or start with a number.’). Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 2 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Cheng in view of Arus-Pous et al. (“SMILES‑based deep generative scaffold decorator for de‑novo drug design,” 2020). Regarding claims 2 (dep. on claim 1) and 13 (dep. on claim 12), Cheng does not expressly teach: “generating the set of fragments by utilizing a bond slicing algorithm with the molecular string representation.” Arus-Pous teaches: “generating the set of fragments by utilizing a bond slicing algorithm with the molecular string representation” (pg. 2, top, right col.; ‘The second experiment instead used a subset of drug-like molecules in ChEMBL, which was exhaustively sliced using the same algorithm but restricting the acyclic bonds to cut to those that complied with the synthetic chemistry-based RECAP [35] rules.’). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the Cheng’s molecule grouping by incorporating Arus-Pous’s slicing in order to generate training sets that help generative models generalize for a wide range of scaffolds. (Arus-Pous: pg. 2, left col., bottom par.). Claim(s) 8-11 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Cheng in view of Qian et al. (“Can Large Language Models Empower Molecular Property Prediction?”, 2023). Regarding claims 8 (dep. on claim 1) and 20 (dep. on claim 16), Cheng does not expressly teach large language models, as in: “generating, utilizing a large language model from the SAFE molecular string representation, an additional SAFE molecular string representation representing an additional molecular compound.” Qian teaches: “generating, utilizing a large language model from the SAFE molecular string representation, an additional SAFE molecular string representation representing an additional molecular compound” (pg. 2, left col., top paragraph; ‘Then, we propose a novel molecular representation called Captions as new Representation (CaR), which leverages ChatGPT to generate informative and professional textual analyses for SMILES. Then the textual explanation can serving as new representation for molecules, as illustrated in Figure 1.’). Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Cheng’s molecular string representation by incorporating Qian’s large language model in order to generate informative and professional textual analyses for SMILES. The combination provides textual descriptions which are meaningful for assisting in molecular-related tasks. (Qian: pg. 2, left col., top paragraph) Regarding claim 9 (dep. on claim 1), the combination of Cheng in view of Qian further teaches: “generating, utilizing a large language model from the SAFE molecular string representation, a complete SAFE molecular compound sequence representation from a partial SAFE molecular compound sequence representation” (Qian: pg. 2, left col., top paragraph; ‘Then, we propose a novel molecular representation called Captions as new Representation (CaR), which leverages ChatGPT to generate informative and professional textual analyses for SMILES. Then the textual explanation can serving as new representation for molecules, as illustrated in Figure 1.’). Regarding claim 10 (dep. on claim 1), the combination of Cheng in view of Qian further teaches: “generating, utilizing a large language model from the SAFE molecular string representation, a linking SAFE molecular string representation for two or more molecular compound sequences” (Cheng: pg. 750, 3.3 Groups; ‘Each group is dened as a set of atoms and bonds representing the molecular group with its attachment points, indicating how the group can participate in bonding.’; Qian: pg. 2, left col., top paragraph; ‘Then, we propose a novel molecular representation called Captions as new Representation (CaR), which leverages ChatGPT to generate informative and professional textual analyses for SMILES. Then the textual explanation can serving as new representation for molecules, as illustrated in Figure 1.’). Regarding claim 11 (dep. on claim 1), the combination of Cheng in view of Qian further teaches: “generating, utilizing a large language model from the SAFE molecular string representation, a molecular compound sequence based on one or more target molecule compound constraints” (Cheng: pg. 750, 2.2 Genetic programming; ‘A string representation of molecules such as SELFIES can be thought of as a programming language where programs specify how to construct molecules. Genetic programming35 uses genetic algorithms to design programs that fulll desired constraints.’; Qian: pg. 2, left col., top paragraph; ‘Then, we propose a novel molecular representation called Captions as new Representation (CaR), which leverages ChatGPT to generate informative and professional textual analyses for SMILES. Then the textual explanation can serving as new representation for molecules, as illustrated in Figure 1.’). Conclusion Other pertinent prior art are cited in the PTO-892 for the applicant's consideration. Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARK VILLENA whose telephone number is (571)270-3191. The examiner can normally be reached 10 am - 6pm EST Monday through Friday. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached at (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. MARK . VILLENA Examiner Art Unit 2658 /MARK VILLENA/ Examiner, Art Unit 2658
Read full office action

Prosecution Timeline

Jun 21, 2024
Application Filed
Mar 06, 2026
Non-Final Rejection — §101, §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12591407
ROBUST VOICE ACTIVITY DETECTOR SYSTEM FOR USE WITH AN EARPHONE
2y 5m to grant Granted Mar 31, 2026
Patent 12592232
SYSTEMS, METHODS, AND APPARATUSES FOR DETECTING AI MASKING USING PERSISTENT RESPONSE TESTING IN AN ELECTRONIC ENVIRONMENT
2y 5m to grant Granted Mar 31, 2026
Patent 12586581
ELECTRONIC DEVICE CONTROL METHOD AND APPARATUS
2y 5m to grant Granted Mar 24, 2026
Patent 12578922
Natural Language Processing Platform For Automated Event Analysis, Translation, and Transcription Verification
2y 5m to grant Granted Mar 17, 2026
Patent 12573394
ESTIMATION METHOD, RECORDING MEDIUM, AND ESTIMATION DEVICE
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
70%
Grant Probability
85%
With Interview (+15.5%)
3y 10m
Median Time to Grant
Low
PTA Risk
Based on 478 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month