Prosecution Insights
Last updated: April 19, 2026
Application No. 18/749,483

TOKENIZING PROGRAMMING CODE WITH CANONICAL REPRESENTATIONS

Non-Final OA §102
Filed
Jun 20, 2024
Examiner
AGUSTIN, PETER VINCENT
Art Unit
2688
Tech Center
2600 — Communications
Assignee
Aurora Labs Ltd.
OA Round
1 (Non-Final)
84%
Grant Probability
Favorable
1-2
OA Rounds
1y 11m
To Grant
95%
With Interview

Examiner Intelligence

Grants 84% — above average
84%
Career Allow Rate
725 granted / 864 resolved
+21.9% vs TC avg
Moderate +12% lift
Without
With
+11.5%
Interview Lift
resolved cases with interview
Fast prosecutor
1y 11m
Avg Prosecution
5 currently pending
Career history
869
Total Applications
across all art units

Statute-Specific Performance

§101
3.5%
-36.5% vs TC avg
§103
31.7%
-8.3% vs TC avg
§102
40.6%
+0.6% vs TC avg
§112
13.0%
-27.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 864 resolved cases

Office Action

§102
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 102 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claims 1-10, 12-21, 23 & 24 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ben-Artzi et al. (US 2009/0313613). In regard to claim 1, Ben-Artzi et al. discloses a non-transitory computer-readable medium (see paragraph 0010) including instructions that, when executed by at least one processor, cause the at least one processor to perform operations for creating and using tokens representing portions of programming code (see abstract), the operations comprising: identifying a body of programming code (paragraph 0022: “source programming language code”); associating a plurality of tokens with respective portions of the body of programming code, wherein the associating comprises determining at least one canonical representation of at least one of the respective portions of the body of programming code (paragraph 0024: “Tokenizer 106 transforms streams of characters from the source programming language code into a list of tokens.”; see also paragraph 0033); configuring model input data for a code language processing model, wherein the model input data comprises the plurality of tokens including the at least one canonical representation (paragraph 0035: “In an embodiment of the invention, list of tokens 406 comprises columns of token list 408a and token type list 408b. Token list 408a comprises the tokens generated from input stream 402 and the token type list 408b comprises the description for the type of tokens. Tokens in list of tokens 406 are categorized block of text. Referring to list of tokens 406, the token `Sum` in tokens 408a is defined by tokenizer 106 as an `identifier` in type 408b. Similarly, the complete programming code of the source programming language can be processed to form a list of tokens. Subsequently, list of tokens 406 is processed by parser 108 to generate structured information.”); and analyzing at least a part of the body of programming code using the code language processing model influenced by the model input data (paragraph 0050: “the list of token is analyzed syntactically by parser 108 to generate a grammatical data structure, at step 804. In an embodiment of the invention, the grammatical data structure is a hierarchical data structure and is referred to as an Abstract Syntax Tree (AST). Thereafter, at step 806, the AST is processed by generator 110 to generate a document object model. Document object model is a simplified grammatical data structure in a hierarchical data structure format. Subsequently, the document object model is processed by analyzer 112 to generate a target list of tokens. The target list of tokens is thereafter processed by analyzer 112 to generate the target programming language code, at step 808”). In regard to claim 2, Ben-Artzi et al. discloses that determining the at least one canonical representation comprises determining the at least one canonical representation from among a plurality of canonical representations, each of the canonical representations representing multiple programming code elements (see paragraph 0035). In regard to claim 3, Ben-Artzi et al. discloses that the multiple programming code elements are associated with different programming languages (see paragraph 0023). In regard to claim 4, Ben-Artzi et al. discloses that the multiple programming code elements are associated with different bodies of programming code (see paragraph 0035). In regard to claim 5, Ben-Artzi et al. discloses that associations between the multiple programming code elements and the canonical representations are determined using the code language processing model (see paragraph 0035). In regard to claim 6, Ben-Artzi et al. discloses that the associations between the multiple programming code elements and the canonical representations are determined by applying the code language processing model to the different bodies of programming code (see paragraph 0035). In regard to claim 7, Ben-Artzi et al. discloses that the at least one canonical representation represents different code elements with a same functionality (see paragraph 0046). In regard to claim 8, Ben-Artzi et al. discloses that the at least one canonical representation represents different code elements with functionalities within a similarity threshold range (see paragraphs 0037-0038) In regard to claim 9, Ben-Artzi et al. discloses that the operations further comprise identifying a portion of the body of programming code for token designation (see paragraph 0036). In regard to claim 10, Ben-Artzi et al. discloses that the operations further comprise: determining functionality of the identified portion; and based on the functionality, designating a new token for association with the identified portion (suggested in paragraph 0036). Claims 12-21 has similar limitations as claims 1-10 and are therefore rejected on the same grounds. In regard to claim 23, Ben-Artzi et al. discloses a non-transitory computer-readable medium (see paragraph 0010) including instructions that, when executed by at least one processor, cause the at least one processor to perform operations for creating and using tokens representing portions of programming code (see abstract), the operations comprising: identifying a body of programming code (paragraph 0022: “source programming language code”); associating a plurality of tokens with respective portions of the body of programming code to generate a token-based representation of the body of programming code, wherein the associating comprises determining at least one canonical representation of at least one of the respective portions of the body of programming code (paragraph 0024: “Tokenizer 106 transforms streams of characters from the source programming language code into a list of tokens.”; see also paragraph 0033); providing the token-based representation of the body of programming code to an emulator, the emulator being configured to interpret token-based representations (see paragraphs 0025 & 0026); and receiving, from the emulator, an emulation result (see paragraphs 0025 & 0026). In regard to claim 24, Ben-Artzi et al. discloses that the emulator is not configured to interpret assembly language (see paragraphs 0004 & 0005). Allowable Subject Matter Claims 11 & 22 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Conclusion The prior art made of record and not relied upon (see attached PTO-892 form) is considered pertinent to applicant's disclosure. Gosling (US 5,367,685) discloses a compiler comprising a lexical analyzer and parser, an intermediate representation builder, a semantic analyzer, and a code generator, wherein these elements are sequentially coupled to each other, and together, they transform a program source code into tokenized statements, intermediate representations, annotated intermediate representations, and ultimately intermediate form code with data references made on a symbolic basis. Nackman et al. (US 6,182,281) discloses a computer-implemented method for compiling a C++ source code program in an enhanced compiler effecting lexical analysis to tokenize the source code program, parsing and semantic analysis to produce an intermediate representation of the source code program, comprising the steps of: parsing the tokenized source code program in any order with respect to declarations in the program through multiple parsing passes, each pass accumulating information to parse the declarations in the source code program for which all identifiers are unknown, from program definitions, wherein the multiple parsing passes comprise an initial pass that parses only type declarations, a second pass that parses types of functions and variables, and a third pass that parses variable initializers and function bodies. Ota (US 7,657,878) discloses a compile apparatus for generating object code from the application program comprising a lexical analyzer configured to divide an operation described in a source code of the application program into tokens, a syntax analyzer configured to analyze whether or not the tokens conform to grammatical rules. Kraft (US 2013/0212563) discloses a symbol database including a tokenized representation of a program code which is a higher-level representation where the characters of the program code text have been converted into lexemes (also known as tokens), according to the grammar of the programming language at hand. Olson et al. (US 2021/0056211) discloses obtaining source code from a client codebase, wherein the client codebase is a complete or an incomplete body of the source code for a given software program or an application; and using a machine learning (ML) model to perform a ML based analysis on an abstract syntax tree (AST) for detecting a first security vulnerability over a static source code. Any inquiry concerning this communication or earlier communications from the examiner should be directed to Peter Vincent Agustin whose telephone number is (571) 272-7567. The examiner can normally be reached on Monday - Thursday 8:30 am - 6:30 pm. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Steven Lim can be reached on 571-270-1210. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /Peter Vincent Agustin/ Primary Examiner, Art Unit 2688
Read full office action

Prosecution Timeline

Jun 20, 2024
Application Filed
Dec 29, 2025
Non-Final Rejection — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12603106
DISK DEVICE
2y 5m to grant Granted Apr 14, 2026
Patent 12597440
MAGNETIC DISK DEVICE
2y 5m to grant Granted Apr 07, 2026
Patent 12586603
DISK DEVICE
2y 5m to grant Granted Mar 24, 2026
Patent 12579998
METHOD FOR STORING AND ACQUIRING INFORMATION USING FLUORESCENCE DEFECTS IN WIDE BANDGAP MATERIALS
2y 5m to grant Granted Mar 17, 2026
Patent 12562181
SIGNAL PROCESSING DEVICE, SIGNAL PROCESSING METHOD, AND NON-TRANSITORY COMPUTER READABLE RECORDING MEDIUM STORING SIGNAL PROCESSING PROGRAM
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
84%
Grant Probability
95%
With Interview (+11.5%)
1y 11m
Median Time to Grant
Low
PTA Risk
Based on 864 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month