Last updated: May 29, 2026

Application No. 18/339,139

METHODS AND SYSTEMS FOR GENERATING DESCRIPTION FOR ASSEMBLY FUNCTIONS

Non-Final OA §103

Filed

Jun 21, 2023

Examiner

VO, TED T

Art Unit

2191

Tech Center

2100 — Computer Architecture & Software

Assignee

Blackberry Limited

OA Round

3 (Non-Final)

Interview Optional

— +9.3% interview lift. Interview lift (+9.3%) is below the 15.0% threshold. A written response is recommended.

Based on 807 resolved cases, 2023–2026

Examiner Intelligence

VO, TED T View full profile →

Grants 81% — above average

Career Allowance Rate

654 granted / 807 resolved

+26.0% vs TC avg

Moderate +9% lift

Without

With

+9.3%

Interview Lift

resolved cases with interview

Typical timeline

3y 2m

Avg Prosecution

12 currently pending

Career history

829

Total Applications

across all art units

Statute-Specific Performance

§101

6.1%

-33.9% vs TC avg

§103

67.3%

+27.3% vs TC avg

§102

15.4%

-24.6% vs TC avg

§112

3.5%

-36.5% vs TC avg

Black line = Tech Center average estimate • Based on career data from 807 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This action is in response to the claimed listing filed on 02/13/2026.
Claims 1, 3-10, 12-19 are pending.
Response to Arguments
This is in response to the Argument Remarks  filed on 02/13/2026, the amendment necessitated the new ground of rejection presenting in the Action. Therefore, Applicant submissions in the remarks are moot in view of the added prior art in the Action.

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-4, 7-10, 12-13, 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Taviss et al., “Asm2Seq: Explainable Assembly Code Functional Summary Generation for Reverse Engineering and Vulnerability Analysis”, May 18, 2023,  25 pages (Applicant’s Submitted prior art #2 in IDS, receipt date 11/01/2024, and hereinafter: Taviss), and in view of Taviss, “Asm2Seq: Explainable Assembly Code Functional Summary Generation”, 2021, Master Thesis, Queen University, Canada, 103 pages (hereinafter: Taviss Thesis), and in view of Kusupati et al., “Natural Language to Code Using Transformers”, 2022, arXiv, 7 pages (Applicant’s Submitted prior art #7 in IDS, receipt date 09/19/2023). 
As per Claim 1: Taviss discloses the limitations in bold below:
1. (Currently Amended) A method for creating a model to add a code summary to functions of assembly language code, the method comprising:
tokenizing an assembly code dataset, wherein the assembly code dataset comprises [the functions of assembly language code and comment pairings];
See p. 2, within second bullet of the Introduction sec.,  ‘We created the first labelled datasets for binary code summarization…..The creation of the assembly-description pairs will be very beneficial for future work on developing an AI model that can truly understand the semantics of assembly code. The complete datasets are available at….’. See p.3, within sec. 2.2, ‘A more simplified approach models code summarization as a machine translation problem, where the input is represented through a sequence of tokens, and the output is the natural language equivalent here the input is represented through a sequence of tokens’)

But, in the creation of dataset/the input as a sequence of tokens, Taviss mentions that a complete dataset is an assembly-description pairs will be very beneficial, but
does not explicitly mention,
wherein the assembly code dataset comprises the functions of assembly language code and comment pairings]
Taviss thesis discloses, “the assembly code dataset comprises the functions of assembly language code and comment pairings” (See Taviss thesis, p. 5, lines 1-6 “Specially, we compile the Juliet Test Suite and the NDSS18 vulnerable source code dataset and link them with their corresponding vulnerability descriptions. In total, we generated 97,492 unique pairs. The
creation of the assembly-description pairs will be very beneficial to future work on developing an AI model that can truly understand the semantics of assembly code”. In p. 27, within sec. 3.1.2, Code Summarization and Generation, second para., “The comments included throughout source code are often a good means for producing an accurate summary of a codes purpose. In this regard, text summarization techniques are useful as comments are generally written in natural language. Due to potentially a multitude of reasons, software developers and programmers do not always
include commenting in their code” 
 
P. 56, last four lines, “The assembly instructions associated with one source code file are combined into a string of tokens. This string of tokens is related to the description extracted from the source code file used to create these assembly instructions.”.
Thus, the Taviss thesis suggested that comments in the assembly code/subroutine/functions in the program would be helpful for summarization, and the suggestion explains assembly-description pair is another term for assembly language code and comments. 
Therefore, it would be obvious to an ordinary of skills in the art before the effective filing of the application to combine the tokenized assembly code description pair of Taviss with the code and comment paring in the token dataset in Taviss thesis as a necessity for model understanding and it would be helpful for code summarization. 

Claim continuing recites where Taviss, further discloses the limitations in bold below:
inputting the tokenized assembly code dataset to a pre-trained transformer-based model, 
the pre-trained transformer-based model [ having an architecture comprising
self-attention layers];
(See Taviss above “created the first labelled datasets, and  ” p. 20, sec. 10, ‘We focus on qualitatively evaluating the summary generated from the previously trained model from Juliet and NDSS dataset on both in-sample and out-of-sample CWE categories.’)
using an encoder to create fixed length embeddings; 
(See Taviss, p. 5, within sec 3.1 Encoder, ‘An encoder is a neural network that processes the variable-length input sequences to a fixed-length vector’)
and using a decoder on the fixed length embeddings to generate the code summary.
(See Taviss, p. 6, within sec 3.2 Decoder ‘A decoder maps the fixed-length vectors back into variable-length sequences’;  and  see Fig. 1, Asm2Seq).
	
	Taviss and combing Taviss Thesis do not explicitly disclose the limitation in italics below: 
the pre-trained transformer-based model [ having an architecture comprising self-attention layers];
	Kusupati discloses “the pre-trained transformer-based model having an architecture comprising self-attention layers (See Figure 1, in p.2 and this transformer model includes “Pre-train” in p. 5, sec. 3.4, “• FINETUNE: We first pre-train the transformer using the mined data and then finetune the model using annotated data alone.”. 
See in p. 2, Figure 1 shows layers of “Multi-Head Attention”, and in sec. 2.1: (left col.) “A transformer is similar to many sequence to sequence models in the sense that it contains an encoder and decoder to compress the sentence to an encoding and further generate each token conditioned on previous the previous tokens.” , and ((right col.) “In addition to this generic setup of self-attention and fully connected network, the decoder layer contains another attention in between them, which computes attention specific to each output word over the input encodings from the output of the encoder stack.”)

	With Self-attention layers in the pre-trained transformer of Kusipati, it helps to tokens interacting together and resolving ambiguity for training.
Therefore, it would be obvious to an ordinary of skills in the art before the effective filing of the application to combine 
the teaching tokenized assembly code description pair of Taviss with the code and comment paring in the token dataset in Taviss Thesis, and further with the teaching the self-attention layers for positioning information of long token sequence of Kusupati. 
The combination would yield predictable results because with the input of self-attention layers in the transformer model, it would help the code summarization in binding the information for resolving ambiguity and keeping the information for not being lost, and thus would improve the performance of code summarization.

As per Claim 3: : Taviss and Taviss thesis and further combining Kusupati, where
Taviss discloses the limitation in bold as below:
3. (Previously presented) The method of claim 1, wherein the dataset is created by:
retrieving source code [with comment pairings];  compiling the source code to create a binary output; (Taviss: p.  2, Fig. 1, and in p. 4, item 4)
disassembling the binary output to assembly language code; and
correlating functions within the assembly language code and the source code to associate the 
comment pairings with the assembly language code.
(Taviss: p. 4, sec. 2.3,  using disassembly tool IDA Pro, etc., and “A similar study using RNNs predicts function types from disassembled binary code functions..”  ) 
Taviss does not mention “comment pairings” (See rationales addressed in claim 1)
It would be obvious to an ordinary of skills in the art before the effective filing of the application to combine the tokenized assembly code description pair of Taviss  and further in view of Kusupati within the code and comment paring in the token dataset in Taviss Thesis as the necessity for code summarization. 

As per Claim 4: Taviss and  combing Taviss thesis, and combining Kusupati, where
Taviss further discloses,
4. (Previously presented) The method of claim 1, further comprising training the pre-trained transformer-based model with a subset of the assembly code dataset and testing the model using a further subset of the of the assembly code dataset.
(Taviss, in p. 14: sec. 8.2, Experiments, “The validation set is the sample of data used to evaluate the model on the training dataset and is frequently used to fine-tune the hyperparameters. The testing set is the sample of data used to evaluate the final model after training is complete.”: The term fine-tune is the training step after pre-training. And  see p.9, within sec. 5.1, ‘Rather than encoding a whole input sequence into a single fixed-length vector, attention models encode input sequences into a series of vectors and choose a subset of these vectors as the model decodes and predicts an output.’)


As per Claim 7: Taviss and combing Taviss thesis, and combining Kusupati, where
Taviss further discloses,
7. The method of claim 1, wherein the fixed length embeddings are further created using padding and truncation.
(Taviss, p. 13, within sec. 8.1.2, ‘Sequences are truncated or padded with zeroes until they have the appropriate length to ensure all samples are of the same length.’)

As per Claim 8: Taviss and combing Taviss thesis, and combining Kusupati, where
Taviss further discloses,
8. The method of claim 7, wherein the fixed length is optimized for accuracy and model training time.
(Taviss, p. 10, entire sec. 6: Optimization)

As per Claim 9: Taviss and  combing Taviss thesis, and combining Kusupati, where
Taviss further discloses,
9. The method of claim 1, wherein each of the fixed length embeddings is a contextual vector representation of an input token.
(Taviss, p. 5-6, sec. 3.1, Encoder, ‘An encoder is a neural network that processes the variable-length input sequences to a fixed-length vector ’;  p. 6, sec. 3.2, Decoder ‘A decoder maps the fixed-length vectors back into variable-length sequences’)





As per Claims 10, 12-13, 16-18: 
The claims 10, 12-13, 16-18 recite a device, where the claims recite the claimed limitations to perform the method of claims 1, 3-4, 7-9. The rejection of the claims would be with the same rationales as addressed in the rejection of the method claims 1, 3-4, 7-9.

As per Claim 19: 
The claim 19 recites a non-transitory computer readable medium, where the claim recites the claimed limitations to perform the method of claim 1. The rejection of the claim would be with the same rationales as addressed in the rejection of the method claim 1.



Claims 5-6, 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over as being unpatentable over Taviss, “Asm2Seq: Explainable Assembly Code Functional Summary Generation for Reverse Engineering and Vulnerability Analysis”, May 18, 2023,  25 pages (Applicant’s Submitted prior art #2 in IDS, receipt date 11/01/2024, and hereinafter: Taviss), and in view of Taviss, “Asm2Seq: Explainable Assembly Code Functional Summary Generation”, 2021, Master Thesis, Queen University, Canada, 103 pages (hereinafter: Taviss thesis) and in view of Kusupati et al., “Natural Language to Code Using Transformers”, 2022, arXiv, 7 pages (Applicant’s Submitted prior art #7 in IDS, receipt date 09/19/2023), and further in view of Feng et al., “CodeBERT: A Pre-Trained Model for Programming and Natural Languages”, 2020, 12 pages (Applicant’s Submitted prior art #38 in IDS, receipt date 09/19/2023).
As per Claim 5: Regarding,
5. The method of claim 1, wherein the pre-trained transformer-based model is a CodeBERT model.
As per above limitation, Taviss, and in view of Taviss thesis , and in view of Kusupati do not explicitly mention the model is “a CodeBERT”.
Feng disclose a pre-trained transformer-based model is  a CodeBERT (See Feng, Abstract, see sec. 3 CodeBERT, in page 3). The CodeBERT is a pre-trained model available in machine-learning used as a bimodal performed both on programming language and Natural Language.
Therefore, it would be obvious to an ordinary of skills before the effective filing of the application to include and to utilize the pre-trained CodeBERT in Feng,  with the pre-trained model of Taviss et al., and in view of Taviss thesis  and in view of Kusupati, for conforming to the model availability.

As per Claim 6: Regarding,
6. The method of claim 1, wherein the tokenizing is performed by a WordPiece tokenizer.
As per above limitation, Taviss shows the performance of  assembly code as it  tokenized in words then mapped into numbers (p. 12, Fig. 8), but
Taviss and in view of Taviss thesis , and in view of Kusupati do not explicitly mention do not explicitly mention the model is performed by “a WordPiece tokenizer”.
Feng disclose tokenized is performed by  “a WordPiece tokenizer” (See in page 3, within sec. 3.2, ‘Following the standard way of processing text in Transformer, we regard a natural language text as a sequence of words, and split it as WordPiece (Wu et al., 2016). We regard a piece of code as a sequence of tokens.’) Tokenizing dataset with workpiece is standardized with the language that used with Latin characters, and used with bimodal.
Therefore, it would be obvious to an ordinary of skills before the effective filing of the application to include tokenizing in Taviss and in view of Taviss thesis and in view of Kusupati, with the WordPiece tokenizer in Feng for conforming to the standard and the availability.

As per Claims 14-15: 
The claims 14-15 recite a device, where the claims recite the claimed limitations to perform the method of claims 5-6. The rejection of the claims would be with the rationales addressed in the rejection of the method claims 5-6.

Conclusion
 	 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ted T Vo whose telephone number is (571)272-3706. The examiner can normally be reached 8am-4:30pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wei Y Mui can be reached on (571) 272-3708. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

TTV
March 21, 2026
/Ted T. Vo/
Primary Examiner, Art Unit 2191

Read full office action

Prosecution Timeline

Jun 21, 2023

Application Filed

Jun 17, 2025

Non-Final Rejection mailed — §103

Sep 09, 2025

Response Filed

Nov 14, 2025

Final Rejection mailed — §103

Jan 08, 2026

Response after Non-Final Action

Feb 13, 2026

Request for Continued Examination

Feb 24, 2026

Response after Non-Final Action

Mar 25, 2026

Non-Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/397,205

Patent 12639067

CODE COMMENT QUALITY ASSURANCE

2y 5m to grant Granted May 26, 2026

18/433,813

Patent 12632233

INTEGRATING LOOP UNROLLING AND LOOP SPLITTING TO REDUCE CONTROL OVERHEADS

2y 3m to grant Granted May 19, 2026

18/472,488

Patent 12632224

METHOD FOR RUNNING INSTANCE, COMPUTER DEVICE, AND STORAGE MEDIUM

2y 8m to grant Granted May 19, 2026

17/717,592

Patent 12619403

REGULAR EXPRESSION PROCESSOR

4y 0m to grant Granted May 05, 2026

18/546,778

Patent 12619522

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING SYSTEM

2y 8m to grant Granted May 05, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

81%

Grant Probability

90%

With Interview (+9.3%)

3y 2m (~3m remaining)

Median Time to Grant

High

PTA Risk

Based on 807 resolved cases by this examiner. Grant probability derived from career allowance rate.