Last updated: May 29, 2026

Application No. 18/751,047

INTELLIGENT CONSTRAINED DECODING

Non-Final OA §102§103

Filed

Jun 21, 2024

Priority

Feb 08, 2024 — provisional 63/551,069

Examiner

ROBERTS, SHAUN A

Art Unit

2655

Tech Center

2600 — Communications

Assignee

Scaled Cognition Inc.

OA Round

1 (Non-Final)

Interview Optional

— +10.5% interview lift. Interview lift (+10.5%) is below the 15.0% threshold. A written response is recommended.

Based on 652 resolved cases, 2023–2026

Examiner Intelligence

ROBERTS, SHAUN A View full profile →

Grants 76% — above average

Career Allowance Rate

495 granted / 652 resolved

+13.9% vs TC avg

Moderate +10% lift

Without

With

+10.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

18 currently pending

Career history

679

Total Applications

across all art units

Statute-Specific Performance

§101

1.8%

-38.2% vs TC avg

§103

83.6%

+43.6% vs TC avg

§102

12.5%

-27.5% vs TC avg

§112

0.1%

-39.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 652 resolved cases

Office Action

§102 §103

DETAILED ACTION
1.	This action is responsive to Application no.18/751,047 filed 6/21/2024.  All claims have been examined and are currently pending.
Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
3.	Claims 4-5, 9-10 are objected to because of the following informalities: the claims recite a finite state machine (FSA), where the specification discloses a finite state automaton. Further, the abbreviation in the claims, FSA, also corresponds to finite state automaton.  Appropriate correction is required for consistency.
	Claim 18 recites the method of claim 11, where it appears it should read the method of claim 13.  Appropriate correction is required.

Claim Rejections - 35 USC § 102
4.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

5.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

6.	Claims 13-18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Scholak et al (2022/0358125).

Regarding claim 13 Scholak teaches A method for decoding tokens provided by a language model (0011 decoding…of the language model), comprising: 
receiving a sequence of token from a language model (0011: natural language query; 40; 44: may define a number of tokens…configured to output n potential next tokens); 
performing a constrained search for a subsequent token (0011: constrains the output of the language model; at each decoding step of the language model, the model generates a predicted next token; 39; 44); 
integrating the subsequent token and the sequence of tokens into two or more new sequences of tokens (41: identifies one or more valid potential translations; selects the highest scoring); and 
selecting one of the two or more new sequences of tokens based on a probability score
(0011: the DSL parser may also score and rank the set of partial potential translations at each auto-regressive decoding step, at the conclusion of the decoding process, or any combination thereof, based on confidence values generated by the language model for the tokens of the partial potential translation, based on the analysis of the partial potential translation by the DSL parser, or any combination thereof. As such, by incrementally parsing at each decoding step, the DSL parser enables the NLQ-to-DSLQ translation system to “fail early” with respect to invalid and low-scoring translations as they are being generated, which reduces overall computational resource usage and enables the expended computational resources to be focused on generating and validating the most promising potential translations; 
41: identifies one or more valid potential translations; selects the highest scoring.).

Regarding claim 14 Scholak teaches The method of claim 13, wherein a probability score is determined for each of the new sequences of tokens (11: score; 41).

Regarding claim 15 Scholak teaches The method of claim 13, wherein a plurality of token sequences is received from the language model and a plurality of subsequent tokens are integrated with each of the plurality of token sequences (0011 generates predicted next token; 44: n potential next tokens).

Regarding claim 16 Scholak teaches The method of claim 15, wherein a probably score is determined for each new token sequence generated from the plurality token sequences and the plurality of tokens (11: score; 41).

Regarding claim 17 Scholak teaches The method of claim 16, wherein selecting one of the plurality of new sequences includes selecting a subset of the plurality of new token sequences having the highest probability score (11; 41: identifies one or more valid potential translations; selects the highest scoring).

Regarding claim 18 Scholak teaches teaches The method of claim 11, further comprising: 
determining that the constrained search results in no allowable tokens (11; 40: invalid or low-scoring); and 
changing a token selection from a previous iteration of decoding (52 may be reduced in rank of entirely ejected from the beam in subsequent decoding step of the language model).


Claim Rejections - 35 USC § 103
7.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

8.	Claims 1,6, 11 are rejected under 35 U.S.C. 103 as being unpatentable over Scholak in view of Hildebrandt et al (2010/0312755) in further view of Ren et al (2022/0269431).
Regarding claim 1 Scholak et al (2022/0358125) teaches A method for performing constrained decoding by a language model (0011: language model and…parser that constrains the output of the language model; decoding step of the language model), comprising: 
accessing a {context free} grammar {(CFG)}(11: grammar); 
{converting the CFG to a byte-level CFG;} 
{constructing} a {byte-level} representation of a tokenizer vocabulary of the language model (11: at each decoding step of the language model, the model generates a predicted next token; 44 define a number of tokens in the vocabulary); and 
parsing a {byte} sequence by a parsing mechanism to determine if the {byte} sequence corresponds to an allowed string prefix {according to the byte-level CFG} (11: At each decoding step of the language model, the model generates a predicted next token for each of a set of partial potential translations of the NLQ. The DSL parser evaluates each of the partial potential translations generated by the model at each decoding step based on a set of stored DSL rules, which define valid terminology, syntax, grammar, and/or other constraints of the DSL.);
but does not specifically teach
accessing a context free grammar (CFG); 
converting the CFG to a byte-level CFG; 
constructing a byte-level representation of a tokenizer vocabulary of the language model; and 
parsing a byte sequence by a parsing mechanism to determine if the byte sequence corresponds to an allowed string prefix according to the byte-level CFG.

In a similar filed of endeavor, Hildebrandt et al (2010/0312755) teaches a context free grammar and a representation of the compressed grammar
([0037] First of all, there is a description of the compression of data through the production of a context-free grammar according to an embodiment of the invention.
[0041] The context-free grammar to be produced for data to be compressed can additionally be obtained by means of so-called context compression. In context compression, a multiplicity of (basic) rules K.sub.1 to K.sub.n is either predetermined or used from a previously created grammar, which can then be referenced to produce a new, context-free grammar from the data currently to be compressed. Therefore, the rules of context grammar K.sub.1 to K.sub.n can be used both to create new rules and also in start rule S.sub.0.
[0042] After compression has been carried out by means of the context-free grammar, for further improvement of this first compression, a code is then used to store the grammar, wherein frequent symbols are assigned shorter code words than infrequent symbols. For this purpose, it is possible, for example, to use a Huffman code).
It would have been obvious to one of ordinary skill in the art before the effective filing date to incorporate Hildebrandt and the use of reduced components to minimize storage requirements, and for more efficient processing, for improved decoding by the language model.
Scholak teaches the benefits of reducing data:
[0042] In addition to constraining the output of the language model 304 into the DSL, it may be appreciated that the NLQ-to-DSLQ translation system 302 also offers advantages in terms of selection of the language model 304. For example, in one embodiment, a NLQ-to-DSLQ translation system 302 having a smaller language model 304 (e.g., T5-base model) in combination with the DSL parser 306 performed better at NLQ-to-DSL translation than a comparable translation system having a larger language model 304 (e.g., T5-large) without the DSL parser 306. As such, by including and applying the DSL parser 306, the disclosed NLQ-to-DSLQ translation system 302 can enable enhanced translation performance using smaller language models, which consume fewer computing resources during operation.
Thus, one could look to Hildebrandt for reduction, with the benefits discussed below:
[0003] The compression of digital data by electronic means, i.e. in an electronic system for information processing or data transfer, is used above all to economize on storage space and transmission capacity. Especially in cases where large volumes of digital data are transferred over data networks, compression is important not only for the efficient use of existing transmission capacities, for example of available bandwidth, but also in order to speed up the data transfer process. Yet also in relation to the storage of large volumes of digital data of the order of gigabytes or even terabytes, such as in databases, efficient compression is frequently necessary in order to reduce the amount of storage space that would be required for the uncompressed digital data, thereby making it possible to economize on technical resources.

	Scholak and Hildebrandt do not specifically teach where Ren teaches converting to a byte-level ([0078] The compression is a byte-level data reduction technology. A concept of the compression is to use an encoding technology to represent longer data in a shorter encoded format to reduce a data size.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to incorporate Ren to further compress the data and to represent longer data in a shorter encoded format to reduce a data size (Ren 78).
Ren 0003: Deduplication and compression are key technologies in the storage industry. A storage device performs deduplication and compression, so that an amount of actually stored data can be reduced, storage space occupied by the data in the storage device can be reduced, and storage efficiency of the storage device can be improved.
Thus, prior art Scholak, Hildebrandt, and Ren would therefore teach:
accessing a context free grammar (CFG); 
converting the CFG to a byte-level CFG; 
constructing a byte-level representation of a tokenizer vocabulary of the language model; and 
parsing a byte sequence by a parsing mechanism to determine if the byte sequence corresponds to an allowed string prefix according to the byte-level CFG.

Regarding claim 6 Scholak, Hildebrandt, and Ren teach A non-transitory computer readable storage medium having embodied thereon a program, the program being executable by a processor (Scholak figures 1-4) to perform constrained decoding by a language model, the method comprising: 
accessing a context free grammar (CFG); 
converting the CFG to a byte-level CFG; 
constructing a byte-level representation of a tokenizer vocabulary of the language model; and 
parsing a byte sequence by a parsing mechanism to determine if the byte sequence corresponds to an allowed string prefix according to the byte-level CFG.
Claim recites limitations similar to claim 1 and is rejected for similar rationale and reasoning 

Regarding claim 11 Scholak, Hildebrandt, and Ren teach A system for perform constrained decoding by a language model (Scholak fig 3-4; 0037: server hosts), comprising: 
one or more servers, wherein each server includes a memory and a processor (Scholak fig 3,4; para 33; 37); and 
one or more modules stored in the memory and executed by at least one of the one or more processors (fig 3,4; para 33; 37) to 
access a context free grammar (CFG), 
convert the CFG to a byte-level grammar, 
construct a byte-level representation of a tokenizer vocabulary of the language model, and 
parse a byte-level representation of a tokenizer vocabulary of the language model.
Claim recites limitations similar to claim 1 and is rejected for similar rationale and reasoning 


9.	Claims 2 and 7 are rejected under 35 U.S.C. 103 as being unpatentable over Scholak in view of Hildebrandt et al (2010/0312755) in further view of Ren et al (2022/0269431) in further view of Buchholz (2007/0016398).

Regarding claim 2 Scholak, Hildebrandt, and Ren do not specifically teach where Buchholz (2007/0016398) teaches The method of claim 1, wherein parsing includes parsing each incrementally generated string during left-to-right decoding of the language model ([0066] As discussed above the parsing method of the present invention determines the heads and grammatical roles of tokens strictly from left to right, i.e. in the first step, it determines which role the first token takes and which other token is the first token's head, in the second step it determines the same for the second token, and so on until the last token.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to incorporate Buchholz with Scholak to allow for proper parsing presenting a reasonable expectation of success.  Scholak already teaches a parser for parsing and one could look to Buchholz to ensure the parsing is done in a manner consistent with the specific linguistic requirements.

	Claim 7 recites limitations similar to claim 2 and is rejected for similar rationale and reasoning 

10.	Claims 3-5, 8-10, 12 are rejected under 35 U.S.C. 103 as being unpatentable over Scholak in view of Hildebrandt et al (2010/0312755) in further view of Ren et al (2022/0269431) in further view of Levit et al (2015/0325235).

Regarding claim 3 Scholak, Hildebrandt, and Ren teach
The method of claim 1, wherein parsing includes performing {lattice} parsing {on a lattice} representing a plurality of tokens to determine a set of byte sequences that are accepted by the byte-level CFG (rejected for similar rationale and reasoning as claim 1);
But do not specifically teach where Levit teaches 
wherein parsing includes performing lattice parsing on a lattice ([0028] Storage 106 may also store information about parsed representations of corpora (i.e., parses). In some embodiments, corpora parses are stored as a lattice structure, as described in connection to parsing component 124. Information about the parses may include tokens created from words, entities, or phrases of a corpus; statistics associated with the tokens; and tags, which may identify the token type. In some embodiments, tokens are tagged by parsing component 124 to represent a type of sequences of words,; [0037] In an embodiment, parsing component 124 determines a “lattice” data structure of nonlinear sequences of corpus elements. The lattice data structure is a directed graph providing a compact representation of a number of alternative parses. Each path through the lattice produces a different parse of the corpus, and each path is associated with a probability.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to incorporate lattice based parsing for improved parsing (of Scholak), allowing to better determine the potential results, while presenting a reasonable expectation of success in allowing the parsing to still be completed.

Regarding claim 4 Scholak, Hildebrandt, and Ren teach
The method of claim 1, further comprising {minimizing a finite state machine (FSA) that} represents a set of possible byte sequences corresponding to the language model's vocabulary (rejected for similar rationale and reasoning as claim 1)
But do not specifically teach where Levit teaches further comprising minimizing a finite state machine (FSA) ([0031] Entity definitions may also comprise implicitly defined instances of entity-types. In particular, for certain entity-types, it is not efficient to explicitly enumerate all possible instances of the entity-type. For example, while all (or most) actors could be explicitly included in a definition for the actor entity-type, it is not efficient to enumerate all possible phone numbers, temporal information, such as dates and times, or other combinatorial entity-types. Therefore, in some embodiments, these entities may be implicitly defined by combinatorial models that can provide the entity definition. For example, a finite state machine (FSM) or similar model may be used. ).
It would have been obvious to one of ordinary skill in the art before the effective filing date to incorporate the finite state machine for improved and more efficient decoding (for determining possible, potential results), while presenting a reasonable expectation of success,
And thus teaching minimizing a finite state machine (FSA) that represents a set of possible byte sequences corresponding to the language model's vocabulary.

Regarding claim 5 Scholak, Hildebrandt, and Ren teach potential byte sequences that can be parsed,
But do not specifically teach, where Levit teaches minimized FSA and simultaneous parsing using lattice parsing (28; 31; 37).
It would have been obvious to one of ordinary skill in the art before the effective filing date to incorporate the finite state machine and lattice for improved and more efficient decoding (for determining possible, potential results), while presenting a reasonable expectation of success, teaching
wherein the minimized FSA includes a plurality of potential byte sequences that can be parsed simultaneously using lattice parsing.
	Scholak, Hildebrandt, and Ren already teach byte representations, and parsing byte sequences to determine (linguistically) allowed outputs.  Incorporating Levit and a finite state machine and lattice parsing would allow for parsing using the provided structures, which are linguistic tools to process all possible alternatives and paths to present a collection of best or most relevant outputs.

Claims 8 and 12 recite limitations similar to claim 3 and are rejected for similar rationale and reasoning
Claim 9 recites limitations similar to claim 4 and is rejected for similar rationale and reasoning
Claim 10 recites limitations similar to claim 5 and is rejected for similar rationale and reasoning

Conclusion
11.	The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: See PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAUN A ROBERTS whose telephone number is (571)270-7541.  The examiner can normally be reached Monday-Friday 9-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool.  To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on 571-272-7516.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center.  Unpublished application information in Patent Center is available to registered users.  To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov.
For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SHAUN ROBERTS/Primary Examiner, Art Unit 2655

Read full office action

Prosecution Timeline

Jun 21, 2024

Application Filed

Mar 03, 2026

Non-Final Rejection mailed — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/575,883

Patent 12639534

WEBTOON CONTENT MULTILINGUAL TRANSLATION METHOD

2y 3m to grant Granted May 26, 2026

18/667,219

Patent 12626705

APPARATUS AND METHOD FOR MAPPING EMERGENCY CALL DATA MANUAL

1y 12m to grant Granted May 12, 2026

18/566,268

Patent 12621616

METHOD OF OPERATING A HEARING AID SYSTEM AND A HEARING AID SYSTEM USING SPEECH FORECASTING

2y 5m to grant Granted May 05, 2026

18/274,775

Patent 12609133

SCENE ESTIMATE METHOD, SCENE ESTIMATE APPARATUS, AND PROGRAM

2y 8m to grant Granted Apr 21, 2026

18/312,688

Patent 12586599

AUDIO SIGNAL PROCESSING METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM WITH MACHINE LEARNING AND FOR MICROPHONE MUTE STATE FEATURES IN A MULTI PERSON VOICE CALL

2y 10m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

76%

Grant Probability

86%

With Interview (+10.5%)

2y 11m (~11m remaining)

Median Time to Grant

Low

PTA Risk

Based on 652 resolved cases by this examiner. Grant probability derived from career allowance rate.