Last updated: April 19, 2026
Application No. 18/767,544
VOICE COMMAND DETECTION AND PREDICTION

Non-Final OA §101§103§DP
Filed
Jul 09, 2024
Examiner
SIRJANI, FARIBA
Art Unit
2659
Tech Center
2600 — Communications
Assignee
Comcast Cable Communications LLC
OA Round
1 (Non-Final)
Interview Optional

— +31.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 547 resolved cases, 2023–2026
Examiner Intelligence

SIRJANI, FARIBA View full profile →
Grants 76% — above average
Career Allow Rate
414 granted / 547 resolved
+13.7% vs TC avg
Strong +31% interview lift
Without
With
+31.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
31 currently pending
Career history
578
Total Applications
across all art units
Statute-Specific Performance

§101
14.1%
-25.9% vs TC avg
§103
49.1%
+9.1% vs TC avg
§102
14.7%
-25.3% vs TC avg
§112
10.7%
-29.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 547 resolved cases
Office Action

§101 §103 §DP
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claims 1-20 are pending. Claims 1, 8, and 15 are independent.
This Application was published as U.S. 20250006185.
Apparent priority: 27 February 2019.
This Application is a continuation of 18/302644 issued as U.S. 12067975 which is a continuation of 16/287666 issued as U.S. 11657801.  Terminal Disclaimers over the terms of both parents are required as set forth below.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims are rejected on the ground of nonstatutory double patenting as being unpatentable over claims of U.S. Patents No. 11741317 and No. 12067975 as shown below. Although the claims at issue are not identical, they are not patentably distinct from each other because of the following mapping:
Instant Application
Reference Patent U.S. 1165780
1. A method comprising: 
1. A method comprising: 
determining, based on historical user input data, one or more patterns associated with one or more voice commands; 
receiving a signal comprising a voice input; 
receiving a voice input indicating data associated with a first portion of a command; 
detecting, in the voice input, data that is associated with a first portion of a command; 
predicting, based on the first portion and while the voice input is being received, and based on the determined one or more patterns, a second portion of the command; and 
predicting, based on the first portion and while the voice input is being received, and using a machine learning model trained based on historical user input data, a second portion of the command; and 
causing execution of the command, based on the first portion and the predicted second portion, prior to an end of the voice input.
causing execution of the command, based on the first portion and the predicted second portion, prior to an end of the voice input.


The remaining Claims are either parallel to Claim 1 or depend from Claim 1 and are rejected under the combination of claim 1 of the reference and the 35 U.S.C. 103 references applied to each dependent Claim below.

Instant Application
Reference Patent U.S. 12067975
1. A method comprising: 
1. A method comprising: 
determining, based on historical user input data, one or more patterns associated with one or more voice commands; 
training, based on historical user input data, a machine learning model to determine patterns associated with voice commands; 
receiving a voice input indicating data associated with a first portion of a command; 
receiving a voice input indicating data associated with a first portion of a command; 
predicting, based on the first portion and while the voice input is being received, and based on the determined one or more patterns, a second portion of the command; and 
predicting, based on the first portion and while the voice input is being received, and based on the machine learning model, a second portion of the command; and 
causing execution of the command, based on the first portion and the predicted second portion, prior to an end of the voice input.
causing execution of the command, based on the first portion and the predicted second portion, prior to an end of the voice input.


“Training” in the reference parent is different from “determining” of the instant Application.  However, due to the broad and otherwise identical languages of both the Claim of the Application and the claim of the parent and because “determining” of patterns is a first step in training and because a subsequent “predicting” is usually based on some type of previous “training,” each claim is obvious in view of the other.
The remaining Claims are either parallel to Claim 1 or depend from Claim 1 and are rejected under the combination of claim 1 of the reference and the 35 U.S.C. 103 references applied to each dependent Claim below.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more. The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Step 1: The independent Claims are directed to statutory categories: 
Claim 1 is a method claim and directed to the process category of patentable subject matter.
Claim 8 is a system claim and directed to the machine or manufacture category of patentable subject matter.
Claim 15 is a computer-readable-storage device claim and is directed to the machine or manufacture category of patentable subject matter.
Step 2A, Prong One: Does the Claim recite a Judicially Recognized Exception? Abstract Idea? Are these Claims nevertheless considered Abstract as a Mathematical Concept (mathematical relationships, mathematical formulas or equations, mathematical calculations), Mental Process (concepts performed in the human mind (including an observation, evaluation, judgment, opinion), or Certain Methods of Organizing Human Activity (1-fundamental economic principles or practices (including hedging, insurance, mitigating risk), 2-commercial or legal interactions (including agreements in the form of contracts; legal obligations; advertising, marketing or sales activities or behaviors; business relations), 3- managing personal behavior or relationships or interactions between people (including social activities, teaching, and following rules or instructions) and fall under the judicial exception to patentable subject matter?)
The rejected Claims recite Mental Processes or Methods of Organizing Human Activity.
Step 2A, Prong Two: Additional Elements that Integrate the Judicial Exception into a Practical Application? Identifying whether there are any additional elements recited in the claim beyond the judicial exception(s), and evaluating those additional elements to determine whether they integrate the exception into a practical application of the exception. “Integration into a practical application” requires an additional element(s) or a combination of additional elements in the claim to apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that the claim is more than a drafting effort designed to monopolize the exception. Uses the considerations laid out by the Supreme Court and the Federal Circuit to evaluate whether the judicial exception is integrated into a practical application.
The rejected Claims do not include additional limitations that point to integration of the abstract idea into a practical application and are therefore directed to the abstract idea.
Claim 1 is not even a generic automation and rather a mere expression of the abstract idea of predicting the command of a supervisor based on his past patterns and performing the command.  There is no automation or connection to a machine in the claim.
1. A method comprising:
 determining, based on historical user input data, one or more patterns associated with one or more voice commands; [Jack works at an ice cream stand and knows the patterns of ice cream orders like which toppings are usually requested with which type of icecream.]
receiving a voice input indicating data associated with a first portion of a command; [Jack’s customer askes for a butter pecan scoop.]
predicting, based on the first portion and while the voice input is being received, and based on the determined one or more patterns, a second portion of the command; and [Jack can predict what type of topping the customer is going to ask for based on the type of ice cream.]
causing execution of the command, based on the first portion and the predicted second portion, prior to an end of the voice input. [Jack begins preparing the order and puts whipped cream on top before the customer has a chance to say the rest of the order.]

Step 2B: Search for Inventive Concept: Additional Elements Do not amount to Significantly More: The method Claim 1 has no additional limitations and for parallel Claims 8 and 15, the  limitations of processors, memory, and programs are well-understood, routine, and conventional machine components that are being used for their well-understood, routine, conventional and rather generic functions. Additionally, these limitations are expressed parenthetically and lack nexus to the Claim language and as such are a separable and divisible mention to a machine. Accordingly, they are not sufficient to cause the Claim as a whole to amount to significantly more than the underlying abstract idea. 

The Dependent Claims do not add limitations that could integrate the abstract idea into a practical technological application or could help the Claim as a whole to amount to significantly more than the Abstract idea identified for the Independent Claim:

2. The method of claim 1, further comprising: 
storing second data indicative of a complete voice input; and [Jack notes and remembers that the customer is asking for mint chips and not for whipped cream.]
determining, based on the stored second data, that the predicted second portion is incorrect; and [Jack realizes that his prediction was incorrect.]
causing execution of a second command that is associated with the complete voice input.[Jack removes the whipped cream and sprinkles mint chips.]

3. The method of claim 1, wherein the predicting second portion is further based on at least one of: 
one or more common input commands, metadata, time information, location information, demographic information, or differences between a format of the voice input and formats of previous inputs and changes in acoustic features. [Jack knows that customers of a particular age like a particular topping. Demographic information.]

4. The method of claim 1, wherein the predicting further comprises determining the end of the voice input based on at least one of: one or more acoustic features of the voice input, one or more linguistic features of the voice input, or detection of one or more additional voice inputs. [Jack can easily tell when the command ends based on the customer going quiet.]

5. The method of claim 4, wherein the one or more acoustic features comprise one or more energy levels of the voice input. [Jack can tell the customer had stopped when the volume/energy of voice goes down.]

6. The method of claim 4, wherein the one or more linguistic features comprise one or more formats of the voice input. [Jack can tell the customer had stopped based on the format of his voice whatever format may mean by this Claim.]

7. The method of claim 4, wherein the one or more additional voice inputs indicate one or more voices. [Jack can tell the customer had stopped when other voices are heard.]

The remaining Claims are parallel system and CRM Claims and are rejected under similar rationale.  
Claim 8 is a system claim with limitations corresponding to the limitations of Claim 1 and is rejected under similar rationale.  Additional elements addressed above.
Claim 9 is a system claim with limitations corresponding to the limitations of Claim 2 and is rejected under similar rationale.
Claim 10 is a system claim with limitations corresponding to the limitations of Claim 3 and is rejected under similar rationale.
Claim 11 is a system claim with limitations corresponding to the limitations of Claim 4 and is rejected under similar rationale.
Claim 12 is a system claim with limitations corresponding to the limitations of Claim 5 and is rejected under similar rationale.
Claim 13 is a system claim with limitations corresponding to the limitations of Claim 6 and is rejected under similar rationale.

Claim 14 is a system claim with limitations corresponding to the limitations of Claim 7 and is rejected under similar rationale.
Claim 15 is a computer program product system claim with limitations corresponding to the limitations of method Claim 1 and is rejected under similar rationale. Additional elements addressed above.
Claim 16 is a computer program product system claim with limitations corresponding to the limitations of method Claim 3 and is rejected under similar rationale.
Claim 17 is a computer program product system claim with limitations corresponding to the limitations of method Claim 4 and is rejected under similar rationale.
Claim 18 is a computer program product system claim with limitations corresponding to the limitations of method Claim 5 and is rejected under similar rationale.
Claim 19 is a computer program product system claim with limitations corresponding to the limitations of method Claim 6 and is rejected under similar rationale.
Claim 20 is a computer program product system claim with limitations corresponding to the limitations of method Claim 7 and is rejected under similar rationale.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-4, 7-11, 14-17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ortega (U.S. 6564213) in view of Kumar (U.S. 20200058300).
Regarding Claim 1, Ortega teaches:
1. A method comprising:[Ortega: the hardware is shown in Figure 1 which shows the types of devices that may be used or are involved in the process.]
 determining, based on historical user input data, one or more patterns associated with one or more voice commands; [Ortega, Figure 1, “database 22” as part of the “query server 24.”  See Figure 4, “autocomplete database 22” including “dataset 30.”  “The string extraction component 46 is responsible for periodically generating one or more datasets 30 that contain autocompletion strings (terms and/or phrases) that describe items within the database 22. The autocompletion strings are preferably extracted from a log of query submissions (as illustrated in FIGS. 4 and 5) and/or from the item descriptions within the database 22 (as illustrated in FIG. 6), although other sources of information could be used such as customer reviews or manufacturers' databases.”  4:1-9.  For example, in Figure 5: start from 80: get query log data for Last M days to 86: Identify and assign scores to most frequently uses search terms and phrases.  This is extracting patterns from data associated with historical queries over the past M days.  Query teaches the command of the Claim and is a command for information. ]
receiving a voice input indicating data associated with a first portion of a command; [Ortega, “The invention may be used regardless of the particular text input method used (stylus/graffiti, voice, keyboard, etc.).” 2:18-20.  “The invention may be used with any of a variety of text entry methods, including but not limited to handwriting recognition (e.g., graffiti), voice recognition, and keyboard entry.”  3:17-20.]
predicting, based on the first portion and while the voice input is being received, and based on the determined one or more patterns, a second portion of the command; and [Ortega is directed to autocompleting/predicting the remainder of input queries/commands.  See Figures 2A and 2B where when So- and Sony are entered as the “first portion” and before the rest of the input is received (by voice, text, stylus, etc.) the possible remainders/second portions are predicted and provided at 62.  “FIGS. 2(a) and 2(b) illustrate the general form of a user interface that may be used by the autocompletion client 50 for both PCs and handheld computing devices. In this example, as the user enters a search query into a search field 60 of the Amazon.com web site (by voice, stylus, etc.), the autocompletion client displays suggested autocompletion terms and phrases in a drop-down box 62. As illustrated in FIG. 2(a), terms are displayed in an upper pane of the box and phrases are displayed in a lower pane of the box. In other implementations, the autocompletion client may only suggest one type of string (terms or phrases) without the other. As illustrated in FIG. 2(b), once the user has completed a term, the autocompletion client may only display suggested phrases.” 5:22-36.  The Claim broadly refers to first and second portions of a command without specifying what the first and second portions may be.  Ortega teaches several options where the first portion is a partial term or a full and complete term and the second portion being a partial term completing the first portion, or a full phrase with several terms.]
causing execution of the command, based on the first portion and the predicted second portion, prior to an end of the voice input. [Ortega, Figures 2A and 2B, provide a slew of choices that lead to the execution of the query/command and all are before the user says the entire query/command:  “The user interface may implement one or more methods for the user to perform the dual action of selecting and submitting a displayed autocompletion string with a single selection action, such as a single (or double) mouse click, a single (or double) tap on the string with a stylus, or a voice command. In one embodiment, for example, if the user taps or clicks once on a suggested string (term or phrase), the string is automatically added to the search field 60; and if the user taps or clicks twice on a string, the string is automatically submitted as the search query. In another embodiment, tapping or clicking once on a string causes the string to be submitted as the search query. In either case, the user can advantageously initiate the search without moving the stylus or mouse cursor away from the selected string. In embodiments that support voice recognition, each suggested string may be displayed in conjunction with a number (1, 2, 3, . . . ) that can be used as a voice command to perform the dual action of selecting the string and initiating the search.”  5: 36-55.]
Ortega is directed to query/response systems where Query is a type of command that elicits a response.
Ortega does not mention a “command” expressly.
Kumar teaches:
1. A method comprising:[Kumar, Figures 7, 8, and 9 show the hardware that teaches the processors 704, memory 706, and storage 708 that are called for in the parallel independent Claims 8 and 15.]
 determining, based on historical user input data, one or more patterns associated with one or more voice commands; [Kumar, Figure 3, “Personal Graph Generator 345a” which generates the “personal graph data 335” teaches historical patterns of user’s input.  “[0063] Linkages between intents in the unpersonalized graph input in the personal graph generator(s) 345 may be based on system user history across domains of the system from many different users. …”  “[0076] Various machine learning techniques may be used to train and operate the personal graph generator 345 as well as the context merging component 325…. Focusing on SVM as an example, SVM is a supervised learning model with associated learning algorithms that analyze data and recognize patterns in the data, and which are commonly used for classification and regression analysis….”  The “intent pairs” in [0059]-[0061] are examples of pairs of Commands such as Order Pizza or Paly Music.]
receiving a voice input indicating data associated with a first portion of a command; [Kumar, Figure 1A, User 5 is inputting his command to the Amazon Alexa 110a by Audio 11.  See Figure 3, “Audio Data 305” and “Speech Recognition 250.” Figure 4 shows “Intent 1 402” as the first portion of the command with certain probabilities leading to potential Second portions as “Intent 2 404” or “Intent 3, 406.”]
predicting, based on the first portion and while the voice input is being received, and based on the determined one or more patterns, a second portion of the command; and [Kumar, Figure 4 shows “Intent 1 402” as the first portion of the command with certain probabilities leading to potential Second portions as “Intent 2 404” or “Intent 3, 406.”  “Techniques for determining a command or intent likely to be subsequently invoked by a user of a system are described. A user inputs a command (either via a spoken utterance or textual input) to a system. The system determines content responsive to the command. The system also determines a second command or corresponding intent likely to be invoked by the user subsequent to the previous command. Such determination may involve analyzing pairs of intents, with each pair being associated with a probability that one intent of the pair will be invoked by a user subsequent to a second intent of the pair. The system then outputs first content responsive to the first command and second content soliciting the user as to whether the system to execute the second command.”]
causing execution of the command, based on the first portion and the predicted second portion, prior to an end of the voice input. [Kumar, Figures 6A and 6B show the output of the results of the execution of the first command/ first portion of the Claim and the second command/second portion of the Claim responsive only to the input of the first command/first portion.  The second command/portion is guessed by the system based on context and history.  “[0086] The server(s) 120 determines (620) user data associated with the user that either spoke the utterance or generated the textual input. The server(s) 120 also determines (622) context data associated with processing of the previous user command. The server(s) 120 may further determine (623) data indicating previous instances of intent suggestion success and failure. That is, the data may indicate when the user previously instructed the system to execute a suggested intent as well as when the user previously instructed the system not to execute a suggested intent. The server(s) 120 determines (624) an intent likely to be subsequently invoked by the user based on the input text data associated with the previous command, the intent associated with the previous command, the user data, the context data, and the data indicating the previous instances of intent suggestion success and failure. For a 1P application, the server(s) 120 may determine the second intent prior to determining a 1P application 290 configured to execute the intent. For a 3P application, the server(s) 120 may determine a 3P application prior to determine the second intent that may be performed by the 3P application.”]
Ortega and Kumar pertain to spoken commands and queries and it would have been obvious to combine the system of Kumar with Ortega to fill and complement the process of Ortega with an express teaching of Commands that are executed to perform a function (such as ordering pizza) to arrive at a more versatile system.  This combination falls under combining prior art elements according to known methods to yield predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Regarding Claim 2, Ortega teaches:
2. The method of claim 1, further comprising: 
storing second data indicative of a complete voice input; and [Ortega, Figures 2A and 2B.  In Figure 2A, the first portion that is input is So_ and therefore the autocomplete suggestions include Software, songs, socks, etc. which were not intended.  In Figure 2B, all of Sony is input.]
determining, based on the stored second data, that the predicted second portion is incorrect; and [Ortega, In Figure 2B, where all of Sony is input the options of Software, Songs, etc. are determined as incorrect.]
causing execution of a second command that is associated with the complete voice input. [Ortega, Figure 2B.  One of the selected options of Figure 2B will be executed that includes the complete Sony input.]
Ortega is directed primarily to query/response and Kumar expressly teach the “execution of command:
causing execution of a second command that is associated with the complete voice input. [Kumar causes execution of the two successive commands (first portion and second portion of the Claim). With examples provided as:
0059] [0.345] <GetWeather>; <GetTraffic>
[0060] [0.217] <OrderPizza>; <PlayMovie>
[0061] [0.121] <PlayMusic>; <SetVolume>]
	Rationale for combination as provided for Claim 1.

Regarding Claim 3, Ortega teaches:
3. The method of claim 1, wherein the predicting second portion is further based on at least one of: 
one or more common input commands, metadata, time information, location information, demographic information, or differences between a format of the voice input and formats of previous inputs and changes in acoustic features. [Ortega selects the autocomplete prediction based on frequency/common inputs.  Figure 5, 86. “Another method, which is illustrated in FIGS. 4 and 5 and described below, involves monitoring query submissions over time to identify the most frequently used search terms and/or phrases.”  4:27-31.  “In state 86, the process identifies and assigns scores to the most frequently used terms and phrases, excluding common stop words. This may be accomplished, for example, by counting the number of times each word or phrase appears within the relevant log data….”  7:66-8:3.]

Regarding Claim 4, Ortega teaches:
4. The method of claim 1, wherein the predicting further comprises determining the end of the voice input based on at least one of: one or more acoustic features of the voice input, one or more linguistic features of the voice input, or detection of one or more additional voice inputs. [Ortega in Figures 2A and 2B, upon display of the autocomplete suggestions permits the selection of an option with would constitute the “end of the voice input” and additionally such selection may be done with a voice input selecting a number associated with the selection.  “… In embodiments that support voice recognition, each suggested string may be displayed in conjunction with a number (1, 2, 3, . . . ) that can be used as a voice command to perform the dual action of selecting the string and initiating the search.”  5: 36-55.]

Regarding Claim 7, Ortega teaches:
7. The method of claim 4, wherein the one or more additional voice inputs indicate one or more voices. [Ortega permits the selection of an option with would constitute the “end of the voice input” with a voice input selecting a number associated with the selection and this teaches “one or more voices” of the Claim.  “… In embodiments that support voice recognition, each suggested string may be displayed in conjunction with a number (1, 2, 3, . . . ) that can be used as a voice command to perform the dual action of selecting the string and initiating the search.”  5: 36-55.]

Claim 8 is a system claim with limitations corresponding to the limitations of Claim 1 and is rejected under similar rationale.
Claim 9 is a system claim with limitations corresponding to the limitations of Claim 2 and is rejected under similar rationale.
Claim 10 is a system claim with limitations corresponding to the limitations of Claim 3 and is rejected under similar rationale.
Claim 11 is a system claim with limitations corresponding to the limitations of Claim 4 and is rejected under similar rationale.
Claim 14 is a system claim with limitations corresponding to the limitations of Claim 7 and is rejected under similar rationale.

Claim 15 is a computer program product system claim with limitations corresponding to the limitations of method Claim 1 and is rejected under similar rationale.
Claim 16 is a computer program product system claim with limitations corresponding to the limitations of method Claim 3 and is rejected under similar rationale.
Claim 17 is a computer program product system claim with limitations corresponding to the limitations of method Claim 4 and is rejected under similar rationale.
Claim 20 is a computer program product system claim with limitations corresponding to the limitations of method Claim 7 and is rejected under similar rationale.

Claims 5, 12, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Ortega and Kumar in view of Maas (U.S. 10854192).
Regarding Claim 5, Ortega is not determining the end of speech from the acoustic features of user’s voice. Kumar does not include endpoint detection either.
Maas teaches:
5. The method of claim 4, wherein the one or more acoustic features comprise one or more energy levels of the voice input. [Maas teaches endpoint detection of speech.  See Abstract.  And teaches that one technique of end-pointing depends on the energy/volume of speech.  “To determine the beginning or end of an audio command, a number of techniques may be used.. … The beginning/end of an utterance may also be detected using speech/voice characteristics. Other techniques may also be used to determine the beginning of an utterance (also called beginpointing) or end of an utterance (endpointing). Beginpointing/endpointing may be based, for example, on the number of silence/non-speech audio frames, for instance the number of consecutive silence/non-speech frames. For example, some systems may employ energy based or acoustic model based voice activity detection (VAD) techniques. Such techniques may determine whether speech is present in an audio input based on various quantitative aspects of the audio input, such as the spectral slope between one or more frames of the audio input; the energy levels (such as a volume, intensity, amplitude, etc.) of the audio input … These factors may be compared to one or more thresholds to determine if a break in speech has occurred that qualifies as a beginpoint/endpoint….”  11:12-52.]
Ortega/Kumar and Mass pertain to spoken commands and endpoint detection is an implementation feature that is used in spoken command systems but is not elaborated upon in references that are focused on other aspects. It would have been obvious to combine some known method of endpoint detection with the system of combination to detect the end of a command.  This combination falls under combining prior art elements according to known methods to yield predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Claim 12 is a system claim with limitations corresponding to the limitations of Claim 5 and is rejected under similar rationale.
Claim 18 is a computer program product system claim with limitations corresponding to the limitations of method Claim 5 and is rejected under similar rationale.

Claims 6, 13, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Ortega and Kumar in view of Stanford (U.S. 20030110042).
Regarding Claim 6, Ortega is not determining the end of speech from the format of the voice input.  Kumar does not include endpoint detection either.
Stanford teaches: 
6. The method of claim 4, wherein the one or more linguistic features comprise one or more formats of the voice input. [Stanford includes end-pointing of speech commands:  “[0011] …  This may allow the user to be more comfortable interacting with a speech recognition system, as well as reducing timing considerations such as detecting the start and end points of a speech command.”  It teaches that the endpoint markers may be based on format of the speech:  “[0037] …  ASR control module 510 may coordinate insertion of the appropriate start and end codes for the speech information in accordance with the position signal, and instruct vocoder 506 to format the uncompressed or compressed digital speech signals into the appropriate transmission format.”]
Ortega/Kumar and Stanford pertain to spoken commands and endpoint detection is an implementation feature that is used in spoken command systems but is not elaborated upon in references that are focused on other aspects. It would have been obvious to combine some known method of endpoint detection with the system of combination to detect the end of a command.  This combination falls under combining prior art elements according to known methods to yield predictable results or use of known technique to improve similar devices (methods, or products) in the same way. See MPEP 2141, KSR, 550 U.S. at 418, 82 USPQ2d at 1396.

Claim 13 is a system claim with limitations corresponding to the limitations of Claim 6 and is rejected under similar rationale.
Claim 19 is a computer program product system claim with limitations corresponding to the limitations of method Claim 6 and is rejected under similar rationale.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to FARIBA SIRJANI whose telephone number is (571)270-1499. The examiner can normally be reached 9 to 5, M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Desir can be reached at 571-272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Fariba Sirjani/
Primary Examiner, Art Unit 2659
Read full office action
Prosecution Timeline

Jul 09, 2024
Application Filed
Jan 26, 2026
Non-Final Rejection — §101, §103, §DP (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/454,031
Patent 12603099
SELF-ADJUSTING ASSISTANT LLMS ENABLING ROBUST INTERACTION WITH BUSINESS LLMS
2y 5m to grant Granted Apr 14, 2026
18/152,553
Patent 12579482
Schema-Guided Response Generation
2y 5m to grant Granted Mar 17, 2026
18/341,681
Patent 12572737
GENERATIVE THOUGHT STARTERS
2y 5m to grant Granted Mar 10, 2026
18/406,094
Patent 12537013
AUDIO-VISUAL SPEECH RECOGNITION CONTROL FOR WEARABLE DEVICES
2y 5m to grant Granted Jan 27, 2026
18/180,329
Patent 12492008
Cockpit Voice Recorder Decoder
2y 5m to grant Granted Dec 09, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
76%
Grant Probability
99%
With Interview (+31.0%)
2y 10m
Median Time to Grant
Low
PTA Risk
Based on 547 resolved cases by this examiner. Grant probability derived from career allow rate.