Last updated: May 29, 2026
Application No. 17/341,691
VOICE-BASED ORDER PROCESSING

Final Rejection §101§103
Filed
Jun 08, 2021
Priority
Mar 28, 2019 — continuation of 11/132,740
Examiner
SULLIVAN, THOMAS J
Art Unit
3689
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Ncr Voyix Corporation
OA Round
8 (Final)
Interview Optional

— +21.9% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 28% grant rate with +21.9% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 130 resolved cases, 2023–2026
Examiner Intelligence

SULLIVAN, THOMAS J View full profile →
Grants only 28% of cases
Career Allowance Rate
37 granted / 130 resolved
-23.5% vs TC avg
Strong +22% interview lift
Without
With
+21.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
24 currently pending
Career history
169
Total Applications
across all art units
Statute-Specific Performance

§101
19.2%
-20.8% vs TC avg
§103
69.6%
+29.6% vs TC avg
§102
8.0%
-32.0% vs TC avg
§112
2.1%
-37.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 130 resolved cases
Office Action

§101 §103
Detailed Action
Status of Claims
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
This Action is in reply to the Amendment filed on 1/27/2026. Claims 2-18 and 20-21 are currently pending and have been examined. Claims 1 and 19 stand cancelled. Claims 2, 13, and 20 have been amended. 

Priority
The examiner acknowledges that the instant application claims priority from parent Application 16/368,772, filed 03/28/2019. Therefore, the claims receive the effective filing date of 03/28/2019.

Claim Rejection - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 2-12 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
	
First, it is determined whether the claims are directed to a statutory category of invention. In the instant case, claims 2-12 are directed to a process. Therefore, claims 2-12 are directed to statutory subject matter under Step 1 of MPEP 2106  (Step 1: YES).
The claims are then analyzed to determine whether the claims are directed to a judicial exception. In determining whether the claims are directed to a judicial exception, the claims are analyzed to evaluate whether the claims recite a judicial exception (Prong One of Step 2A), as well as analyzed to evaluate whether the claims recite additional elements that integrate the judicial exception into a practical application of the judicial exception (Prong Two of Step 2A). 
Claim 2 recites at least the following limitations that are believed to recite an abstract idea:
detecting a user in a vehicle at a drive thru terminal by processing images using an algorithm that is refined on other images to determine when the vehicle is present at the drive thru terminal, by events, or by receiving a vocal request from the user; 
greeting the user by an initial greeting and displaying corresponding text presented on a display of the drive thru terminal based on the detecting; 
assigning a unique identifier for a transaction with the user from a transaction manager of a point of sale associated with the drive thru;
engaging the user in a natural language dialogue to start a session for the transaction with the user using a predefined set of vocabulary terms of a lexicon that is specific to a menu and menu options associated with the drive thru terminal and using the transaction identifier obtained from the transaction manager, wherein engaging further includes translating speech provided the user into text commands associated with an order using the lexicon, wherein the lexicon is associated with a type of restaurant;
	wherein the lexicon includes predefined order-specific words and phrases comprising commands for one or more of: want, like, make, add, order, buy, purchase, cancel, delete, remove, modify, or change;
	wherein the commands are identified as processing actions that are processed by the transaction manager for a particular order;
configuring the lexicon to be specific to a particular restaurant type and providing the lexicon to a voice-enabled service as a specialized feature, wherein the voice-enable service is a modified version of a consumer-voice service that includes the specialized feature for a specialized lexicon and that issues commands and operates through a voice-enable communication;
providing a real-time speech based dialogue for the user to verbally communicate the order, wherein the real-time speech-based dialogue provides a dialogue flow based on the lexicon associated with the type of restaurant to enable order processing;
 receiving the order from the user during the session based on the natural language dialogue; 
providing the order to order fulfillment processing for fulfillment; 
providing instructions to the user during the natural language dialogue for the user to provide a payment to complete the transaction, the session, and the natural language dialogue; and
providing language-based ordering for the user at the drive-thru during the session tailored to the type of restaurant associated with the drive thru, wherein the user is instructed to obtain the order once fulfilled from a designated window or a food storage bin that  unlocks when the order is completed, wherein providing language-based ordering further includes finalizing, at the drive thru terminal, the order for subsequent order fulfillment processing.

The above limitations recite the concept of drive-through ordering. These limitations, under their broadest reasonable interpretation, fall within the “Certain Methods of Organizing Human Activity” grouping of abstract ideas, enumerated in MPEP 2106, in that they recite commercial interactions, e.g. sales activities/behaviors, and managing personal behavior or relationships or interactions between people, e.g., following rules or instructions. Accordingly, under Prong One of Step 2A, claim 2 recites an abstract idea (Step 2A, Prong One: YES).
           Prong Two of Step 2A is the next step in the eligibility analyses and looks at whether the abstract idea is integrated into a practical application. This requires an additional element or combination of additional elements in the claims to apply, rely on, or user the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that the claim is more than a drafting effort designed to monopolize the exception.
	In this instance, the claims recite the additional elements of:
providing executable instructions to a processor of a device from a non-transitory computer-readable storage medium
processing images captured by a camera using a machine-learning algorithm that is trained
events triggered by a sensor
A microphone
The greeting being played over a speaker
A point-of-sale terminal
Automated speech recognition
An order interface
A network service and network device
A real-time speech-based interface configured to adapt a dialogue flow
Automated natural language-based ordering
The unlocking being automatic
However, these elements do not amount to an improvement in the functioning of a computer or any other technology or technical field; apply the judicial exception with, or by use of, a particular machine; or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort to monopolize the exception.
In addition, the recitations are recited at a high level of generality and also do not amount to an improvement in the functioning of a computer or any other technology or technical field; apply the judicial exception with, or by use of, a particular machine; or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort to monopolize the exception.
 
The dependent claims also fail to recite elements which amount to an improvement in the functioning of a computer or any other technology or technical field; apply the judicial exception with, or by use of, a particular machine; or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort to monopolize the exception. For example, claims 3, 5, 7-9, and 12 are directed to the abstract idea itself and do not amount to an integration according to any one of the considerations above. As for claims 4, 6, and 10-11 these claims are similar to the independent claims except that they recite the further additional elements of a camera, automated text, a mobile device, a speaker, and a server. These additional elements are recited at a high level of generality and also do not amount to an improvement in the functioning of a computer or any other technology or technical field; apply the judicial exception with, or by use of, a particular machine; or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort to monopolize the exception. Therefore the dependent claims do not create an integration for the same reasons.
Step 2B is the next step in the eligibility analyses and evaluates whether the claims recite additional elements that amount to an inventive concept (i.e., “significantly more”) than the recited judicial exception. According to Office procedure, revised Step 2A overlaps with Step 2B, and thus, many of the considerations need not be re-evaluated in Step 2B because the answer will be the same.
In Step 2A, several additional elements were identified as additional limitations:
providing executable instructions to a processor of a device from a non-transitory computer-readable storage medium
processing images captured by a camera using a machine-learning algorithm that is trained
events triggered by a sensor
A microphone
The greeting being played over a speaker
A point-of-sale terminal
Automated speech recognition
An order interface
A network service and network device
A real-time speech-based interface configured to adapt a dialogue flow
Automated natural language-based ordering
The unlocking being automatic
These additional limitations, including the limitations in the dependent claims, do not amount to an inventive concept because they were already analyzed under Step 2A and did not amount to a practical application of the abstract idea. Therefore, the claims lack one or more limitations which amount to an inventive concept in the claims.
For these reasons, the claims are rejected under 35 U.S.C. 101.
Claims 13-18 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
	
First, it is determined whether the claims are directed to a statutory category of invention. In the instant case, claims 13-18 are directed to a process. Therefore, claims 13-18 are directed to statutory subject matter under Step 1 of MPEP 2106  (Step 1: YES).
The claims are then analyzed to determine whether the claims are directed to a judicial exception. In determining whether the claims are directed to a judicial exception, the claims are analyzed to evaluate whether the claims recite a judicial exception (Prong One of Step 2A), as well as analyzed to evaluate whether the claims recite additional elements that integrate the judicial exception into a practical application of the judicial exception (Prong Two of Step 2A). 
Claim 13 recites at least the following limitations that are believed to recite an abstract idea:
initiating an initial greeting for a user detected in a vehicle adjacent to a drive thru terminal with a voice prompt, upon detecting the vehicle using visual recognition techniques comprising an algorithm that is refined on images to determine when a vehicle is present at the drive thru terminal;
obtaining a transaction identifier for a transaction based on the initial greeting being spoken; 
configuring speech based on a specific lexicon of words associated with a menu and menu items available from the drive thru terminal; 
wherein the specific lexicon of words includes nouns identifying menu items, adjectives defining characteristics of menu items, prepositions for including or not including items with menu items, and exclamations for confirming an order;
configuring the specific lexicon to be specific to a particular restaurant type and providing the specific lexicon to a voice-enabled service as a specialized feature, wherein the voice- enabled service is a modified version of a consumer-voice service that includes the specialized feature for a specialized lexicon and that issues commands and operates through a voice- enabled communication;
initiating a session with the user using the transaction identifier; 
engaging the user during the session in a natural language dialogue comprising user-provided speech and generated speech to receive order details from the user for an order using speech processing, wherein engaging further includes translating speech provided the user into text commands associated with an order using the specific lexicon of words, wherein the specific lexicon of words is associated with a type of restaurant; 
instructing the user during the natural language dialogue on where and how a payment for the order can be supplied to complete the order; 
submitting the order details with the transaction identifier for order fulfillment and completion of the transaction; 
providing a real-time speech based dialogue for the user to verbally communicate the order, wherein the real-time speech-based dialogue provides a dialogue flow based on the specific lexicon of words associated with the type of restaurant to enable order processing; and 
providing language-based ordering for the user at the drive thru during the session tailored to the type of restaurant associated with the drive thru, wherein the user is instructed to obtain the order once fulfilled from a designated window or a food storage bin that unlocks when the order is fulfilled, wherein providing further includes finalizing, at the drive thru terminal, the order for subsequent order fulfillment processing.

The above limitations recite the concept of drive-through ordering. These limitations, under their broadest reasonable interpretation, fall within the “Certain Methods of Organizing Human Activity” grouping of abstract ideas, enumerated in MPEP 2106, in that they recite commercial interactions, e.g. sales activities/behaviors, and managing personal behavior or relationships or interactions between people, e.g., following rules or instructions. Accordingly, under Prong One of Step 2A, claim 13 recites an abstract idea (Step 2A, Prong One: YES).
           Prong Two of Step 2A is the next step in the eligibility analyses and looks at whether the abstract idea is integrated into a practical application. This requires an additional element or combination of additional elements in the claims to apply, rely on, or user the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that the claim is more than a drafting effort designed to monopolize the exception.
	In this instance, the claims recite the additional elements of:
providing executable instructions to a processor of a device from a non-transitory computer-readable storage medium 
speech synthesis
a machine-learning algorithm that is trained
a point of sale terminal
the greeting being spoken automatically
a network service and network device
Automated generated speech
Automated speech processing
An order interface
A real-time speech-based interface configured to adapt a dialogue flow
Automated natural language-based ordering
The unlocking being automatic
However, these elements do not amount to an improvement in the functioning of a computer or any other technology or technical field; apply the judicial exception with, or by use of, a particular machine; or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort to monopolize the exception.
In addition, the recitations are recited at a high level of generality and also do not amount to an improvement in the functioning of a computer or any other technology or technical field; apply the judicial exception with, or by use of, a particular machine; or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort to monopolize the exception.
 
The dependent claims also fail to recite elements which amount to an improvement in the functioning of a computer or any other technology or technical field; apply the judicial exception with, or by use of, a particular machine; or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort to monopolize the exception. For example, claims 14-18 are directed to the abstract idea itself and do not amount to an integration according to any one of the considerations above. Therefore the dependent claims do not create an integration for the same reasons.
Step 2B is the next step in the eligibility analyses and evaluates whether the claims recite additional elements that amount to an inventive concept (i.e., “significantly more”) than the recited judicial exception. According to Office procedure, revised Step 2A overlaps with Step 2B, and thus, many of the considerations need not be re-evaluated in Step 2B because the answer will be the same.
In Step 2A, several additional elements were identified as additional limitations:
providing executable instructions to a processor of a device from a non-transitory computer-readable storage medium 
speech synthesis
a machine-learning algorithm that is trained
a point of sale terminal
the greeting being spoken automatically
a network service and network device
Automated generated speech
Automated speech processing
An order interface
A real-time speech-based interface configured to adapt a dialogue flow
Automated natural language-based ordering
The unlocking being automatic
These additional limitations, including the limitations in the dependent claims, do not amount to an inventive concept because they were already analyzed under Step 2A and did not amount to a practical application of the abstract idea. Therefore, the claims lack one or more limitations which amount to an inventive concept in the claims.
For these reasons, the claims are rejected under 35 U.S.C. 101.
Claims 20-21 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
	
First, it is determined whether the claims are directed to a statutory category of invention. In the instant case, claims 20-21 are directed to a machine. Therefore, claims 20-21 are directed to statutory subject matter under Step 1 of MPEP 2106  (Step 1: YES).
The claims are then analyzed to determine whether the claims are directed to a judicial exception. In determining whether the claims are directed to a judicial exception, the claims are analyzed to evaluate whether the claims recite a judicial exception (Prong One of Step 2A), as well as analyzed to evaluate whether the claims recite additional elements that integrate the judicial exception into a practical application of the judicial exception (Prong Two of Step 2A). 
Claim 20 recites at least the following limitations that are believed to recite an abstract idea:
initiating a voice interaction with a user when a vehicle of the user is detected adjacent to the drive thru terminal based on images using an algorithm that is refined on images to determine when the vehicle is present at the drive thru terminal and when the user utters a specific command;
obtaining a unique transaction identifier for a transaction with the user during the voice interaction from a transaction manager;
generating speech responses to user-provided speech during the voice interaction, wherein generating further includes translating speech provided the user into text commands associated with an order using a specific vocabulary for a specific lexicon of words, wherein the specific lexicon of words is associated with a type of restaurant;
	wherein the specific lexicon includes a restricted set of words and phrases to identify user order commands, menu items as nouns, adjectives affecting the menu items, prepositions affecting menu items, and exclamations that confirm or do not confirm a particular user order;
configuring the specific lexicon to be specific to a particular restaurant type and providing the specific lexicon to a voice-enabled service as specialized feature, wherein a voice- enabled service is a modified version of a consumer-voice service that includes the specialized feature for a specialized lexicon and that issues commands and operates through a voice- enabled communication;
obtaining order details for an order being placed by the user based on the user-provided speech during the voice interaction using the unique transaction identifier for the transaction; 
providing the order details and the unique transaction identifier for order fulfillment; 
instructing the user on how and where to provide a payment to pay for the order and complete the order; 
providing a real-time speech-based dialogue for the user through the drive thru during the voice interaction, wherein the real-time speech-based dialogue provides a dialogue flow based on the specific lexicon of words associated with a particular type of restaurant to enable order processing;
providing language-based ordering for the user at the drive thru during the voice interaction tailored to the type of restaurant associated with the drive thru, wherein the user is instructed to obtain the order once fulfilled from a designated pickup location associated with the drive thru, wherein the designated window or a food storage bin that unlocks when the order is complete, wherein providing language-based ordering further includes finalizing, at the drive thru terminal, the order for subsequent order fulfillment processing.

The above limitations recite the concept of drive-through ordering. These limitations, under their broadest reasonable interpretation, fall within the “Certain Methods of Organizing Human Activity” grouping of abstract ideas, enumerated in MPEP 2106, in that they recite commercial interactions, e.g. sales activities/behaviors, and managing personal behavior or relationships or interactions between people, e.g., following rules or instructions. Accordingly, under Prong One of Step 2A, claim 20 recites an abstract idea (Step 2A, Prong One: YES).
           Prong Two of Step 2A is the next step in the eligibility analyses and looks at whether the abstract idea is integrated into a practical application. This requires an additional element or combination of additional elements in the claims to apply, rely on, or user the judicial exception in a manner that imposes a meaningful limit on the judicial exception, such that the claim is more than a drafting effort designed to monopolize the exception.
	In this instance, the claims recite the additional elements of:
a drive thru terminal comprising a display, a microphone, and a camera
a Point-Of-Sale (POS) terminal interfaced to an order fulfillment system
a server comprising a processor and a non-transitory computer-readable storage medium; the non-transitory computer-readable storage medium comprises processor-executable instructions
images provided by the camera
a machine-learning algorithm that is trained
automated speech responses
an order interface
a network service and network device
a real-time speech-based interface configured to adapt a dialogue flow
Automated natural language-based ordering
The unlocking being automatic
However, these elements do not amount to an improvement in the functioning of a computer or any other technology or technical field; apply the judicial exception with, or by use of, a particular machine; or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort to monopolize the exception.
In addition, the recitations are recited at a high level of generality and also do not amount to an improvement in the functioning of a computer or any other technology or technical field; apply the judicial exception with, or by use of, a particular machine; or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort to monopolize the exception.
 
The dependent claims also fail to recite elements which amount to an improvement in the functioning of a computer or any other technology or technical field; apply the judicial exception with, or by use of, a particular machine; or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort to monopolize the exception. As for claim 21, these claims are similar to the independent claims except that they recite the further additional elements of a cloud processing environment and a thin client device. These additional elements are recited at a high level of generality and also do not amount to an improvement in the functioning of a computer or any other technology or technical field; apply the judicial exception with, or by use of, a particular machine; or apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort to monopolize the exception. Therefore the dependent claims do not create an integration for the same reasons.
Step 2B is the next step in the eligibility analyses and evaluates whether the claims recite additional elements that amount to an inventive concept (i.e., “significantly more”) than the recited judicial exception. According to Office procedure, revised Step 2A overlaps with Step 2B, and thus, many of the considerations need not be re-evaluated in Step 2B because the answer will be the same.
In Step 2A, several additional elements were identified as additional limitations:
a drive thru terminal comprising a display, a microphone, and a camera
a Point-Of-Sale (POS) terminal interfaced to an order fulfillment system
a server comprising a processor and a non-transitory computer-readable storage medium; the non-transitory computer-readable storage medium comprises processor-executable instructions
images provided by the camera
a machine-learning algorithm that is trained
automated speech responses
an order interface
a network service and network device
a real-time speech-based interface configured to adapt a dialogue flow
Automated natural language-based ordering
The unlocking being automatic
These additional limitations, including the limitations in the dependent claims, do not amount to an inventive concept because they were already analyzed under Step 2A and did not amount to a practical application of the abstract idea. Therefore, the claims lack one or more limitations which amount to an inventive concept in the claims.
For these reasons, the claims are rejected under 35 U.S.C. 101.

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Claim Rejection – 35 USC § 103

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or non-
obviousness.
Claims 2-6, 8-12 are rejected under 35 U.S.C. 103 as being unpatentable over Coleman et al (US 20190108566 A1), hereinafter Coleman, in view of Carpenter II et al (US 20190171711 A1), hereinafter Carpenter, and further in view of Kelly et al (US 20180253805 A1), hereinafter Kelly.

Examiner Note: Coleman acronyms: DTOA = Drive-Thru Ordering Area; AOS = Automated Ordering System; RIS = Restaurant Information System; NLU/NLG = Natural Language Understanding/Generation; MOP = Manual Order Process

Regarding claim 2, Coleman discloses a method, comprising: providing executable instructions to a processor of a device from a non-transitory computer-readable storage medium causing the processor to perform operations (Coleman: [0016]), comprising:	
detecting a user in a vehicle at a drive thru terminal by processing images captured by a camera using a machine-learning algorithm that is trained on other images to determine when the vehicle is present at the drive thru terminal, by events triggered by a sensor [detector], or by receiving a vocal request from the user through a microphone (Coleman: “At 305, an identification of a vehicle present in an ordering area of a first entity can be made. The vehicle may be associated with a customer, such as an individual customer planning to interact with an ordering system.” [0082] – “a customer 6 drives their car to a particular DTOA 1, … Upon arrival , at least one detector 7 can sense the customer's presence at the particular DTOA 1 .” [0062] – “The detectors 7 may be any device or sensor operable to sense or otherwise detect a customer's presence within the DTOA” [0028]); 
greeting the user by playing an initial greeting over a speaker and displaying corresponding text presented on a display of the drive thru terminal based on the detecting (Coleman: “At least one speaker 9 is used to produce audible messages to customers, including greetings upon arrival and interactions during and after the ordering interactions are performed.” [0030] – “the detector ' s signal can be provided to the AOS … and cause the AOS 2 to initialize a new order session at 207. …At 208, …The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 …generating responses to the customer through the NLG and speaker 9 and/or through the digital board 10.” [0069-0070]);
assigning a unique identifier for a transaction with the user from a transaction manager of a point-of-sale (POS) terminal associated with the drive thru terminal (Coleman: “The RIS 3 can include at least one computer - based and / or software - based point - of sale ( POS ) system that allows for the manual entry of orders , executes and records transactions , … The RIS 3 also may include a customer loyalty or rewards system that tracks transactions with specific customers” [0035] – “a determination can be made, automatically and without user input , whether to initiate the interaction with the customer in an automated interaction mode …The initial determination can be based on a current context of the customer and/or the current context of the first entity. The current context of the customer may include or be based on the identification of the customer using any suitable analysis , including facial recognition … , a vehicle license plate analysis and lookup …, a method of customer identity input within the ordering area (e.g., a loyalty card or account identification or presentation, etc.) ” [0083-0084] – “The ongoing … interactions and transactions may be used in the determination, including a relative volume of transactions with customers in the DTOA 1 … a current number of customers ( or expected customers ) in line to enter a DTOA 1, entering the DTOA 1, … the number of DTOAs 1 currently in use, … and a current number of trans actions being performed” [0085] – It is recognized that the either the user identity/account, or an indication of the session necessary to determine “the number of DTOAs currently in use” may constitute an identifier of the session, which ends in an order transaction; [0022] further supports historical transactions being associated with a customer’s identity.);
 engaging the user in a natural language dialogue to start a session for the transaction with the user using a predefined set of vocabulary terms of a lexicon for automated speech recognition that is specific to a menu and menu options associated with the drive thru terminal and using the unique identifier obtained from the transaction manager, wherein engaging further includes translating speech provided by the user into text commands associated with an order interface using the lexicon, wherein the lexicon is associated with a type of restaurant (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . The AOS 2 then processes the customer's voice response via the microphone 8 and the NLU. The ordering and conversation algorithms determine the AOS's next action based on the customer's response and continues to interact with the customer 6 through the ordering and interaction process. The AOS 2 will continue to interact with the customer 6 by processing the voice input from the customer 6 through the NLU, executing one or more actions by the ordering and conversation algorithms, and generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10. The AOS 2 may process many rounds of interactions with the customer 6 to complete an order” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] – It is recognized in light of at least [0084-0085] that the session is initiated based on a determination of a user identity/account or a number of transactions or customers. – “The data is populated by information regarding or associated with the prior activities of the AOS 2 and a current state of the interaction with the customer. The data also may include data received from the RIS 3, such as the restaurant ' s menu” [0057]– It is further recognized that the session is customized for the specific type of restaurant insofar as the system provides recommendations using language specific to the restaurant (“specific menu items”).);
configuring the lexicon to be specific to a particular restaurant type and providing the lexicon to a voice-enabled network service as a specialized feature, wherein the voice-enabled network service is a modified version of a consumer-voice service that includes the specialized feature for a specialized lexicon and that issues commands and operates through a voice-enabled network device (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . … and continues to interact with the customer 6 through the ordering and interaction process” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] - “The data is populated by information regarding or associated with the prior activities of the AOS 2 and a current state of the interaction with the customer. The data also may include data received from the RIS 3, such as the restaurant ' s menu” [0057]– The system’s vocabulary/lexicon is specialized to the restaurant to be able to use language specific to the restaurant (“specific menu items”). – The system operates through a network service/device: [0104-0105]);
providing a real-time speech-based interface for the user to verbally communicate an order, wherein the real-time speech-based interface is configured to adapt a dialogue flow based on the lexicon associated with the type of restaurant to enable order processing (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . The AOS 2 then processes the customer's voice response via the microphone 8 and the NLU. The ordering and conversation algorithms determine the AOS's next action based on the customer's response and continues to interact with the customer 6 through the ordering and interaction process. The AOS 2 will continue to interact with the customer 6 by processing the voice input from the customer 6 through the NLU, executing one or more actions by the ordering and conversation algorithms, and generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10. The AOS 2 may process many rounds of interactions with the customer 6 to complete an order” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] – With reference to [0094], it is noted that operations of the system occur in real-time, “without any intentional delay, taking into account…time required to…gather…analyze…or transmit data.”);
 receiving the order from the user during the session based on the natural language dialogue (Coleman: “an order is received and processed , the digital board 10 may present the updated items included in the order to provide visual feedback to the customer 6 regarding the interaction .” [0031] – “In an ongoing interaction, the items included in a current order may be used to identify one or more items to recommend or likely items to be requested, as well as particular actions or clarifications to be made.” [0058]); 
providing the order to order fulfillment processing for fulfillment (Coleman: “At 218, the AOS 2 transmits the order information to the RIS 3 so that the order can be fulfilled by the RIS 3” [0080]); 
providing instructions to the user during the natural language dialogue using automated speech processing for the user to provide a payment to complete the transaction, the session, and the natural language dialogue (Coleman: “the AOS 2 determines that the order is complete and can generate a response to the customer 6 with instructions on proceeding to pick up and / or pay for the ordered food and beverages.” [0080] – See also Figure 1 & [0035], which note that electronic payment may be rendered through a POS device of the DTOA.); and
providing automated natural language-based ordering for the user through the drive thru terminal during the session tailored to the type of restaurant associated with the drive thru terminal (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . The AOS 2 then processes the customer's voice response via the microphone 8 and the NLU. The ordering and conversation algorithms determine the AOS's next action based on the customer's response and continues to interact with the customer 6 through the ordering and interaction process. The AOS 2 will continue to interact with the customer 6 by processing the voice input from the customer 6 through the NLU, executing one or more actions by the ordering and conversation algorithms, and generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10. The AOS 2 may process many rounds of interactions with the customer 6 to complete an order” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] – “The data is populated by information regarding or associated with the prior activities of the AOS 2 and a current state of the interaction with the customer. The data also may include data received from the RIS 3, such as the restaurant ' s menu” [0057] – It is recognized that the session is customized for the specific type of restaurant insofar as the system provides recommendations using language specific to the restaurant (“specific menu items”).), 
wherein the user is instructed to obtain the order once fulfilled from a designated window or a food storage bin (Coleman: “the AOS 2 determines that the order is complete and can generate a response to the customer 6 with instructions on proceeding to pick up … the ordered food and beverages.” [0080] – “the DTOA1 may be …a "pull through” drive-thru (e.g., order is placed at the DTOA 1 and the customer drives to a window or other area to receive the order)” [0027]),
wherein providing automated natural language-based ordering further includes finalizing, at the drive thru terminal, the order for subsequent order fulfillment processing (Coleman: “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer …At 218 , the AOS 2 transmits the order information to the RIS 3 so that the order can be fulfilled” [0080]),
but does not specifically teach that the lexicon includes predefined order-specific words and phrases comprising commands for one or more of: want, like, make, add, order, buy, purchase, cancel, delete, remove, modify, and change; the commands are identified as processing actions that are processed by the transaction manager for a particular order; or that the designated window or food storage bin automatically unlocks when the order is completed.
However, Carpenter teaches an artificially intelligent natural-language drive-through ordering system (Carpenter: Abstract), including that the lexicon includes predefined order-specific words and phrases comprising commands for one or more of: want, like, make, add, order, buy, purchase, cancel, delete, remove, modify, or change (Carpenter: “a highly accurate speech recognition component that is able to be trained to recognize a wide vocabulary of words” [0019] – “The NLP 51 pulls meaning out of the text . In an example , when the text comprises “ I want a cheeseburger." … an instruction set adding one cheeseburger to the order is generated, representing the intent of the order” [0050] – “providing an audio stream of a customer order … converting a word or words in the audio stream to text using the speech recognition module … receives recognized text from the order processor and creates or modifies an order based upon the recognized text … e.g. medium or well-done; mustard or no mustard… hold the onions on the burger or no ice in the drink” [0011] – See also [0037]. Examiner notes that this limitation merely recites non-functional language directed to conveying meaning to a human reader rather than toward a function [MPEP 2111.05]. Accordingly, this limitations is granted little to no patentable weight.);
wherein the commands are identified as processing actions that are processed by the transaction manager for a particular order (Carpenter: “The NLP 51 pulls meaning out of the text . In an example , when the text comprises “ I want a cheeseburger." … an instruction set adding one cheeseburger to the order is generated, representing the intent of the order” [0050]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because the results would be predictable. Specifically, Coleman would continue to teach engaging the user in a natural language dialogue to start a session for the transaction with the user using a predefined set of vocabulary terms of a lexicon for automated speech recognition that is specific to a menu and menu options associated with the drive thru terminal, except that now it would also teach that the lexicon includes predefined order-specific words and phrases comprising commands for one or more of : want, like, make, add, order, buy, purchase, cancel, delete, remove, modify, or change; and that the commands are identified as processing actions that are processed by the transaction manager for a particular order, according to the teachings of Carpenter. This is a predictable result of the combination.
In addition, it would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because it would result in an improved ability to streamline order processing, thereby enhancing the speed of the process and customer satisfaction (Carpenter: [0004]).

While Coleman/Carpenter teach that the user is instructed to obtain the order once fulfilled from a designated window or a food storage bin (Coleman: [0080], [0027]), they do not specifically teach that the designated window or food storage bin automatically unlocks when the order is completed.
However, Kelly teaches automated drive-thru techniques (Kelly: Abstract), including that the designated window or food storage bin automatically unlocks when the order is completed (Kelly: “When the food is ready for pick-up, the food conveyance and conditioning pod 940 may open at the front access port 920, and the partition door 982 may move to its open position. This frees the distribution tray to be grasped by a consumer and slid out of the food conveyance and conditioning pod 940.” [0156]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because the results would be predictable. Specifically, Coleman/Carpenter would continue to teach the user obtaining the order from the designated window or food storage, except that now it would also teach that the designated window or a food storage bin automatically unlocks when the order is completed, according to the teachings of Kelly. This is a predictable result of the combination.
In addition, it would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because it would result in an improved ability for accuracy and effectiveness of fast food restaurant order realization (Kelly: [0141]).

Regarding Claim 3, Coleman/Carpenter/Kelly teach the method of claim 2, wherein detecting further includes identifying a wake-up word or a wake-up phrase in user-provided speech to the microphone (Carpenter: “the artificially intelligent order processing system is configured such that the arrival of a customer 25 at an order station triggers an alert … or the customer 25 may make their presence known by speaking into a microphone 21 . A transaction is initiated when … the artificially intelligent order processing system 22 … alert the customer 25 that he has been recognized by the artificially intelligent order processing system as having arrived at the ordering station” [0033] – It is understood that the customer’s speaking into the microphone constitutes a wake word/phrase insofar as the spoken words “make their presence known,” triggering the system to initiate the transaction dialogue.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Carpenter with Coleman/Kelly for the reasons identified above with respect to claim 2. 

Regarding Claim 4, Coleman/Carpenter/Kelly teach the method of claim 2, wherein detecting further includes detecting a vehicle adjacent to or in front of the drive thru terminal in an image captured by a camera associated with the drive thru terminal (Coleman: “a customer 6 drives their car to a particular DTOA 1 … Upon arrival , at least one detector 7 can sense the customer ' s presence at the particular DTOA 1” [0062] – “the microphone 8 or camera 11 may serve as the detector 7” [0033]).

Regarding Claim 5, Coleman/Carpenter/Kelly teach the method of claim 2, wherein engaging further includes translating user-provided speech during the natural language dialogue into feedback text and presenting the feedback text on the display to confirm with the user the user-provided speech provided as responses during the natural language dialogue (Coleman: “generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10.” [0070]).

Regarding Claim 6, Coleman/Carpenter/Kelly teach the method of claim 5, wherein translating further includes translating automated speech for the automated speech processing generated during the natural language dialogue into automated text and presenting the automated text on the display to ensure the user hears the automated speech and can also read the automated speech during the natural language dialogue (Coleman: “generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10.” [0070]). 
Examiner Note: It is noted that a recitation of the intended use of the claimed invention does not impose any limit on the interpretation of the claim unless such a recitation results in a structural difference between the claimed invention and the prior art in order to patentably distinguish the claimed invention from the prior art. [MPEP 2111.04] In claim 6, the language “to ensure the user hears the automated speech and can also read the automated speech during the dialogue” represents an intended use of the terminal by the user, and as such is afforded little weight. See also MPEP 2103: “statements of intended use ... including statements of purpose” in the claims “raise a question as to its limiting effect.”

Regarding Claim 8, Coleman/Carpenter/Kelly teach the method of claim 2, wherein receiving further includes initiating a remote voice call to an agent when the user-provided speech is unable to be translated for completing the order during the natural language dialogue (Coleman: “At 209 , a determination can be made as to whether the microphone input from the customer 6 is useable , such as whether excessive background noise in the DTOA 1 degrades the performance of the NLU, or whether the customer 6 is unable to provide sufficient inputs for the DTOA 1 to be able to accurately evaluate the inputs. …If the input is not useable, method 200 can move to 214 , where the transaction is re - routed to the MOP 4 . … At 210, a determination is made as to whether the customer 6 is speaking or providing inputs in an unclear manner such that the performance of the NLU is degraded or unusable as a primary source of ordering determinations . If the input is sufficient , method 200 continues at 211, while if not, method 200 continues to 214.” [0072-0073] – “At 214 , if an active ordering process with the AOS 2 is re - routed to MOP 4 , then the human agents are alerted to the re - routed interaction and communicate with the customer 6 through the DTOA 1 equipment and can complete the order in the usual manual manner.” [0077]).

Regarding Claim 9, Coleman/Carpenter/Kelly teach the method of claim 8, wherein initiating further includes receiving order details for the order from the agent (Coleman: “At 214 , if an active ordering process with the AOS 2 is re - routed to MOP 4 , then the human agents are alerted to the re - routed interaction and communicate with the customer 6 through the DTOA 1 equipment and can complete the order in the usual manual manner.” [0077]).

Regarding claim 10, Coleman/Carpenter/Kelly teach the method of claim 2, further comprising receiving the user-provided speech during the natural language dialogue from a mobile device operated by the user (Kelly: “the customer has the ability to input or speak into a kiosk (or into an app provided on a mobile device) a user id and items ordered.” [0097] – “the mobile device 802 may comprise a microphone 810, wherein the microphone 810 and associated circuitry may convert the sound of the environment, including spoken words, into machine-compatible signals.” [0133]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Coleman/Carpenter with Kelly for the reasons identified above with respect to claim 2. 

Regarding Claim 11, Coleman/Carpenter/Kelly teach method of claim 10 further comprising, providing the automated speech generated during the natural language dialogue to a speaker of the mobile device (Kelly: “the mobile device 802 may comprise a microphone … … input facilities 814 may include a touchscreen display. Visual feedback 832 to the user may occur through a visual display, touchscreen display, or indicator lights. Audible feedback 834 may be transmitted through a loudspeaker or other audio transducer.” [0133]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Coleman/Carpenter with Kelly for the reasons identified above with respect to claim 2. 

Regarding Claim 12, Coleman/Carpenter/Kelly teach the method of claim 2 further comprising, processing the method on a server remotely located from the drive thru terminal or processing the method on the POS terminal located at a same establishment that the drive thru terminal is located (Coleman: “The AOS components operate through software executing , via one or more processors , on computers and / or computing devices located on or at the restaurant site or at one or more remote sites , which may include remote computing environments hosted by third parties , the restaurant , or the AOS vendor .” [0034] – “The system and methods described herein may be associated with a network that facilitates wireless or wireline communications between the components of the environment 100 , as well as with any other local or remote computer, such as mobile devices , clients , servers , remotely executed or located portions of a particular component, …one or more of the components may be included within network as one or more cloud - based ser vices or operations .” [0098]).

Claims 13, 15-18, and 20-21 are rejected under 35 U.S.C. 103 as being unpatentable over Coleman, in view of Madden et al (US 10726723 B1), hereinafter Madden, in view of Carpenter, and further in view of Kelly.

Regarding Claim 13, Coleman discloses a method, comprising: providing executable instructions to a processor of a device from a non-transitory computer-readable storage medium causing the processor to perform operations (Coleman: [0016]), comprising:
initiating an initial greeting for a user detected in a vehicle adjacent to a drive thru terminal with a voice prompt generated by speech synthesis, upon detecting the vehicle using visual recognition techniques comprising a machine-learning algorithm [artificial intelligence system] to determine when a vehicle is present at the drive thru terminal (Coleman: “At 305, an identification of a vehicle present in an ordering area of a first entity can be made. The vehicle may be associated with a customer, such as an individual customer planning to interact with an ordering system.” [0082] – “a customer 6 drives their car to a particular DTOA 1, … Upon arrival , at least one detector 7 can sense the customer's presence at the particular DTOA 1 .” [0062] – “At least one speaker 9 is used to produce audible messages to customers, including greetings upon arrival and interactions during and after the ordering interactions are performed.” [0030] – “At least one camera 11 can be operable to monitor and capture actions at the DTOA 1 , including detecting a new customer arriving at the DTOA and/or to capture the customer's license plate number, facial features, or other images for purposes of uniquely identifying the particular customer” [0032] – “the customer 6 may be identified using an artificial intelligence system operable to process an image captured by the camera 11 of the customer's license plate or vehicle, … or an image captured by the camera 11 of the customer's face.” [0078]– “the detector ' s signal can be provided to the AOS … and cause the AOS 2 to initialize a new order session at 207. …At 208, …The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 …generating responses to the customer through the NLG and speaker 9 and/or through the digital board 10.” [0069-0070] - See also Figure 1, which illustrates the vehicle being adjacent to the terminal.); 
obtaining a transaction identifier for a transaction from a point-of-sale (POS) terminal based on the initial greeting being spoken automatically (Coleman: “The RIS 3 can include at least one computer - based and / or software - based point - of sale ( POS ) system that allows for the manual entry of orders , executes and records transactions , … The RIS 3 also may include a customer loyalty or rewards system that tracks transactions with specific customers” [0035] – “a determination can be made, automatically and without user input , whether to initiate the interaction with the customer in an automated interaction mode …The initial determination can be based on a current context of the customer and/or the current context of the first entity. The current context of the customer may include or be based on the identification of the customer using any suitable analysis , including facial recognition … , a vehicle license plate analysis and lookup …, a method of customer identity input within the ordering area (e.g., a loyalty card or account identification or presentation, etc.) ” [0083-0084] – “The ongoing … interactions and transactions may be used in the determination, including a relative volume of transactions with customers in the DTOA 1 … , a current number of customers ( or expected customers ) in line to enter a DTOA 1 , entering the DTOA 1 , … the number of DTOAs 1 currently in use, … and a current number of trans actions being performed” [0085] – It is understood, without further clarifying language, that the either the user identity/account, or an indication of the session necessary to determine “the number of DTOAs currently in use” may constitute an identifier of the session, which ends in an order transaction; [0022] further supports historical transactions being associated with a customer’s identity.); 
configuring speech based on a specific lexicon of words associated with a menu and menu items available from the drive thru terminal (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . The AOS 2 then processes the customer's voice response via the microphone 8 and the NLU. The ordering and conversation algorithms determine the AOS's next action based on the customer's response and continues to interact with the customer 6 through the ordering and interaction process. The AOS 2 will continue to interact with the customer 6 by processing the voice input from the customer 6 through the NLU, executing one or more actions by the ordering and conversation algorithms, and generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10. The AOS 2 may process many rounds of interactions with the customer 6 to complete an order” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] – “The data is populated by information regarding or associated with the prior activities of the AOS 2 and a current state of the interaction with the customer. The data also may include data received from the RIS 3, such as the restaurant ' s menu” [0057] – Examiner notes that the NLG comprises “a set of ordering and conversation algorithms” [0034]); 
configuring the specific lexicon to be specific to a particular restaurant type and providing the specific lexicon to a voice-enabled network service as a specialized feature, wherein the voice- enabled network service is a modified version of a consumer-voice service that includes the specialized feature for a specialized lexicon and that issues commands and operates through a voice- enabled network device (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . … and continues to interact with the customer 6 through the ordering and interaction process” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] - “The data is populated by information regarding or associated with the prior activities of the AOS 2 and a current state of the interaction with the customer. The data also may include data received from the RIS 3, such as the restaurant ' s menu” [0057]– The system’s vocabulary/lexicon is specialized to the restaurant to be able to use language specific to the restaurant (“specific menu items”). – The system operates through a network service/device: [0104-0105]);
initiating a session with the user using the transaction identifier (Coleman: The current context of the customer may include or be based on the identification of the customer using …a method of customer identity input within the ordering area (e.g., a loyalty card or account identification …Depending on an analysis of the initial customer context, a determination can be made whether to initiate an automated or manual ordering process.” [0084] – “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9.” [0070] – It is understood that the identity or account may constitute an identifier to be associated with transactions. [0022] further supports historical transactions being associated with a customer’s identity.); 
engaging the user during the session in a natural language dialogue comprising the user-provided speech and automated generated speech to receive order details from the user for an order the user-provided speech and automated generated speech using the automated speech processing, wherein engaging further includes translating speech provided by the user into text commands associated with an order interface using the specific lexicon of words, wherein the specific lexicon of words is associated with a type of restaurant (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . The AOS 2 then processes the customer's voice response via the microphone 8 and the NLU. The ordering and conversation algorithms determine the AOS's next action based on the customer's response and continues to interact with the customer 6 through the ordering and interaction process. The AOS 2 will continue to interact with the customer 6 by processing the voice input from the customer 6 through the NLU, executing one or more actions by the ordering and conversation algorithms, and generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10. The AOS 2 may process many rounds of interactions with the customer 6 to complete an order” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] – It is recognized in light of at least [0084-0085] that the session is initiated based on a determination of a user identity/account or a number of transactions or customers. – “The data is populated by information regarding or associated with the prior activities of the AOS 2 and a current state of the interaction with the customer. The data also may include data received from the RIS 3, such as the restaurant ' s menu” [0057]– It is further recognized that the session is customized for the specific type of restaurant insofar as the system provides recommendations using language specific to the restaurant (“specific menu items”).);
instructing the user during the natural language dialogue on where and how a payment for the order can be supplied to complete the order (Coleman: “the AOS 2 determines that the order is complete and can generate a response to the customer 6 with instructions on proceeding to pick up and / or pay for the ordered food and beverages.” [0080] – See also Figure 1 & [0035], which note that electronic payment may be rendered through a POS device of the DTOA.); 
submitting the order details with the transaction identifier to the POS terminal for order fulfillment and completion of the transaction (Coleman: “At 218, the AOS 2 transmits the order information to the RIS 3 so that the order can be fulfilled by the RIS 3” [0080]); and
providing a real-time speech-based interface for the user to verbally communicate the order through the drive thru terminal during the session, wherein the real-time speech-based interface is configured to adapt a dialogue flow based on the specific lexicon of words associated with the type of restaurant to enable order processing (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . The AOS 2 then processes the customer's voice response via the microphone 8 and the NLU. The ordering and conversation algorithms determine the AOS's next action based on the customer's response and continues to interact with the customer 6 through the ordering and interaction process. The AOS 2 will continue to interact with the customer 6 by processing the voice input from the customer 6 through the NLU, executing one or more actions by the ordering and conversation algorithms, and generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10. The AOS 2 may process many rounds of interactions with the customer 6 to complete an order” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] – With reference to [0094], it is noted that operations of the system occur in real-time, “without any intentional delay, taking into account…time required to…gather…analyze…or transmit data.”);
providing automated natural language-based ordering for the user through the drive thru terminal during the session tailored to the type of restaurant associated with the drive thru terminal (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . The AOS 2 then processes the customer's voice response via the microphone 8 and the NLU. The ordering and conversation algorithms determine the AOS's next action based on the customer's response and continues to interact with the customer 6 through the ordering and interaction process. The AOS 2 will continue to interact with the customer 6 by processing the voice input from the customer 6 through the NLU, executing one or more actions by the ordering and conversation algorithms, and generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10. The AOS 2 may process many rounds of interactions with the customer 6 to complete an order” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] – “The data is populated by information regarding or associated with the prior activities of the AOS 2 and a current state of the interaction with the customer. The data also may include data received from the RIS 3, such as the restaurant ' s menu” [0057] – It is recognized that the session is customized for the specific type of restaurant insofar as the system provides recommendations using language specific to the restaurant (“specific menu items”).),
wherein the user is instructed to obtain the order once fulfilled from a designated window or a food storage bin (Coleman: “the AOS 2 determines that the order is complete and can generate a response to the customer 6 with instructions on proceeding to pick up … the ordered food and beverages.” [0080] – “the DTOA1 may be …a "pull through” drive-thru (e.g., order is placed at the DTOA 1 and the customer drives to a window or other area to receive the order)” [0027]), 
wherein providing natural language-based ordering further includes finalizing, at the drive thru terminal, the order for subsequent order fulfillment processing (Coleman: “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer …At 218 , the AOS 2 transmits the order information to the RIS 3 so that the order can be fulfilled” [0080]).
However, Coleman does not specifically teach that the machine-learning algorithm that is trained on images; that the specific lexicon of words includes nouns identifying menu items, adjectives defining characteristics of menu items, prepositions for including or not including items with menu items, and exclamations for  confirming an order; or that the designated window or food storage bin automatically unlocks when the order is completed.

However, Madden teaches systems for detecting vehicles and customers [Abstract], including that the machine-learning algorithm that is trained on images (Madden: “deep neural networks can be used to train the one or more recognition models to detect and classify tracked objects (e.g., vehicles, customers,” Col. 8, lines 5-15- “ the server 108 may collect imaging data from the distributed camera system 102 and use trained deep neural networks to detect and classify the vehicle in the collected imaging data” Col. 13, lines 15-25– See also Col. 3, lines 15-25; Col. 10, lines 60-65; Col. 15, lines 5-15, which note that the system detects arrival of vehicles and customers, such as for food pickups & at restaurants.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because the results would be predictable. Specifically, Coleman would continue to teach initiating an initial greeting for a user detected in a vehicle adjacent to a drive thru terminal with a voice prompt generated by speech synthesis, upon detecting the vehicle using visual recognition techniques comprising a machine-learning algorithm to determine when a vehicle is present at the drive thru terminal, except that now it would also teach that the machine-learning algorithm that is trained on images, according to the teachings of Madden. This is a predictable result of the combination.
In addition, it would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because it would result in an improved ability to detect vehicles and/or customers (Madden: Col. 2, lines 50-55).

However, Coleman/Madden do not specifically teach that the specific lexicon of words includes nouns identifying menu items, adjectives defining characteristics of menu items, prepositions for including or not including items with menu items, and exclamations for  confirming an order; or that the designated window or food storage bin automatically unlocks when the order is completed.


However, Carpenter teaches an artificially intelligent natural-language drive-through ordering system (Carpenter: Abstract), including that the specific lexicon of words includes nouns identifying menu, adjectives defining characteristics of menu items, prepositions for including or not including items with menu items, and exclamations for  confirming an order (Carpenter: “a highly accurate speech recognition component that is able to be trained to recognize a wide vocabulary of words” [0019] – “The NLP 51 pulls meaning out of the text . In an example , when the text comprises “ I want a cheeseburger." … an instruction set adding one cheeseburger to the order is generated, representing the intent of the order” [0050] – “providing an audio stream of a customer order … converting a word or words in the audio stream to text using the speech recognition module … receives recognized text from the order processor and creates or modifies an order based upon the recognized text … e.g. medium or well-done; mustard or no mustard… hold the onions on the burger or no ice in the drink” [0011] – See also [0037] & [0051].).
It would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because the results would be predictable. Specifically, Coleman/Madden would continue to teach that the real-time speech-based interface is configured to adapt a dialogue flow based on the specific lexicon of words associated with the type of restaurant to enable order processing, except that now it would also teach that the specific lexicon of words includes nouns identifying menu items, adjectives defining characteristics of menu items, prepositions for including or not including items with menu items, and exclamations for  confirming an order, according to the teachings of Carpenter. This is a predictable result of the combination.
In addition, it would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because it would result in an improved ability to streamline order processing, thereby enhancing the speed of the process and customer satisfaction (Carpenter: [0004]).
While Coleman/Madden/Carpenter teach that the user is instructed to obtain the order once fulfilled from a designated window or a food storage bin (Coleman: [0080], [0027]), they do not specifically teach that the designated window or food storage bin automatically unlocks when the order is completed.
However, Kelly teaches automated drive-thru techniques (Kelly: Abstract), including 
the designated window or food storage bin automatically unlocks when the order is completed (Kelly: “When the food is ready for pick-up, the food conveyance and conditioning pod 940 may open at the front access port 920, and the partition door 982 may move to its open position. This frees the distribution tray to be grasped by a consumer and slid out of the food conveyance and conditioning pod 940.” [0156]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because the results would be predictable. Specifically, Coleman/Madden/Carpenter would continue to teach that  the user obtaining the order from a designated window or food storage bin, except that now it would also teach the designated window or food storage bin automatically unlocks when the order is completed, according to the teachings of Kelly. This is a predictable result of the combination.
In addition, it would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because it would result in an improved ability for accuracy and effectiveness of fast food restaurant order realization (Kelly: [0141]).

Regarding Claim 15, Coleman/Madden/Carpenter/Kelly teach the method of claim 13, wherein engaging further includes translating user-provided speech to user-identified text, translating the automated generated speech to feedback text, and presenting the user-identified text with the feedback text on a display of the drive thru terminal during the session as visual feedback to the user (Coleman: “generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10.” [0070]).

Regarding Claim 16, Coleman/Madden/Carpenter/Kelly teach the method of claim 13, wherein instructing further includes using the automated generated speech to instruct the user to pull a vehicle of the user ahead to a next location where the payment for the order will be collected from the user (Coleman: “the DTOA1 may be … a “pull through” drive-thru ( e . g . , order is placed at the DTOA 1 and the customer drives to a window or other area to receive the order ) without departing from the solution .” [0027] – “the AOS 2 determines that the order is complete and can generate a response to the customer 6 with instructions on proceeding to pick up and / or pay for the ordered food and beverages .” [0080]). 

Regarding Claim 17, Coleman/Madden/Carpenter/Kelly teach the method of claim 13, wherein instructing further includes activating a card reader at the drive thru terminal to receive a payment card for the payment (Kelly: “the user may elect to pay at the kiosk 1105. In some examples the user may provide a credit card, a debit card, cash” [0176] – “The point-of-sale and drive-thru kiosks are responsible for translating the order to the kitchen staff …Both kiosks are responsible for handling payments, … handling traditional means of payment such as insert/slide credit card” [0096]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Coleman/Madden/Carpenter  with Kelly for the reasons identified above with respect to claim 13. 

Regarding Claim 18, Coleman/Madden/Carpenter/Kelly teach the method of claim 13, wherein instructing further includes instructing the user to pull ahead to a next location to receive items associated with the order after processing the payment (Coleman: “the DTOA1 may be … a “pull through” drive-thru ( e . g . , order is placed at the DTOA 1 and the customer drives to a window or other area to receive the order ) without departing from the solution .” [0027] – “the AOS 2 determines that the order is complete and can generate a response to the customer 6 with instructions on proceeding to pick up and / or pay for the ordered food and beverages .” [0080]).

Regarding Claim 20, Coleman discloses a system, comprising: a drive thru terminal comprising a display, a microphone, and a camera (Coleman: [0028], Figure 1); 
a point-of-sale (POS) terminal interfaced to an order fulfillment system (Coleman: “The RIS 3 can include at least one computer - based and / or software - based point - of sale ( POS ) system that allows for the manual entry of orders, executes and records transactions” [0035]);
a server (Coleman: [0098]) comprising a processor and a non-transitory computer-readable storage medium; the non-transitory computer-readable storage medium comprises executable instructions (Coleman: [0110]); 
the executable instructions when executed by the processor from the non- transitory computer-readable storage medium cause the processor to perform operations comprising: 
initiating a voice interaction with a user when a vehicle of the user is detected adjacent to the drive thru terminal based on images provided by the camera  using a machine-learning algorithm [artificial intelligence system] to determine when the vehicle is present at the drive thru terminal and when the user speaks into the microphone (Coleman: “At 305, an identification of a vehicle present in an ordering area of a first entity can be made. The vehicle may be associated with a customer, such as an individual customer planning to interact with an ordering system. … At 310, a determination can be made , automatically and without user input , whether to initiate the interaction with the customer in an automated interaction mode” [0082-0083] – “At least one camera 11 can be operable to monitor and capture actions at the DTOA 1 , including detecting a new customer arriving at the DTOA 1” [0032]  – “the customer 6 may be identified using an artificial intelligence system operable to process an image captured by the camera 11 of the customer's license plate or vehicle, … or an image captured by the camera 11 of the customer's face.” [0078]– “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . The AOS 2 then processes the customer's voice response via the microphone 8 and the NLU. … The AOS 2 will continue to interact with the customer 6 by processing the voice input from the customer 6 through the NLU, executing one or more actions by the ordering and conversation algorithms, and generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10.” [0070]); 
obtaining a unique transaction identifier for a transaction with the user during the voice interaction from a transaction manager of the POS terminal (Coleman: “The RIS 3 can include at least one computer - based and / or software - based point - of sale ( POS ) system that allows for the manual entry of orders , executes and records transactions , … The RIS 3 also may include a customer loyalty or rewards system that tracks transactions with specific customers” [0035] – “a determination can be made, automatically and without user input , whether to initiate the interaction with the customer in an automated interaction mode …The initial determination can be based on a current context of the customer and/or the current context of the first entity. The current context of the customer may include or be based on the identification of the customer using any suitable analysis , including facial recognition … , a vehicle license plate analysis and lookup …, a method of customer identity input within the ordering area (e.g., a loyalty card or account identification or presentation, etc.) ” [0083-0084] – “The ongoing … interactions and transactions may be used in the determination, including a relative volume of transactions with customers in the DTOA 1 … , a current number of customers ( or expected customers ) in line to enter a DTOA 1 , entering the DTOA 1 , … the number of DTOAs 1 currently in use, … and a current number of trans actions being performed” [0085] – It is understood, without further clarifying language, that the either the user identity/account, or an indication of the session necessary to determine “the number of DTOAs currently in use” may constitute an identifier of the session, which ends in an order transaction; [0022] further supports historical transactions being associated with a customer’s identity.);
generating automated speech responses to user-provided speech during the voice interaction, wherein generating further includes translating speech provided by the user into text commands associated with an order interface using a specific vocabulary for a specific lexicon of words, wherein the specific lexicon of words is associated with a type of restaurant (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . The AOS 2 then processes the customer's voice response via the microphone 8 and the NLU. The ordering and conversation algorithms determine the AOS's next action based on the customer's response and continues to interact with the customer 6 through the ordering and interaction process. The AOS 2 will continue to interact with the customer 6 by processing the voice input from the customer 6 through the NLU, executing one or more actions by the ordering and conversation algorithms, and generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10. The AOS 2 may process many rounds of interactions with the customer 6 to complete an order” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] – It is recognized in light of at least [0084-0085] that the session is initiated based on a determination of a user identity/account or a number of transactions or customers. – “The data is populated by information regarding or associated with the prior activities of the AOS 2 and a current state of the interaction with the customer. The data also may include data received from the RIS 3, such as the restaurant ' s menu” [0057]– It is further recognized that the session is customized for the specific type of restaurant insofar as the system provides recommendations using language specific to the restaurant (“specific menu items”).);
configuring the specific lexicon to be specific to a particular restaurant type and providing the specific lexicon to a voice-enabled network service as specialized feature, wherein a voice- enabled network service is a modified version of a consumer-voice service that includes the specialized feature for a specialized lexicon and that issues commands and operates through a voice- enabled network device (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . … and continues to interact with the customer 6 through the ordering and interaction process” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] - “The data is populated by information regarding or associated with the prior activities of the AOS 2 and a current state of the interaction with the customer. The data also may include data received from the RIS 3, such as the restaurant ' s menu” [0057]– The system’s vocabulary/lexicon is specialized to the restaurant to be able to use language specific to the restaurant (“specific menu items”). – The system operates through a network service/device: [0104-0105]);
obtaining order details for an order being placed by the user based on the user-provided speech during the voice interaction using the unique transaction identifier for the transaction (Coleman: “The AOS 2 may process many rounds of interactions with the customer 6 to complete an order” [0070] – “At 218, the AOS 2 transmits the order information to the RIS 3 so that the order can be fulfilled by the RIS 3” [0080] – It is recognized in light of at least [0084-0085] that the session is initiated based on a determination of a user identity/account or a number of transactions or customers.); 
providing the order details and the unique transaction identifier to the POS terminal for order fulfillment (Coleman: “At 218, the AOS 2 transmits the order information to the RIS 3 so that the order can be fulfilled by the RIS 3” [0080] – It is recognized in light of at least [0084-0085] that the session is initiated based on a determination of a user identity/account or a number of transactions or customers.); 
instructing the user on how and where to provide a payment to pay for the order and complete the order (Coleman: “the AOS 2 determines that the order is complete and can generate a response to the customer 6 with instructions on proceeding to pick up and / or pay for the ordered food and beverages.” [0080] – See also Figure 1 & [0035], which note that electronic payment may be rendered through a POS device of the DTOA.); and 
providing a real-time speech-based interface for the user through the drive thru terminal during the voice interaction, wherein the real-time speech-based interface is configured to adapt a dialogue flow based on the specific lexicon of words associated with a particular type of restaurant, to enable order processing (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . The AOS 2 then processes the customer's voice response via the microphone 8 and the NLU. The ordering and conversation algorithms determine the AOS's next action based on the customer's response and continues to interact with the customer 6 through the ordering and interaction process. The AOS 2 will continue to interact with the customer 6 by processing the voice input from the customer 6 through the NLU, executing one or more actions by the ordering and conversation algorithms, and generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10. The AOS 2 may process many rounds of interactions with the customer 6 to complete an order” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] – With reference to [0094], it is noted that operations of the system occur in real-time, “without any intentional delay, taking into account…time required to…gather…analyze…or transmit data.”);
providing automated natural language-based ordering for the user through the drive thru terminal during the voice interaction tailored to the specific type of restaurant associated with the drive thru terminal (Coleman: “The AOS 2 provides an initial voice prompt generated by the NLG to the customer through the speaker 9 . The AOS 2 then processes the customer's voice response via the microphone 8 and the NLU. The ordering and conversation algorithms determine the AOS's next action based on the customer's response and continues to interact with the customer 6 through the ordering and interaction process. The AOS 2 will continue to interact with the customer 6 by processing the voice input from the customer 6 through the NLU, executing one or more actions by the ordering and conversation algorithms, and generating responses to the customer through the NLG and speaker 9 and / or through the digital board 10. The AOS 2 may process many rounds of interactions with the customer 6 to complete an order” [0070] – “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer” [0079] – “The data is populated by information regarding or associated with the prior activities of the AOS 2 and a current state of the interaction with the customer. The data also may include data received from the RIS 3, such as the restaurant ' s menu” [0057] – It is recognized that the session is customized for the specific type of restaurant insofar as the system provides recommendations using language specific to the restaurant (“specific menu items”).), 
wherein the user is instructed to obtain the order once fulfilled from a designated window or a foods storage bin (Coleman: “the AOS 2 determines that the order is complete and can generate a response to the customer 6 with instructions on proceeding to pick up … the ordered food and beverages.” [0080] – “the DTOA1 may be …a "pull through” drive-thru (e.g., order is placed at the DTOA 1 and the customer drives to a window or other area to receive the order)” [0027]),
wherein providing automated natural language-based ordering further includes finalizing, at the drive thru terminal, the order for subsequent order fulfillment processing (Coleman: “The AOS 2 may personalize the interaction in a variety of ways , such as suggesting specific menu items to the customer …At 218 , the AOS 2 transmits the order information to the RIS 3 so that the order can be fulfilled” [0080]),
but does not specifically teach that the machine-learning algorithm that is trained on images; that the specific lexicon includes a restricted set of words and phrases to identify user order commands, menu items as nouns, adjectives affecting the menu items, prepositions affecting the menu items, and exclamations that confirm or do not confirm a particular user order; or that the voice interaction is initiated when the user utters a specific command into the microphone; or that the designated window or food storage bin automatically unlocks when the order is complete.
However, Madden teaches systems for detecting vehicles and customers [Abstract], including that the machine-learning algorithm that is trained on images (Madden: “deep neural networks can be used to train the one or more recognition models to detect and classify tracked objects (e.g., vehicles, customers,” Col. 8, lines 5-15- “ the server 108 may collect imaging data from the distributed camera system 102 and use trained deep neural networks to detect and classify the vehicle in the collected imaging data” Col. 13, lines 15-25– See also Col. 3, lines 15-25; Col. 10, lines 60-65; Col. 15, lines 5-15, which note that the system detects arrival of vehicles and customers, such as for food pickups & at restaurants.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because the results would be predictable. Specifically, Coleman would continue to teach initiating a voice interaction with a user when a vehicle of the user is detected adjacent to the drive thru terminal based on images provided by the camera  using a machine-learning algorithm to determine when the vehicle is present at the drive thru terminal and when the user speaks into the microphone, except that now it would also teach that the machine-learning algorithm that is trained on images, according to the teachings of Madden. This is a predictable result of the combination.
In addition, it would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because it would result in an improved ability to detect vehicles and/or customers (Madden: Col. 2, lines 50-55).
However, Coleman/Madden do not specifically teach that the specific lexicon includes a restricted set of words and phrases to identify user order commands, menu items as nouns, adjectives affecting the menu items, prepositions affecting the menu items, and exclamations that confirm or do not confirm a particular user order; or that the voice interaction is initiated when the user utters a specific command into the microphone; or that the designated window or food storage bin automatically unlocks when the order is complete
However, Carpenter teaches an artificially intelligent natural-language drive-through ordering system (Carpenter: Abstract), including:
that the specific lexicon includes a restricted set of words and phrases to identify user order commands, menu items as nouns, adjectives affecting the menu items, prepositions affecting the menu items, and exclamations that confirm or do not confirm a particular user order (Carpenter: “a highly accurate speech recognition component that is able to be trained to recognize a wide vocabulary of words” [0019] – “The NLP 51 pulls meaning out of the text . In an example , when the text comprises “ I want a cheeseburger." … an instruction set adding one cheeseburger to the order is generated, representing the intent of the order” [0050] – “providing an audio stream of a customer order … converting a word or words in the audio stream to text using the speech recognition module … receives recognized text from the order processor and creates or modifies an order based upon the recognized text … e.g. medium or well-done; mustard or no mustard… hold the onions on the burger or no ice in the drink” [0011] – See also [0037] & [0051].); and 
automatically initiating a voice-based dialogue with a user when the user utters a specific command into the microphone (Carpenter: “the artificially intelligent order processing system is configured such that the arrival of a customer 25 at an order station triggers an alert … or the customer 25 may make their presence known by speaking into a microphone 21 . A transaction is initiated when … the artificially intelligent order processing system 22 … alert the customer 25 that he has been recognized by the artificially intelligent order processing system as having arrived at the ordering station” [0033]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because the results would be predictable. Specifically, Coleman/Madden would continue to teach automatically initiating a voice-based dialogue with a user when a vehicle of the user is detected adjacent to the drive thru terminal based on images provided by the camera using a machine-learning algorithm that is trained on images to determine when the vehicle is present at the drive thru terminal and when the user speaks into the microphone, except that now it would also teach that the specific lexicon includes a restricted set of words and phrases to identify user order commands, menu items as nouns, adjectives affecting the menu items, prepositions affecting the menu items, and exclamations that confirm or do not confirm a particular user order; and automatically initiating a voice-based dialogue with a user when the user utters a specific command into the microphone, according to the teachings of Carpenter. This is a predictable result of the combination.
In addition, it would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because it would result in an improved ability to streamline order processing, thereby enhancing the speed of the process and customer satisfaction (Carpenter: [0004]).
Coleman/Madden/Carpenter teach that the user is instructed to obtain the order once fulfilled from a designated window or a foods storage bin (Coleman: [0080], [0027]), but do not specifically teach that the designated window or food storage bin automatically unlocks when the order is completed.

However, Kelly teaches automated drive-thru techniques (Kelly: Abstract), including that the designated window or food storage bin automatically unlocks when the order is completed (Kelly: “When the food is ready for pick-up, the food conveyance and conditioning pod 940 may open at the front access port 920, and the partition door 982 may move to its open position. This frees the distribution tray to be grasped by a consumer and slid out of the food conveyance and conditioning pod 940.” [0156]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because the results would be predictable. Specifically, Coleman/Madden /Carpenter would continue to teach the user obtaining the order from a designated window or food storage bin, except that now it would also teach that the designated window or food storage bin automatically unlocks when the order is completed, according to the teachings of Kelly. This is a predictable result of the combination.
In addition, it would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because it would result in an improved ability for accuracy and effectiveness of fast food restaurant order realization (Kelly: [0141]).

Regarding Claim 21, Coleman/Madden /Carpenter/Kelly teach the system of claim 20, wherein the server is a cloud processing environment and the drive thru terminal is a thin client device (Coleman: “one or more of the components may be included within network as one or more cloud - based ser vices or operations . The network may be all or a portion of an enterprise or secured network , while in another instance , at least a portion of the network may represent a connection to the Internet.” [0098] – With reference to [0035], [0038], and Figure 1, it is understood that the DTOA terminal operates as a thin client whereas the physically remote AOS, where ordering and NL steps are performed, may operate as a thick client for the system.).

Claims 7 is rejected under 35 U.S.C. 103 as being unpatentable over Coleman/Carpenter/Kelly, in further view of Suthar (US 20040158494 A1), hereinafter Suthar.

Regarding claim 7, Coleman/Carpenter/Kelly teach the method of claim 2, but does not specifically teach that receiving further include obtaining menu item images for menu items as spoken by the user during the natural language dialogue and presenting the menu item images on the display as visual feedback to the user of a particular menu item ordered with the order by the user.
However, Suthar teaches an automated restaurant system with drive-thru functionality (Suthar: Abstract, [0053]), including obtaining menu item images as spoken by the user during the natural language dialogue and presenting the menu item images on the display as visual feedback to the user of a particular menu item ordered with the order by the user (Suthar: “the AOS and E-Menu provide rich content information on the meals (photos, ingredients, preparation Video clips, nutrition, specials, etc.) in a dynamic format, it makes it easy for customers to be more educated about what they are ordering. For example, photos provide a clear depiction of what the meal will look like prior to ordering.” [0021] – See also Figure 34.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because the results would be predictable. Specifically, Coleman/Carpenter/Kelly would continue to teach receiving an order from the user, except that now it would also teach obtaining menu item images as spoken by the user during the natural language dialogue and presenting the menu item images on the display as visual feedback to the user of a particular menu item ordered with the order by the user, according to the teachings of Suthar. This is a predictable result of the combination.
In addition, it would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because it would result in an improvement to the efficiencies in restaurant operations (Suthar: [0006]).

Claims 14 is rejected under 35 U.S.C. 103 as being unpatentable over Coleman/Madden/Carpenter/Kelly, in further view of Suthar (US 20040158494 A1), hereinafter Suthar.

Regarding Claim 14, Coleman/Madden/Carpenter/Kelly teach the method of claim 13, but does not specifically teach that engaging further includes obtaining menu item images for certain menu items as spoken by the user during the natural language dialogue from the POS terminal and displaying the menu item images as visual feedback to the user on a display of the drive thru terminal.
However, Suthar teaches an automated restaurant system with drive-thru functionality(Suthar: Abstract, [0053]), including obtaining menu item images as spoken by the user during the natural language dialogue from the POS terminal and displaying the menu item images as visual feedback to the user on a display of the drive thru terminal (Suthar: “the AOS and E-Menu provide rich content information on the meals (photos, ingredients, preparation Video clips, nutrition, specials, etc.) in a dynamic format, it makes it easy for customers to be more educated about what they are ordering. For example, photos provide a clear depiction of what the meal will look like prior to ordering.” [0021] – See also Figure 34.)
It would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because the results would be predictable. Specifically, Coleman/Madden/Carpenter/Kelly would continue to teach engaging the user during the session in a natural language dialogue to receive order details, except that now it would also teach obtaining menu item images as spoken by the user during the natural language dialogue from the POS terminal and displaying the menu item images as visual feedback to the user on a display of the drive thru terminal, according to the teachings of Suthar. This is a predictable result of the combination.
In addition, it would have been obvious to one of ordinary skill in the art before the effective filing date of invention to combine these references because it would result in an improvement to the efficiencies in restaurant operations (Suthar: [0006]).


Response to Arguments
	Applicant's arguments filed 1/27/2026 have been fully considered but are not persuasive.

Claim Rejections – 35 USC § 101
Applicant argues with respect to Step 2A Prong 1 that the claims are not directed to an abstract idea. Applicant makes reference to Desjardins to argue that the claims “provide specific technological improvements to drive-thru terminal systems by” implementing a machine-learning algorithm trained to images to automatically detect vehicle presence at drive-thru terminals, using restaurant-specific lexicons constrained to menu items and menu options to substantially improve speech recognition accuracy, integrating specialized voice-enabled network services as enhanced skills for consumer-voice services, and providing real-time speech-based interfaces that adapt dialogue flows based on restaurant specific lexicons.” 
Examiner disagrees. With reference to the rejection above, the claims recite steps that amount to a concept for drive-through ordering. These limitations, under their broadest reasonable interpretation, fall within the “Certain Methods of Organizing Human Activity” grouping of abstract ideas, enumerated in MPEP 2106, in that they recite commercial interactions, e.g. sales activities/behaviors, and managing personal behavior or relationships or interactions between people, e.g., following rules or instructions. This concept includes the argued ability to detect vehicle presence, using restaurant-specific lexicons constrained to menu items and menu options, providing real-time speech dialogue, etc. except for the recitation of computer related additional elements as addressed in subsequent steps. Rather than being directed to any alleged improvement, the claims are directed to this abstract idea. Whereas Desjardins identifies a specifical technological problem in the Specification and recites claim limitations providing a specific technological solution to that problem, the pending claims merely invoke the additional elements to provide a general linking to computer technology [MPEP 2106.05(f)], and at best offer the improved speed or efficiency inherent to a general purpose computer [MPEP 2106.05(a)].

Applicant argues with respect to Step 2A Prong 2 that the claims integrate the abstract idea into a practical application “by providing specific improvements to drive-thru terminal technology and restaurant ordering systems. With further reference to Desjardins, Applicant argues that the claims “similarly integrate technological improvements into practical applications,” arguing that “automated initiation improves drive-thru terminal operations by eliminating the need for manual employee monitoring and enabling immediate customer engagement upon vehicle arrival,” that the “constrained lexicon approach substantially improves speech recognition accuracy compared to general-purpose speech recognition systems,” and that the “specific hardware integration enables the claimed technological improvements and is far removed from abstract ideas.” Applicant also argues that “the claims solve a real world problem in the restaurant industry by automating drive-thru ordering while maintaining or improving accuracy,” which” improves restaurant operations, reduces labor costs, and enhances customer experience.” 
Examiner disagrees. Whereas Desjardins identifies a specifical technological problem in the Specification and recites claim limitations providing a specific technological solution to that problem, the pending claims merely invoke the additional elements as mere instructions to apply the abstract idea to a technological environment, providing only a general linking to computer technology [MPEP 2106.05(f)]. The additional elements are recited at a high level of generality, and the alleged improvements either step solely from the abstract idea, e.g. using a restaurant-specific lexicon or immediately engaging customers for an “enhance[d] customer experience,” offer only the improved speed or efficiency inherent to a general purpose computer [MPEP 2106.05(a)], e.g. the argued “reduce[d] labor costs” stemming from mere instructions to automate the abstract idea.

Applicant argues with respect to Step 2B that “the additional elements amount to significantly more because they implement unconventional machine learning technology … utilize restaurant-specific lexicons for improved accuracy …provide enhanced consumer-voice services with specialized skills…integrate multiple hardware components in novel ways… [and] provide real-time adaptive dialogue processing.”
Examiner disagrees. Similar to the discussion with respect to Step 2A Prong 2, alleged improvements such as the ability to “utilize restaurant-specific lexicons for improved accuracy” and “provide real-time adaptive dialogue processing” are part of the abstract idea itself, rather than computer-related additional elements, and are at most business improvements rather than technological ones. Additional elements such as the hardware components and “machine learning technology” are recited at a very high level of generality, and provide only a general linking to computer technology, such that they amount to mere instructions to apply the abstract idea to a technological environment [MPEP 2106.05(f)], offering at most only the improved speed or efficiency inherent to a general purpose computer [MPEP 2106.05(a)].

Claim Rejections – 35 USC § 103
Applicant argues with respect to Claim 2 that the claim now recites “detecting a user at a drive thru terminal by processing images captured by a camera using a machine-learning algorithm that is trained on other images to determine when the vehicle is present at the drive thru terminal.” Applicant argues that “none of the cited references…disclose or suggest using a machine learning algorithm that is trained on images.”
Examiner disagrees – Applicant’s argument does not accurately reflect the claim language, which recites that a user is detected by one of “processing images captured by a camera using a machine-learning algorithm that is trained on other images to determine when the vehicle is present at the drive thru terminal, by events triggered by a sensor, or by receiving a vocal request from the user through a microphone.” With reference to the rejection above, Coleman discloses that, upon arrival at the drive thru, the presence of the customer’s car is detected using a detector [0062], which is “any device or sensor operable to sense or otherwise detect a customer's presence. [0028]. In other words, Coleman discloses detecting a user at a drive thru terminal by events triggered by a sensor, and thus teaches limitation as actually recited in the claim.

Applicant argues that “there is no motivation articulated in the references to combine their teachings to arrive at the claimed invention.” Applicant argues that the references “address fundamentally different problems and operate in different technical domains.” 
In response to applicant’s argument that there is no teaching, suggestion, or motivation to combine the references, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art.  See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007).  In this case, Coleman, Carpenter, and Kelly are closely analogous art, and it would have been obvious to combine Coleman’s teaching of engaging the user in a natural language dialogue to start a session for the transaction with the user using a predefined set of vocabulary terms of a lexicon for automated speech recognition that is specific to a menu and menu options associated with the drive thru terminal with Carpenter’s teaching that the lexicon includes predefined order-specific words and phrases comprising commands for one or more of : want, like, make, add, order, buy, purchase, cancel, delete, remove, modify, or change; and that the commands are identified as processing actions that are processed by the transaction manager for a particular order, because it would result in an improved ability to streamline order processing, thereby enhancing the speed of the process and customer satisfaction (Carpenter: [0004]). Similarly, it would have been obvious for Coleman/Carpenter to continue to teach the user obtaining the order from the designated window or food storage, except that now it would also teach that the designated window or a food storage bin automatically unlocks when the order is completed, according to the teachings of Kelly, because it would result in an improved ability for accuracy and effectiveness of fast food restaurant order realization (Kelly: [0141]).

Applicant further argues that “the claimed invention provides unexpected results not achievable through the references,” arguing that “immediate, automatic initiation upon vehicle detection provides several unexpected advantage: elimination of employee monitoring…immediate customer engagement…improved accuracy…seamless integration,” and argues that “machine-learning trained on images provides more reliable detection” and that “the trained algorithm automatically initiates the restaurant-specific voice ordering system.”
Examiner respectfully disagrees. With reference to the rejection above, Coleman teaches, in part, real-time detection of user vehicles upon arrival using at least a detector that can sense the customer’s presence [0062], and to engage in natural-language understanding and recognition with the user automatically [0070] including with reference to specific terms related to the particular restaurant, such as menu items [0079]. The cited references teach the functionalities, as claimed, which Applicant is alleging provide improvements. Examiner also notes that the argued “machine learning” is not a required element of Claim 2, with no trained algorithm being required to initiate the claimed natural language processing of the voice ordering system, or to detect the user’s arrival.

Applicant argues with respect to Claim 13 that “none of the references – Coleman, Carpenter, or Kelly – disclose or suggest a machine-learning algorithm trained on images to determine vehicle presence at drive-thru terminals.” Applicant argues that there are 6 “specific technical elements” not disclosed or suggested by any reference previously relied upon: “initiating greeting upon vehicle detection using machine learning trained on images,” “obtaining transaction identifier from POS terminal based on automatic greeting,” “configuring speech based on specific lexicon including nouns, adjectives, prepositions, and exclamation,” “providing specialized lexicon to voice-enabled network service as specialized feature,” “providing real-time speech-based interface adapted to restaurant type,” and “complete natural language dialogue with automated order fulfillment and payment procession.”
Examiner partially disagrees. Coleman teaches a drive thru system in which a customer’s vehicle, upon arrival, is detected by a detector, such as a camera [0062], and identified [0082]. An AI system is used to process the camera image in order to recognize that a customer has arrived [0078] This causes a new session to be initiated, during which the system engages in natural language processing [0070]. Upon arrival, the system also provides an audible greeting to the customer, along with instructions for engaging with the NLP system [0030]. Coleman further teaches that the system uses a loyalty or rewards system to classify a transaction/order with a specific customer [0035], and ties user identifier data to the newly established interaction/order [0083-0084]. The system tracks the number of active transactions [0085]. After providing an initial voice prompt, the system receives the user’s voice response and processes it using NLP and continues to interact with the user and generates responses [0070], such as suggesting menu items [0079] from the restaurant’s menu [0057] until the order is completed for ordering, under the current context tied to the user and the order [0070, 0084], when a transaction identifier is used [0022]. The system provides its responses using automatically-generated speech [0070]. When the order is complete, the user is instructed how to pick up and pay for their food [0080], such as a payment through the automated drive thru terminal [0035]. This order is submitted to the restaurant for fulfillment [0080], and the user is instructed to pull through to a window to pick up their order [0027, 0080]. 
However, the rejection turns to newly-relied upon reference Madden, which teaches an analogous system for detecting customer arrivals at commercial venues [Abstract], including that machine learning/AI models such as those in Coleman can be trained using images to detect arriving vehicles or people [Col. 8, lines 5-15]. Previously-cited reference Carpenter teaches an analogous AI-driven automated drive-through system [Abstract] which teaches that drive-thru systems using a lexicon as in Coleman may use a particular vocabulary [0019], including menu-specific nouns, characteristics of the menu items, expressions related to inclusion of items, and order confirmation terms [0011, 0037,0050-0051]. In addition, previously-cited reference Kelly teaches an analogous automated drive-through [Abstract], including, in systems such as the one of Coleman in which food is picked up from a designated bin or window, that the bin or window can unlock “when the food is ready for pick-up” [0156].

Applicant further argues that “the references provide no teaching or suggestion that would lead one of the ordinary skill in the art to combine,” arguing that “this specific approach…is neither disclosed nor suggested by the references.”
In response to applicant’s argument that there is no teaching, suggestion, or motivation to combine the references, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art.  See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007).  In this case, Coleman, Madden, Carpenter, and Kelly are closely analogous art, and it would have been obvious to combine Coleman’s teaching of initiating a voice interaction with a user when a vehicle of the user is detected adjacent to the drive thru terminal based on images provided by the camera  using a machine-learning algorithm to determine when the vehicle is present at the drive thru terminal and when the user speaks into the microphone with Madden’s teaching that the machine-learning algorithm that is trained on images because it would result in an improved ability to detect vehicles and/or customers (Madden: Col. 2, lines 50-55). Similarly, it would have been obvious to combine Coleman/Madden’s teaching for engaging the user in a natural language dialogue to start a session for the transaction with the user using a predefined set of vocabulary terms of a lexicon for automated speech recognition that is specific to a menu and menu options associated with the drive thru terminal with Carpenter’s teaching that the lexicon includes predefined order-specific words and phrases comprising commands for one or more of : want, like, make, add, order, buy, purchase, cancel, delete, remove, modify, or change; and that the commands are identified as processing actions that are processed by the transaction manager for a particular order, because it would result in an improved ability to streamline order processing, thereby enhancing the speed of the process and customer satisfaction (Carpenter: [0004]). Similarly, it would have been obvious for Coleman/Madden/Carpenter to continue to teach the user obtaining the order from the designated window or food storage, except that now it would also teach that the designated window or a food storage bin automatically unlocks when the order is completed, according to the teachings of Kelly, because it would result in an improved ability for accuracy and effectiveness of fast food restaurant order realization (Kelly: [0141]).

Applicant further argues that the claim results in “unexpected advantages” including seamless customer experience, improved accuracy, scalability, context-awareness, and complete automation.
Examiner respectfully disagrees. With reference to the rejection above, the limitations upon which the alleged advantages are based are all taught by the combination of references applied. For instance, Coleman teaches automatic detection of customers, and context-aware natural language processing for an accurate, fully automated drive thru experience. With respect to Applicant’s arguments regarding “scalability,” it is noted that the claims do not recite implementation via, for example, a Siri system, but the use of consumer-voice services, which without further clarification is understood to encompass the ability to engage with a consumer in a voice-based dialogue, such as in Coleman. In addition, the references, including Coleman, describe a scalable solution capable of handling multiple customers at multiple terminals.

Applicant argues with respect to Claim 20 that “none of the references disclose or suggest this specific technical implementation,” arguing that the following elements are not taught by the previous combination of references: machine-learning-based vehicle detection from camera images, obtaining unique transaction identifiers from POS terminal transaction manager, generating automated speech using restaurant-specific lexicon with restricted vocabulary, providing lexicon as specialized feature for modified consumer-voice service, real-time speech-based interface adapting dialogue flow based on restaurant type, and complete integrated system with order fulfillment, payment processing, and automated food storage bins.”
Examiner partially disagrees. Coleman teaches a drive thru system in which a customer’s vehicle, upon arrival, is detected by a detector, such as a camera [0062], and identified [0082]. An AI system is used to process the camera image in order to recognize that a customer has arrived [0078] This causes a new session to be initiated, during which the system engages in natural language processing [0070]. Upon arrival, the system also provides an audible greeting to the customer, along with instructions for engaging with the NLP system [0030]. Coleman further teaches that the system uses a loyalty or rewards system to classify a transaction/order with a specific customer [0035], and ties user identifier data to the newly established interaction/order [0083-0084]. The system tracks the number of active transactions [0085]. After providing an initial voice prompt, the system receives the user’s voice response and processes it using NLP and continues to interact with the user and generates responses [0070], such as suggesting menu items [0079] from the restaurant’s menu [0057] until the order is completed for ordering, under the current context tied to the user and the order [0070, 0084], when a transaction identifier is used [0022]. The system provides its responses using automatically-generated speech [0070]. When the order is complete, the user is instructed how to pick up and pay for their food [0080], such as a payment through the automated drive thru terminal [0035]. This order is submitted to the restaurant for fulfillment [0080], and the user is instructed to pull through to a window to pick up their order [0027, 0080]. 
However, the rejection turns to newly-relied upon reference Madden, which teaches an analogous system for detecting customer arrivals at commercial venues [Abstract], including that machine learning/AI models such as those in Coleman can be trained using images to detect arriving vehicles or people [Col. 8, lines 5-15]. Previously-cited reference Carpenter teaches an analogous AI-driven automated drive-through system [Abstract] which teaches that drive-thru systems using a lexicon as in Coleman may use a particular vocabulary [0019], including menu-specific nouns, characteristics of the menu items, expressions related to inclusion of items, and order confirmation terms [0011, 0037,0050-0051]. In addition, previously-cited reference Kelly teaches an analogous automated drive-through [Abstract], including, in systems such as the one of Coleman in which food is picked up from a designated bin or window, that the bin or window can unlock “when the food is ready for pick-up” [0156].

Applicant further argues that the claimed system provides “synergistic technical advantages that are unexpected,” including integration of components that “enables immediate engagement of customers upon arrival,” a “restaurant-specific constrained lexicon [that] substantially improves speech recognition accuracy,” “rapid deployment and scalability” from using “existing voice-enabled network service[s]” such as Apple’s Siri, and advantages of “complete automation.” Applicant argues that “this system architecture is not taught, suggested, or motivated by the references.”
Examiner disagrees. The alleged advantages are based on limitations taught by the cited references. For instance, Coleman teaches immediate detection of customers upon arrival and initiation of voice interaction with them, as well as a restaurant-specific lexicon with knowledge of the specific restaurant’s menu items, such that the alleged improvement to speech recognition accuracy is present in the limitations taught by Coleman. With respect to Applicant’s arguments regarding “scalability,” it is noted that the claims do not recite implementation via, for example, a Siri system, but the use of consumer-voice services, which without further clarification is understood to encompass the ability to engage with a consumer in a voice-based dialogue, such as in Coleman. In addition, the references, including Coleman, describe a scalable solution capable of handling multiple customers at multiple terminals.

Applicant further argues that “there is “no motivation to construct the claimed system architecture,” as “each reference addresses different problems,” such that “none of the references would motivation one of ordinary skill to” combine them to teach the claim.
In response to applicant’s argument that there is no teaching, suggestion, or motivation to combine the references, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art.  See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007).  In this case, Coleman, Madden, Carpenter, and Kelly are closely analogous art, and it would have been obvious to combine Coleman’s teaching of initiating a voice interaction with a user when a vehicle of the user is detected adjacent to the drive thru terminal based on images provided by the camera  using a machine-learning algorithm to determine when the vehicle is present at the drive thru terminal and when the user speaks into the microphone with Madden’s teaching that the machine-learning algorithm that is trained on images because it would result in an improved ability to detect vehicles and/or customers (Madden: Col. 2, lines 50-55). Similarly, it would have been obvious to combine Coleman/Madden’s teaching for engaging the user in a natural language dialogue to start a session for the transaction with the user using a predefined set of vocabulary terms of a lexicon for automated speech recognition that is specific to a menu and menu options associated with the drive thru terminal with Carpenter’s teaching that the lexicon includes predefined order-specific words and phrases comprising commands for one or more of : want, like, make, add, order, buy, purchase, cancel, delete, remove, modify, or change; and that the commands are identified as processing actions that are processed by the transaction manager for a particular order, because it would result in an improved ability to streamline order processing, thereby enhancing the speed of the process and customer satisfaction (Carpenter: [0004]). Similarly, it would have been obvious for Coleman/Madden/Carpenter to continue to teach the user obtaining the order from the designated window or food storage, except that now it would also teach that the designated window or a food storage bin automatically unlocks when the order is completed, according to the teachings of Kelly, because it would result in an improved ability for accuracy and effectiveness of fast food restaurant order realization (Kelly: [0141]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Kalaimani (US 20200226667 A1) teaches systems for in-vehicle proximity -based ordering.
Moonetal (US 20090104920 A1) teaches a system for location-based retail-related services on a mobile device, including the ability to toggle, with an opt-in/out option, between two settings that either allow or prevent the receipt of advertisements from proximate merchants, wherein certain information about the merchants may be received regardless of the selected option. 
Minter et al (US 20100287052 A1) teaches short-range messaging that allows a merchant to contact a customer device when within geographic range. 
Hohlfeld et al (WO 2013163333 A2) teaches retail proximity marketing that allows a customer to be detected by a merchant and for contact to be made, along with various privacy considerations to protect the identity of a customer. 
Jung et al (US 20140123306 A1) teaches systems for controlling, at each of two devices, privacy settings that determine whether geographically proximate contacts may detect and interact with each other. 
Platt et al (US 10206089 B2) teaches systems for allowing a user to s witch between a private and public mode, wherein a device is detectable by other network devices while in public mode, and is not while in private mode. 
Reference U (NPL — see attached) discusses online ordering systems using location-based filters.
Siefken et al (US 20230368274 A1) discusses drive through eateries, including natural language processing and automated techniques based on a specific vocabulary.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to THOMAS JOSEPH SULLIVAN whose telephone number is (571)272-9736.  The examiner can normally be reached on Mon - Fri 8-5 PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Marissa Thein can be reached on 571-272-6764.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. 
Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/T.J.S./Examiner, Art Unit 3689

/MARISSA THEIN/Supervisory Patent Examiner, Art Unit 3689
Read full office action
Prosecution Timeline

Show 17 earlier events
Mar 31, 2025
Final Rejection mailed — §101, §103
Jul 30, 2025
Examiner Interview Summary
Jul 30, 2025
Applicant Interview (Telephonic)
Jul 31, 2025
Request for Continued Examination
Aug 01, 2025
Response after Non-Final Action
Oct 27, 2025
Non-Final Rejection mailed — §101, §103
Jan 27, 2026
Response Filed
Apr 21, 2026
Final Rejection mailed — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/196,365
Patent 12475505
SYSTEM AND METHOD FOR INTRODUCTION OF A TRANSACTION MECHANISM TO AN E-COMMERCE WEBSITE WITHOUT NECESSITATION OF MULTIPARTY SYSTEMS INTEGRATION
4y 8m to grant Granted Nov 18, 2025
17/365,516
Patent 12475444
SYSTEM AND METHOD FOR INTRODUCTION OF A TRANSACTION MECHANISM TO AN E-COMMERCE WEBSITE WITHOUT NECESSITATION OF MULTIPARTY SYSTEMS INTEGRATION
4y 4m to grant Granted Nov 18, 2025
17/479,310
Patent 12380303
IMAGE PROCESSING APPARATUS, IMAGE PROCESSING SYSTEM, AND METHOD TO PREVENT DUPLICATE ORDER FOR SUPPLIES
3y 10m to grant Granted Aug 05, 2025
15/883,618
Patent 12321977
METHOD AND APPARATUS FOR ONE-TAP MOBILE CHECK-IN
7y 4m to grant Granted Jun 03, 2025
18/118,618
Patent 12260441
VISUAL CABLE BUILDER
2y 0m to grant Granted Mar 25, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

9-10
Expected OA Rounds
28%
Grant Probability
50%
With Interview (+21.9%)
3y 3m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 130 resolved cases by this examiner. Grant probability derived from career allowance rate.