Prosecution Insights
Last updated: April 18, 2026
Application No. 18/653,485

SYSTEMS AND METHODS FOR AN ARTIFICIAL INTELLIGENCE AVATAR DRIVING COMPANION

Final Rejection §101§102
Filed
May 02, 2024
Examiner
JACKSON, JAKIEDA R
Art Unit
2657
Tech Center
2600 — Communications
Assignee
Toyota Motor North America, Inc.
OA Round
2 (Final)
74%
Grant Probability
Favorable
3-4
OA Rounds
3y 0m
To Grant
89%
With Interview

Examiner Intelligence

Grants 74% — above average
74%
Career Allow Rate
669 granted / 905 resolved
+11.9% vs TC avg
Strong +15% interview lift
Without
With
+15.4%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
35 currently pending
Career history
940
Total Applications
across all art units

Statute-Specific Performance

§101
25.8%
-14.2% vs TC avg
§103
42.5%
+2.5% vs TC avg
§102
21.8%
-18.2% vs TC avg
§112
3.5%
-36.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 905 resolved cases

Office Action

§101 §102
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Amendment In response to the Office Action mailed November 11, 2025, applicant submitted an amendment filed on February 12, 2026, in which the applicant amended and requested reconsideration. Response to Arguments Applicants argue that the prior art cited fails to teach wherein determining the voice profile of the avatar for the scenario of the plurality of scenarios includes determining a responsiveness level of the occupant of the vehicle to the voice profile of the avatar. It is explained that nowhere in Filev does it even mention the word “responsiveness”. Paragraph 0055 of the specification was pointed out and it was explained that it refers to different scenarios such as oil change (urgent) vs. possible collision impact (real urgent) requiring different levels of responsiveness (speed of response). As explained in the examiners interview, the responsive level is not clearly defined in the specification or the claims. The term responsive is only briefly mentioned in four paragraphs of the specification and it is explained in those four paragraphs that occupants may be more responsive to certain voices or voice profiles. Even the pointed out paragraph does not mention the word speed or even a correlation between speed and responsiveness. In fact, it merely states that something very urgent can be scenario 1. Therefore, the term responsive is broadly interpreted. It is noted that according to p. 0026 of Filev, for example, it teaches that based on inputs (different scenarios), various outputs are generated for the occupants (the system responds based on data received). Paragraph 0033 of Filev goes into details explaining various responsive states (e.g. the emotion state of the output is altered based on different scenarios. Regarding the 101 rejection, when considered as a whole, is directed to the abstract idea of collecting information about a human and selecting or adapting a presentation (a voice profile) based on that information (i.e., personalizing output based on scenario/context and user responsiveness). The claim recites high-level functional steps such as “identifying a scenario,” “determining a voice profile,” “dynamically selecting” a voice profile from a plurality, “providing audio,” and “continually training” selection based on additional data. These activities are properly characterized as mental processes, methods of organizing human activity (personalization and tailoring of communications), and/or abstract data processing (classification and selection), which are identified by the courts as abstract ideas. The additional elements of the claim: the recitation of a “computing device,” the use of “dynamically selecting,” and “continually training,” are presented at a high level and do not meaningfully limit the claim to a specific, unconventional improvement in computer or vehicle technology. The claim does not recite a particular hardware architecture, specialized data structures, concrete signal-processing steps, defined latency or safety constraints, or a specific machine-learning architecture or training regime that produces a technological improvement. The mere instruction to perform the abstract idea on a generic “computing device” or to use conventional machine learning (“continually training”) is insufficient to transform the abstract idea into patent-eligible subject matter. The claim does not supply an inventive concept that amounts to significantly more than the judicial exception because the claimed elements are routine, conventional data-processing activities implemented on generic computing hardware. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-5, 7-12 and 14-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claims are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more. The claims are directed to the abstract idea of for customizing speech, as explained in detail below. The limitations, as drafted, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “various elements” nothing in the claim element precludes the steps from mental processing, methods of organizing human activity (personalization and tailoring of communications), and/or abstract data processing. The additional elements of the claim: the recitation of a “computing device,” the use of “dynamically selecting,” and “continually training,” are presented at a high level and do not meaningfully limit the claim to a specific, unconventional improvement in computer or vehicle technology. For example, the language, identifying a scenario of a plurality of scenarios associated with an occupant of a vehicle (can be done by a user observing the scene in a vehicle with another person), determining a voice profile of an avatar for the scenario of the plurality of scenarios associated with the occupant of the vehicle (can be done by a user determining the voice characteristics of a user, as the specification, p. 0015 teaches that the avatar can be voice), selecting the voice profile of the avatar determined for the scenario of the plurality of scenarios associated with the occupant of the vehicle (can be done by a user determining how the user is speaking and associating a profile), wherein the voice profile of the avatar is selected from a plurality of voice profiles and providing audio to the occupant of the vehicle using the voice profile of the avatar based upon, at least in part, identifying the scenario of the plurality of scenarios associated with the occupant of the vehicle (can be done by a user outputting a voice fitting for the scene). The present claim language under its broadest reasonable interpretation, covers performance of mental processing, methods of organizing human activity and/or abstract data processing and recites generic computer components, which all falls within the grouping of abstract ideas. Accordingly, the claim recites an abstract idea. This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements which are recited at a high-level of generality (i.e., as a generic processor performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible. The dependent claims recite similar language such as selecting and generating a voice profile (claims 2-3, 6, 9-10, 13, 16-17 and 20), sampling audio (claims 4, 11 and 18), providing a manual selection (claims 5, 12 and 19) and matching the responsiveness level (7, 14 and 20), which is all non-statutory. Claim Rejections - 35 USC § 102 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claim(s) 1-5, 7-12 and 14-19 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Filev et al. (PGPUB 2008/0269958), hereinafter referenced as Filev. Regarding claims 1, 8 and 15, Filev discloses a computer-implemented method, medium and system, hereinafter referenced as a method comprising: identifying, by a computing device, a scenario of a plurality of scenarios associated with an occupant of a vehicle (the avatar's simulated human emotional state may depend on a variety of different criteria including an estimated emotional state of the occupant, a condition of the vehicle and/or a quality with which the EAS 10 is performing a task, etc. For example, the sensors may detect head movements, speech prosody, biometric information, etc. of the occupant that, when processed by the computer, indicate that the occupant is angry; p. 0032, 0172); determining a voice profile of an avatar for the scenario of the plurality of scenarios associated with the occupant of the vehicle (In one example response, the EAS may limit or discontinue dialog that it initiates with the occupant while the occupant is angry. In another example response, the avatar may be rendered in blue color tones with a concerned facial expression and ask in a calm voice "Is something bothering you; p. 0032-0033); dynamically selecting the voice profile of the avatar determined for the scenario of the plurality of scenarios associated with the occupant of the vehicle, wherein the voice profile of the avatar is selected from a plurality of voice profiles (during the exchange, the avatar may appear to become frustrated if, for example, the vehicle experiences frequent acceleration and deceleration or otherwise harsh handling. This change in simulated emotion may prompt the occupant to ask "What's wrong?" The avatar may answer "Your driving is hurting my fuel efficiency. You might want to cut down on the frequent acceleration and deceleration." The avatar may also appear to become confused if, for example, the avatar does not understand a command or query from the occupant. This type of dialog may continue with the avatar dynamically altering its simulated emotional state via its appearance, expression, tone of voice, word choice, etc. to convey information to the occupant 12; p. 0033, 0064, 0085); providing audio to the occupant of the vehicle using the voice profile of the avatar based upon, at least in part, identifying the scenario of the plurality of scenarios associated with the occupant of the vehicle (during the exchange, the avatar may appear to become frustrated if, for example, the vehicle experiences frequent acceleration and deceleration or otherwise harsh handling. This change in simulated emotion may prompt the occupant to ask "What's wrong?" The avatar may answer "Your driving is hurting my fuel efficiency. You might want to cut down on the frequent acceleration and deceleration." The avatar may also appear to become confused if, for example, the avatar does not understand a command or query from the occupant. This type of dialog may continue with the avatar dynamically altering its simulated emotional state via its appearance, expression, tone of voice, word choice, etc. to convey information to the occupant 12; p. 0033, 0124); and continually training the operation of dynamically selecting the voice profile based on additional data (learning module; p. 0164-0177) wherein determining the voice profile of the avatar for the scenario of the plurality of scenarios includes determining a responsiveness level of the occupant of the vehicle to the voice profile of the avatar (The prosodic analysis module may use multi-parametric speech analysis algorithms to determine the occupant's affective state. For example, the specific features of the speech input, such as speech rate, pitch, pitch change rate, pitch variation, Teager energy operator, intensity, intensity change, articulation, phonology, voice quality, harmonics to noise ratio, or other speech characteristics, are computed. The change in these values compared with baseline values is used as input into a classifier algorithm which determines the emotion on either a continuous scale or as speech categories; p. 0088-0092, 0032-0033). Regarding claims 2, 9 and 16, Filev discloses a method wherein the voice profile is selected from a voice library (avatar emotion includes weighted vector representations of a set of emotions for the avatar. Emotively tagged text includes marked-up phrases that indicate emotional content associated with certain words of the phrase. The avatar appearance is dynamically altered to express emotion, indicate speech is taking place and/or convey information, etc. The avatar expression is controlled by manipulating specific points on the surface of the avatar; p. 0064, 0085). Regarding claims 3, 10 and 17, Filev discloses a method further comprising generating the voice profile using a voice model (voice persona; p. 0054, 0090, 0145-0147, 0156). Regarding claims 4, 11 and 18, Filev discloses a method wherein the voice model is based upon, at least in part, a sampling of audio (detecting inputs generated by the occupant and converting them into digital information; wherein the input includes audible data; p. 0026, 0039). Regarding claims 5, 12 and 19, Filev discloses a method wherein the voice model is based upon, at least in part, a manual selection of a plurality of voice characteristics (This selective capability of the emotion generator 132 may reflect motivation and intent on the part of the EAS. For example, the EAS may use simulated emotion to convey the urgency of what is being said, to appeal to the occupant's emotions, to convey the state of the vehicle systems. The EAS 10 may thus determine the appropriate times to display emotions indicative of various inputs. The selective capability discussed above may be implemented through occupant request and/or automatically; p. 0121-0122). Regarding claims 7 and 14, Filev discloses a method further comprising matching the responsiveness level of the occupant of the vehicle to a level of the scenario of the plurality of scenarios (The learning module may implement a neural network that monitors the condition inputs and attempts to match patterns of conditions with occupant requests. Such a neural network may recognize, for example, that at a particular time of day, a certain driver asks for news about financial markets. As a result, the learning module may generate a task for an agent to gather such news about financial markets in advance of the particular time of day and generate a task to ask the driver just before that particular time of day if they would like the news about financial markets; p. 0172, 0032). Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAKIEDA R JACKSON whose telephone number is (571)272-7619. The examiner can normally be reached Mon - Fri 6:30a-2:30p. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached at 571.272.5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /JAKIEDA R JACKSON/Primary Examiner, Art Unit 2657
Read full office action

Prosecution Timeline

May 02, 2024
Application Filed
Nov 06, 2025
Non-Final Rejection — §101, §102
Feb 10, 2026
Applicant Interview (Telephonic)
Feb 12, 2026
Examiner Interview Summary
Feb 12, 2026
Response Filed
Apr 01, 2026
Final Rejection — §101, §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12603079
PROVIDING A REPOSITORY OF AUDIO FILES HAVING PRONUNCIATIONS FOR TEXT STRINGS TO PROVIDE TO A SPEECH SYNTHESIZER
2y 5m to grant Granted Apr 14, 2026
Patent 12603088
TRAINING A DEVICE SPECIFIC ACOUSTIC MODEL
2y 5m to grant Granted Apr 14, 2026
Patent 12598092
SYSTEMS, METHODS, AND APPARATUS FOR NOTIFYING A TRANSCRIBING AND TRANSLATING SYSTEM OF SWITCHING BETWEEN SPOKEN LANGUAGES
2y 5m to grant Granted Apr 07, 2026
Patent 12597427
CONFIGURABLE NATURAL LANGUAGE OUTPUT
2y 5m to grant Granted Apr 07, 2026
Patent 12597418
AUDIO SIGNAL PROCESSING DEVICE AND METHOD FOR SYNCHRONIZING SPEECH AND TEXT BY USING MACHINE LEARNING MODEL
2y 5m to grant Granted Apr 07, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
74%
Grant Probability
89%
With Interview (+15.4%)
3y 0m
Median Time to Grant
Moderate
PTA Risk
Based on 905 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month