Last updated: April 18, 2026
Application No. 17/460,494
AUTO-ADAPTATION OF AI SYSTEM FROM FIRST ENVIRONMENT TO SECOND ENVIRONMENT

Non-Final OA §103
Filed
Aug 30, 2021
Examiner
KIM, JONATHAN J
Art Unit
2141
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
5 (Non-Final)
Interview Optional

— +80.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 6 resolved cases, 2023–2026
Examiner Intelligence

KIM, JONATHAN J View full profile →
Grants only 33% of cases
Career Allow Rate
2 granted / 6 resolved
-21.7% vs TC avg
Strong +80% interview lift
Without
With
+80.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
30 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
36.7%
-3.3% vs TC avg
§103
38.6%
-1.4% vs TC avg
§102
15.9%
-24.1% vs TC avg
§112
8.7%
-31.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 6 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 03/17/2026 has been entered.

The status of the claims is as follows.
Claims 1, 8 and 15 are amended and Claims 2, 3, 6, 9, 10, 13, 16, 17 and 20 are cancelled. Claims 21-23 have been added. Claims 1, 4, 5, 7, 8, 11, 12, 14, 15, 18, 19, 21-23 are currently pending.

Claim Interpretation
Claim 15 recites a computer program product comprising one or more computer readable tangible storage media. The specification provides a specific definition for computer readable storage medium, see [0018]: “A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se”.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background of determining obviousness under 35 U.S.C 103 are summarized as follows:
Determining the scope and contents of the prior art.
Ascertaining the differences between the prior art and the claims at issue.
Resolving the level of ordinary skill in the pertinent art.
Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 4, 5 and 7; 8, 11, 12 and 14; 15, 18, 19 are rejected under 35 U.S.C. 103 as being unpatentable over Reicher et al. (US 20210081302 A1, hereinafter “Reicher”) in view of McBain et al. (US20210248825 A1, hereinafter “McBain”) and further in view of Hwang et al. (US 20210065685 A1, hereinafter “Hwang”) and further in view of Cohen et al. (US 20200310742 A1, hereinafter “Cohen”) in view of Katayama et al. (“Situation-Aware Emotion Regulation of Conversational Agents with Kinetic Earables” [2019], hereinafter “Katayama”) in view of Schmidt et al. (US20200260208A1, hereinafter “Schmidt”) in view of Jadav et al. (US20200100063A1, hereinafter “Jadav”)

Regarding Claim 1,
Reicher discloses Receiving a digital twin model associated with each user of a plurality of users, (Reicher [0018]; “The test generator 120 may receive information via input 140 (e.g., manually from a user) or may receive information automatically generated by the product requirements generator 110. The information includes job roles (“user roles” or “type of users”), user software activities (“sequences of user actions”), and objectives”
Reicher [0020]; “user software activities are actions taken by a user who is using software (e.g., starting a workflow, adding information, requesting a search, etc.)”, wherein the user software representing activities done by the physical counterpart of the digital user, thus reading on a digital twin model associated with a user,
Reicher[0004]; “The test operation sequence is executed to simulate different users having different job roles using the software with the user software activities to achieve the objectives.” wherein different users read on a plurality of users and associated software activity digital models)
wherein the digital twin model associated with each user at least includes a first digital twin model of a … user and a second digital twin model of a … user (Reicher [0004]; “The test operation sequence is executed to simulate different users having different job roles using the software with the user software activities to achieve the objectives.”, wherein the different users simulated through the usage of user software activities (read on as digital models) inherently discloses digital twin models of a first and second user)
wherein the first digital twin model and the second digital twin model are virtual representations of the first user and the second user, respectively (Reicher [0004]; “The test operation sequence is executed to simulate different users having different job roles using the software with the user software activities to achieve the objectives.”, wherein the different users simulated through software are interpreted as software simulated representations of the first and second user thus reading on virtual representations of the first and second user)
Identifying one or more characteristics of the digital twin model (Reicher [0018]; “ In certain embodiments, the product requirements generator 110 tracks user software activity (e.g., executing software, entering data into a software user interface, etc.) or processes user software activity logs and classifies the user software activity based on objectives and job roles using Artificial Intelligence (AI) (e.g., using the RN 132).”)
Executing a digital twin simulation of movements and activities for the digital model based on the one or more identified characteristics (Reicher [0028]; “ In certain additional embodiments, the test generator 120 automatically executes one or more test operation sequences consisting of simulated user software activities in various sequences that are associated with each objective. In this manner, the test generator 120 simulates an environment in which different users (e.g., users unfamiliar with the software and users familiar with the software) are simultaneously attempting to accomplish various objectives by exploring various sequences of user software activities.”) 
wherein the executed digital twin simulation is a virtual simulation (Reicher [0019]; “The computing device 100 is coupled to a data store 150. The data store 150 may store test operation sequences 160 (“test sequences”) and may store performance reports 170. The test generator 120 outputs the test operation sequences 160 and the performance reports 170. The test generator 120 generates test plans and executes tests based on an understanding of objectives per each job role. In certain embodiments, the test operation sequences 160 may be workflows. In additional embodiments, the test operation sequences may be test scripts that are executed by a computer.” wherein the test operation sequences interpreted as digital twin simulations of movements and activities are software executable scripts simulating real users, thus read on as a virtual simulation)
Creating a set of commands to be asked by the digital twin model in the digital twin simulation (Reicher [0016]; “Embodiments intelligently simulate real users based on the work that the users intend to accomplish (“objectives”, “goals” or “intended accomplishments”) to improve identification of software requirements and completion of automated software testing (“quality testing”). Software may also be referred to as a software application or computer program.”, 
Reicher [0032]; “In certain embodiments, to generate the list of requirements per job role for each site, the RNN classifies job roles and objectives from labeled user software activity logs generated during live use of the software by the user. Then, the frequently performed objectives for each job role may be designated as the site-specific requirements for that job role.”)
executing an additional digital twin simulation (Reicher [0028]; “ In certain additional embodiments, the test generator 120 automatically executes one or more test operation sequences consisting of simulated user software activities in various sequences that are associated with each objective. In this manner, the test generator 120 simulates an environment in which different users (e.g., users unfamiliar with the software and users familiar with the software) are simultaneously attempting to accomplish various objectives by exploring various sequences of user software activities.”, wherein the execution of one or more test operation sequences reads on executing an additional digital twin simulation)

Reicher fails to disclose but McBain discloses a digital twin model of a human user and a second digital twin model of a robotic user. (McBain [0023]; “In addition to replicating physical (e.g., equipment, buildings, etc.) and logical (e.g., computer systems/networks/functions), the elements comprising the virtual replica can include replicas of humans that are performing tasks in the real world. In this way, in addition to performing monitoring and management of replicated twin augmented reality systems according to rules and/or user instruction, the ecosystem can dynamically, and in real-time, track and update based on the actions of replicated humans.” 
McBain [0059]; “FIG. 6 shows an illustration of a work plan 600 generated to implement a change to the digital twin augmented reality environment, in accordance with one or more embodiments. For example, the system can receive, at 610, a change request indicative of a change to the first and/or second elements. The change request can include user input changing one or more of the respective fields, an automatic update to the same, etc. The system can, at 620, generate an assessment of an effect of the change request on the first and/or second elements. The assessment can be generated by, for example, propagating the change and its effects through the interconnected elements in the digital environment, a machine learning algorithm and/or business rules and/or digital robots and/or operational rules and/or humans where the change is modelled based on training the machine learning algorithm with different scenarios, or predictive algorithms that can calculate likely outcomes and effects of the proposed change.” wherein the virtual environment mimicking digital twin interactions containing digitally represented robots and humans reads on digital twin models of human and robotic users)

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the software testing via simulated digital twins of Reicher to have the simulated digital twins specifically be of a robotic user and a human user. The motivation to combine is because “technologic advancement … is not done in a coordinated manner leading to both inefficiency in the manner in which new technologies are adopted as well as inefficiency in how tasks are performed and managed after new technologies are adopted” (McBain [0003]), thus the simulation of human and technology interactions will allow for more efficient technological advancement. 

The combination of Reicher/McBain fails to disclose but Hwang discloses a computer-based method for automatically adapting an artificial intelligence (AI) from a first environment to a second environment (Hwang Fig. 5a, Fig.5b),
providing the set of commands to an AI virtual assistant, (Hwang [0011]; “method may further include receiving an input sequence input from the user according to the output messages. The method may further include training a voice assistant service model for learning the response operation by using the input sequence. The input sequence may include at least one of a voice input, a key input, a touch input, or a motion input from the user”)
determining whether the AI virtual assistant is able to execute each command of the set of commands, (Hwang [0011]; “determining whether a response operation with respect to the voice of the user is performable according to a preset criterion”)
and in response to determining the AI virtual assistant is not able to execute each command, identifying … each user whose command was not able to be executed, (Hwang [0313]; “processor may transfer information related to the determination that the response operation with respect to the user voice is not performable to the electronic device”) 
recommending one or more corrective actions such that each command of the set of commands is able to be executed (Hwang [0011]; “based on the determining the response operation is not performable, outputting a series of guide messages”)

Reicher/McBain discloses identifying the digital twin model for each user. Reicher does not explicitly teach identifying a user whose command was not able to be executed, but by using the simulated users of Reicher/McBain in the “identifying a user whose command was not able to be executed” method of Hwang, the combination consequently teaches identifying the digital twin model for each user whose command was not able to be executed. 

Reicher/McBain discloses the corrective action of executing an additional digital twin simulation. Reicher does not explicitly teach wherein recommending the one or more corrective actions includes executing an additional digital twin simulation … for each digital twin model whose command was not able to be executed, but by performing the digital twin simulation of Reicher/McBain as corrective action in the “recommending corrective action including identifying the digital twin model for each user whose command was not able to be executed” method of Hwang, the combination consequently teaches recommending the one or more corrective actions includes executing an additional digital twin simulation … for each digital twin model whose command was not able to be executed

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the software testing via digital twin simulations of Reicher/McBain to conduct Hwang’s virtual assistant training testing using the combination’s digital twins.  The motivation to combine is to improve the efficacy of virtual assistant training so that it “may learn the plurality of reference patterns representing the acoustic pattern of the user voice based on the user voice and the text information corresponding to the user voice, and then may improve the accuracy of voice recognition” (Hwang [0176]).

The Reicher/McBain/Hwang combination fails to disclose but Cohen discloses wherein the executed digital twin simulation includes the first [user] walking towards an Al virtual assistant and the second [user] walking away from the Al virtual assistant. (Cohen [0072]; “As an example, a person may be walking back and forth, interacting with voice-based interaction agents. As the person walks, her facial direction may change repeatedly, causing the input volume level, as measured, to repeatedly change” where a person walking back and forth, interacting with voice-based interaction agents reads on a user walking towards or away from an AI virtual assistant,
Cohen [0028]; “As another example, many users may be using voice-based interaction agents while watching television (TV) in order to order dinner.” where many users using voice-based interaction agents reads on a first user and a second user interacting with the AI virtual assistant
wherein providing the set of commands to the Al virtual assistant includes … the first [user] walking towards the Al virtual assistant issuing a first voice command and … the second [user] walking away from the Al virtual assistant issuing the second voice command (Cohen [0072]; “As an example, a person may be walking back and forth, interacting with voice-based interaction agents. As the person walks, her facial direction may change repeatedly, causing the input volume level, as measured, to repeatedly change” where a person walking back and forth, interacting with voice-based interaction agents reads on a user walking towards or away from a virtual assistant,
Cohen [0005]; “One exemplary embodiment of the disclosed subject matter is a method comprising: obtaining a vocal input from a user, wherein the vocal input is part of an interaction between the user and the voice-based interaction agent; determining an interaction context of the interaction between the user and the voice-based interaction agent; determining an output volume level of the voice-based interaction agent based on the interaction context; and providing to the user an output of the voice-based interaction agent, wherein the output comprises a voice-based output having a volume level of the output volume level.” Wherein the vocal input interactions between a user and a voice-based interaction agent read on as issuing a voice command)
Cohen [0028]; “ As another example, many users may be using voice-based interaction agents while watching television (TV) in order to order dinner” where many users using voice-based interaction agents reads on a first user and a second user issuing a first and second voice command)
wherein … a change in sound levels observed by the AI virtual assistant (Cohen Fig. 4, Cohen[0098]; “Column 410 shows lower bounds to input volume ranges, while Column 420, shows upper bounds to the input volume ranges. Column 430 shows possible output volume levels, each of which corresponding to a different input volume level range.” which discloses different sound levels, some louder than others, for the input voice commands, thus reading on a first user’s voice command being a louder sound level than a different second user’s second voice command.

Cohen discloses a first user walking towards the Al virtual assistant and a second user walking away from the Al virtual assistant. Cohen does not explicitly teach a first digital twin model walking towards an Al virtual assistant and a second digital twin model walking away from the Al virtual assistant, but by replacing the human users of Cohen with the simulated users of the combination, the users of Cohen correspond to digital twin models in the Reicher/McBain/Hwang/Cohen combination. Henceforth, the combination does teach the first digital twin model walking towards an Al virtual assistant and the second digital twin model walking away from the Al virtual assistant as well as the first digital twin model walking towards the Al virtual assistant issuing a first voice command and the second digital twin model walking away from the Al virtual assistant issuing a second voice command.

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the digital twin simulation of Reicher/McBain/Hwang to incorporate Cohen’s method of users walking towards and away from an AI virtual assistant while issuing a voice command as simulated activity of the simulated user persona in the context of virtual assistant software testing .  A person of ordinary skill in the art would have been motivated to do so because “as the person walks, her facial direction may change repeatedly, causing the input volume level, as measured, to repeatedly change. By using the disclosed subject matter, small changes in the input volume level (e.g 3 db, 4 db, or the like) may be ignored.” (Cohen [0024])

The combination of Reicher/McBain/Hwang/Cohen fails to explicitly disclose but Schmidt discloses the first [user] generating a sound level for a first voice command consistent with a Doppler effect corresponding to the [user] … second [user] generating a sound level for a second voice command consistent with the Doppler effect corresponding to the [user] (Schmidt [29]; "wearable head device 100 may incorporate one or more microphones 150 configured to 
detect audio signals generated by the user's voice" 
Schmidt [47]; "The direct source 514 may be configured to provide an audio signal (step 552 of 
process 550). The Doppler 516 may receive a signal from the direct source 514 and may be 
configured to introduce a Doppler effect into its input signal (step 554). For example, the 
Doppler 516 may change the pitch of the sound source (e.g., pitch shifting) to change relative to 
the motion of the sound source, the user of the system, or both." wherein the inputted audio signals generated by the user’s voice modified by Doppler effect reads on generating sound levels for voice commands of the user)
wherein the Doppler effect causes a change in sound levels … (Schmidt [47]; "The direct source 514 may be configured to provide an audio signal (step 552 of 
process 550). The Doppler 516 may receive a signal from the direct source 514 and may be 
configured to introduce a Doppler effect into its input signal (step 554). For example, the 
Doppler 516 may change the pitch of the sound source (e.g., pitch shifting) to change relative to 
the motion of the sound source, the user of the system, or both.")

It would have been obvious to a person of ordinary skill in the art before the effective filing 
date of the invention to, in response to an inability to perceive the combination’s commands of moving users due to varying pitches of the sound commands, to incorporate Schmidt’s ability to enhance auditory commands to accommodate for the Doppler effect and better understand the voice signal. A person of ordinary skill in the art would have been motivated to do so to "introduce representations of 
the movements of the sound source(s), the user, or both" (Schmidt [96]).

Schmidt discloses a first [user] generating a sound level for a first voice command consistent with a Doppler effect corresponding to the [user] … second [user] generating a sound level for a second voice command consistent with the Doppler effect corresponding to the [user]. Schmidt does not explicitly teach a first digital twin model generating a sound level for a first voice command consistent with a Doppler effect corresponding to the first digital twin model … second digital twin model generating a sound level for a second voice command consistent with the Doppler effect corresponding to the second digital twin model but by replacing the human users of Schmidt with the simulated users of the combination, the users of Schmidt correspond to digital twin models in the Reicher/McBain/Hwang/Cohen/Schmidt combination. Henceforth, the combination does teach the first digital twin model generating a sound level for a first voice command consistent with a Doppler effect corresponding to the first digital twin model … second digital twin model generating a sound level for a second voice command consistent with the Doppler effect corresponding to the second digital twin model. Similarly, Schmidt’s disclosure of the Doppler effect causing a change in sound levels is performed upon the combination’s sound levels observed by the AI virtual assistant. Thus, the Reicher/McBain/Hwang/Cohen/Schmidt combination also reads the Doppler effect causes a change in sound levels observed by the AI virtual assistant.

The combination of Reicher/McBain/Hwang/Cohen/Schmidt does not explicitly disclose but Katayama discloses the sound level for the first voice command and the sound level for the second voice command further being generated based on different tones and textures associated with the first [user] and the second [user] (Katayama [Abstract]; “Conversational agents are increasingly becoming digital partners of our everyday computing experiences offering a variety of purposeful information and utility services. Although rich on competency, these agents are entirely oblivious to their users' situational and emotional context today and incapable of adjusting their interaction style and tone contextually. To this end, we present a mixed-method study that informs the design of a situation- and emotion-aware conversational agent for kinetic earables. We surveyed 280 users, and qualitatively interviewed 12 users to understand their expectation from a conversational agent in adapting the interaction style. Grounded on our findings, we develop a first-of-its-kind emotion regulator for a conversational agent on kinetic earable that dynamically adjusts its conversation style, tone, volume in response to users emotional, environmental, social and activity context gathered through speech prosody, motion signals and ambient sound. We describe these context models, the end-to-end system including a purpose-built kinetic earable and their real-world assessment” wherein the conversational agents able to adjust its conversational volume based on the speech prosody and ambient sound of its associated users thus reads on sound levels for vocal input further being generated based on tones and textures associated with the first and second users)

It would have been obvious to a person of ordinary skill in the art before the effective filing 
date of the invention to determine the sound volumes for the voice commands of the Reicher/McBain/Hwang/Cohen/Schmidt combination through Katayama’s analyzed tones and textures of the simulated conversational inputs associated with a plurality of users. A person of ordinary skill in the art would have been motivated to do so because "simple adjustments of the interaction style of the agents' responses can increase users' conversational experience with these agents" (Katayama [Section I Paragraph 2]).

Katayama discloses a the sound level for the first voice command and the sound level for the second voice command further being generated based on different tones and textures associated with the first [user] and the second [user]. Katayama does not explicitly teach a the sound level for the first voice command and the sound level for the second voice command further being generated based on different tones and textures associated with the first digital twin model and the second digital twin model but by replacing the human users of Katayama with the simulated users of the combination, the users of Katayama correspond to digital twin models in the Reicher/McBain/Hwang/Cohen/Schmidt combination. Henceforth, the combination does teach the first digital twin model generating a sound level for a first voice command consistent with a Doppler effect corresponding to the first digital twin model … second digital twin model generating a sound level for a second voice command consistent with the Doppler effect corresponding to the second digital twin model.

The combination of Reicher/McBain/Hwang/Cohen/Schmidt/Katayama already discloses wherein executing of the digital twin simulation includes the first digital twin model generating simulated physical movements and voice of the first user operating in the second environment simultaneously with the second digital twin model generating simulated physical movements and voice of the second user operating in the second environment (Cohen [0072]; “As an example, a person may be walking back and forth, interacting with voice-based interaction agents. As the person walks, her facial direction may change repeatedly, causing the input volume level, as measured, to repeatedly change” where a person walking back and forth, interacting with voice-based interaction agents is interpreted as simulating physical movements and voice by digital twin models in a second digital environment,
Cohen [0005]; “One exemplary embodiment of the disclosed subject matter is a method comprising: obtaining a vocal input from a user, wherein the vocal input is part of an interaction between the user and the voice-based interaction agent; determining an interaction context of the interaction between the user and the voice-based interaction agent; determining an output volume level of the voice-based interaction agent based on the interaction context; and providing to the user an output of the voice-based interaction agent, wherein the output comprises a voice-based output having a volume level of the output volume level.” Wherein the vocal input interactions between a user and a voice-based interaction agent read on as issuing a voice command is interpreted as simulating by a first digital twin model’s voice
Cohen [0028]; “ As another example, many users may be using voice-based interaction agents while watching television (TV) in order to order dinner” where many users using voice-based interaction agents reads on simulating through associated mobility patterns of at least first and second digital twin models and their associated physical movements and voices)

The Reicher/McBain/Hwang/Cohen/Schmidt/Katayama combination fails to explicitly disclose but Jadav discloses an alternative capture scenario is simulated … wherein the alternative capture scenario includes adding one or more additional microphones; (Jadav [0024]; “As understood herein, “sensing devices” refer to any type of device, preferably a mobile device such as a handheld computer, personal digital assistant, tablet, smartphone, and any equivalents thereof that would be understood by skilled artisans reading this description as capable of receiving signals from location sensors deployed throughout an environment, and processing such signals (e.g. using dedicated application software, a dedicated API, and/or services running on or otherwise provided to/by the sensing device) to determine a location of the location sensor sending the signal. Preferably, the sensing device also includes a graphical display and/or auditory components such as a microphone and speakers to provide visual and/or auditory information to the user“ wherein the sensing device used in an indoor environment reads on adding an additional microphone to the environment,
Jadav [0007]; “The program instructions are executable by a sensing device to cause the sensing device to perform a method, including: providing, via the sensing device, an instruction to deploy a first location sensor at a first location within an indoor environment; receiving, at the sensing device, a plurality of signals from the first location sensor while moving away from the first location sensor, wherein each signal is characterized by a signal strength” wherein the deployment of the sensing device’s location sensors throughout an indoor environment for analysis of the environment reads on an alternative capture scenario being simulated)
issuing … a new voice command with the one or more additional microphones added (Jadav [0024]; 
“Preferably, the sensing device also includes a graphical display and/or auditory components such as a microphone and speakers to provide visual and/or auditory information to the user, as well as receive input from the user, e.g. via a touchscreen or the microphone. The sensing device also preferably serves as a primary interface between the system and the user/client, and facilitates cognitive aspects of the invention, as described in greater detail below.” wherein the sensing device contains one microphone; wherein the user input received through the sensing device reads on a new voice being issued)

Reicher/McBain/Hwang/Cohen/Schmidt/Katayama discloses the corrective action of executing an additional digital twin simulation ... for each digital twin model whose command was not able to be executed. Reicher/McBain/Hwang/Cohen/Schmidt/Katayama does not explicitly teach simulation in which an alternative capture scenario is simulated … wherein the alternative capture scenario includes adding one or more additional microphones in the additional digital twin simulation, but by performing Jadav’s alternative capture scenario in the virtual environment of the Reicher/McBain/Hwang/Cohen/Schmidt/Katayama combination, the combination consequently teaches executing an additional digital twin simulation in which an alternative capture scenario is simulated for each digital twin model whose command was not able to be executed, wherein the alternative capture scenario includes adding one or more additional microphones in the additional digital twin simulation.

Reicher/McBain/Hwang/Cohen/Schmidt/Katayama discloses executing the additional digital twin simulation includes … by each digital twin model whose command was not able to be executed based on the additional digital twin simulation. Reicher/McBain/Hwang/Cohen/Schmidt/Katayama does not explicitly teach issuing … a new voice command with the one or more additional microphones added, but by performing Jadav’s method of issuing voice commands using additional microphones with the inexecutable digital twin models of the Reicher/McBain/Hwang/Cohen/Schmidt/Katayama combination, the combination consequently teaches executing the additional digital twin simulation includes issuing, by each digital twin model whose command was not able to be executed based on the additional digital twin simulation, a new voice command with the one or more additional microphones added.

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to perform Jadav’s alternative capture scenario of adding microphones and issuing voice commands in the digital twin simulation of Reicher/McBain/Hwang/Cohen/Schmidt/Katayama.  A person of ordinary skill in the art would have been motivated to do so “to provide visual and/or auditory information to the user [digital model], as well as receive input from the user” (Jadav [0024])

Regarding Claim 4,
The Reicher/McBain/Hwang/Cohen/Schmidt/Katayama/Jadav combination of Claim 1 teaches the method of Claim 1 (and thus the rejection of Claim 1 is incorporated) and further teaches the corrective action is selected from a group consisting of adding one or more additional microphones in a surrounding environment, adding one or more sensors in the surrounding environment, and installing one or more additional gauges on a piece of environment in the surrounding environment. (Jadav [0072]; “Provide an instruction to deploy an additional location sensor in response to determining either: (a) the signal strength of one of the signals received from the first location sensor, the second location sensor, the third location sensor, and/or one of the iteratively increasing number of additional location sensors placed throughout the indoor environment is less than the predetermined minimum signal strength threshold”)

Regarding Claim 5,
The Reicher/McBain/Hwang/Cohen/Schmidt/Katayama/Jadav combination of Claim 1 teaches the method of Claim 1 (and thus the rejection of Claim 1 is incorporated) and further teaches wherein the digital twin model is received from a digital library. (Reicher [0033]; “The site-specific labeled user software activities and objectives are routed from that site's data store 200 to a datastore 202 that stores the labeled user software activities and objectives. A RNN 204 has an input layer, a hidden layer, and an output layer. In block 206, the product requirement generator 110 trains a user software activity sequence model using the RNN 204 and the labeled user software activities and objectives from the data store 202. The user software activity sequence model is a model that is used to predict job roles and objectives based on the user software activities.”)

Regarding Claim 7,
The Reicher/McBain/Hwang/Cohen/Schmidt/Katayama/Jadav combination of Claim 1 teaches the method of Claim 1 (and thus the rejection of Claim 1 is incorporated) and already teaches wherein the simulation of the movements and activities includes multi-directional movements of each digital twin model. (Cohen [0068]; “As an example, a person may be walking back and forth, interacting with voice-based interaction agents. As the person walks, her facial direction may change repeatedly, causing the input volume level, as measured, to repeatedly change”)

Claims 8, 11, 12, 14 recite a computer system, comprising one or more processors, and one or more computer readable memories, one or more computer-readable tangible storage media, and program instructions to perform precisely the methods of Claims 1, 4, 5, 7 respectively. As Reicher teaches such a system to perform their method (Reicher [0090]; “The present invention may be a system, a method, and/or a computer program product at any possible technical detail level of integration. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.”), Claims 8, 11, 12, 14 are thus rejected for reasons set forth in the rejections of Claims 1, 4, 5, 7, respectively. 

Similarly, Claims 15, 18, 19 recite one or more computer-readable tangible storage media holding the instructions. As Reicher teaches such a computer readable medium (Reicher [0090]), Claims 15, 18, 19 are rejected for reasons set forth in the rejections of Claims 1, 4, 5, respectively. 

Claims 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Reicher et al. (US20210081302A1, hereinafter “Reicher”) in view of McBain et al. (US20210248825 A1, hereinafter “McBain”) and further in view of Hwang et al. (US20210065685A1, hereinafter “Hwang”) and further in view of Cohen et al. (US20200310742A1, hereinafter “Cohen”) in view of Katayama et al. (“Situation-Aware Emotion Regulation of Conversational Agents with Kinetic Earables” [2019], hereinafter “Katayama”) in view of Schmidt et al. (US20200260208A1, hereinafter “Schmidt”) in view of Jadav (US20200100063A1) in view of Keating et al. (US20140028712A1, hereinafter “Keating”)

Regarding Claim 21,
The Reicher/McBain/Hwang/Cohen/Schmidt/Katayama/Jadav combination of Claim 1 teaches the method of Claim 1 (and thus the rejection of Claim 1 is incorporated). The combination fails to explicitly disclose but Keating discloses wherein responsive to the first digital twin model issuing the first voice command simultaneously with the second digital twin model issuing the second voice command to the AI virtual assistant, prioritizing based on a pre-defined sequence of operations, one of the first voice command and the second voice command (Keating [0100]; “The augmentation logic of the ARD 14 can be configured to identify a primary user of the device. Where a single ARD 14 is being used by multiple users, the ARD 14 can identify a primary user of the device and give priority to voice commands and/or verbalizations provided by the primary user. For example, if no primary user is currently associated with the ARD 14, the ARD 14 can be configured to select a user that is loudest as the primary user of the device, as this user may likely to be the user who is closest to the device. After a user's voice has been associated with the ARD 14, the ARD 14 can be configured to continue to recognize that voice as the primary user. The augmentation logic of the ARD 14 can be configured to provide dominant focus on vocalizations from the primary user and secondarily focus on vocalizations from other users.”)

It would have been obvious to a person of ordinary skill in the art before the effective filing 
date of the invention to perform Keating’s method of assigning priorities to voice commands and verbalizations upon the voice commands of the Reicher/McBain/Hwang/Cohen/Schmidt/Katayama/Jadav combination. A person of ordinary skill in the art would have been motivated to do so because "With this approach, the augmentation logic can resolve conflicting inputs from the users in favor of the primary user of the device" (Keating [0100]).

Claim 22 recites a computer system, comprising one or more processors, and one or more computer readable memories, one or more computer-readable tangible storage media, and program instructions to perform precisely the methods of Claim 21 respectively. As Reicher teaches such a system to perform their method (Reicher [0090]; “The present invention may be a system, a method, and/or a computer program product at any possible technical detail level of integration. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.”), Claim 22 is thus rejected for reasons set forth in the rejections of Claim 21, respectively. 

Similarly, Claim 23 recites one or more computer-readable tangible storage media holding the instructions. As Reicher teaches such a computer readable medium (Reicher [0090]), Claim 23 is rejected for reasons set forth in the rejections of Claim 21 respectively. 


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
“Routing Voice Commands to Virtual Assistants” (US20200105273A1) which discloses assigning priority to voice commands and their associated operation scheduling.
“Multiple User Interaction with Audio Devices Using Speech and Gestures” (US20190317606A1) which discloses transmission and detection of multiple user voice command inputs to an audio device
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONATHAN J KIM whose telephone number is (571)272-0523. The examiner can normally be reached 9-6.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matt El can be reached on (571) 270-3264. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000



/JONATHAN J KIM/Examiner, Art Unit 2141                                                                                                                                                                                                        
/TAN H TRAN/Primary Examiner, Art Unit 2141
Read full office action
Prosecution Timeline

Aug 30, 2021
Application Filed
Dec 23, 2024
Non-Final Rejection — §103
Mar 04, 2025
Interview Requested
Mar 12, 2025
Applicant Interview (Telephonic)
Mar 12, 2025
Examiner Interview Summary
Apr 02, 2025
Response Filed
Apr 28, 2025
Final Rejection — §103
Jun 17, 2025
Interview Requested
Jul 01, 2025
Applicant Interview (Telephonic)
Jul 02, 2025
Request for Continued Examination
Jul 08, 2025
Response after Non-Final Action
Jul 08, 2025
Examiner Interview Summary
Jul 24, 2025
Non-Final Rejection — §103
Sep 19, 2025
Interview Requested
Oct 02, 2025
Applicant Interview (Telephonic)
Oct 10, 2025
Examiner Interview Summary
Oct 21, 2025
Response Filed
Dec 01, 2025
Final Rejection — §103
Feb 05, 2026
Response after Non-Final Action
Mar 17, 2026
Request for Continued Examination
Mar 20, 2026
Response after Non-Final Action
Apr 01, 2026
Non-Final Rejection — §103 (current)
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
33%
Grant Probability
99%
With Interview (+80.0%)
3y 3m
Median Time to Grant
High
PTA Risk
Based on 6 resolved cases by this examiner. Grant probability derived from career allow rate.
AUTO-ADAPTATION OF AI SYSTEM FROM FIRST ENVIRONMENT TO SECOND ENVIRONMENT

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email