DETAILED ACTION
Claims 1-20 have been examined.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 U.S.C. § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA 35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. § 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1, 5, 7, 8, 12, 15, and 19 are rejected under 35 U.S.C. § 102(a)(1) as being anticipated by Dehesa, et al., Touché: Data-Driven Interactive Sword Fighting in Virtual Reality, CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 25–30 April 2020, pp. 1-15, in its entirety. Specifically:
Claim 1
Claim 1's ''an interface; and'' is anticipated by Dehesa, et al., page 7, right column, last partial paragraph and page 8, left column, first partial paragraph, where it recites:
We first conducted a questionnaire study, where participants were presented with short videos (one minute each) of sword fighting sequences against the characters described by each of the conditions, driven by similar player interactions. The videos are from the point of view of the player and show the character walking towards the player and engaging in sword fighting, attacking and defending.
Claim 1's ''processing circuitry configured to execute a first neural network to:'' is anticipated by Dehesa, et al., page 6, left column, second full paragraph, where it recites:
We chose to use a neural network as a model for this system, given their convenience and their success in comparable problems. We used an architecture inspired by the phase-functioned neural network by Holden et al. [27]. Instead, however, of training a single neural network, we train a collection, each specialised in particular aspects of the problem. We use a fixed network architecture, but train for multiple sets of weights (parameterisations).
Claim 1's ''receive, via the interface, indications of movement of a player controlled by a user playing a video game application;'' is anticipated by Dehesa, et al., page 1, caption for Figure 1 (images on the right), where it recites:
Figure 1: Left: Our framework splits the problem of simulating interactive VR sword fighting characters into a “physical” level, relying on data-driven models, and a “semantic” level, where designers can configure the behaviour of the character. Right: The framework generates responsive animations against player attacks (top), avoiding nonreactive behaviour from the character (bottom left). A neural network parameterised by the position of the player’s sword synthesises the animation (bottom right).
Claim 1's “program a first distance from the player and a second distance from the player, wherein the second distance is greater than the first distance; and” is anticipated by Dehesa, et al., page 7, right column, Figure 4, where it shows distance differences between the bones of different characters.
Claim 1's “implement a movement scheme for a first non-player character (NPC) indicating that the first NPC is to remain within a region created as a result of a difference between the first distance and the second distance from the player as the player moves; and;'' is anticipated by Dehesa, et al., page 5, left column, first full paragraph, where it recites:
Initially, the character is at a distance from the player and starts walking in their direction until the distance is small enough (IsClose), from where it can either defend or attack. While defending, the character will simply try to put their sword across the trajectory of the player’s sword. There are two circumstances that puts the character into attacking state: if no strikes from the user are detected for some random amount of time (RandomTimer), the character will proactively initiate an attack, or when a strike from the player is detected (PlayerAttacked), the character may react by counterattacking with a strike following the same direction, in an attempt to hit the unguarded area. For example, if the player performs a strike from right to left, then the right-hand side will be left unprotected, so the character may try to strike there.
Further, the distance differences are anticipated by Dehesa, et al., page 7, right column, Figure 4, where it shows distance differences between the bones of different characters.
Claim 1's wherein the apparatus is configured to render the first NPC into a user interface (UI) alongside the player following the movement scheme.'' is anticipated by Dehesa, et al., page 5, right column, first full paragraph, where it recites:
Animation Synthesis
The animation synthesis module is responsible for determining the pose that the virtual character shall adopt in each frame. This module is driven by the directives emitted by the behaviour planning module, and can be seen as a translator of these directives into “physical” actions. Animation synthesis is also a data-driven component: it does not use complex state machines, but rather offers a menu of actions it may perform and acts them out, depending on the context. There are two kinds of actions that the model may perform: defending and attacking. Defending is the more complicated, as it is reactive and depends on what the player does. Therefore, the defence animation synthesis uses a machine learning model capable of generating the motion necessary to block arbitrary strikes from the user.
Claim 5
Claim 5's ''receive feedback on whether behavior of the first NPC is appropriate; and'' is anticipated by Dehesa, et al., page 6, right column, fourth full paragraph, where it recites:
Gesture Recognition
We can assess the accuracy of the gesture recognition system by measuring the mispredictions of the model. However, we also want to have an understanding of the cases in which those mispredictions are more frequent. We therefore analysed the errors of the neural network per gesture and also across the duration of each gesture. In particular, we want to make sure that gesture classification is more reliable in the middle sections of a gesture, allowing more room for error at the beginning and end, where the boundary of the gesture may not be as precisely defined.
Further, it is anticipated by Dehesa, et al., page 5, right column, first full paragraph, where it recites:
Therefore, the defence animation synthesis uses a machine learning model capable of generating the motion necessary to block arbitrary strikes from the user. Attacks, on the other hand, are initiated by the character by request of the behaviour planner, which also indicates the kind of attack to perform. For this, we can simply use a collection of animation clips that are played out as necessary. This simplifies the system and gives designers precise control of what happens when the character attacks the user. The animation synthesis system smoothly blends between the machine learning model and the clips, fading the weight of each one in and out over a short period of time as the character transitions between attack and defence, so the overall animation appears as a continuous action.
For the defence animation synthesis, we start by collecting a set of motion data to train the model. We used Vicon Bonita equipment to record several sword fighting attacks and blocks at different angles. We produced about 15 minutes of training data in total, recorded at 30 frames per second (over 26000 frames), with each frame containing the pose of a 24-joint skeleton and the position of the tip of both swords. The data is split leaving 80% for training and the rest for evaluation.
Defence animation is generated one frame at a time. The model continuously predicts the next character pose using the current pose and the perceived control information from the user. In summary, the collection of features used as input is:
Claim 5's ''train one or more parameters of the first neural network responsive to receiving the feedback on behavior of the first NPC.'' is anticipated by Dehesa, et al., page 6, left column, second full paragraph, where it recites:
We chose to use a neural network as a model for this system, given their convenience and their success in comparable problems. We used an architecture inspired by the phase-functioned neural network by Holden et al. [27]. Instead, however, of training a single neural network, we train a collection, each specialised in particular aspects of the problem. We use a fixed network architecture, but train for multiple sets of weights (parameterisations).
Claim 7
Claim 7's ''7. The apparatus as recited in claim 1, wherein the first neural network uses the first programmable amount (“isclose”?) and the second programmable amount of distance (“NPC starting point”?) as parameters of the movement scheme that define (i) a region proximate to the player that the first NPC does not enter and (ii) a surrounding region bounded between the first and second amounts, and generates movement control data to keep the first NPC within the surrounding region as the player moves (“change gestures”?).'' is taught by Dehesa, et al., page 5, left column, first and second paragraphs, where it recites:
Initially, the character is at a distance from the player (i.e., the claimed “second programmable amount of distance”) and starts walking in their direction until the distance is small enough (IsClose) (i.e., the claimed “first programmable amount”), from where it can either defend or attack. While defending, the character will simply try to put their sword across the trajectory of the player’s sword. There are two circumstances that puts the character into attacking state: if no strikes from the user are detected for some random amount of time (RandomTimer), the character will proactively initiate an attack, or when a strike from the player is detected (PlayerAttacked), the character may react by counterattacking with a strike following the same direction, in an attempt to hit the unguarded area. For example, if the player performs a strike from right to left, then the right-hand side will be left unprotected, so the character may try to strike there.
When the character is attacking, it will usually continue the attack until it is completed (AttackFinished), and then come back to defend again, but the attack may also be aborted early. This will always happen when the player hits the character in the middle of an attack (HitByPlayer). But the character may also interrupt an attack willingly if an incoming strike from the player is detected (PlayerAttacking) (i.e., the claimed “as the player moves”), quickly coming back to defending in an attempt to block the attack.
Claim 8
Claim 8's ''receiving, by processing circuitry executing a first neural network, indications of movement of a player controlled by a user playing a video game application;'' is anticipated by Dehesa, et al., page 1, caption for Figure 1(images on the right), where it recites:
Figure 1: Left: Our framework splits the problem of simulating interactive VR sword fighting characters into a “physical” level, relying on data-driven models, and a “semantic” level, where designers can configure the behaviour of the character. Right: The framework generates responsive animations against player attacks (top), avoiding nonreactive behaviour from the character (bottom left). A neural network parameterised by the position of the player’s sword synthesises the animation (bottom right).
Further, the claimed neural network is anticipated by Dehesa, et al., page 6, left column, second full paragraph, where it recites:
We chose to use a neural network as a model for this system, given their convenience and their success in comparable problems. We used an architecture inspired by the phase-functioned neural network by Holden et al. [27]. Instead, however, of training a single neural network, we train a collection, each specialised in particular aspects of the problem. We use a fixed network architecture, but train for multiple sets of weights (parameterisations).
Claim 8's “programming, by the processing circuitry, a first distance from the player and a second distance from the player, wherein the second distance is greater than the first distance; and” is anticipated by Dehesa, et al., page 7, right column, Figure 4, where it shows distance differences between the bones of different characters.
Claim 8's “implementing, by the processing circuitry, a movement scheme for a first non-player character (NPC), indicating that the first NPC is to remain within a region created as a result of a difference between the first distance and the second distance from the player as the player moves; and;'' is anticipated by Dehesa, et al., page 5, left column, first full paragraph, where it recites:
Initially, the character is at a distance from the player and starts walking in their direction until the distance is small enough (IsClose), from where it can either defend or attack. While defending, the character will simply try to put their sword across the trajectory of the player’s sword. There are two circumstances that puts the character into attacking state: if no strikes from the user are detected for some random amount of time (RandomTimer), the character will proactively initiate an attack, or when a strike from the player is detected (PlayerAttacked), the character may react by counterattacking with a strike following the same direction, in an attempt to hit the unguarded area. For example, if the player performs a strike from right to left, then the right-hand side will be left unprotected, so the character may try to strike there.
Further, the distance differences are anticipated by Dehesa, et al., page 7, right column, Figure 4, where it shows distance differences between the bones of different characters.
Claim 8's ''implementing a movement scheme for a first non-player character (NPC) to remain in relatively close proximity to the player without invading a first programmable amount of distance from the player;'' is anticipated by Dehesa, et al., page 5, left column, first full paragraph, where it recites:
Initially, the character is at a distance from the player and starts walking in their direction until the distance is small enough (IsClose), from where it can either defend or attack. While defending, the character will simply try to put their sword across the trajectory of the player’s sword. There are two circumstances that puts the character into attacking state: if no strikes from the user are detected for some random amount of time (RandomTimer), the character will proactively initiate an attack, or when a strike from the player is detected (PlayerAttacked), the character may react by counterattacking with a strike following the same direction, in an attempt to hit the unguarded area. For example, if the player performs a strike from right to left, then the right-hand side will be left unprotected, so the character may try to strike there.
Claim 8's ''causing the first NPC to not exceed a second programmable amount of distance from the player, wherein the second programmable amount of distance is greater than the first programmable amount of distance; and'' is anticipated by Dehesa, et al., page 5, left column, first full paragraph, where it recites:
Initially, the character is at a distance from the player and starts walking in their direction until the distance is small enough (IsClose), from where it can either defend or attack. While defending, the character will simply try to put their sword across the trajectory of the player’s sword. There are two circumstances that puts the character into attacking state: if no strikes from the user are detected for some random amount of time (RandomTimer), the character will proactively initiate an attack, or when a strike from the player is detected (PlayerAttacked), the character may react by counterattacking with a strike following the same direction, in an attempt to hit the unguarded area. For example, if the player performs a strike from right to left, then the right-hand side will be left unprotected, so the character may try to strike there.
Note that the claimed second distance is taught by the distance that the sword crosses the other sword.
Claim 8's ''rendering, by the processing circuitry, the first NPC into a user interface (UI) alongside the player following the movement scheme.'' is anticipated by Dehesa, et al., page 5, left column, first full paragraph, where it recites:
Initially, the character is at a distance from the player and starts walking in their direction until the distance is small enough (IsClose), from where it can either defend or attack. While defending, the character will simply try to put their sword across the trajectory of the player’s sword. There are two circumstances that puts the character into attacking state: if no strikes from the user are detected for some random amount of time (RandomTimer), the character will proactively initiate an attack, or when a strike from the player is detected (PlayerAttacked), the character may react by counterattacking with a strike following the same direction, in an attempt to hit the unguarded area. For example, if the player performs a strike from right to left, then the right-hand side will be left unprotected, so the character may try to strike there.
Claim 12
Claim 12's ''receiving feedback on whether behavior of the first NPC is appropriate; and'' is anticipated by Dehesa, et al., page 6, right column, fourth full paragraph, where it recites:
Gesture Recognition
We can assess the accuracy of the gesture recognition system by measuring the mispredictions of the model. However, we also want to have an understanding of the cases in which those mispredictions are more frequent. We therefore analysed the errors of the neural network per gesture and also across the duration of each gesture. In particular, we want to make sure that gesture classification is more reliable in the middle sections of a gesture, allowing more room for error at the beginning and end, where the boundary of the gesture may not be as precisely defined.
Further, it is anticipated by Dehesa, et al., page 5, right column, first full paragraph, where it recites:
Therefore, the defence animation synthesis uses a machine learning model capable of generating the motion necessary to block arbitrary strikes from the user. Attacks, on the other hand, are initiated by the character by request of the behaviour planner, which also indicates the kind of attack to perform. For this, we can simply use a collection of animation clips that are played out as necessary. This simplifies the system and gives designers precise control of what happens when the character attacks the user. The animation synthesis system smoothly blends between the machine learning model and the clips, fading the weight of each one in and out over a short period of time as the character transitions between attack and defence, so the overall animation appears as a continuous action.
For the defence animation synthesis, we start by collecting a set of motion data to train the model. We used Vicon Bonita equipment to record several sword fighting attacks and blocks at different angles. We produced about 15 minutes of training data in total, recorded at 30 frames per second (over 26000 frames), with each frame containing the pose of a 24-joint skeleton and the position of the tip of both swords. The data is split leaving 80% for training and the rest for evaluation.
Defence animation is generated one frame at a time. The model continuously predicts the next character pose using the current pose and the perceived control information from the user. In summary, the collection of features used as input is:
Claim 12's ''training one or more parameters of the first neural network responsive to receiving the feedback on behavior of the first NPC.'' is anticipated by Dehesa, et al., page 6, left column, second full paragraph, where it recites:
We chose to use a neural network as a model for this system, given their convenience and their success in comparable problems. We used an architecture inspired by the phase-functioned neural network by Holden et al. [27]. Instead, however, of training a single neural network, we train a collection, each specialised in particular aspects of the problem. We use a fixed network architecture, but train for multiple sets of weights (parameterisations).
Further, it is anticipated by Dehesa, et al., page 5, right column, first full paragraph, where it recites:
Therefore, the defence animation synthesis uses a machine learning model capable of generating the motion necessary to block arbitrary strikes from the user. Attacks, on the other hand, are initiated by the character by request of the behaviour planner, which also indicates the kind of attack to perform. For this, we can simply use a collection of animation clips that are played out as necessary. This simplifies the system and gives designers precise control of what happens when the character attacks the user. The animation synthesis system smoothly blends between the machine learning model and the clips, fading the weight of each one in and out over a short period of time as the character transitions between attack and defence, so the overall animation appears as a continuous action.
For the defence animation synthesis, we start by collecting a set of motion data to train the model. We used Vicon Bonita equipment to record several sword fighting attacks and blocks at different angles. We produced about 15 minutes of training data in total, recorded at 30 frames per second (over 26000 frames), with each frame containing the pose of a 24-joint skeleton and the position of the tip of both swords. The data is split leaving 80% for training and the rest for evaluation.
Defence animation is generated one frame at a time. The model continuously predicts the next character pose using the current pose and the perceived control information from the user. In summary, the collection of features used as input is:
Claim 15
Claim 15's ''receive indications of movement of a player controlled by a user playing a video game application;'' is anticipated by Dehesa, et al., page 1, caption for Figure 1(images on the right), where it recites:
Figure 1: Left: Our framework splits the problem of simulating interactive VR sword fighting characters into a “physical” level, relying on data-driven models, and a “semantic” level, where designers can configure the behaviour of the character. Right: The framework generates responsive animations against player attacks (top), avoiding nonreactive behaviour from the character (bottom left). A neural network parameterised by the position of the player’s sword synthesises the animation (bottom right).
Claim 15's “program a first distance from the player and a second distance from the player, wherein the second distance is greater than the first distance; and” is anticipated by Dehesa, et al., page 7, right column, Figure 4, where it shows distance differences between the bones of different characters.
Claim 15's “implement a movement scheme for a first non-player character (NPC) indicating that the first NPC is to remain within a region created as a result of a difference between the first distance and the second distance from the player as the player moves; and;'' is anticipated by Dehesa, et al., page 5, left column, first full paragraph, where it recites:
Initially, the character is at a distance from the player and starts walking in their direction until the distance is small enough (IsClose), from where it can either defend or attack. While defending, the character will simply try to put their sword across the trajectory of the player’s sword. There are two circumstances that puts the character into attacking state: if no strikes from the user are detected for some random amount of time (RandomTimer), the character will proactively initiate an attack, or when a strike from the player is detected (PlayerAttacked), the character may react by counterattacking with a strike following the same direction, in an attempt to hit the unguarded area. For example, if the player performs a strike from right to left, then the right-hand side will be left unprotected, so the character may try to strike there.
Further, the distance differences are anticipated by Dehesa, et al., page 7, right column, Figure 4, where it shows distance differences between the bones of different characters.
Claim 15's ''a rendering engine configured to render the first NPC into a user interface (UI) alongside the player following the movement scheme enforced by the first neural network.'' is anticipated by Dehesa, et al., page 5, left column, first full paragraph, where it recites:
Initially, the character is at a distance from the player and starts walking in their direction until the distance is small enough (IsClose), from where it can either defend or attack. While defending, the character will simply try to put their sword across the trajectory of the player’s sword. There are two circumstances that puts the character into attacking state: if no strikes from the user are detected for some random amount of time (RandomTimer), the character will proactively initiate an attack, or when a strike from the player is detected (PlayerAttacked), the character may react by counterattacking with a strike following the same direction, in an attempt to hit the unguarded area. For example, if the player performs a strike from right to left, then the right-hand side will be left unprotected, so the character may try to strike there.
Further, the claimed neural network is anticipated by Dehesa, et al., page 6, left column, second full paragraph, where it recites:
We chose to use a neural network as a model for this system, given their convenience and their success in comparable problems. We used an architecture inspired by the phase-functioned neural network by Holden et al. [27]. Instead, however, of training a single neural network, we train a collection, each specialised in particular aspects of the problem. We use a fixed network architecture, but train for multiple sets of weights (parameterisations).
Claim 19
Claim 19's ''receive feedback on whether behavior of the first NPC is appropriate; and'' is anticipated by Dehesa, et al., page 6, right column, fourth full paragraph, where it recites:
Gesture Recognition
We can assess the accuracy of the gesture recognition system by measuring the mispredictions of the model. However, we also want to have an understanding of the cases in which those mispredictions are more frequent. We therefore analysed the errors of the neural network per gesture and also across the duration of each gesture. In particular, we want to make sure that gesture classification is more reliable in the middle sections of a gesture, allowing more room for error at the beginning and end, where the boundary of the gesture may not be as precisely defined.
Further, it is anticipated by Dehesa, et al., page 5, right column, first full paragraph, where it recites:
Therefore, the defence animation synthesis uses a machine learning model capable of generating the motion necessary to block arbitrary strikes from the user. Attacks, on the other hand, are initiated by the character by request of the behaviour planner, which also indicates the kind of attack to perform. For this, we can simply use a collection of animation clips that are played out as necessary. This simplifies the system and gives designers precise control of what happens when the character attacks the user. The animation synthesis system smoothly blends between the machine learning model and the clips, fading the weight of each one in and out over a short period of time as the character transitions between attack and defence, so the overall animation appears as a continuous action.
For the defence animation synthesis, we start by collecting a set of motion data to train the model. We used Vicon Bonita equipment to record several sword fighting attacks and blocks at different angles. We produced about 15 minutes of training data in total, recorded at 30 frames per second (over 26000 frames), with each frame containing the pose of a 24-joint skeleton and the position of the tip of both swords. The data is split leaving 80% for training and the rest for evaluation.
Defence animation is generated one frame at a time. The model continuously predicts the next character pose using the current pose and the perceived control information from the user. In summary, the collection of features used as input is:
Claim 19's ''train one or more parameters of a first neural network responsive to receiving the feedback on behavior of the first NPC.'' is anticipated by Dehesa, et al., page 6, left column, second full paragraph, where it recites:
We chose to use a neural network as a model for this system, given their convenience and their success in comparable problems. We used an architecture inspired by the phase-functioned neural network by Holden et al. [27]. Instead, however, of training a single neural network, we train a collection, each specialised in particular aspects of the problem. We use a fixed network architecture, but train for multiple sets of weights (parameterisations).
Further, it is anticipated by Dehesa, et al., page 5, right column, first full paragraph, where it recites:
Therefore, the defence animation synthesis uses a machine learning model capable of generating the motion necessary to block arbitrary strikes from the user. Attacks, on the other hand, are initiated by the character by request of the behaviour planner, which also indicates the kind of attack to perform. For this, we can simply use a collection of animation clips that are played out as necessary. This simplifies the system and gives designers precise control of what happens when the character attacks the user. The animation synthesis system smoothly blends between the machine learning model and the clips, fading the weight of each one in and out over a short period of time as the character transitions between attack and defence, so the overall animation appears as a continuous action.
For the defence animation synthesis, we start by collecting a set of motion data to train the model. We used Vicon Bonita equipment to record several sword fighting attacks and blocks at different angles. We produced about 15 minutes of training data in total, recorded at 30 frames per second (over 26000 frames), with each frame containing the pose of a 24-joint skeleton and the position of the tip of both swords. The data is split leaving 80% for training and the rest for evaluation.
Defence animation is generated one frame at a time. The model continuously predicts the next character pose using the current pose and the perceived control information from the user. In summary, the collection of features used as input is:
Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA 35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 6, 13-14, and 20 are rejected under 35 U.S.C. § 103 as being unpatentable over Dehesa, et al., Touché: Data-Driven Interactive Sword Fighting in Virtual Reality, CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 25–30 April 2020, pp. 1-15, in view of Lee, et al., Precomputing Avatar Behavior From Human Motion Data, Graphical Models, Volume 68, Issue 2, March 2006, Pages 158-174, in their entireties. Specifically:
Claim 6
Claim 6's ''The apparatus as recited in claim 1, wherein the processing circuitry is configured to execute a reinforcement learning application to:'' is not expressly taught by the structure of Dehesa, et al.
It is, however, taught by Lee, et al., page 165, last full paragraph, where it recites:
4.2. Dynamic programming
Computing a control policy is a simple iterative process of sampling states and applying a local update rule to incrementally refine values in the table. On each iteration, we randomly choose a pose s of the avatar and a target e among grid points. The avatar needs to decide which action to take among a set of actions immediately available to the avatar at state s. A greedy policy is to select the one that gains the highest reward in one step. Taking action a brings the avatar to state s0 and the target to a new location e0 (since the location is represented with respect to a local moving coordinate system). According to the greedy policy, the value at (s, e) should be updated to reflect the immediate reward and the value of the next state (s0 , e0 ), using the best available action. This process is called value iteration in reinforcement learning community and can be shown to converge to the optimal values. We repeat the process until all states have been visited dozens of times.
Rationale -- It would have been obvious for one of ordinary skill in the art, as of the effective filing date, to substitute the reinforcement learning system of Lee, et al., for the neural network of Dehesa, et al. because both AI systems are suitable for making the decisions required of the system.
Claim 6's access, from the first neural network, data associated with features based on game scenarios encountered in an environment sequence; and'' is anticipated by Dehesa, et al., page 3, left column, first partial paragraph, where it recites:
Given the limited complexity of our features (two hand sensors and one head sensor), we chose neural networks to implement gesture recognition in our framework for their simplicity and proven effectiveness.
Claim 6's ''select a next action for the first NPC based on the accessed data.'' is anticipated by Dehesa, et al., page 4, left column, third full paragraph, where it recites:
We split the problem into two levels. The physical level represents actual events and information in the virtual world. For our purposes, this means essentially input from the player and character animation. It is difficult to reason directly with this data, as it is mostly 3D geometrical information with little structure. We therefore use data-driven models to project it onto a semantic level, where the information is seen as discrete labels for virtual world events. On the player’s side, a gesture recognition system interprets raw 3D input as sword fighting gestures. It becomes now simple to define custom logic to decide how to react to the actions of the user. This is done by the human-designed behaviour planning module.
Claim 13
Claim 13's ''executing, by the processing circuitry a reinforcement learning engine, to access, from the first neural network, data associated with features based on the game scenarios encountered in an environment sequence; and'' is not expressly taught by the structure of Dehesa, et al.
It is, however, taught by Lee, et al., page 165, last full paragraph, where it recites:
4.2. Dynamic programming
Computing a control policy is a simple iterative process of sampling states and applying a local update rule to incrementally refine values in the table. On each iteration, we randomly choose a pose s of the avatar and a target e among grid points. The avatar needs to decide which action to take among a set of actions immediately available to the avatar at state s. A greedy policy is to select the one that gains the highest reward in one step. Taking action a brings the avatar to state s0 and the target to a new location e0 (since the location is represented with respect to a local moving coordinate system). According to the greedy policy, the value at (s, e) should be updated to reflect the immediate reward and the value of the next state (s0 , e0 ), using the best available action. This process is called value iteration in reinforcement learning community and can be shown to converge to the optimal values. We repeat the process until all states have been visited dozens of times.
Rationale -- It would have been obvious for one of ordinary skill in the art, as of the effective filing date, to substitute the reinforcement learning system of Lee, et al., for the neural network of Dehesa, et al. because both AI systems are suitable for making the decisions required of the system.
Claim 13's ''selecting a next action for the first NPC based on the features.'' is anticipated by Dehesa, et al., page 4, left column, third full paragraph, where it recites:
We split the problem into two levels. The physical level represents actual events and information in the virtual world. For our purposes, this means essentially input from the player and character animation. It is difficult to reason directly with this data, as it is mostly 3D geometrical information with little structure. We therefore use data-driven models to project it onto a semantic level, where the information is seen as discrete labels for virtual world events. On the player’s side, a gesture recognition system interprets raw 3D input as sword fighting gestures. It becomes now simple to define custom logic to decide how to react to the actions of the user. This is done by the human-designed behaviour planning module.
Claim 14
Claim 14's ''receiving a personality score generated based on whether behavior of the first NPC matches an assigned personality and mood; and'' is anticipated by Dehesa, et al., page 3, left column, last full paragraph, where it recites:
For our purposes, we implement a basic behavioural planner based on a simple state machine in our framework, although our approach could support more sophisticated reasoning models to model scenarios that require them.
Claim 14's ''training one or more parameters of the first neural network based on the personality score.'' is anticipated by Dehesa, et al., page 4, right column, first partial paragraph, where it recites:
We trained a neural network as the basis of our recogniser. To this end, we first collected a dataset of gestural actions. This was done in a separate VR environment, where the player was presented with a signal indicating a gesture to perform, which they then did while pressing a button on the hand controller.
Claim 20
Claim 20's ''The system as recited in claim 15, wherein the processor is further to:'' is not expressly taught by the structure of Dehesa, et al.
It is, however, taught by Lee, et al., page 165, last full paragraph, where it recites:
4.2. Dynamic programming
Computing a control policy is a simple iterative process of sampling states and applying a local update rule to incrementally refine values in the table. On each iteration, we randomly choose a pose s of the avatar and a target e among grid points. The avatar needs to decide which action to take among a set of actions immediately available to the avatar at state s. A greedy policy is to select the one that gains the highest reward in one step. Taking action a brings the avatar to state s0 and the target to a new location e0 (since the location is represented with respect to a local moving coordinate system). According to the greedy policy, the value at (s, e) should be updated to reflect the immediate reward and the value of the next state (s0 , e0 ), using the best available action. This process is called value iteration in reinforcement learning community and can be shown to converge to the optimal values. We repeat the process until all states have been visited dozens of times.
Rationale -- It would have been obvious for one of ordinary skill in the art, as of the effective filing date, to substitute the reinforcement learning system of Lee, et al., for the neural network of Dehesa, et al. because both AI systems are suitable for making the decisions required of the system.
Claim 20's ''execute a reinforcement learning application to access, from the first neural network, data associated with features based on the game scenarios encountered in an environment sequence; and'' is not expressly taught by Dehesa, et al.
It is, however, taught by Lee, et al., page 165, last full paragraph, where it recites:
4.2. Dynamic programming
Computing a control policy is a simple iterative process of sampling states and applying a local update rule to incrementally refine values in the table. On each iteration, we randomly choose a pose s of the avatar and a target e among grid points. The avatar needs to decide which action to take among a set of actions immediately available to the avatar at state s. A greedy policy is to select the one that gains the highest reward in one step. Taking action a brings the avatar to state s0 and the target to a new location e0 (since the location is represented with respect to a local moving coordinate system). According to the greedy policy, the value at (s, e) should be updated to reflect the immediate reward and the value of the next state (s0 , e0 ), using the best available action. This process is called value iteration in reinforcement learning community and can be shown to converge to the optimal values. We repeat the process until all states have been visited dozens of times.
Rationale -- It would have been obvious for one of ordinary skill in the art, as of the effective filing date, to substitute the reinforcement learning system of Lee, et al., for the neural network of Dehesa, et al. because both AI systems are suitable for making the decisions required of the system.
Claim 20's ''select a next action for the first NPC based on the accessed data.'' is anticipated by Dehesa, et al., page 4, left column, third full paragraph, where it recites:
We split the problem into two levels. The physical level represents actual events and information in the virtual world. For our purposes, this means essentially input from the player and character animation. It is difficult to reason directly with this data, as it is mostly 3D geometrical information with little structure. We therefore use data-driven models to project it onto a semantic level, where the information is seen as discrete labels for virtual world events. On the player’s side, a gesture recognition system interprets raw 3D input as sword fighting gestures. It becomes now simple to define custom logic to decide how to react to the actions of the user. This is done by the human-designed behaviour planning module.
Allowable Subject Matter
Claims 2-4, 9-11 and 16-18 are allowed. Specifically, for:
Claims 2, 9, and 16 the cited prior art does not teach a score representative of truthfulness of information contained in the output, and wherein the score includes metadata indicating a time when the output was received and information about the first NPC;
Claims 3, 10, and 17 the cited prior art does not teach the second neural network has a different complexity level from the first neural network;
Claims 4, 11, and 18 the cited prior art does not teach “a friend threshold”, nor “a foe threshold”.
Response to Arguments
Applicant's arguments filed 04 SEP 2025 have been fully considered but they are not persuasive. Specifically, Applicant argues:
Argument 1
Claim 1 recites an apparatus comprising circuitry configured to program two distances from a player, a first distance and a second distance that is greater than the first distance, and implement a movement scheme that causes the NPC to remain within a region created by the difference between those two distances as the player moves. The present Office Action maps the recited "region" to Dehesa's "IsClose" condition and maps the "two distances" to Figure 4 of Dehesa. See Office Action at pp. 12-13 (IsClose) and pp. 20-21 (IsClose + Fig. 4). The Office Action further states: "[t]he region in the prior art is '(IsClose)."' and relies again on Figure 4 for the "distance differences." Applicant respectfully submits the cited disclosures of Dehesa are not equivalent to the claim features for at least the following reasons.
Dehesa discloses a single threshold ("IsClose"), not two programmed distances with a region (band/annulus) in which the NPC must remain. Dehesa's behavior diagram and disclosure explicitly describe the character "walking in [the player's] direction until the distance is small enough (IsClose)"-a single threshold to engage, not a bounded region defined by the difference between two programmed distances that is continuously enforced as the player moves.
In the prior art, there are two distances the opponent stays between: 1) the initial location of the character from which it approaches in order to fight; 2) the “isclose” distance at which the characters begin to engage.
In the claims, there is no limit on the claimed “first limit.” It may be as small as zero. There is no limit on the claimed “second limit.” It may be infinite. There is no limit on the difference between the claimed “first limit” and “second limit” other than the second is greater than the first. That difference may be as small as a pixel…or smaller.
Further, regarding the argued limitation “band/annulus,” neither “band” nor “annulus” is in Spec. Not even a “radius” is in the Spec. Nor are they claimed. There is no indication that the claimed distances refer to a circular planar segment.
Applicant's argument is unpersuasive.
The rejections stand.
Argument 2
In addition, Dehesa Figure 4 is not NPC-to-player distance at all. Rather, it defines a pose- difference metric measured as areas between corresponding bones of the same character across poses. Accordingly, it says nothing about programming two distances from the player or maintaining an NPC within an annulus relative to a moving player.
Accordingly, Dehesa does not disclose (i) programming two distances from the player nor (ii) and does not disclose a movement scheme that keeps the NPC within the region created by their difference. For at least these reasons claim 1 is not anticipated by Dehesa.
Claims 8 and 15 are similarly not anticipated by Dehesa.
Neither “band” nor “annulus” is in Spec. Not even a “radius” is in the Spec. Nor are they claimed. There is no indication that the claimed distances refer to a circular planar segment.
Further, Applicant does not limit in the claim what the two distances pertain to. They cannot be limited to an annulus, since no annulus was claimed. Further, the distances do not specify what part of the character from which they are measured and what governs the limits of those distances. The limits may be infinite and the limits may be as close as the end of the character’s weapons or appendages. In the broadest reasonable interpretation of the claim, the distances may be measured to different parts of the same character.
Further, Applicant argues that the claim contains the following limitation:
…a movement scheme that keeps the NPC within the region created by their difference…
That is not a direct quote of the claim and has a different scope than the actual claim. The claim actually recites:
… the first NPC is to remain within a region created as a result of a difference between the first distance and the second distance…
Many regions may be created “as a result of” a difference. In the prior art, the difference pertains to a pose of the user. As a result of that pose, the NPC approaches in one of many ways to interact with the player. The NPC remains within its initial starting point and the “isclose” distance “as a result of” the pose distances.
Again, there is no teaching of an annulus in the Specification. Applicant argues matter that is not in the claims.
Applicant's argument is unpersuasive.
The rejections stand.
Argument 3
Further, Applicant respectfully submits claims 5, 12 and 19 are not anticipated by Dehesa. For example, claim 5 further recites:
"The apparatus as recited in claim 1, wherein the processing circ