Office Action Analysis: 18454990 — CONTROL APPARATUS, CONTROL SYSTEM, CONTROL METHOD, AND PROGRAM

Office Action

§101 §103 §112
DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The Office Action is in response to claims filed 08/24/2023.
Claims 1-9 are pending.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

The term “near” in claims 1, 8, and 9 is a relative term which renders the claim indefinite. Claims 2-7 are also rejected because they depend on claim 1 and have the same deficiency. The term “near” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. At best, the specification provides a plurality of exemplary ways to define “near”. However, the exemplary definitions also suffer from relative terminology such as the term “closest” in Page 11 Line 1. The exemplary definitions do not remedy the claims’ indefiniteness. For the purposes of compact prosecution, examiner interprets “at least one other agent near the agent” to be any other agent within the environment. 

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-9 are rejected under 35 U.S.C. 101 because the claimed invention recites a judicial exception, an abstract idea, and it has not been integrated into practical application and the claims further do not recite significantly more than the judicial exception. Examiner has evaluated the claims under the framework provided in the 2019 Patent Eligibility Guidance published in the Federal Register 01/07/2019 and has provided such analysis below.
Step 1: 
Claims 1-7 are directed to a control apparatus and fall within the statutory class of machine. Claim 8 is directed to a control system and falls within the statutory class of machine. Claim 9 is directed to a control method and falls within the statutory class of processes. Therefore, “Are the claims to a process, machine, manufacture or composition of matter?” Yes.
Step 2A Prong 1: 
Claims 1, 8, and 9: Claims 1 and 8 claim the limitations “a request response processing unit, … , configured to calculate, based on observation information about the agent, at least one other agent near the agent, and the task, a request parameter as to whether or not to request help, and a response parameter as to whether or not to respond to a request from the at least one other agent”, “an importance processing unit, … , configured to perform processing for calculating, based on at least the request parameter of the at least one other agent and the response parameter of the agent, importance of each of the tasks for the agent”, and “a task selection unit, … , configured to select the task to be performed by the agent according to the importance”, as drafted, are a process that under its broadest reasonable interpretation, covers performance of the limitation of the mind. Additionally, claim 9 claims the same limitations in the statutory class of process. Calculating a value based on an observation obtained from the environment and selecting a task based on a value involve a mental process of observing and then forming a judgement. The recited actions are understood to be performed by processor, but are also able to be entirely performed in the mind.
Therefore, Yes, claims 1, 8, and 9 recite a judicial exception. Step 2A Prong 2 will evaluate whether the claims are directed to the judicial exception.
Step 2A Prong 2: 
Claims 1, 8, and 9: The judicial exception is not integrated into a practical application. Claims 1 and 8 recites the following additional elements – “hardware, including at least one memory configured to store a computer program and at least one processor configured to execute the computer program”, “a task execution unit, implemented by hardware, configured to control the agent so that it performs the selected task”, and “implemented by hardware” in the limitations analyzed in Step 2A Prong 1. All these additional elements are considered to be mere recitations of generic computing components and functions merely being used as a tool to apply the abstract idea (MPEP § 2106.05(f)). Claim 1, 8 and 9 also claims the additional elements – “a control apparatus configured to control an agent configured to perform a task”, “a control system configured to control a plurality of agents in a distributed manner, each of the plurality of agents being configured to perform a task”, “a control method for controlling agent configured to perform a task”, “wherein the larger a number of agents that perform the task is, the greater a possibility that a target for the task will be achieved increases”, and “there are a plurality of tasks in an environment”. These additional elements are considered to be field of use/technological environment (MPEP § 2106.05(h)). The recited additional elements do not integrate the judicial exception into a practical application. 
Therefore, “Do the claims recite additional elements that integrate the judicial exception in a practical application?” No, these additional elements do not integrate the abstract idea into a practical application and they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
After having evaluated the inquiries set forth in Steps 2A Prong 1 and 2, it has been concluded that claims 1, 8, and 9 not only recite a judicial exception but that the claims are directed to the judicial exception as the judicial exception has not been integrated into practical application.
Step 2B: 
Claims 1, 8, and 9: The claims do not include additional elements, alone or in combination, that are sufficient to amount to significantly more than the judicial exception. As discussed above, the additional elements only amount to generic computing components and functions being used as a tool to apply the abstract idea and field of use/technological environment.
Therefore, “Do the claims recite additional elements that amount to significantly more than the judicial exception? No, these additional elements, alone or in combination, do not amount to significantly more than the judicial exception.
Having concluded analysis with in the provided framework, claims 1, 8 and 9 do not recite eligible subject matter under 35 U.S.C. § 101.

With regards to claim 2, it recites “the request response processing unit calculates the request parameter and the response parameter based on a respective one of the plurality of policies”. The claim further limits the judicial exception claimed in claim 1, so the limitation is also considered a mental process. Calculating parameters based on a policy involves observing, understanding, and calculating is still a process that can be performed entirely in the mind. Therefore, claim 2 recites a judicial exception and fails Step 2A Prong 1. Claim 2 also recites an additional element of “wherein a plurality of policies are learned for a plurality of agents, respectively” which is considered an insignificant extra-solution activity (MPEP § 2106.05(g)) because the agents are acquiring the policy. The limitation does not integrate the judicial exception into a practical application, so the claim fails Step 2A Prong 2. Lastly, when reexamining the additional element for an inventive concept that is significantly more, claim 2 does not add an inventive concept that is other than what is well understood, routine, and conventional in the field. MPEP § 2106.05(d)(II) lists that “Receiving or transmitting data over a network” is a well understood, routine, and conventional computer function. The policy is considered the data that is received, so the claim fails Step 2B. Therefore, claim 2 does not recite patent eligible material under 35 U.S.C. § 101.  

With regard to claim 3, it recites “wherein the request response processing unit calculates the request parameter and the response parameter based on a request level and a response level, respectively”. The limitation further limits the judicial exception claimed in claim 1 and 2, so this limitation is also considered a mental process. Therefore, claim 3 recites a judicial exception and fails Step 2A Prong 1. Claim 3 also recites “the request level and the response level being output from the one of the plurality of policies by inputting the observation information into the one of the plurality of policies”. Under its broadest reasonable interpretation, the limitation is considered an insignificant extra-solution activity (MPEP 2106.05(g)) because it is mere data gathering/transmission. Inputting data and collecting outputted data is considered data gathering/transmission. The limitation does not integrate the judicial exception into a practical application, so claim 3 fails Step 2A Prong 2. When revaluating the insignificant extra-solution activity for an inventive concept that is significantly more, claim 3 does not add an inventive concept that is other than what is well understood, routine, and conventional in the field. MPEP § 2106.05(d)(II) lists that “Receiving or transmitting data over a network” is a well understood, routine, and conventional computer function. The limitation fails Step 2B. Therefore, claim 3 does not recite patent eligible material under 35 U.S.C. § 101.  

With regard to claim 4, it recites “wherein the request response processing unit calculates the request parameter indicating that help should be requested when the request level exceeds a predetermined threshold and the task that the agent is performing or about to perform is not proceeding”. The claim further limits the judicial exception claimed in claim 1, 2, and 3, so this limitation is also considered a mental process involving observing, understanding, and calculating. Therefore, claim 4 recites a judicial exception and fails Step 2A Prong 1. Claim 4 does not include any other additional elements that integrate the judicial exception into a practical application, so claim 4 fails Step 2A Prong 2. Additionally, there are no additional elements that amount to something significantly more, so the limitation fails Step 2B. Therefore, claim 4 does not recite patent eligible material under 35 U.S.C. § 101.  
 
With regard to claim 5, it recites “wherein the request response processing unit calculates the response parameter indicating that the request should be responded to when the response level exceeds a predetermined threshold and the task that the agent is performing or about to perform is not proceeding”. The claim further limits the judicial exception claimed in claim 1, 2, and 3, so this limitation is also considered a mental process involving observing, understanding, and calculating. Therefore, claim 5 recites a judicial exception and fails Step 2A Prong 1. Claim 5 does not include any other additional elements that integrate the judicial exception into a practical application, so claim 5 fails Step 2A Prong 2. Lastly, the additional element, alone or in combination, does not amount to something significantly more, so the claim fails Step 2B. Therefore, claim 5 does not recite patent eligible material under 35 U.S.C. § 101.

With regard to claim 6, it recites “the importance processing unit calculates importance of each of the tasks for the agent based on the one of the plurality of policies that has been learned for that agent”. This limitation further limits the judicial exception claimed in claim 1, so this limitation is also considered a mental process involving observing, understanding, and calculating. Therefore, claim 6 recites a judicial exception and fails Step 2A Prong 1. Claim 6 also recites an additional element of “wherein a plurality of policies are learned for a plurality of agents” which is considered an insignificant extra-solution activity (MPEP § 2106.05(g)) because the agents are acquiring the policy. The additional element does not integrate the judicial exception into a practical application, so the claim fails Step 2A Prong 2. Lastly, when reexamining the additional element for an inventive concept that is significantly more, claim 6 does not add an inventive concept that is other than what is well understood, routine, and conventional in the field. MPEP § 2106.05(d)(II) lists that “Receiving or transmitting data over a network” is a well understood, routine, and conventional computer function. The policy is considered the data that is received. Therefore, claim 6 does not recite patent eligible material under 35 U.S.C. § 101. 

With regard to claim 7, it recites “wherein the importance processing unit calculates, based on a target value of importance of the task corresponding to the observation information, the importance of the task corresponding to the observation information for the agent”. This limitation further limits the judicial exception claimed in claims 1 and 6, so this limitation is also considered a mental process involving observing, understanding, and calculating Therefore, the claim fails Step 2A Prong 1. The claim also recites “the target value of the importance being output from the policy by inputting the observation information into the policy”. Under its broadest reasonable interpretation, the limitation is considered an insignificant extra-solution activity (MPEP 2106.05(g)) because it is mere data gathering/transmission. Inputting data and collecting outputted data is considered data gathering/transmission. The limitation does not integrate the judicial exception into a practical application, so claim 7 fails Step 2A Prong 2. When revaluating the insignificant extra-solution activity for an inventive concept that is significantly more, claim 7 does not add an inventive concept that is other than what is well understood, routine, and conventional in the field. MPEP § 2106.05(d)(II) lists that “Receiving or transmitting data over a network” is a well understood, routine, and conventional computer function. The limitation fails Step 2B. Therefore, claim 7 does not recite patent eligible material under 35 U.S.C. § 101.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 8, and 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Torri et al. Pat. No. US 20200282549 A1 (hereafter Torri) in view of Orita et al. Pat. No. US 20080109114 A1 (hereafter Orita).
Regarding claim 1, Torii teaches a control apparatus configured to control an agent configured to perform a task (¶ [0046] states “the control device according to the present embodiment causes multiple robots 1 and 2 to cooperate with each other to execute one piece of work (task)”),
wherein the larger a number of agents that perform the task is, the greater a possibility that a target for the task will be achieved increases (¶ [0047] states “in a case where the robot 1 is allocated with a task exceeding the ability (capability) that can be executed by the robot 1, the control device determines that the allocated task cannot be handled by the robot 1 alone”. ¶ [0049] states “Then, the control device causes the robots 1 and 2 to operate in cooperation with each other to cause the robots 1 and 2 to execute the task that cannot be executed by the robot 1 alone”); 
and there are a plurality of tasks in an environment (¶ [0055] states “The task management unit 110 manages tasks allocated to the robot 1”),
and the control apparatus comprises: hardware, including at least one memory configured to store a computer program and at least one processor configured to execute the computer program (¶ [0139] states “As illustrated in FIG. 15, the control device 100 includes a central processing unit (CPU) 901, a read only memory (ROM) 902, a random access memory (RAM) 903”. ¶ [0140] states “The ROM 902 stores programs and arithmetic operation parameters used by the CPU 901, and the RAM 903 temporarily stores programs used in execution of the CPU 901”); 
a request response processing unit, implemented by the hardware, configured to calculate, based on observation information about the agent, at least one other agent near the agent, and the task, a request parameter as to whether or not to request help, and a response parameter as to whether or not to respond to a request from the at least one other agent (¶ [0055] states “the task management unit 110 manages the start time, the end time, and the execution period of a task which is allocated to the robot 1 and is to be executed (that is, a reserved state) or being executed. The task management unit 110 further refers to a database that covers the ability used for execution of each the tasks”. ¶ [0056] states “Specifically, the ability management unit 120 determines the capability indicating the ability that the robot 1 can execute on the basis of the ability of the hardware and software of the robot 1 and the state of the robot 1 at the time of the determination”. ¶ [0070] states “the help management unit 130 acquires the capability of the robot 2” and “the help management unit 130 compares the generated help list with the acquired capability of the robot 2 to determine whether or not the capability of the robot 2 satisfies the ability indicated in the help list". ¶ [0064] states “Furthermore, in a case where the capability of the robot 1 does not satisfy the ability necessary for execution of the task, the help management unit 130 determines the capability required for the other robot 2 to execute the task, and a help list indicating the ability is generated”. ¶ [0071] states “the help management unit 130 may receive the determination result of the availability of cooperation from the robot 2 having been selected as the cooperation target, and thereby determine whether or not the capability of the robot 2 satisfies the ability indicated in the help list”. Examiner’s Notes: a first robot, a second robot, and the task are observed. In light of the 112(b) issue, the “other agent” that is “near” the first agent is interpreted as any other agent within the environment. If the first robot cannot complete the task, it requests help by generating a help list. This is interpreted to be part of the request parameter. The second robot calculates “determination result of the availability of cooperation”, which is interpreted to be the response parameter); 
an importance processing unit, implemented by the hardware, configured to perform processing for calculating, based on at least the request parameter of the at least one other agent and the response parameter of the agent, importance of each of the tasks for the agent (¶ [0064] states “Furthermore, in a case where the capability of the robot 1 does not satisfy the ability necessary for execution of the task, the help management unit 130 determines the capability required for the other robot 2 to execute the task, and a help list indicating the ability is generated”. ¶ [0071] states “the help management unit 130 may receive the determination result of the availability of cooperation from the robot 2 having been selected as the cooperation target, and thereby determine whether or not the capability of the robot 2 satisfies the ability indicated in the help list”. Examiner’s Notes: when a robot cannot complete a task, it begins to request help from another robot. When a robot sends the “determination result of the availability of cooperating”, this is the response parameter); 
a task selection unit, implemented by the hardware, configured to select the task to be performed by the agent according to the importance (¶ [0101] states “a task to be allocated to the first robot 1 is determined”. ¶ [0055] states “The task management unit 110 manages tasks allocated to the robot 1”); 
and a task execution unit, implemented by the hardware, configured to control the agent so that it performs the selected task (¶ [0104] states “the task is executed by cooperation between the first robot 1 and the second robot 2 (S123)”. ¶ [0096] states “The mechanism control unit 170 controls the overall operation of each mechanism of the robot 1” and “the mechanism control unit 170 executes the task instructed by the task management unit 110”).
Torri does not explicitly teach calculating importance for each task and using importance for task selection.
However, in an analogous art, Orita teaches an importance processing unit, implemented by the hardware, configured to perform processing for calculating, based on at least the request parameter of the at least one other agent and the response parameter of the agent, importance of each of the tasks for the agent (¶ [0094] states “The priority data generator 320 is configured to determine priority of tasks to be executed by the robots R”. ¶ [0097] states “determination of priority of each task is made with consideration given to: the importance of the task set at the time of registration of the task into the task information database 220; the distance between the start position of the task and the robot located closest to the task start position; and the time remaining until the start time or end time of the task from the present time”. Examiner’s Notes: the priority of the task is interpreted to be the importance of a task. Other parameters can be used to calculate priority).
a task selection unit, implemented by the hardware, configured to select the task to be performed by the agent according to the importance (¶ [0158] states “The task schedule production unit 341 determines which robot R will be caused to carry out a task based on the priority determined for each task, and determines the order of execution of the tasks assigned to each robot R”);
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine task priority calculation based on importance and using priority to create a schedule of Orita with the request and response parameter of Torri. A person having ordinary skill in the art would have motivated to make this combination for the purpose of finding “an optimum value can be obtained as the priority P of the task” (Orita ¶ [0101]). Further, the priority is used to determine the execution order of tasks (Orita ¶ [0158] states “The task schedule production unit 341 determines which robot R will be caused to carry out a task based on the priority determined for each task, and determines the order of execution of the tasks assigned to each robot R”) which makes it “possible to control a plurality of robots to effectively perform a plurality of tasks even in the condition where an unexpected factor such as human interaction response exists” (Orita ¶ [0017]). Additionally, it would have been obvious to include the request and response parameters in determining the priority of the task because they correspond to robot needs and availability respectively. Calculating priority of the task can involve multiple parameters. One of ordinary skill in the art would acknowledge that knowing robot needs and availability would contribute to calculating priority in an environment with multiple robots and multiple tasks. 

Regarding claim 8, Torri teaches a control system configured to control a plurality of agents in a distributed manner, each of the plurality of agents being configured to perform a task (¶ [0007] states “a control system that enable a robot to flexibly cooperate with another robot in order to execute an allocated task in an environment where the situation changes dynamically”),
wherein the larger a number of agents that perform the task is, the greater a possibility that a target for the task will be achieved increases (¶ [0047] states “in a case where the robot 1 is allocated with a task exceeding the ability (capability) that can be executed by the robot 1, the control device determines that the allocated task cannot be handled by the robot 1 alone”. ¶ [0049] states “Then, the control device causes the robots 1 and 2 to operate in cooperation with each other to cause the robots 1 and 2 to execute the task that cannot be executed by the robot 1 alone”);
and there are a plurality of tasks in an environment (¶ [0055] states “The task management unit 110 manages tasks allocated to the robot 1”),
the control system comprises a plurality of control apparatuses, each of the plurality of control apparatuses being configured to control a respective one of the plurality of agents, (¶ [0007] states “a control system that enable a robot to flexibly cooperate with another robot in order to execute an allocated task in an environment where the situation changes dynamically”. ¶ [0053] states “it is assumed that the control device 100 is included in the robot 1”),
Torri does not explicitly teach that each robot has the control device. However, it would have been obvious to try. The control device serves to solve the problem of controlling the robot’s action and communications. ¶ [0054] lists that “the control device 100 includes a task management unit 110, an ability management unit 120, a help management unit 130, a robot management unit 140, a communication unit 160, a cooperation management unit 150, a mechanism control unit 170, and a notification unit 180”. The control device can be implemented only in a finite number of identified and predictable potential solutions. These include – implemented outside of the robots in a centralized device, implemented within one robot, implemented within multiple robots, or implemented across multiple robots in a distributed manner. One of ordinary skill in the art could have pursued implementing the control device in each robot of the system with a reasonable expectation of success because all robots are expected to determine their own capability, communicate with other robots, and perform actions. The difference between robots 1 and 2 in Torri are that they have different abilities (¶ [0064], [0069]). In other words, there is no limitation that only one robot could possess a control device. By having a control device in each robot, any one robot could initiate a request for help. The benefit of doing so that is it contributes to causing “a robot to execute an allocated task flexibly by causing the robot to cooperate with another robot in an environment where the situation changes dynamically” (¶ [0012]). 
and each of the plurality of control apparatuses comprises: hardware, including at least one memory configured to store a computer program and at least one processor configured to execute the computer program (¶ [0139] states “As illustrated in FIG. 15, the control device 100 includes a central processing unit (CPU) 901, a read only memory (ROM) 902, a random access memory (RAM) 903”. ¶ [0140] states “The ROM 902 stores programs and arithmetic operation parameters used by the CPU 901, and the RAM 903 temporarily stores programs used in execution of the CPU 901”);
a request response processing unit, implemented by the hardware, configured to calculate, based on observation information about the agent controlled by that control apparatus, at least one other agent near the agent, and the task, a request parameter as to whether or not to request help, and a response parameter as to whether or not to respond to a request from the at least one other agent (¶ [0055] states “the task management unit 110 manages the start time, the end time, and the execution period of a task which is allocated to the robot 1 and is to be executed (that is, a reserved state) or being executed. The task management unit 110 further refers to a database that covers the ability used for execution of each the tasks”. ¶ [0056] states “Specifically, the ability management unit 120 determines the capability indicating the ability that the robot 1 can execute on the basis of the ability of the hardware and software of the robot 1 and the state of the robot 1 at the time of the determination”. ¶ [0070] states “the help management unit 130 acquires the capability of the robot 2” and “the help management unit 130 compares the generated help list with the acquired capability of the robot 2 to determine whether or not the capability of the robot 2 satisfies the ability indicated in the help list". ¶ [0064] states “Furthermore, in a case where the capability of the robot 1 does not satisfy the ability necessary for execution of the task, the help management unit 130 determines the capability required for the other robot 2 to execute the task, and a help list indicating the ability is generated”. ¶ [0071] states “the help management unit 130 may receive the determination result of the availability of cooperation from the robot 2 having been selected as the cooperation target, and thereby determine whether or not the capability of the robot 2 satisfies the ability indicated in the help list”. Examiner’s Notes: a first robot, a second robot, and the task are observed. In light of the 112(b) issue, the “other agent” that is “near” the first agent is interpreted as any other agent within the environment. If the first robot cannot complete the task, it requests help by generating a help list. This is interpreted to be part of the request parameter. The second robot calculates “determination result of the availability of cooperation”, which is interpreted to be the response parameter);
an importance processing unit, implemented by the hardware, configured to perform processing for calculating, based on at least the request parameter of the at least one other agent and the response parameter of the agent, importance of each of the tasks for the agent (¶ [0064] states “Furthermore, in a case where the capability of the robot 1 does not satisfy the ability necessary for execution of the task, the help management unit 130 determines the capability required for the other robot 2 to execute the task, and a help list indicating the ability is generated”. ¶ [0071] states “the help management unit 130 may receive the determination result of the availability of cooperation from the robot 2 having been selected as the cooperation target, and thereby determine whether or not the capability of the robot 2 satisfies the ability indicated in the help list”. Examiner’s Notes: when a robot cannot complete a task, it begins to request help from another robot. When a robot sends the “determination result of the availability of cooperating”, this is the response parameter);
a task selection unit, implemented by the hardware, configured to select the task to be performed by the agent according to the importance (¶ [0101] states “a task to be allocated to the first robot 1 is determined”. ¶ [0055] states “The task management unit 110 manages tasks allocated to the robot 1”);
and a task execution unit, implemented by the hardware, configured to control the agent so that it performs the selected task (¶ [0104] states “the task is executed by cooperation between the first robot 1 and the second robot 2 (S123)”. ¶ [0096] states “The mechanism control unit 170 controls the overall operation of each mechanism of the robot 1” and “the mechanism control unit 170 executes the task instructed by the task management unit 110”).
Torri does not explicitly teach calculating importance for each task and using importance for task selection.
However, in an analogous art, Orita teaches an importance processing unit, implemented by the hardware, configured to perform processing for calculating, based on at least the request parameter of the at least one other agent and the response parameter of the agent, importance of each of the tasks for the agent (¶ [0094] states “The priority data generator 320 is configured to determine priority of tasks to be executed by the robots R”. ¶ [0097] states “determination of priority of each task is made with consideration given to: the importance of the task set at the time of registration of the task into the task information database 220; the distance between the start position of the task and the robot located closest to the task start position; and the time remaining until the start time or end time of the task from the present time”. Examiner’s Notes: the priority of the task is interpreted to be the importance of a task. Other parameters can be used to calculate priority).
a task selection unit, implemented by the hardware, configured to select the task to be performed by the agent according to the importance (¶ [0158] states “The task schedule production unit 341 determines which robot R will be caused to carry out a task based on the priority determined for each task, and determines the order of execution of the tasks assigned to each robot R”);
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine task priority calculation based on importance and using priority to create a schedule of Orita with the request and response parameter of Torri. A person having ordinary skill in the art would have motivated to make this combination for the purpose of finding “an optimum value can be obtained as the priority P of the task” (Orita ¶ [0101]). Further, the priority is used to determine the execution order of tasks (Orita ¶ [0158] states “The task schedule production unit 341 determines which robot R will be caused to carry out a task based on the priority determined for each task, and determines the order of execution of the tasks assigned to each robot R”) which makes it “possible to control a plurality of robots to effectively perform a plurality of tasks even in the condition where an unexpected factor such as human interaction response exists” (Orita ¶ [0017]). Additionally, it would have been obvious to include the request and response parameters in determining the priority of the task because they correspond to robot needs and availability respectively. Calculating priority of the task can involve multiple parameters. One of ordinary skill in the art would acknowledge that knowing robot needs and availability would contribute to calculating priority in an environment with multiple robots and multiple tasks. 

Regarding claim 9, Torri teaches a control method for controlling an agent configured to perform a task (¶ [0007] states “the present disclosure proposes a new and improved control device, a control method, and a control system that enable a robot to flexibly cooperate with another robot in order to execute an allocated task in an environment where the situation changes dynamically”),
wherein the larger a number of agents that perform the task is, the greater a possibility that a target for the task will be achieved increases (¶ [0047] states “in a case where the robot 1 is allocated with a task exceeding the ability (capability) that can be executed by the robot 1, the control device determines that the allocated task cannot be handled by the robot 1 alone”. ¶ [0049] states “Then, the control device causes the robots 1 and 2 to operate in cooperation with each other to cause the robots 1 and 2 to execute the task that cannot be executed by the robot 1 alone”);
and there are a plurality of tasks in an environment (¶ [0055] states “The task management unit 110 manages tasks allocated to the robot 1”), 
and the control method comprises: calculating, based on observation information about the agent, at least one other agent near the agent, and the task, a request parameter as to whether or not to request help, and a response parameter as to whether or not to respond to a request from the at least one other agent (¶ [0055] states “the task management unit 110 manages the start time, the end time, and the execution period of a task which is allocated to the robot 1 and is to be executed (that is, a reserved state) or being executed. The task management unit 110 further refers to a database that covers the ability used for execution of each the tasks”. ¶ [0056] states “Specifically, the ability management unit 120 determines the capability indicating the ability that the robot 1 can execute on the basis of the ability of the hardware and software of the robot 1 and the state of the robot 1 at the time of the determination”. ¶ [0070] states “the help management unit 130 acquires the capability of the robot 2” and “the help management unit 130 compares the generated help list with the acquired capability of the robot 2 to determine whether or not the capability of the robot 2 satisfies the ability indicated in the help list". ¶ [0064] states “Furthermore, in a case where the capability of the robot 1 does not satisfy the ability necessary for execution of the task, the help management unit 130 determines the capability required for the other robot 2 to execute the task, and a help list indicating the ability is generated”. ¶ [0071] states “the help management unit 130 may receive the determination result of the availability of cooperation from the robot 2 having been selected as the cooperation target, and thereby determine whether or not the capability of the robot 2 satisfies the ability indicated in the help list”. Examiner’s Notes: a first robot, a second robot, and the task are observed. In light of the 112(b) issue, the “other agent” that is “near” the first agent is interpreted as any other agent within the environment. If the first robot cannot complete the task, it requests help by generating a help list. This is interpreted to be part of the request parameter. The second robot calculates “determination result of the availability of cooperation”, which is interpreted to be the response parameter); 
performing processing for calculating, based on at least the request parameter of the at least one other agent and the response parameter of the agent, importance of each of the tasks for the agent (¶ [0064] states “Furthermore, in a case where the capability of the robot 1 does not satisfy the ability necessary for execution of the task, the help management unit 130 determines the capability required for the other robot 2 to execute the task, and a help list indicating the ability is generated”. ¶ [0071] states “the help management unit 130 may receive the determination result of the availability of cooperation from the robot 2 having been selected as the cooperation target, and thereby determine whether or not the capability of the robot 2 satisfies the ability indicated in the help list”. Examiner’s Notes: when a robot cannot complete a task, it begins to request help from another robot. When a robot sends the “determination result of the availability of cooperating”, this is the response parameter); 
selecting the task to be performed by the agent according to the importance (¶ [0101] states “a task to be allocated to the first robot 1 is determined”. ¶ [0055] states “The task management unit 110 manages tasks allocated to the robot 1”); 
and controlling the agent so that it performs the selected task (¶ [0104] states “the task is executed by cooperation between the first robot 1 and the second robot 2 (S123)”. ¶ [0096] states “The mechanism control unit 170 controls the overall operation of each mechanism of the robot 1” and “the mechanism control unit 170 executes the task instructed by the task management unit 110”).
Torri does not explicitly teach calculating importance for each task and using importance for task selection.
However, in an analogous art, Orita teaches performing processing for calculating, based on at least the request parameter of the at least one other agent and the response parameter of the agent, importance of each of the tasks for the agent (¶ [0094] states “The priority data generator 320 is configured to determine priority of tasks to be executed by the robots R”. ¶ [0097] states “determination of priority of each task is made with consideration given to: the importance of the task set at the time of registration of the task into the task information database 220; the distance between the start position of the task and the robot located closest to the task start position; and the time remaining until the start time or end time of the task from the present time”. Examiner’s Notes: the priority of the task is interpreted to be the importance of a task. Other parameters can be used to calculate priority);
selecting the task to be performed by the agent according to the importance (¶ [0158] states “The task schedule production unit 341 determines which robot R will be caused to carry out a task based on the priority determined for each task, and determines the order of execution of the tasks assigned to each robot R”);
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine task priority calculation based on importance and using priority to create a schedule of Orita with the request and response parameter of Torri. A person having ordinary skill in the art would have motivated to make this combination for the purpose of finding “an optimum value can be obtained as the priority P of the task” (Orita ¶ [0101]). Further, the priority is used to determine the execution order of tasks (Orita ¶ [0158] states “The task schedule production unit 341 determines which robot R will be caused to carry out a task based on the priority determined for each task, and determines the order of execution of the tasks assigned to each robot R”) which makes it “possible to control a plurality of robots to effectively perform a plurality of tasks even in the condition where an unexpected factor such as human interaction response exists” (Orita ¶ [0017]). Additionally, it would have been obvious to include the request and response parameters in determining the priority of the task because they correspond to robot needs and availability respectively. Calculating priority of the task can involve multiple parameters. One of ordinary skill in the art would acknowledge that knowing robot needs and availability would contribute to calculating priority in an environment with multiple robots and multiple tasks. 

Claim(s) 2, 3, 6, and 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Torri in view of Orita and further in view of Silver et al. US Pat. No. US 20200244707 A1 (hereafter Silver).
Regarding claim 2, Torri and Orita teach the control apparatus according to claim 1. Additionally, Torri teaches wherein a plurality of policies are learned for a plurality of agents, respectively, and the request response processing unit calculates the request parameter and the response parameter based on a respective one of the plurality of policies (¶ [0064] states “Furthermore, in a case where the capability of the robot 1 does not satisfy the ability necessary for execution of the task, the help management unit 130 determines the capability required for the other robot 2 to execute the task, and a help list indicating the ability is generated”. ¶ [0071] states “the help management unit 130 may receive the determination result of the availability of cooperation from the robot 2 having been selected as the cooperation target, and thereby determine whether or not the capability of the robot 2 satisfies the ability indicated in the help list”. Examiner’s Notes: the request parameter is calculated based on if the first robot can complete the task. The response parameter is calculated based on if the second robot can complete the task. The process of comparing robot capabilities and task requirements to calculate the request and response parameter is consider to be the policy).
Although Torri and Orita teach calculating response and request parameters based on one policy, they do not teach calculating the parameters based on the respective learned policies for each agent.
However, in an analogous art, Silver teaches wherein a plurality of policies are learned for a plurality of agents, respectively, and the request response processing unit calculates the request parameter and the response parameter based on a respective one of the plurality of policies (¶ [0008] states “generating training data for the learner policy by causing a first agent controlled using the learner policy to perform the particular task while interacting with one or more second agents, where each second agent is controlled by a respective one of the selected policies; and updating the respective set of policy parameters that define the learner policy by training the learner policy on the training data through reinforcement learning to optimize a reinforcement learning loss function for the learner policy”).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine agents learning respective policies of Silver with the response and request parameter of Torri and Orita resulting in a system where the request and response parameter is dependent on the policy of the specific agent. A person having ordinary skill in the art would have motivated to make this combination so that agents can achieve improved performance on a task (Silver ¶ [0006] states “techniques for reinforcement learning which use interactions between agents to achieve better final performance on a task. The agents may interact cooperatively or competitively”). It would be obvious that the learned policies would become optimized for the respective agent.

Regarding claim 3, Torri, Orita, and Silver, teach the control apparatus according to claim 2. Additionally, Torri teaches wherein the request response processing unit calculates the request parameter and the response parameter based on a request level and a response level, respectively, the request level and the response level being output from the one of the plurality of policies by inputting the observation information into the one of the plurality of policies (see ¶ [0064] and [0071] and explanation above in rejection of claim 1 for request and response parameter. See ¶ [0055], [0056] and [0070] and explanation above in rejection of claim 1 for observation information).
Additionally, Silver teaches wherein the request response processing unit calculates the request parameter and the response parameter based on a request level and a response level, respectively, the request level and the response level being output from the one of the plurality of policies by inputting the observation information into the one of the plurality of policies (¶ [0042] states “The network output includes an action selection output and, in some cases, a predicted expected return output. The action selection output defines an action selection policy for selecting an action to be performed by the agent in response to the input observation”. ¶ [0043] states “In some cases, the action selection output defines a probability distribution over possible actions to be performed by the agent. For example, the action selection output can include a respective action probability for each action in a set of possible actions that can be performed by the agent to interact with the environment”. Examiner’s Note: the probabilities associated with possible actions are considered to be the “level”. The request and response parameters are considered to be possible actions that could be output by the policy. Therefore, the request level and response level are the probabilities associated with the request and response parameters).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the policy outputting a probability associated with a specific action of Silver with the request and response parameter of Torri and Orita. It would be obvious that the probability of requesting help or responding to a help request would be taken into consideration when determining to request help or respond to a help request. A person having ordinary skill in the art would have motivated to make this combination so that agents can achieve improved performance on a task (Silver ¶ [0006] states “techniques for reinforcement learning which use interactions between agents to achieve better final performance on a task. The agents may interact cooperatively or competitively”). Additionally, the output probabilities contribute to the goal of allowing “an agent (e.g., agent 102A) to better perform the particular task by more effectively interacting with the environment 104, with the other agents (e.g., agents 102B-N) in the environment 104, or both” (Silver ¶ [0047]). 

Regarding claim 6, Torri and Orita teach the control apparatus according to claim 1. Additionally, Orita teaches wherein a plurality of policies are learned for a plurality of agents, respectively, and the importance processing unit calculates importance of each of the tasks for the agent based on the one of the plurality of policies that has been learned for that agent (¶ [0098] states “the priority data generator 320 determines the priority of each task by calculating the priority P of the task using Equation (1) below: P=(T.sub.pri+n(T.sub.sp))f(T.sub.err) (1)”. Examiner’s Notes: the priority is interpreted to the importance of a task. The equation is considered a policy).
Although Tori and Orita teach calculating priority based on a single policy, Torri and Orita do not explicitly teach calculating importance based the agent’s respective learned policy. 
However, in an analogous art, Silver teaches wherein a plurality of policies are learned for a plurality of agents, respectively, and the importance processing unit calculates importance of each of the tasks for the agent based on the one of the plurality of policies that has been learned for that agent (¶ [0008] states “generating training data for the learner policy by causing a first agent controlled using the learner policy to perform the particular task while interacting with one or more second agents, where each second agent is controlled by a respective one of the selected policies; and updating the respective set of policy parameters that define the learner policy by training the learner policy on the training data through reinforcement learning to optimize a reinforcement learning loss function for the learner policy”).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine agents learning respective policies of Silver with the priority calculation based on policy of Torri and Orita to create a system where each agent has a different priority calculation based on the learned policy. A person having ordinary skill in the art would have motivated to make this combination so that agents can achieve improved performance on a task (Silver ¶ [0006] states “techniques for reinforcement learning which use interactions between agents to achieve better final performance on a task. The agents may interact cooperatively or competitively”). It would be obvious that the learned policies would become optimized for the respective agent.

With regard to claim 7, Torri, Orita, and Silver teach the control apparatus according to claim 6. Additionally, Orita additionally teaches wherein the importance processing unit calculates, based on a target value of importance of the task corresponding to the observation information, the importance of the task corresponding to the observation information for the agent, the target value of the importance being output from the policy by inputting the observation information into the policy (¶ [0097] states “determination of priority of each task is made with consideration given to: the importance of the task set at the time of registration of the task into the task information database 220; the distance between the start position of the task and the robot located closest to the task start position; and the time remaining until the start time or end time of the task from the present time”. Examiner’s Notes: the “importance of the task set at the time of registration” of Orita is interpreted to be the “target value of importance”. The “priority” or Orita of each task is interpreted to be the “importance”).
Additionally, Silver teaches wherein the importance processing unit calculates, based on a target value of importance of the task corresponding to the observation information, the importance of the task corresponding to the observation information for the agent, the target value of the importance being output from the policy by inputting the observation information into the policy (¶ [0042] states “The network output includes an action selection output and, in some cases, a predicted expected return output. The action selection output defines an action selection policy for selecting an action to be performed by the agent in response to the input observation”. ¶ [0043] – [0046] explain other possible outputs. ¶ [0044] states “In some other cases, the action selection output includes a respective action-value estimate (e.g., Q value) for each of a plurality of possible actions”. Examiner’s Notes: the network creates the outputs for the policy. Agent actions are one kind of output. There is evidence, such as Q value, that suggest other values can also be output from the network. It would be obvious that the “target value of importance”, or the “importance” of Orita, could be an output).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the additional policy network outputs based on observations of Silver with the priority being calculated with respect to “importance” of Orita resulting in a system where the policy network also outputs Orita’s “importance”. A person having ordinary skill in the art would have motivated to make this combination because “the action selection output identifies an optimal action from the set of possible actions to be performed by the agent in response to the observation” (Silver ¶ [0045]). It would be obvious to one of ordinary skill in the art that having the importance of a task, which is then used in priority calculations, would also help achieve the goal of identifying optimal actions. Identifying the optimal actions supports “interactions between agents to achieve better final performance on a task” (Silver ¶ [0006]).

Claim(s) 4 and 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Torri in view of Orita and Silver and further in view of Nikou et al. US Pat. No. US 20240311687 A1 (hereafter Nikou).
Regarding claim 4, Torri, Orita, and Silver teach the control apparatus according to claim 3. Additionally, Torri teaches wherein the request response processing unit calculates the request parameter indicating that help should be requested when the request level exceeds a predetermined threshold and the task that the agent is performing or about to perform is not proceeding (¶ [0102] states “On the other hand, if the task cannot be executed by the first robot 1 alone, the help management unit 130 generates a help list indicating the ability required for execution of the task (S109)”. Examiner’s Note: the help management unit is generating a help list as part of a request for help).
Torri, Orita, and Silver do not explicitly teach a level exceeding a predetermined threshold.
However, in an analogous art, Nikou teaches wherein the request response processing unit calculates the request parameter indicating that help should be requested when the request level exceeds a predetermined threshold and the task that the agent is performing or about to perform is not proceeding (¶ [0037] states “The predetermined probability threshold may be set depending on the specific system configuration and environment, taking into account factors such as the severity of an intent being violated”. ¶ [0012] states “The step of selecting an action may further comprise determining if any actions, from the one or more suggested actions obtained from the policy, have a probability above a predetermined threshold of causing a criterion from among the one or more criteria to be satisfied by the environment to be violated, based on the combinations of a CMDP output state and a logic state”. ¶ [0038] states “an action to be performed on the environment may then be selected from the one or more available actions suggested by a policy, taking into account the actions that are blocked based on the intent (product of the logic states and CMDP output states)”. Examiner’s Notes: the probability associated with an action is evaluated against a predetermined probability threshold. If it crosses that threshold, the corresponding action is removed. It would be obvious that the action could be to “not request help”. If the probability associated with that action crosses the predetermined threshold, the “not request help” action would be removed, which could signal the agent to request help. Inverting the action and/or the probabilities would be obvious).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the probability threshold to determine available actions of Nikou with the request probability and request parameter calculation of Torri, Orita, and Silver resulting in a system where a probability and threshold are used in calculating whether to request help or not. A person having ordinary skill in the art would have motivated to make this combination to “improve the reliability of the actions selected for implementation in the environment, and may also provide increased flexibility and control over the RL agent decision making and/or training” (Nikou ¶ [0010]). 

Regarding claim 5, Torri, Orita, and Silver teach the control apparatus according to claim 3. Additionally, Torri teaches wherein the request response processing unit calculates the response parameter indicating that the request should be responded to when the response level exceeds a predetermined threshold and the task that the agent is performing or about to perform is not proceeding (¶ [0103] states “In the second robot 2 that has received the help list, a capability list indicating the capability of the second robot 2 as of the current time is generated. (S115). Thereafter, the second robot 2 compares the capability of the second robot 2 with the received help list to determine whether or not a help can be provided (S117)”. ¶ [0102] states “On the other hand, if the task cannot be executed by the first robot 1 alone, the help management unit 130 generates a help list indicating the ability required for execution of the task (S109)”. Examiner’s Notes: the response parameter calculation happens when the other agent’s task cannot be done by the other agent).
Torri, Orita, and Silver do not explicitly teach a level exceeding a predetermined threshold.
However, in an analogous art, Nikou teaches wherein the request response processing unit calculates the response parameter indicating that the request should be responded to when the response level exceeds a predetermined threshold and the task that the agent is performing or about to perform is not proceeding (¶ [0037] states “The predetermined probability threshold may be set depending on the specific system configuration and environment, taking into account factors such as the severity of an intent being violated”. ¶ [0012] states “The step of selecting an action may further comprise determining if any actions, from the one or more suggested actions obtained from the policy, have a probability above a predetermined threshold of causing a criterion from among the one or more criteria to be satisfied by the environment to be violated, based on the combinations of a CMDP output state and a logic state”. ¶ [0038] states “an action to be performed on the environment may then be selected from the one or more available actions suggested by a policy, taking into account the actions that are blocked based on the intent (product of the logic states and CMDP output states)”. Examiner’s Notes: the probability associated with an action is evaluated against a predetermined probability threshold. If it crosses that threshold, the corresponding action is removed. Similar to claim 4, an obvious action could be to “not respond to a request for help”. If the probability associated with that action crosses the predetermined threshold, the “not respond to a request for help” action would be removed, which could signal the agent to request help. Inverting the action and/or the probabilities would be obvious).
It would have been obvious to a person having ordinary skill in the art prior to the effective filing date to combine the probability threshold to determine available actions of Nikou with the response parameter calculation of Torri, Orita, and Silver resulting in a system where a probability and threshold are used in calculating whether to respond to a help request or not. A person having ordinary skill in the art would have motivated to make this combination to “improve the reliability of the actions selected for implementation in the environment, and may also provide increased flexibility and control over the RL agent decision making and/or training” (Nikou ¶ [0010]). 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20230347503 A1
teaches
Robots With Lift Mechanisms
US 12050438 B1
teaches
Collaborative Intelligence Of Artificial Intelligence Agents
US 20230339108 A1
teaches
Machine-Learned Robot Fleet Management For Value Chain Networks


Any inquiry concerning this communication or earlier communications from the examiner should be directed to PETER L YUAN whose telephone number is (571)272-5737. The examiner can normally be reached Mon-Fri 7:30am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bradley Teets can be reached at 571-272-3338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PETER LI YUAN/Examiner, Art Unit 2197                                                                                                                                                                                                        
/BRADLEY A TEETS/Supervisory Patent Examiner, Art Unit 2197
Read full office action
CONTROL APPARATUS, CONTROL SYSTEM, CONTROL METHOD, AND PROGRAM

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

CONTROL APPARATUS, CONTROL SYSTEM, CONTROL METHOD, AND PROGRAM

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email