DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
Claims 1-23 are pending and examined below.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-7, 10-20, and 23 are rejected under 35 U.S.C. 103 as being unpatentable over US 20240286283 A1 (“Hong”) in view of US 20200114506 A1 (“Toshev”).
As per Claim 1, Hong discloses a learning visual pose estimation robotic system, said system comprising:
a robot with a gripper configured to perform an operation on a workpiece (¶ 46—“the tool 110 may be a robot arm having a manipulator 112 positioned at one end thereof. The manipulator 112 may include a device such as an end-effector for interacting with the target object 102. Examples of the end effector may include grippers”);
a camera coupled to an outer arm of the robot proximal the gripper, the camera providing images of a workpiece operational scene (¶ 48—“The vision sensor 120 may include one or more cameras, and may be configured to capture images of at least one of the tool 110 and the target object 102 …the vision sensor 120 may be attached to the robot arm such that the vision sensor 120 is located at a fixed position with respect to the manipulator 112”); and
at least one computing device in communication with the robot and the camera, the at least one computing device being configured with a neural network, where the neural network is trained for visual pose estimation with a plurality of the images having a variety of workpiece positions along with an actual relative pose for each image, and after training the neural network runs in inference mode for visual pose estimation used in visual servoing control of the robot performing the operation (¶ 7—“Another approach uses a convolutional neural network (CNN)-based visual servoing system… The CNN may be trained in a supervised manner for regression of the relative pose between two input images.”; ¶ 40—“embodiments may determine 3D relative egomotion based on information about the movement of a robot, for example motor encoder readings which reflect joint angles of the robot's joints”; ¶ 43—“an apparatus 100 according to embodiments may include a tool 110, a vision sensor 120, and a computer system 130. The computer system 130 may include an input/output interface 131, an image pre-processor 132, a relative pose neural network 133, and a command generator 134.”; ¶ 62—“a feature extractor 301a, which may be a CNN such as a ResNet18 network, and a feature extractor 301b, which may be a multilayer perceptron (MLP)”; ¶ 77—“the motion controller 1342 may select a desired relative pose p.sub.d, and may generate a movement command which may cause the tool 110 to move according to the desired relative poses p.sub.d. In order to move the tool 110 from a current absolute pose H.sub.1 to a desired absolute pose H.sub.d corresponding to the desired relative pose p.sub.d, the motion controller 1342 may calculate a plurality of joint angles”; ¶ 112—“generating a movement command based on the relative pose information, and operation 807, which includes moving the manipulator based on the generated movement command”).
Toshev teaches additional limitations not expressly disclosed by Hong, including namely one or more lights illuminating a workspace of the robot and a variety of lighting conditions (¶ 51—“To encourage the model to learn a robust policy invariant to the shape and appearance of the target objects and scene appearances, a diverse set of objects can be utilized and exponentially augment the visual diversity of the environment using texture randomization, lighting randomization, and/or other techniques”). Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Hong to include the limitations as taught by Toshev to prevent the time-consuming nature of heavy usage of the physical robots in attempting robotic grasps or other manipulations (Toshev: ¶ 4).
As per Claim 2, Hong further discloses wherein the workpiece operational scene in each image includes at least a portion of the workpiece in the gripper of the robot and a placement target area, and the actual relative pose for each image defines a relative position of the workpiece with respect to a target position as determined from robot joint positions supervised.
As per Claim 3, Hong further discloses wherein the at least one computing device is configured with a supervised learning algorithm which trains the neural network for visual pose estimation (¶ 44—“the relative pose neural network 133 may be trained to minimize a loss function that is calculated by the loss calculator 1333 to measure the similarity between features obtained from a target image, and features obtained from a source image that is transformed based on the relative pose.”).
Toshev teaches additional limitations not expressly disclosed by Hong, including namely computing, for each image of the plurality of images, a difference between an inferred pose from the neural network for the image and the actual relative pose for the image applying a cost function which rewards a small difference and penalizes a large difference (¶ 8—“a shaped reward can be determined at each time step based on comparison of (e.g. Euclidean distance between) a direction indicated by an action prediction of the time step and a “ground truth” direction to the target object. The ground truth direction to the target object can be efficiently determined at each time step as the pose of a simulated target object is known by the simulator.”). Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Hong to include the limitations as taught by Toshev to prevent the time-consuming nature of heavy usage of the physical robots in attempting robotic grasps or other manipulations (Toshev: ¶ 4).
As per Claim 4, Toshev teaches additional limitations not expressly disclosed by Hong, including namely wherein the variety of lighting conditions are achieved by individually or collectively turning on, turning off and/or dimming the one or more lights (¶ 51—“ lighting randomization”). Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Hong to include the limitations as taught by Toshev to prevent the time-consuming nature of heavy usage of the physical robots in attempting robotic grasps or other manipulations (Toshev: ¶ 4).
As per Claim 5, Hong further discloses wherein the neural network has a structure including a plurality of layers pre-configured for image feature extraction and a linear projection layer which receives output from the plurality of layers and provides an inferred pose (¶ 62—“ the feature extractor 301 may include a feature extractor 301a, which may be a CNN such as a ResNet18 network, and a feature extractor 301b, which may be a multilayer perceptron (MLP)”).
As per Claim 6, Hong further discloses wherein parameter values of the neural network are revised during training to improve accuracy of the inferred pose, and the structure is not revised (¶ 7—“Another approach uses a convolutional neural network (CNN)-based visual servoing system… The CNN may be trained in a supervised manner for regression of the relative pose between two input images.”; ¶ 44—“the relative pose neural network 133 may be trained to minimize a loss function that is calculated by treceiveshe loss calculator 1333 to measure the similarity between features obtained from a target image”).
As per Claim 7, Hong further discloses wherein, in the visual servoing control of the robot, the neural network receives a camera image and computes an inferred relative pose, and a position planning module computes robot joint motions needed to move the workpiece to a target position based on the inferred relative pose (¶ 7—“Another approach uses a convolutional neural network (CNN)-based visual servoing system… The CNN may be trained in a supervised manner for regression of the relative pose between two input images; ¶ 77—“the motion controller 1342 may select a desired relative pose p.sub.d, and may generate a movement command which may cause the tool 110 to move according to the desired relative poses p.sub.d. In order to move the tool 110 from a current absolute pose H.sub.1 to a desired absolute pose H.sub.d corresponding to the desired relative pose p.sub.d, the motion controller 1342 may calculate a plurality of joint angles”).
As per Claim 10, Hong further discloses wherein the camera is a two-dimensional (2D) camera (¶ 48—“one or more red/green/blue (RGB) cameras”).
As per Claim 11, Hong further discloses wherein the operation is moving the workpiece to a destination location, or fitting the workpiece with or into a second workpiece (¶ 46—“The tool 110 may be operated under the control of the computer system 130 to manipulate the target object 102”).
As per Claim 12, Hong further discloses wherein the at least one computing device is a robot controller which controls movements of the robot and receives joint state data from the robot, and which also performs training of the neural network (¶ 7—“The CNN may be trained in a supervised manner for regression of the relative pose between two input images.”; ¶ 44—“the apparatus 100 may train the relative pose neural network 133”).
As per Claim 13, Hong further discloses wherein the at least one computing device includes a computer in communication with a robot controller, where the computer receives images from the camera and joint position data from the robot controller and performs training of the neural network, and the robot controller performs the visual servoing control of the robot using the trained neural network scene (¶ 48—“The vision sensor 120 may include one or more cameras, and may be configured to capture images of at least one of the tool 110 and the target object 102 …the vision sensor 120 may be attached to the robot arm such that the vision sensor 120 is located at a fixed position with respect to the manipulator 112”; ¶ 7—“Another approach uses a convolutional neural network (CNN)-based visual servoing system…The CNN may be trained in a supervised manner for regression of the relative pose between two input images.”).
As per Claim 14, Hong discloses a learning visual pose estimation robotic system, said system comprising:
a robot with a gripper configured to perform an operation on a workpiece (¶ 46—“the tool 110 may be a robot arm having a manipulator 112 positioned at one end thereof. The manipulator 112 may include a device such as an end-effector for interacting with the target object 102. Examples of the end effector may include grippers”);
a camera coupled to an outer arm of the robot proximal the gripper, the camera providing images of a workpiece operational scene (¶ 48—“The vision sensor 120 may include one or more cameras, and may be configured to capture images of at least one of the tool 110 and the target object 102 …the vision sensor 120 may be attached to the robot arm such that the vision sensor 120 is located at a fixed position with respect to the manipulator 112”);
at least one computing device in communication with the robot and the camera,
where the at least one computing device is configured with a supervised learning algorithm which trains a neural network for visual pose estimation using a plurality of the images having a variety of workpiece positions along with an actual relative pose for each image (¶ 44—“the relative pose neural network 133 may be trained to minimize a loss function that is calculated by the loss calculator 1333 to measure the similarity between features obtained from a target image, and features obtained from a source image that is transformed based on the relative pose.”)
and after training the at least one computing device runs the neural network in inference mode for visual pose estimation used in visual servoing control of the robot performing the operation, where the neural network receives a camera image and computes an inferred relative pose, and a position planning module computes robot joint motions needed to move the workpiece to a target position based on the inferred relative pose (¶ 7—“Another approach uses a convolutional neural network (CNN)-based visual servoing system… The CNN may be trained in a supervised manner for regression of the relative pose between two input images.”; ¶ 40—“embodiments may determine 3D relative egomotion based on information about the movement of a robot, for example motor encoder readings which reflect joint angles of the robot's joints”; ¶ 43—“an apparatus 100 according to embodiments may include a tool 110, a vision sensor 120, and a computer system 130. The computer system 130 may include an input/output interface 131, an image pre-processor 132, a relative pose neural network 133, and a command generator 134.”; ¶ 62—“a feature extractor 301a, which may be a CNN such as a ResNet18 network, and a feature extractor 301b, which may be a multilayer perceptron (MLP)”; ¶ 77—“the motion controller 1342 may select a desired relative pose p.sub.d, and may generate a movement command which may cause the tool 110 to move according to the desired relative poses p.sub.d. In order to move the tool 110 from a current absolute pose H.sub.1 to a desired absolute pose H.sub.d corresponding to the desired relative pose p.sub.d, the motion controller 1342 may calculate a plurality of joint angles”; ¶ 112—“generating a movement command based on the relative pose information, and operation 807, which includes moving the manipulator based on the generated movement command”).
Toshev teaches additional limitations not expressly disclosed by Hong, including namely
one or more lights illuminating a workspace of the robot and a variety of lighting conditions (¶ 51—“To encourage the model to learn a robust policy invariant to the shape and appearance of the target objects and scene appearances, a diverse set of objects can be utilized and exponentially augment the visual diversity of the environment using texture randomization, lighting randomization, and/or other techniques”)
wherein the supervised learning algorithm computes for each image of the plurality of images a difference between an inferred pose from the neural network for the image and the actual relative pose for the image, and uses a cost function to train parameters of the neural network by rewarding a small difference and penalizing a large difference (¶ 8—“a shaped reward can be determined at each time step based on comparison of (e.g. Euclidean distance between) a direction indicated by an action prediction of the time step and a “ground truth” direction to the target object. The ground truth direction to the target object can be efficiently determined at each time step as the pose of a simulated target object is known by the simulator.”). Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Hong to include the limitations as taught by Toshev to prevent the time-consuming nature of heavy usage of the physical robots in attempting robotic grasps or other manipulations (Toshev: ¶ 4).
As per Claim 15, Hong discloses a method for learning visual pose estimation in a robotic operation, said method comprising:
providing a robot with a gripper configured to perform an operation on a workpiece and a camera coupled to an outer arm of the robot proximal the gripper, where the camera is configured to provide images of a workpiece operational scene (¶ 46—“the tool 110 may be a robot arm having a manipulator 112 positioned at one end thereof. The manipulator 112 may include a device such as an end-effector for interacting with the target object 102. Examples of the end effector may include grippers”; ¶ 48—“The vision sensor 120 may include one or more cameras, and may be configured to capture images of at least one of the tool 110 and the target object 102 …the vision sensor 120 may be attached to the robot arm such that the vision sensor 120 is located at a fixed position with respect to the manipulator 112”);
collecting training data, by a computing device in communication with the robot and the camera, where the training data includes a plurality of the images having a variety of workpiece positions along with an actual relative pose for each image (¶ 7—“Another approach uses a convolutional neural network (CNN)-based visual servoing system… The CNN may be trained in a supervised manner for regression of the relative pose between two input images.”);
training the neural network for visual pose estimation with the training data (¶ 7—“Another approach uses a convolutional neural network (CNN)-based visual servoing system… The CNN may be trained in a supervised manner for regression of the relative pose between two input images.”); and
running the neural network in inference mode for visual pose estimation used in visual servoing control of the robot performing the operation (¶ 112—“generating a movement command based on the relative pose information, and operation 807, which includes moving the manipulator based on the generated movement command”; ¶ 77—“the motion controller 1342 may select a desired relative pose p.sub.d, and may generate a movement command which may cause the tool 110 to move according to the desired relative poses p.sub.d. In order to move the tool 110 from a current absolute pose H.sub.1 to a desired absolute pose H.sub.d corresponding to the desired relative pose p.sub.d, the motion controller 1342 may calculate a plurality of joint angles”).
Toshev teaches additional limitations not expressly disclosed by Hong, including namely one or more lights illuminating a workspace of the robot and a variety of lighting conditions (¶ 51—“To encourage the model to learn a robust policy invariant to the shape and appearance of the target objects and scene appearances, a diverse set of objects can be utilized and exponentially augment the visual diversity of the environment using texture randomization, lighting randomization, and/or other techniques”). Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Hong to include the limitations as taught by Toshev to prevent the time-consuming nature of heavy usage of the physical robots in attempting robotic grasps or other manipulations (Toshev: ¶ 4).
As per Claim 16, Hong further discloses wherein the workpiece operational scene in each image includes at least a portion of the workpiece in the gripper of the robot and a placement target area, and the actual relative pose for each image defines a relative position of the workpiece with respect to a target position as determined from robot joint positions (¶ 48—“capture images which may indicate a relative position of the tool 110 or the manipulator 112 with respect to the target object 102”; ¶ 40—“motor encoder readings which reflect joint angles of the robot's joints”)
As per Claim 17, Toshev teaches additional limitations not expressly disclosed by Hong, including namely wherein the computing device is configured with a supervised learning algorithm which trains the neural network for visual pose estimation by computing, for each image of the plurality of images, a difference between an inferred pose from the neural network for the image and the actual relative pose for the image, and applying a cost function which rewards a small difference and penalizes a large difference (¶ 8—“a shaped reward can be determined at each time step based on comparison of (e.g. Euclidean distance between) a direction indicated by an action prediction of the time step and a “ground truth” direction to the target object. The ground truth direction to the target object can be efficiently determined at each time step as the pose of a simulated target object is known by the simulator.”). Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Hong to include the limitations as taught by Toshev to prevent the time-consuming nature of heavy usage of the physical robots in attempting robotic grasps or other manipulations (Toshev: ¶ 4).
As per Claim 18, Toshev teaches additional limitations not expressly disclosed by Hong, including namely wherein the variety of lighting conditions are achieved by individually or collectively turning on, turning off and/or dimming the one or more lights (¶ 51—“To encourage the model to learn a robust policy invariant to the shape and appearance of the target objects and scene appearances, a diverse set of objects can be utilized and exponentially augment the visual diversity of the environment using texture randomization, lighting randomization, and/or other techniques”). Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Hong to include the limitations as taught by Toshev to prevent the time-consuming nature of heavy usage of the physical robots in attempting robotic grasps or other manipulations (Toshev: ¶ 4).
As per Claim 19, Hong further discloses wherein the neural network has a structure including a plurality of layers pre-configured for image feature extraction and a linear projection layer which receives output from the plurality of layers and provides an inferred pose, and where parameter values of the neural network are revised during training to improve accuracy of the inferred pose (¶ 7—“Another approach uses a convolutional neural network (CNN)-based visual servoing system… The CNN may be trained in a supervised manner for regression of the relative pose between two input images.”; ¶ 44—“the relative pose neural network 133 may be trained to minimize a loss function that is calculated by the loss calculator 1333 to measure the similarity between features obtained from a target image”).
As per Claim 20, Hong further discloses wherein, in the visual servoing control of the robot, the neural network receives a camera image and computes an inferred relative pose, and a position planning module computes robot joint motions needed to move the workpiece to a target position based on the inferred relative pose (¶ 7—“Another approach uses a convolutional neural network (CNN)-based visual servoing system… The CNN may be trained in a supervised manner for regression of the relative pose between two input images; ¶ 77—“the motion controller 1342 may select a desired relative pose p.sub.d, and may generate a movement command which may cause the tool 110 to move according to the desired relative poses p.sub.d. In order to move the tool 110 from a current absolute pose H.sub.1 to a desired absolute pose H.sub.d corresponding to the desired relative pose p.sub.d, the motion controller 1342 may calculate a plurality of joint angles”).
As per Claim 23, wherein the operation is moving the workpiece to a destination location, or fitting the workpiece with or into a second workpiece (¶ 46—“The tool 110 may be operated under the control of the computer system 130 to manipulate the target object 102”).
Claims 8, 9, 21, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Hong in view of Toshev as applied to claim 1 above, and further in view of US 20170182659 A1 (“Simaan”).
As per Claim 8, Simaan teaches additional limitations not expressly disclosed by Hong, including namely wherein the visual servoing control is used in cooperation with compliance control of the robot performing the operation (¶ 227—“Switching between full motion control and hybrid motion/force control”). Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Hong to include the limitations as taught by Simaan for active compliant instrumentation that can safely interact with the surrounding environment while maintaining manipulation precision and delivering adequate forces for executing manipulation tasks (Simaan: ¶ 5).
As per Claim 9, Simaan teaches additional limitations not expressly disclosed by Hong, including namely wherein the visual servoing control is used to perform a preliminary positioning of the workpiece and the compliance control is subsequently used to perform a final positioning of the workpiece, or the visual servoing control operates in an outer feedback control loop and the compliance control operates in an inner feedback control loop during positioning of the workpiece (¶ 227—“Switching between full motion control and hybrid motion/force control”; ¶ 228—“the continuum robot autonomously regulates a force…Position data of the probe were only collected when the sensed force…and the hybrid motion/force controller was engaged”). Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Hong to include the limitations as taught by Simaan for active compliant instrumentation that can safely interact with the surrounding environment while maintaining manipulation precision and delivering adequate forces for executing manipulation tasks (Simaan: ¶ 5).
As per Claim 21, Simaan teaches additional limitations not expressly disclosed by Hong, including namely wherein the visual servoing control is used in cooperation with compliance control of the robot performing the operation (¶ 227—“Switching between full motion control and hybrid motion/force control”). Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Hong to include the limitations as taught by Simaan for active compliant instrumentation that can safely interact with the surrounding environment while maintaining manipulation precision and delivering adequate forces for executing manipulation tasks (Simaan: ¶ 5).
As per Claim 22, Simaan teaches additional limitations not expressly disclosed by Hong, including namely wherein the visual servoing control is used to perform a preliminary positioning of the workpiece and the compliance control is subsequently used to perform a final positioning of the workpiece, or the visual servoing control operates in an outer feedback control loop and the compliance control operates in an inner feedback control loop during positioning of the workpiece (¶ 227—“Switching between full motion control and hybrid motion/force control”; ¶ 228—“the continuum robot autonomously regulates a force…Position data of the probe were only collected when the sensed force…and the hybrid motion/force controller was engaged”). Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Hong to include the limitations as taught by Simaan for active compliant instrumentation that can safely interact with the surrounding environment while maintaining manipulation precision and delivering adequate forces for executing manipulation tasks (Simaan: ¶ 5).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BASIL T JOS whose telephone number is (571)270-5915. The examiner can normally be reached 11:00 - 8:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, THOMAS WORDEN can be reached at (571) 272-4876. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Basil T. Jos/Primary Examiner, Art Unit 3658