DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
35 USC § 101 rejection regarding claim 1 is withdrawn in view of amendments. Applicant's arguments filed on 12/15/2025 have been fully considered but they are not persuasive or moot in view of new ground of rejection provided below which was necessitated based on Applicant’s amendments to the claims. The new ground of rejection for independent claims are based on in combination of Gabriel, Van Heukelom, Wagner, and Li.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 2, 8, 10, 14-16, 17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Gabriel et al. (DE102022206274B4) (Hereinafter Gabriel) in view of Van Heukelom et al. (US 20200174481 A1) (Hereinafter Van Heukelom), Wagner et al. (US 20170136632 A1) (Hereinafter Wagner), and further in view of Li (US 2023/0125022 A1).
Regarding Claim 1, Gabriel teaches a sorting system, comprising:
an object recognition device configured to capture an image of a first target object (See at least Para [0010] “According to various embodiments, a method for controlling a robot for manipulating, in particular capturing, an object is provided, comprising acquiring an image showing the object, …”, Para [0041] “According to various embodiments, the machine learning model 112 is a neural network 112 and the controller 106 supplies input data to the neural network 112 based on the one or more digital images (color images, depth images, or both) of an object 113, and the neural network 112 is configured to indicate locations (or areas) of the object 113 suitable for capturing the object 113…”); …
a processor configured to:
input the image into a machine learning model that is configured to output a pick probability heat map corresponding to the first target object (See at least Para [0051] “…This can be done, for example, in such a way that a respective heat map is generated per registered descriptor and the manipulation preference image is formed as a (pixel-by-pixel) maximum over all these heat maps.”, Para [0041] “According to various embodiments, the machine learning model 112 is a neural network 112 and the controller 106 supplies input data to the neural network 112 based on the one or more digital images (color images, depth images, or both) of an object 113, and the neural network 112 is configured to indicate locations (or areas) of the object 113 suitable for capturing the object 113. For example, the neural network may segment an input image showing object 113 accordingly…” Para [0075] “Input data for the machine learning models are, for example, color and depth images. However, these can also be supplemented by sensor signals from other sensors such as radar, LiDAR, ultrasound, movement, thermal images, etc. For example, an RGB and depth image is captured in a robot cell and the image (or more such images) is (or are) used to generate candidate locations for grasping one or more objects and generate a descriptor matching heat map based on annotations according to a user preference…”) …;
a diverting mechanism that performs the first pick operation on the first target object based at least in part on a pick location (See at least Para [0039] “According to various embodiments, the machine learning model 112 is configured and trained to enable the robot 100 to recognize a location of an object 113 where the robot 100 may pick up (or otherwise interact with, e.g., paint) the object 113.”) on the first target object determined based at least in part on the first pick probability heat map (See at least Para [0051] “(4) For the input image a. The DON 115 determines a descriptor image from the input image b. The registered descriptors are searched for in the descriptor image for the input image and the manipulation preference image (for example with pixel values in the interval [0, 1]) is generated such that the pixel value of a pixel indicates how well the descriptor of the pixel matches one of the registered descriptors, that is to say illustratively in the form of a "(descriptor match) map" with respect to the match with the registered descriptors and thus the locations selected by the user. This can be done, for example, in such a way that a respective heat map is generated per registered descriptor and the manipulation preference image is formed as a (pixel-by-pixel) maximum over all these heat maps.”).
However, Gabriel does not explicitly spell out … and a second target object; …
… and a second pick probability heat map corresponding to the second target object, wherein the first pick probability heat map shows corresponding probabilities of a successful pick operation of the first target object associated with respective locations along at least one surface of the first target object that is exposed in the image, wherein the machine learning model was trained on ground truth heat maps, and wherein a ground truth heat map comprises an image depicting a historical pick operation on an object that includes a first annotation of a historical pick location on the object and a second annotation of whether the historical pick operation was successful or not; …
compare the first pick probability heat map and the second pick probability heat map;
determine to perform a first pick operation of the first target object based at least in part on the comparison;
determine to omit performing a second pick operation of the second target object based at least in part on the comparison; and …
Van Heukelom teaches … and a second target object (See at least Para [0024] “The techniques described herein can be implemented in a number of ways. Example implementations are provided below with reference to the following figures. Although discussed in the context of an autonomous vehicle, the methods, apparatuses, and systems described herein can be applied to a variety of systems (e.g., a sensor system or a robotic platform), and is not limited to autonomous vehicles. In another example, the techniques can be utilized in an aviation or nautical context, or in any system involving objects or entity that may be associated with behavior that is unknown to the system. Further, although discussed in the context of lidar data, sensor data can include any two-dimensional, three-dimensional, or multi-dimensional data such as image data (e.g., stereo cameras, time-of-flight data, and the like), radar data, sonar data, and the like. Additionally, the techniques described herein can be used with real data (e.g., captured using sensor(s)), simulated data (e.g., generated by a simulator), or any combination of the two.”, Para [0134] “… the first heat map represents first prediction probabilities of first possible locations associated with the first object in the environment; the second heat map represents second prediction probabilities of second possible locations associated with a second object in the environment…”); …
… and a second pick probability heat map corresponding to the second target object (See at least Para [0134] “… the first heat map represents first prediction probabilities of first possible locations associated with the first object in the environment; the second heat map represents second prediction probabilities of second possible locations associated with a second object in the environment…”), …
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Gabriel with the teachings of Van Heukelom and include the feature of taking an image of a second target object and creating a second pick probability heat map corresponding to the second target object, thereby provide the option of comparing between heat maps and reach a decision accordingly, and thereby enhance the performance and computing accuracy (See at least Para [0023] “… Accordingly, techniques for evaluating risk can be performed faster than conventional techniques, which may allow for a faster response or may allow a computing system to consider additional alternative trajectories, thereby improving safety outcomes, performance, and/or accuracy…”).
Wagner teaches …
wherein the first pick probability heat map shows corresponding probabilities of a successful first pick operation of the target object associated with respective locations along at least one surface of the first target object that is exposed in the image, wherein the machine learning model was trained on ground truth heat maps, and wherein a ground truth heat map comprises an image depicting a historical pick operation on an object that includes a first annotation of a historical pick location on the object and a second annotation of whether the historical pick operation was successful or not (See at least Para [0055] “This learning can be accelerated by off-line generation of human-corrected images. For instance, a human could be presented with thousands of images from previous system operation and manually annotate good and bad grasp points on each one. This would generate a large amount of data that could also be input into the machine learning algorithms…”, discloses annotating good and bad grasp points on each one of the images from previous system operation which could also be input into the machine learning algorithms which is construed as machine learning model being trained on ground truth heat maps wherein the ground truth heat map comprises an image depicting a historical pick operation on an object that includes a first annotation of a historical pick location on the object and a second annotation of whether the historical pick operation was successful or not)…
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Gabriel with the teachings of Wagner and include the feature of a machine learning model being trained on ground truth heat maps wherein a ground truth heat map comprises an image depicting a historical pick operation on an object that includes a first annotation of a historical pick location on the object and a second annotation of whether the historical pick operation was successful or not, thereby enhance the speed and efficacy of the system learning (See at least Para [0055] “This learning can be accelerated by off-line generation of human-corrected images. For instance, a human could be presented with thousands of images from previous system operation and manually annotate good and bad grasp points on each one. This would generate a large amount of data that could also be input into the machine learning algorithms to enhance the speed and efficacy of the system learning.”).
Li teaches …
compare the first pick probability heat map and the second pick probability heat map (See at least Para [0091] “The inference unit 54 may set an order of priority for picking to a plurality of target workpieces Wo based on the depth information included in the two-and-a-half dimensional image data acquired by the acquisition unit 51. Specifically, the inference unit 54 may determine that the target workpiece Wo with a shallower depth of the picking position is more easily picked, and is picked at a higher priority order. The inference unit 54 may determine an order of priority for picking of a plurality of target workpieces Wo based on the scores calculated with a weighting coefficient using both of a score set according to the depth of a picking position and a score set according to the commonality of the image in the proximity of the above-described picking position...”, Para [0135] “The training unit 53a generates a trained model for inferring a picking position which is a three-dimensional position of the target workpiece Wo by machine learning (supervised learning) based on the training input data including the three-dimensional point cloud data and the teaching position which is the three-dimensional picking position…”, Para [0031] “The information acquisition device 10 may be configured as a camera for capturing a visible light image such as an RGB image and a grayscale image. Examples of such a camera configured to acquire an invisible light image include an infrared ray camera configured to acquire a heat image used for inspecting humans, animals, or the like…”);
determine to perform a first pick operation of the first target object based at least in part on the comparison (See at least Para [0091] “The inference unit 54 may set an order of priority for picking to a plurality of target workpieces Wo based on the depth information included in the two-and-a-half dimensional image data acquired by the acquisition unit 51. Specifically, the inference unit 54 may determine that the target workpiece Wo with a shallower depth of the picking position is more easily picked, and is picked at a higher priority order. The inference unit 54 may determine an order of priority for picking of a plurality of target workpieces Wo based on the scores calculated with a weighting coefficient using both of a score set according to the depth of a picking position and a score set according to the commonality of the image in the proximity of the above-described picking position...”, Para [0135] “The training unit 53a generates a trained model for inferring a picking position which is a three-dimensional position of the target workpiece Wo by machine learning (supervised learning) based on the training input data including the three-dimensional point cloud data and the teaching position which is the three-dimensional picking position…”, Para [0031] “The information acquisition device 10 may be configured as a camera for capturing a visible light image such as an RGB image and a grayscale image. Examples of such a camera configured to acquire an invisible light image include an infrared ray camera configured to acquire a heat image used for inspecting humans, animals, or the like…”);
determine to omit performing a second pick operation of the second target object based at least in part on the comparison (See at least Para [0091] “The inference unit 54 may set an order of priority for picking to a plurality of target workpieces Wo based on the depth information included in the two-and-a-half dimensional image data acquired by the acquisition unit 51. Specifically, the inference unit 54 may determine that the target workpiece Wo with a shallower depth of the picking position is more easily picked, and is picked at a higher priority order. The inference unit 54 may determine an order of priority for picking of a plurality of target workpieces Wo based on the scores calculated with a weighting coefficient using both of a score set according to the depth of a picking position and a score set according to the commonality of the image in the proximity of the above-described picking position...”, discloses setting an order of priority for picking a plurality of target workpieces which is construed as omit performing certain pick operation of a target object based at least in part on the comparison, Para [0135] “The training unit 53a generates a trained model for inferring a picking position which is a three-dimensional position of the target workpiece Wo by machine learning (supervised learning) based on the training input data including the three-dimensional point cloud data and the teaching position which is the three-dimensional picking position…”);
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Gabriel with the teachings of Li and include the feature of comparing the first pick probability heat map and the second pick probability heat map, determining to perform a first pick operation of the first target object based at least in part on the comparison, and determine to omit performing a second pick operation of the second target object based at least in part on the comparison, thereby increase efficiency.
Regarding Claim 2, modified Gabriel teaches all the elements of claim 1. Gabriel further teaches the sorting system of claim 1, wherein the diverting mechanism comprises a robot and a suction gripper mechanism that is to be actuated by the robot, and wherein the suction gripper mechanism comprises a hollow linear shaft to communicate a vacuum airflow (See at least Para [0012] “Manipulating can mean in particular taking up (e.g. gripping or also suction in the case of a suction gripper)…”, Para [0043] “One possibility for this is, for example, that during preprocessing, the standard deviation (or another measure for the scattering, such as the variance) of normal vectors of the surface of objects shown in the digital images. The normal vector standard deviation is suitable for representing the local flatness of a surface, and is thus particularly relevant information for the grasp quality (or for the quality of an object region for suction).”).
Regarding Claim 8, modified Gabriel teaches all the elements of claim 1. Gabriel further teaches the sorting system of claim 1, wherein to select the pick location on the first target object based at least in part on the first pick probability heat map corresponding to the first target object comprises to select a location on the at least one surface of the first target object that is exposed in the image based on the location’s probability of pick success that is indicated by the first pick probability heat map (See at least Para [0022] “Embodiment 5 is a method according to any one of Embodiments 1 to 4, comprising generating the manipulation quality image by forming, for each registered descriptor, a descriptor matching image in which, for each pixel representing a location on the surface of the object…”, Para [0043] “One possibility for this is, for example, that during preprocessing, the standard deviation (or another measure for the scattering, such as the variance) of normal vectors of the surface of objects shown in the digital images. The normal vector standard deviation is suitable for representing the local flatness of a surface, and is thus particularly relevant information for the grasp quality (or for the quality of an object region for suction).”).
Regarding Claim 10, modified Gabriel teaches all the elements of claim 1.
However, Gabriel does not teach the sorting system of claim 1, wherein the pick location is
selected based on a probability of pick success corresponding to the pick location meeting a threshold probability.
Li teaches the sorting system of claim 1, wherein the pick location is selected based on a
probability of pick success corresponding to the pick location meeting a threshold probability (See at least Para [0091] “… the inference unit 54 sets a threshold of the score set according to the commonality of the image in the proximity of the above-described picking position, and defines all of the picking positions with commonality with the image exceeding the threshold as the picking positions with high possibility of successful picking determined by the findings of the teaching person, so that from among these picking positions as a more appropriate candidate group, the target workpieces Wo with a shallower depth of the picking position may be preferentially picked.”).
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective
filing date of the claimed invention to modify the teachings of Gabriel with the teachings of Li and include the feature of the pick location being selected based on a probability of pick success corresponding to the pick location meeting a threshold probability, thereby increase efficiency.
Regarding Claim 14, modified Gabriel teaches all the elements of claim 1.
However, Gabriel does not explicitly spell out the sorting system of claim 1, wherein the machine learning model is trained on data associated with historical pick operations, wherein data associated with the historical pick operation comprises one or more of the following: an object type, executed sorting parameters, an object image, the historical pick location, a predicted heat probability heat map corresponding to the historical pick operation, and a determination of whether the historical pick operation was successful or not.
Wagner teaches the sorting system of claim 1, wherein the machine learning model is trained on data associated with historical pick operations, wherein data associated with the historical pick operation comprises one or more of the following: an object type, executed sorting parameters, an object image, the historical pick location, a predicted heat probability heat map corresponding to the historical pick operation, and a determination of whether the historical pick operation was successful or not (See at least Para [0055] “This learning can be accelerated by off-line generation of human-corrected images. For instance, a human could be presented with thousands of images from previous system operation and manually annotate good and bad grasp points on each one. This would generate a large amount of data that could also be input into the machine learning algorithms to enhance the speed and efficacy of the system learning.”).
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Gabriel with the teachings of Wagner and include the feature of a the machine learning model being trained on data associated with historical pick operations, wherein data associated with the historical pick operation comprises one or more of the following: an object type, executed sorting parameters, an object image, the historical pick location, a predicted heat probability heat map corresponding to the historical pick operation, and a determination of whether the historical pick operation was successful or not (See at least Para [0055] “This learning can be accelerated by off-line generation of human-corrected images. For instance, a human could be presented with thousands of images from previous system operation and manually annotate good and bad grasp points on each one. This would generate a large amount of data that could also be input into the machine learning algorithms to enhance the speed and efficacy of the system learning.”).
Regarding Claim 15, modified Gabriel teaches all the elements of claim 14. Gabriel further teaches the sorting system of claim 14, wherein the machine learning model is configured to predict sorting parameters to be used for the first pick operation based at least in part on the image of the first target object (See at least Para [0010] “According to various embodiments, a method for controlling a robot for manipulating, in particular capturing, an object is provided, comprising acquiring an image showing the object, …”, Para [0041] “According to various embodiments, the machine learning model 112 is a neural network 112 and the controller 106 supplies input data to the neural network 112 based on the one or more digital images (color images, depth images, or both) of an object 113, and the neural network 112 is configured to indicate locations (or areas) of the object 113 suitable for capturing the object 113…”).
Regarding Claim 16, Gabriel teaches a method, comprising:
receiving an image of a first target object (See at least Para [0010] “According to various embodiments, a method for controlling a robot for manipulating, in particular capturing, an object is provided, comprising acquiring an image showing the object, …”, Para [0041] “According to various embodiments, the machine learning model 112 is a neural network 112 and the controller 106 supplies input data to the neural network 112 based on the one or more digital images (color images, depth images, or both) of an object 113, and the neural network 112 is configured to indicate locations (or areas) of the object 113 suitable for capturing the object 113…”);
inputting the image in a machine learning model that is configured to output a first pick probability heat map corresponding to the first target object (See at least [0041] “According to various embodiments, the machine learning model 112 is a neural network 112 and the controller 106 supplies input data to the neural network 112 based on the one or more digital images (color images, depth images, or both) of an object 113, and the neural network 112 is configured to indicate locations (or areas) of the object 113 suitable for capturing the object 113. For example, the neural network may segment an input image showing object 113 accordingly…” Para [0075] “Input data for the machine learning models are, for example, color and depth images. However, these can also be supplemented by sensor signals from other sensors such as radar, LiDAR, ultrasound, movement, thermal images, etc. For example, an RGB and depth image is captured in a robot cell and the image (or more such images) is (or are) used to generate candidate locations for grasping one or more objects and generate a descriptor matching heat map based on annotations according to a user preference…”, Para [0051] “…This can be done, for example, in such a way that a respective heat map is generated per registered descriptor and the manipulation preference image is formed as a (pixel-by-pixel) maximum over all these heat maps.”) … ;
performing, using a diverting mechanism, the first pick operation on the first target object based at least in part on a pick location (See at least Para [0039] “According to various embodiments, the machine learning model 112 is configured and trained to enable the robot 100 to recognize a location of an object 113 where the robot 100 may pick up (or otherwise interact with, e.g., paint) the object 113.”) on the first target object determined based at least in part on the first pick probability heat map (See at least Para [0051] “(4) For the input image a. The DON 115 determines a descriptor image from the input image b. The registered descriptors are searched for in the descriptor image for the input image and the manipulation preference image (for example with pixel values in the interval [0, 1]) is generated such that the pixel value of a pixel indicates how well the descriptor of the pixel matches one of the registered descriptors, that is to say illustratively in the form of a "(descriptor match) map" with respect to the match with the registered descriptors and thus the locations selected by the user. This can be done, for example, in such a way that a respective heat map is generated per registered descriptor and the manipulation preference image is formed as a (pixel-by-pixel) maximum over all these heat maps.”).
However, Gabriel does not explicitly spell out … and a second target object; …
… and a second pick probability heat map corresponding to the second target object, wherein the first pick probability heat map shows corresponding probabilities of a successful pick operation of the first target object associated with respective locations along at least one surface of the first target object that is exposed in the image, wherein the machine learning model was trained on ground truth heat maps, and wherein a ground truth heat map comprises an image depicting a historical pick operation on an object that includes a first annotation of a historical pick location on the object and a second annotation of whether the historical pick operation was successful or not …
compare the first pick probability heat map and the second pick probability heat map;
determine to perform a first pick operation of the first target object based at least in part on the comparison;
determine to omit performing a second pick operation of the second target object based at least in part on the comparison;
Van Heukelom teaches … and a second target object (See at least Para [0024] “The techniques described herein can be implemented in a number of ways. Example implementations are provided below with reference to the following figures. Although discussed in the context of an autonomous vehicle, the methods, apparatuses, and systems described herein can be applied to a variety of systems (e.g., a sensor system or a robotic platform), and is not limited to autonomous vehicles. In another example, the techniques can be utilized in an aviation or nautical context, or in any system involving objects or entity that may be associated with behavior that is unknown to the system. Further, although discussed in the context of lidar data, sensor data can include any two-dimensional, three-dimensional, or multi-dimensional data such as image data (e.g., stereo cameras, time-of-flight data, and the like), radar data, sonar data, and the like. Additionally, the techniques described herein can be used with real data (e.g., captured using sensor(s)), simulated data (e.g., generated by a simulator), or any combination of the two.”, Para [0134] “… the first heat map represents first prediction probabilities of first possible locations associated with the first object in the environment; the second heat map represents second prediction probabilities of second possible locations associated with a second object in the environment…”); …
… and a second pick probability heat map corresponding to the second target object (See at least Para [0134] “… the first heat map represents first prediction probabilities of first possible locations associated with the first object in the environment; the second heat map represents second prediction probabilities of second possible locations associated with a second object in the environment…”), …
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Gabriel with the teachings of Van Heukelom and include the feature of taking an image of a second target object and creating a second pick probability heat map corresponding to the second target object, thereby provide the option of comparing between heat maps and reach a decision accordingly, and thereby enhance the performance and computing accuracy (See at least Para [0023] “… Accordingly, techniques for evaluating risk can be performed faster than conventional techniques, which may allow for a faster response or may allow a computing system to consider additional alternative trajectories, thereby improving safety outcomes, performance, and/or accuracy…”).
Wagner teaches …
wherein the first pick probability heat map shows corresponding probabilities of a successful pick operation of the first target object associated with respective locations along at least one surface of the first target object that is exposed in the image, wherein the machine learning model was trained on ground truth heat maps, and wherein a ground truth heat map comprises an image depicting a historical pick operation on an object that includes a first annotation of a historical pick location on the object and a second annotation of whether the historical pick operation was successful or not (See at least Para [0055] “This learning can be accelerated by off-line generation of human-corrected images. For instance, a human could be presented with thousands of images from previous system operation and manually annotate good and bad grasp points on each one. This would generate a large amount of data that could also be input into the machine learning algorithms…”, discloses annotating good and bad grasp points on each one of the images from previous system operation which could also be input into the machine learning algorithms which is construed as machine learning model being trained on ground truth heat maps wherein the ground truth heat map comprises an image depicting a historical pick operation on an object that includes a first annotation of a historical pick location on the object and a second annotation of whether the historical pick operation was successful or not)…
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Gabriel with the teachings of Wagner and include the feature of a machine learning model being trained on ground truth heat maps wherein a ground truth heat map comprises an image depicting a historical pick operation on an object that includes a first annotation of a historical pick location on the object and a second annotation of whether the historical pick operation was successful or not, thereby enhance the speed and efficacy of the system learning (See at least Para [0055] “This learning can be accelerated by off-line generation of human-corrected images. For instance, a human could be presented with thousands of images from previous system operation and manually annotate good and bad grasp points on each one. This would generate a large amount of data that could also be input into the machine learning algorithms to enhance the speed and efficacy of the system learning.”).
Li teaches …
comparing the first pick probability heat map and the second pick probability heat map (See at least Para [0091] “The inference unit 54 may set an order of priority for picking to a plurality of target workpieces Wo based on the depth information included in the two-and-a-half dimensional image data acquired by the acquisition unit 51. Specifically, the inference unit 54 may determine that the target workpiece Wo with a shallower depth of the picking position is more easily picked, and is picked at a higher priority order. The inference unit 54 may determine an order of priority for picking of a plurality of target workpieces Wo based on the scores calculated with a weighting coefficient using both of a score set according to the depth of a picking position and a score set according to the commonality of the image in the proximity of the above-described picking position...”, Para [0135] “The training unit 53a generates a trained model for inferring a picking position which is a three-dimensional position of the target workpiece Wo by machine learning (supervised learning) based on the training input data including the three-dimensional point cloud data and the teaching position which is the three-dimensional picking position…”, Para [0031] “The information acquisition device 10 may be configured as a camera for capturing a visible light image such as an RGB image and a grayscale image. Examples of such a camera configured to acquire an invisible light image include an infrared ray camera configured to acquire a heat image used for inspecting humans, animals, or the like…”);
determining to perform a first pick operation of the first target object based at least in part on the comparison (See at least Para [0091] “The inference unit 54 may set an order of priority for picking to a plurality of target workpieces Wo based on the depth information included in the two-and-a-half dimensional image data acquired by the acquisition unit 51. Specifically, the inference unit 54 may determine that the target workpiece Wo with a shallower depth of the picking position is more easily picked, and is picked at a higher priority order. The inference unit 54 may determine an order of priority for picking of a plurality of target workpieces Wo based on the scores calculated with a weighting coefficient using both of a score set according to the depth of a picking position and a score set according to the commonality of the image in the proximity of the above-described picking position...”, Para [0135] “The training unit 53a generates a trained model for inferring a picking position which is a three-dimensional position of the target workpiece Wo by machine learning (supervised learning) based on the training input data including the three-dimensional point cloud data and the teaching position which is the three-dimensional picking position…”, Para [0031] “The information acquisition device 10 may be configured as a camera for capturing a visible light image such as an RGB image and a grayscale image. Examples of such a camera configured to acquire an invisible light image include an infrared ray camera configured to acquire a heat image used for inspecting humans, animals, or the like…”);
determining to omit performing a second pick operation of the second target object based at least in part on the comparison (See at least Para [0091] “The inference unit 54 may set an order of priority for picking to a plurality of target workpieces Wo based on the depth information included in the two-and-a-half dimensional image data acquired by the acquisition unit 51. Specifically, the inference unit 54 may determine that the target workpiece Wo with a shallower depth of the picking position is more easily picked, and is picked at a higher priority order. The inference unit 54 may determine an order of priority for picking of a plurality of target workpieces Wo based on the scores calculated with a weighting coefficient using both of a score set according to the depth of a picking position and a score set according to the commonality of the image in the proximity of the above-described picking position...”, discloses setting an order of priority for picking a plurality of target workpieces which is construed as omit performing certain pick operation of a target object based at least in part on the comparison, Para [0135] “The training unit 53a generates a trained model for inferring a picking position which is a three-dimensional position of the target workpiece Wo by machine learning (supervised learning) based on the training input data including the three-dimensional point cloud data and the teaching position which is the three-dimensional picking position…”); and …
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Gabriel with the teachings of Li and include the feature of comparing the first pick probability heat map and the second pick probability heat map, determining to perform a first pick operation of the first target object based at least in part on the comparison, and determine to omit performing a second pick operation of the second target object based at least in part on the comparison, thereby increase efficiency.
Regarding Claim 17, modified Gabriel teaches all the elements of claim 16. Gabriel further teaches the method of claim 16, wherein selecting the pick location on the first target object based at least in part on the first pick probability heat map corresponding to the first target object comprises selecting a location on the at least one surface of the target object that is exposed in the image based on the location’s probability of pick success that is indicated by the first pick probability heat map (See at least Para [0022] “Embodiment 5 is a method according to any one of Embodiments 1 to 4, comprising generating the manipulation quality image by forming, for each registered descriptor, a descriptor matching image in which, for each pixel representing a location on the surface of the object…”, Para [0043] “One possibility for this is, for example, that during preprocessing, the standard deviation (or another measure for the scattering, such as the variance) of normal vectors of the surface of objects shown in the digital images. The normal vector standard deviation is suitable for representing the local flatness of a surface, and is thus particularly relevant information for the grasp quality (or for the quality of an object region for suction).”).
Regarding Claim 19, modified Gabriel teaches all the elements of claim 16.
However, Gabriel does not teach the method of claim 16, wherein the pick location is selected
based on a probability of pick success corresponding to the pick location meeting a threshold probability.
Li teaches the method of claim 16, wherein the pick location is selected based on a probability
of pick success corresponding to the pick location meeting a threshold probability (See at least Para [0091] “… the inference unit 54 sets a threshold of the score set according to the commonality of the image in the proximity of the above-described picking position, and defines all of the picking positions with commonality with the image exceeding the threshold as the picking positions with high possibility of successful picking determined by the findings of the teaching person, so that from among these picking positions as a more appropriate candidate group, the target workpieces Wo with a shallower depth of the picking position may be preferentially picked.”).
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective
filing date of the claimed invention to modify the teachings of Gabriel with the teachings of Li and include the feature of the pick location being selected based on a probability of pick success corresponding to the pick location meeting a threshold probability, thereby increase efficiency.
Claim(s) 3, 5-7, and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Gabriel et al. (DE102022206274B4) (Hereinafter Gabriel) in view of Van Heukelom et al. (US 20200174481 A1) (Hereinafter Van Heukelom), Wagner et al. (US 20170136632 A1) (Hereinafter Wagner), Li (US 2023/0125022 A1), and further in view of Nakashima et al. (US 2023/0064484 A1) (Hereinafter Nakashima).
Regarding Claim 3, Gabriel teaches all the elements of claim 1. Gabriel further teaches the
sorting system of claim 1, wherein the processor is further configured to:
obtain at least one of respective quality control video frames (See at least Para [0059] “The
neural network 112 that determines the pick qualities can be trained in a monitored manner from
existing input data (e.g., RGB-D images) annotated so that they are labeled identifying locations on the surface of objects where the objects can be picked.”) and …
receive annotations of pick quality types associated with the historical pick operations based on
the at least one of respective quality control video frames (See at least Para [0059] “The
neural network 112 that determines the pick qualities can be trained in a monitored manner from
existing input data (e.g., RGB-D images) annotated so that they are labeled identifying locations on the surface of objects where the objects can be picked.”, Para [0075]) …
receive annotations of respective pick locations within object recognition device captured images of the target objects (See at least Para [0059] “The neural network 112 that determines the pick qualities can be trained in a monitored manner from existing input data (e.g., RGB-D images) annotated so that they are labeled identifying locations on the surface of objects where the objects can be picked.”, Para [0075]); and
generate the ground truth heat maps corresponding to the historical pick operations based on
the annotations of the pick quality types (See at least Para [0044] “For example, one or more cameras, for example corresponding to camera 114, provide raw image data (i.e., one or more images) that particularly includes depth information about an object. From this depth information, normal vectors of the surface of the object and their standard deviations (in different regions of the surface) are determined in a preprocessing. This is supplied as input to the neural network 112. The input of the neural network 112 may also include portions of (or all of) raw image data or image data generated therefrom according to previous image enhancement pre-processings (e.g., noise reduction). Such prior image enhancement pre-processings may also be used to generate image data which is then used as the basis of the standard deviation determination. The neural network is trained thereon (e.g. by means of corresponding training inputs (including. Normal vector standard deviations) and associated target outputs, i.e., ground truth information for supervised learning), map the input to an output that identifies locations or areas of the object that are (e.g., particularly well) suitable for capturing the object.”, Para [0059] “The neural network 112 that determines the pick qualities can be trained in a monitored manner from existing input data (e.g., RGB-D images) annotated so that they are labeled identifying locations on the surface of objects where the objects can be picked.”, Para [0075])…
However, Gabriel does not explicitly spell out …
… and pressure sequences corresponding to historical pick operations performed on target objects
… and pressure sequences corresponding to historical pick operations
Nakashima teaches … and pressure sequences corresponding to historical pick operations performed on target objects (See at least Para [0138] “The computational model may be a machine learning model such as a linear regression model, a neural network, or a support vector machine. For the computational model including a machine learning model, pressure detection values are collected by causing the suction head to attempt to pick up the workpiece W at various positions. The obtained detection values and the distances to the detection point are associated with true inter-distances to generate training datasets. The generated training datasets are then used in machine learning to train the machine learning model to output a true inter-distance in response to an input of a pressure detection value and a distance to the detection point. Machine learning may be performed with a known method such regression analysis or backpropagation. In this manner, a trained machine learning model is generated for calculation of the inter-distance.”);
… and pressure sequences corresponding to historical pick operations (See at least Para [0138] “The computational model may be a machine learning model such as a linear regression model, a neural network, or a support vector machine. For the computational model including a machine learning model, pressure detection values are collected by causing the suction head to attempt to pick up the workpiece W at various positions. The obtained detection values and the distances to the detection point are associated with true inter-distances to generate training datasets. The generated training datasets are then used in machine learning to train the machine learning model to output a true inter-distance in response to an input of a pressure detection value and a distance to the detection point. Machine learning may be performed with a known method such regression analysis or backpropagation. In this manner, a trained machine learning model is generated for calculation of the inter-distance.”);
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Gabriel with the teachings of Nakashima and include data of pressure sequences corresponding to historical pick operations performed on target objects as inputs to machine learning, thereby process more data precisely for stable and successful pick up operation (See at least Para [0025] “…The structure thus allows more appropriate estimation of the position at which the suction head can stably pick up any workpiece deviating from the predetermined position.”).
Regarding Claim 5, modified Gabriel teaches all the elements of claim 3. Gabriel further teaches the sorting system of claim 3, wherein the annotations of the pick quality types associated with the historical pick operations (See at least Para [0059] “The neural network 112 that determines the pick qualities can be trained in a monitored manner from existing input data (e.g., RGB-D images) annotated so that they are labeled identifying locations on the surface of objects where the objects can be picked.”, Para [0075]) … pick quality types (See at least Para [0059] “The neural network 112 that determines the pick qualities can be trained in a monitored manner from existing input data (e.g., RGB-D images) annotated so that they are labeled identifying locations on the surface of objects where the objects can be picked.”, Para [0075])…
However, Gabriel does not explicitly spell out … were determined based on correlating the
pressure sequences corresponding to the historical pick operations with representative pressure sequences associated with corresponding …
Nakashima teaches the sorting system of claim 3, wherein the …were determined based on
correlating the pressure sequences corresponding to the historical pick operations with representative pressure sequences associated with corresponding (See at least Para [0074] “…pressure data 121 indicating the detection values of the compressed air pressure detected by the pressure sensor 32 in time series…”, Fig 6 Pressure data 121, Determine whether pick up is successful, Para [0138] “The computational model may be a machine learning model such as a linear regression model, a neural network, or a support vector machine. For the computational model including a machine learning model, pressure detection values are collected by causing the suction head to attempt to pick up the workpiece W at various positions. The obtained detection values and the distances to the detection point are associated with true inter-distances to generate training datasets. The generated training datasets are then used in machine learning to train the machine learning model to output a true inter-distance in response to an input of a pressure detection value and a distance to the detection point. Machine learning may be performed with a known method such regression analysis or backpropagation. In this manner, a trained machine learning model is generated for calculation of the inter-distance.”)…
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Gabriel with the teachings of Nakashima and include the feature of correlating the pressure sequences corresponding to the historical pick operations with representative pressure sequences, thereby process more data precisely for stable and successful pick up operation (See at least Para [0025] “…The structure thus allows more appropriate estimation of the position at which the suction head can stably pick up any workpiece deviating from the predetermined position.”).
Regarding Claim 6, modified Gabriel teaches all the elements of claim 3. Gabriel further teaches the sorting system of claim 3, wherein the annotations of the respective pick locations within the object recognition device captured images of the target objects were determined based on predetermined pick locations for objects (See at least Para [0059] “The neural network 112 that determines the pick qualities can be trained in a monitored manner from existing input data (e.g., RGB-D images) annotated so that they are labeled identifying locations on the surface of objects where the objects can be picked.”, Para [0075]).
Regarding Claim 7, modified Gabriel teaches all the elements of claim 3. Gabriel further teaches the sorting system of claim 3, wherein the annotations of the respective pick locations within the object recognition device captured images of the target objects were determined based on evaluating the respective quality control video frames (See at least Para [0059] “The neural network 112 that determines the pick qualities can be trained in a monitored manner from existing input data (e.g., RGB-D images) annotated so that they are labeled identifying locations on the surface of objects where the objects can be picked.”, Para [0075]).
Regarding Claim 11, modified Gabriel teaches all the elements of claim 1.
However, Gabriel does not explicitly spell out the sorting system of claim 1, further comprising:
a pressure meter configured to sense a pressure associated with an airflow through a gripper mechanism of the diverting mechanism over time during the first pick operation of the first target object; and
a memory configured to store the sensed pressure associated with the airflow through the gripper mechanism over time as a pressure sequence associated with the pick operation of the first target object, wherein the pressure sequence is correlated with representative pressure sequences associated with corresponding first pick quality types to determine whether the first pick operation on the first target object was successful or not.
Nakashima teaches the sorting system of claim 1, further comprising:
a pressure meter configured to sense a pressure associated with an airflow through a gripper mechanism of the diverting mechanism over time during the first pick operation of the first target object (See at least Para [0144] “The control unit 11 then identifies one section of the sections K1 to K8 with the highest compressed air pressure detected by the pressure sensor 32…”); and
a memory configured to store the sensed pressure associated with the airflow through the gripper mechanism over time as a pressure sequence associated with the first pick operation of the first target object (See at least Para [0074] “…pressure data 121 indicating the detection values of the compressed air pressure detected by the pressure sensor 32 in time series…”), wherein the pressure sequence is correlated with representative pressure sequences associated with corresponding pick quality types to determine whether the first pick operation on the first target object was successful or not (Fig 6 Determine whether pickup is successful).
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective
filing date of the claimed invention to modify the teachings of Gabriel with the teachings of Nakashima and include the feature of a pressure meter to keep tract of pressure sequence which are associated with corresponding pick quality types to determine whether the first pick operation on the first target object was successful or not, thereby increase efficiency by providing pick quality information which is either being successful or unsuccessful.
Claim(s) 12, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Gabriel et al. (DE102022206274B4) (Hereinafter Gabriel) in view of Van Heukelom et al. (US 20200174481 A1) (Hereinafter Van Heukelom), Wagner et al. (US 20170136632 A1) (Hereinafter Wagner), Li (US 2023/0125022 A1), and further in view of Duan et al. (US 2021/0069904 A1) (Hereinafter Duan).
Regarding Claim 12, modified Gabriel teaches all the elements of claim 1.
However, Gabriel does not explicitly spell out the sorting system of claim 1, wherein the processor is further configured to store a determination of whether the first pick operation on the first target object was successful or not along with executed sorting parameters associated with the first pick operation.
Duan teaches the sorting system of claim 1, wherein the processor is further configured to store a determination of whether the first pick operation on the first target object was successful or not along with executed sorting parameters associated with the first pick operation (See at least Para [0045] “The output or outputs of unified reasoning module 310 serves as input to ranking and decision module 315. Ranking and decision module 315 may process the unified model provided by unified reasoning module 310 to produce a ranking of potential pick-up points. The ranking may include a ranking of items according to probability of successful pick-up and a ranking of points on one item or multiple items according to the probabilities of successful pick-up. In some examples, the different types of rankings may be included in the same list. The ranking model may include one or more deep neural nets that have been trained for ranking the probability of success in pick-up options…”).
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Gabriel with the teachings of Duan and include the feature of storing a determination of whether the first pick operation on the first target object was successful or not along with executed sorting parameters associated with the first pick operation, thereby providing precise information regarding pick up operations which can be used to perform better robot operation in the future for improved efficiency (See at least Para [0023] “An intelligent robot may be able to perceive and perform more like a human would through various learning techniques and computer vision rather than following pre-programmed trajectories.”).
Regarding Claim 13, modified Gabriel teaches all the elements of claim 12. Gabriel further teaches the sorting system of claim 12, wherein the sorting parameters include one or more of the following: the pick location on the first target object, an air pressure, a robot speed, a robot acceleration, a height at which a gripper mechanism of the diverting mechanism hovers when the first target object is deposited, a blow off time, and a conveyor belt speed (See at least Para [0041] “According to various embodiments, the machine learning model 112 is a neural network 112 and the controller 106 supplies input data to the neural network 112 based on the one or more digital images (color images, depth images, or both) of an object 113, and the neural network 112 is configured to indicate locations (or areas) of the object 113 suitable for capturing the object 113…” discloses location or areas of the object which is construed as a parameter of the pick location on the target object).
Regarding Claim 20, modified Gabriel teaches all the elements of claim 16.
However, Gabriel does not explicitly spell out the method of claim 16, further comprising storing a determination of whether the first pick operation on the first target object was successful or not along with executed sorting parameters associated with the first pick operation.
Duan teaches the method of claim 16, further comprising storing a determination of whether the first pick operation on the first target object was successful or not along with executed sorting parameters associated with the first pick operation (See at least Para [0045] “The output or outputs of unified reasoning module 310 serves as input to ranking and decision module 315. Ranking and decision module 315 may process the unified model provided by unified reasoning module 310 to produce a ranking of potential pick-up points. The ranking may include a ranking of items according to probability of successful pick-up and a ranking of points on one item or multiple items according to the probabilities of successful pick-up. In some examples, the different types of rankings may be included in the same list. The ranking model may include one or more deep neural nets that have been trained for ranking the probability of success in pick-up options…”).
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Gabriel with the teachings of Duan and include the feature of storing a determination of whether the first pick operation on the first target object was successful or not along with executed sorting parameters associated with the first pick operation, thereby providing precise information regarding pick up operations which can be used to perform better robot operation in the future for improved efficiency (See at least Para [0023] “An intelligent robot may be able to perceive and perform more like a human would through various learning techniques and computer vision rather than following pre-programmed trajectories.”).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Chavez et al. (US 11813758 B2) teaches object picking using visual data and score associated with successful grasp strategy
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAHEDA HOQUE whose telephone number is (571)270-5310. The examiner can normally be reached Monday-Friday 8:00 am- 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ramon Mercado can be reached at 571-270-5744. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SHAHEDA HOQUE/Examiner, Art Unit 3658
/Ramon A. Mercado/Supervisory Patent Examiner, Art Unit 3658