Prosecution Insights
Last updated: April 19, 2026
Application No. 18/176,337

OBJECT MANIPULATION APPARATUS, HANDLING METHOD, AND PROGRAM PRODUCT

Non-Final OA §103
Filed
Feb 28, 2023
Examiner
MOLNAR, SIDNEY LEIGH
Art Unit
3656
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Kabushiki Kaisha Toshiba
OA Round
3 (Non-Final)
54%
Grant Probability
Moderate
3-4
OA Rounds
2y 4m
To Grant
99%
With Interview

Examiner Intelligence

Grants 54% of resolved cases
54%
Career Allow Rate
7 granted / 13 resolved
+1.8% vs TC avg
Strong +86% interview lift
Without
With
+85.7%
Interview Lift
resolved cases with interview
Typical timeline
2y 4m
Avg Prosecution
31 currently pending
Career history
44
Total Applications
across all art units

Statute-Specific Performance

§101
8.7%
-31.3% vs TC avg
§103
42.2%
+2.2% vs TC avg
§102
22.3%
-17.7% vs TC avg
§112
26.1%
-13.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 13 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on November 28, 2025 has been entered. Response to Amendment This correspondence is in response to amendments filed on November 28, 2025. Claims 1, 6, 7, and 9 are amended. Claims 2-4 and 8 are as originally filed or previously presented. Claim 5 is cancelled. Claims 10-15 are newly added. Response to Arguments Applicant generally argues that the Earth model of Wang as recited in the previous rejection does not teach the circular anchor of claims 1 and 6, specifically as amended. Applicant’s arguments with respect to claims 1 and 6 have been considered but are moot because the new ground of rejection does not rely on the same combination of references applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Claim Objections Claims 9 and 14 are objected to because of the following informalities: Claim 9 is dependent to claim 7 which is dependent to claim 6. Further, claim 9 is dependent to claim 8 which is also dependent to claim 6. Although not technically incorrect in form, Examiner recommends simply listing the furthering limitations of claim 8 when furthering the invention of claim 7 such that the limitations of claim 6 are not doubly accounted for in claim 9. Such amendment would be most consistent with common U.S. practice. Similar amendments are recommended for claim 14 which depends from both claim 7 and claim 12 which each depend from claim 6. Further, as an example, claim 15 furthers the features of claim 14 by restating the furthering limitations of the method of claim 13 rather than depending from claim 13 itself. Appropriate correction is required. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 1-4 and 6-15 are rejected under 35 U.S.C. 103 as being unpatentable over Humayun et al. (US 2022/0016766 A1; hereinafter “Humayun”) in view of Maehara et al. (US 2011/00774171 A1; hereinafter “Maehara”). Regarding claim 1, Humayun teaches an object manipulation apparatus (robotic arm 120 in Fig. 2A and 2B) comprising: one or more hardware processors coupled to a memory (The “processing system” includes one or more processors connected to a non-transitory computer readable storage medium [0124].) and configured to function as a feature calculation unit to calculate a feature map indicating a feature of a captured image of a grasping target object ("The graspability network can rapidly generate graspability scores for pixels and/or a graspability map 108 for an image of the scene, wherein the grasp(s) can be selected based on the graspability scores. In some variants, auxiliary scene information can also be generated in parallel (e.g., the object detector can be run on the image to extract object poses), wherein the grasps can be further selected based on the auxiliary data (e.g., the grasps identified from the heatmap can be prioritized based on the corresponding object poses)." [0024]. The features calculated for the associated map are the graspability scores and grasping object poses.); a region calculation unit to … calculate a position and a posture of a handling tool ("The grasp selector 146 is preferably configured to select grasp points from the output of the graspability network, but can additionally or alternatively be configured to select grasp points from the output of the object detector (e.g., an object detector can pre-process inputs to the grasp selector)." [0041]. "Planning the grasp can include determining a grasp pose, where the grasp is planned based on the grasp point and the grasp pose. In a first variant, the grasp pose can be determined from the object parameters output by an object detector (e.g., running in series and/or parallel with the graspability network/grasp selector, based on the same or a contemporaneously-captured image), and planning the grasp for the object parameters for the detected object that encompasses (e.g., includes, is associated with) the grasp point" [0099]. The system, as outlined, thus uses a grasp selector to select a grasp point and object to grasp before planning the grasp, inclusive of the position and posture of the robotic manipulator. The grasp selector selects grasp points on the basis of the graspability network, i.e. the feature map.) … , the position and the posture enabling the handling tool to grasp the grasping target object (“The computing system can include a motion planner 148, which functions to determine control instructions for the robotic arm to execute a grasp attempt for a selected grasp point” [0043]. Thus, the position and posture enable the handling tool to grasp the grasping target object by determining control instructions for a selected grasp point.);… However, Humayun does not explicitly teach …detect a circular anchor on the feature map, and calculate a position and a posture of a handling tool by a first parameter on the circular anchor, the position and the posture enabling the handling tool to grasp the grasping target object; and a grasp configuration (GC) calculation unit to calculate a GC of the handling tool by converting the position and the posture indicated by the first parameter into a second parameter indicating a position and a posture of the handling tool on the captured image, wherein the circular anchor is a circumscribed circle of the GC, the first parameter includes a first sub-parameter indicating coordinates of a center of the circumscribed circle and a second sub-parameter indicating a radius of the circumscribed circle, and the GC calculation unit calculates the GC by acquiring the second parameter from the first sub-parameter, the second sub-parameter, and a third sub-parameter indicating a midpoint of a side of the GC. Maehara, in the same field of endeavor, teaches …detect a circular anchor on the feature map (The “possibility of interference J set to the approximate circle of graspable member S” [0060] will be considered the circular anchor on the feature map, i.e., identified object image. See Fig. 6.), and calculate a position and a posture of a handling tool by a first parameter on the circular anchor, the position and the posture enabling the handling tool to grasp the grasping target object (See Fig. 3 S130 “set grasping target” which determines the grasping target based on grasping candidates and grasping attitude (see Paragraphs [0074-0079]). Viable grasps are selected based on the center of the graspable member, i.e., center of the circular anchor, as well as the non-interference attitude ranges, i.e., radius of the circular anchor (see Fig. 15-16). The center and radius are determined as the first parameter of the circular anchor (described below). Further, S150 performs grasping based on the position and posture of the fingers determined when selecting the grasping target.); and a grasp configuration (GC) calculation unit to calculate a GC of the handling tool by converting the position and the posture indicated by the first parameter into a second parameter indicating a position and a posture of the handling tool on the captured image (“In step S150, the object grasping control apparatus controls the grasping unit 3 to grasp the set grasping target with the set attitude and performs grasping of the graspable member W” [0081]. Thus, the parameters of Fig. 6 are converted into a control signal to perform the grasping, thus converting the first parameter into a position and posture of the handling tool which is required to perform the grasping. Note, as this is a 103 rejection, such conversion is implied by this step of controlling the grasping unit to perform the grasp of the target object.), wherein the circular anchor is a circumscribed circle of the GC (The “possibility of interference J” is circumscribed around fingers F, i.e., the GC.), the first parameter includes a first sub-parameter indicating coordinates of a center of the circumscribed circle and a second sub-parameter indicating a radius of the circumscribed circle (“Coordinates of the center of the approximate circle of a graspable member S whose identifier is (i, j) are set as p (i, j) (x (i, j), y (i, j)), and the radius of the approximate circle of a graspable member S is set as r (i, j)” [0052]. Provided that “possibility of interference J” is set on S, Fig. 6 shows that J shares the center p (i, j) and would have a radius which is the sum of graspable member radius r (i, j) and finger diameter Rf.), and the GC calculation unit calculates the GC by acquiring the second parameter from the first sub-parameter, the second sub-parameter, and a third sub-parameter indicating a midpoint of a side of the GC (Fig. 6 shows position of fingers F aligned through their middle axis, i.e., midpoints of a side of the GC, which are measured based on their rotational configurations of ψ1 and ψ2. Alignment of the position is based on the center of the graspable member and the total radius of the possibility of interference regarding the fingers. Thus, the GC is calculated by acquiring all sub-parameters above which contribute to the alignment of fingers F in Fig. 6.). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the calculation of position and posture of the handling tool on the feature map as taught by Humayun to include the circular anchor calculations as taught by Maehara with a reasonable expectation of success. One of ordinary skill in the art would have been motivated to make this modification as the inclusion of the circular anchor (referred to as “the possibility of interference”) in determining the grip configuration for the fingers in grasping the graspable member allows for steady grasps regardless of any attitude recognition errors which result from visual sensing or positioning errors of the grasping attitude (Maehara, [0082]). Regarding claim 2, Humayun as modified by Maehara (references made to Humayun) teaches the apparatus according to claim 1, wherein the feature calculation unit implements the calculation of the feature map by receiving input of a plurality of pieces of image sensor information ("The input to the graspability network can include: an RGB image, receptive field from image, optionally depth, optionally object detector output (e.g., object parameters, etc.), and/or any other suitable information. In a first variant, the input to the graspability network is a 2D image having 3 channels per pixel (i.e., red-green-blue; RGB). In a second variant, the input to the graspability network can be a 2.5D image having 4 channels per pixel (RGB-depth image). In a first example, the depth can be a sensed depth (e.g., from a lower-accuracy sensor or a higher-accuracy sensor such a Lidar). In a second example, the depth can be a ‘refined’ depth determined by a trained depth enhancement network (e.g., wherein the depth enhancement network can be a precursor neural network or form the initial layers of the graspability network; etc.). In a third variant, the input to the graspability network can include an object detection output as an input feature (e.g., an object parameter, such a characteristic axis of a detected object)" [0069]. Thus, the feature map receives a plurality of input from a plurality of image sensor information including color, object detection, and depth perception images.), extracting a plurality of intermediate features extracted by a plurality of feature extractors from the plurality of pieces of image sensor information, integrating the plurality of intermediate features, and fusing, by convolution calculation, features of the plurality of pieces of image sensor information including the plurality of intermediate features ("The graspability network is preferably trained based on the labelled images. The labelled images can include: the image (e.g., RGB, RGB-D, RGB and point cloud, etc.), grasp point (e.g., the image features depicting a 3D physical point to grasp in the scene), and grasp outcome; and optionally the object parameters (e.g., object pose, surface normal, etc.), effector parameters (e.g., end effector pose, grasp pose, etc.), and/or other information. In particular, the graspability network is trained to predict the outcome of a grasp attempt at the grasp point, given the respective image as the input. However, the network can additionally or alternatively be trained based on object parameters and/or robotic manipulator parameters, such as may be used to: train the graspability network to predict the object parameter values (or bins) and/or robotic manipulator parameter values (or bins), given the respective image as input" [0075]. "The graspability network can be a neural network (e.g., CNN, fully connected, etc.), such as a convolutional neural network (CNN), fully convolutional neural network (FCN), artificial neural network (ANN), a feed forward network, a clustering algorithm, and/or any other suitable neural network or ML model" [0039]. The image sensor information is labelled such that it may be fed into a training model in order to delineate features for each of the images. The features are integrated such that they are provided to a convolutional neural network model in order to fuse the image data and determine graspability outcomes.). Regarding claim 3, Humayun as modified by Maehara (references made to Humayun) teaches the apparatus according to claim 2, wherein the plurality of pieces of image sensor information include a color image indicating a color of the captured image and a depth image indicating a distance from camera to objects in the captured image (Input for the graspability network includes RGB color images with red-green-blue inputs and depth images indicating distance inputs from the sensor data [0069].), and the plurality of feature extractors is implemented by a neural network having an encoder-decoder model structure ("The graspability network can include an encoder (e.g., VGG-16, ResNet, etc.), a decoder (e.g., CCN decoder, FCN decoder, RNN-based decoder, etc.), and/or any other suitable components" [0039].). Regarding claim 4, Humayun as modified by Maehara teaches the apparatus according to claim 1. Humayun further teaches wherein the one or more hardware processors are further configured to function as a position heatmap calculation unit to calculate a position heatmap indicating success probability for grasping target object ("The computing system can include a graspability network 144 which functions to determine a grasp score (e.g., prediction of grasp success probability) for points and/or regions of an image... In one example, the graspability network 144 functions to generate a graspability map (e.g., grasp score mask, a heatmap) for an object scene" [0038]. The grasp scores determine the probability of success for each predicted grasp point and applies this information to each pixel/image region to produce the graspability map which may take the form of a heatmap.), and the region calculation unit is implemented by a neural network … on the basis of the position heatmap ("The graspability network can be a neural network (e.g., CNN, fully connected, etc.), such as a convolutional neural network (CNN), fully convolutional neural network (FCN), artificial neural network (ANN), a feed forward network, a clustering algorithm, and/or any other suitable neural network or ML model. The graspability network can include an encoder (e.g., VGG-16, ResNet, etc.), a decoder (e.g., CCN decoder, FCN decoder, RNN-based decoder, etc.), and/or any other suitable components" [0039]. "The grasp selector 146 is preferably configured to select grasp points from the output of the graspability network.... Additionally or alternatively, the grasp selector can function to select a grasp point based on a plurality of object poses and/or based on a graspability heat map (e.g., grasp score mask; examples are shown in FIGS. 6, 7, and 8)" [0041]. Thus, with regard to the grasp selector, i.e., the region calculation unit, the graspability network and graspability heat map is the driving force behind the grasp selector. The graspability network is a trained neural network.). However, Humayun does not teach …detecting the circular anchor on the feature map… and calculating the first parameter on the detected circular anchor. Maehara further teaches …detecting the circular anchor on the feature map… and calculating the first parameter on the detected circular anchor (In the steps of Fig. 3, the graspable attitude range, i.e., first parameter based on center and radius of the possibility of interference, is determined by setting a graspable member model, i.e., the circle which approximates the graspable member. Such graspable member model is adjusted for the possibility of interference (S123). As such, the circular anchor is detected on the feature map of the graspable member.). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the feature map of Humayun to include the detection of a circular anchor on the features which require grasping as taught by Maehara. One of ordinary skill would have been motivated to make this modification because such grasps resulting from detection of the circular approximation of graspable members will not interfere with other features, i.e., graspable members, which surround the target member (Maehara, [0064]). Regarding claim 6, Humayun teaches a handling method implemented by a computer (The processing system, i.e., computer, is responsible for performing the methods [0124].), the method comprising: calculating a feature map indicating a feature of an image on the basis of image sensor information including a grasping target object ("The graspability network can rapidly generate graspability scores for pixels and/or a graspability map 108 for an image of the scene, wherein the grasp(s) can be selected based on the graspability scores. In some variants, auxiliary scene information can also be generated in parallel (e.g., the object detector can be run on the image to extract object poses), wherein the grasps can be further selected based on the auxiliary data (e.g., the grasps identified from the heatmap can be prioritized based on the corresponding object poses)." [0024]. The features calculated for the associated map are the graspability scores and grasping object poses.); … calculating a position and a posture of a handling tool ("The grasp selector 146 is preferably configured to select grasp points from the output of the graspability network, but can additionally or alternatively be configured to select grasp points from the output of the object detector (e.g., an object detector can pre-process inputs to the grasp selector)." [0041]. "Planning the grasp can include determining a grasp pose, where the grasp is planned based on the grasp point and the grasp pose. In a first variant, the grasp pose can be determined from the object parameters output by an object detector (e.g., running in series and/or parallel with the graspability network/grasp selector, based on the same or a contemporaneously-captured image), and planning the grasp for the object parameters for the detected object that encompasses (e.g., includes, is associated with) the grasp point" [0099]. The system, as outlined, thus uses a grasp selector to select a grasp point and object to grasp before planning the grasp, inclusive of the position and posture of the robotic manipulator. The grasp selector selects grasp points on the basis of the graspability network, i.e. the feature map.) …, the position and the posture enabling the handling tool to grasp the grasping target object (“The computing system can include a motion planner 148, which functions to determine control instructions for the robotic arm to execute a grasp attempt for a selected grasp point” [0043]. Thus, the position and posture enable the handling tool to grasp the grasping target object by determining control instructions for a selected grasp point.);… However, Humayun does not explicitly teach …detecting a circular anchor on the feature map, and calculating a position and a posture of a handling tool by a first parameter on the circular anchor, the position and the posture enabling the handling tool to grasp the grasping target object; and calculating a grasp configuration (GC) calculation unit to calculate a GC of the handling tool by converting the position and the posture indicated by the first parameter into a second parameter indicating a position and a posture of the handling tool on the captured image, wherein the circular anchor is a circumscribed circle of the GC, the first parameter includes a first sub-parameter indicating coordinates of a center of the circumscribed circle and a second sub-parameter indicating a radius of the circumscribed circle, and the GC calculation unit calculates the GC by acquiring the second parameter from the first sub-parameter, the second sub-parameter, and a third sub-parameter indicating a midpoint of a side of the GC. Maehara, in the same field of endeavor, teaches …detecting a circular anchor on the feature map (The “possibility of interference J set to the approximate circle of graspable member S” [0060] will be considered the circular anchor on the feature map, i.e., identified object image. See Fig. 6.), and calculating a position and a posture of a handling tool by a first parameter on the circular anchor, the position and the posture enabling the handling tool to grasp the grasping target object (See Fig. 3 S130 “set grasping target” which determines the grasping target based on grasping candidates and grasping attitude (see Paragraphs [0074-0079]). Viable grasps are selected based on the center of the graspable member, i.e., center of the circular anchor, as well as the non-interference attitude ranges, i.e., radius of the circular anchor (see Fig. 15-16). The center and radius are determined as the first parameter of the circular anchor (described below). Further, S150 performs grasping based on the position and posture of the fingers determined when selecting the grasping target.); and calculating a grasp configuration (GC) calculation unit to calculate a GC of the handling tool by converting the position and the posture indicated by the first parameter into a second parameter indicating a position and a posture of the handling tool on the captured image (“In step S150, the object grasping control apparatus controls the grasping unit 3 to grasp the set grasping target with the set attitude and performs grasping of the graspable member W” [0081]. Thus, the parameters of Fig. 6 are converted into a control signal to perform the grasping, thus converting the first parameter into a position and posture of the handling tool which is required to perform the grasping. Note, as this is a 103 rejection, such conversion is implied by this step of controlling the grasping unit to perform the grasp of the target object.), wherein the circular anchor is a circumscribed circle of the GC (The “possibility of interference J” is circumscribed around fingers F, i.e., the GC.), the first parameter includes a first sub-parameter indicating coordinates of a center of the circumscribed circle and a second sub-parameter indicating a radius of the circumscribed circle (“Coordinates of the center of the approximate circle of a graspable member S whose identifier is (i, j) are set as p (i, j) (x (i, j), y (i, j)), and the radius of the approximate circle of a graspable member S is set as r (i, j)” [0052]. Provided that “possibility of interference J” is set on S, Fig. 6 shows that J shares the center p (i, j) and would have a radius which is the sum of graspable member radius r (i, j) and finger diameter Rf.), and the GC calculation unit calculates the GC by acquiring the second parameter from the first sub-parameter, the second sub-parameter, and a third sub-parameter indicating a midpoint of a side of the GC (Fig. 6 shows position of fingers F aligned through their middle axis, i.e., midpoints of a side of the GC, which are measured based on their rotational configurations of ψ1 and ψ2. Alignment of the position is based on the center of the graspable member and the total radius of the possibility of interference regarding the fingers. Thus, the GC is calculated by acquiring all sub-parameters above which contribute to the alignment of fingers F in Fig. 6.). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the calculation of position and posture of the handling tool on the feature map as taught by Humayun to include the circular anchor calculations as taught by Maehara with a reasonable expectation of success. One of ordinary skill in the art would have been motivated to make this modification as the inclusion of the circular anchor (referred to as “the possibility of interference”) in determining the grip configuration for the fingers in grasping the graspable member allows for steady grasps regardless of any attitude recognition errors which result from visual sensing or positioning errors of the grasping attitude (Maehara, [0082]). Regarding claim 7, Humayun as modified by Maehara (references made to Humayun) teaches a computer program product comprising a non-transitory computer-readable recording medium on which an executable program is recorded ("Alternative embodiments implement the above methods and/or processing modules in non-transitory computer-readable media, storing computer-readable instructions that, when executed by a processing system, cause the processing system to perform the method(s) discussed herein" [0124].), the program instructing a computer to execute the handling method according to claim 6 (see above). Regarding claim 8, Humayun as modified by Maehara (references made to Humayun) teaches the handling method according to claim 6, wherein the calculating the feature map is performed by receiving input of a plurality of pieces of image sensor information ("The input to the graspability network can include: an RGB image, receptive field from image, optionally depth, optionally object detector output (e.g., object parameters, etc.), and/or any other suitable information. In a first variant, the input to the graspability network is a 2D image having 3 channels per pixel (i.e., red-green-blue; RGB). In a second variant, the input to the graspability network can be a 2.5D image having 4 channels per pixel (RGB-depth image). In a first example, the depth can be a sensed depth (e.g., from a lower-accuracy sensor or a higher-accuracy sensor such a Lidar). In a second example, the depth can be a ‘refined’ depth determined by a trained depth enhancement network (e.g., wherein the depth enhancement network can be a precursor neural network or form the initial layers of the graspability network; etc.). In a third variant, the input to the graspability network can include an object detection output as an input feature (e.g., an object parameter, such a characteristic axis of a detected object)" [0069]. Thus, the feature map receives a plurality of input from a plurality of image sensor information including color, object detection, and depth perception images.), extracting a plurality of intermediate features extracted by a plurality of feature extractors from the plurality of pieces of image sensor information, integrating the plurality of intermediate features, and fusing, by convolution calculation, features of the plurality of pieces of image sensor information including the plurality of intermediate features ("The graspability network is preferably trained based on the labelled images. The labelled images can include: the image (e.g., RGB, RGB-D, RGB and point cloud, etc.), grasp point (e.g., the image features depicting a 3D physical point to grasp in the scene), and grasp outcome; and optionally the object parameters (e.g., object pose, surface normal, etc.), effector parameters (e.g., end effector pose, grasp pose, etc.), and/or other information. In particular, the graspability network is trained to predict the outcome of a grasp attempt at the grasp point, given the respective image as the input. However, the network can additionally or alternatively be trained based on object parameters and/or robotic manipulator parameters, such as may be used to: train the graspability network to predict the object parameter values (or bins) and/or robotic manipulator parameter values (or bins), given the respective image as input" [0075]. "The graspability network can be a neural network (e.g., CNN, fully connected, etc.), such as a convolutional neural network (CNN), fully convolutional neural network (FCN), artificial neural network (ANN), a feed forward network, a clustering algorithm, and/or any other suitable neural network or ML model" [0039]. The image sensor information is labelled such that it may be fed into a training model in order to delineate features for each of the images. The features are integrated such that they are provided to a convolutional neural network model in order to fuse the image data and determine graspability outcomes.). Regarding claim 9, Humayun as modified by Maehara (references made to Humayun) teaches the computer program product according to claim 7, wherein the calculation of the feature map is performed by receiving input of a plurality of pieces of image sensor information ("The input to the graspability network can include: an RGB image, receptive field from image, optionally depth, optionally object detector output (e.g., object parameters, etc.), and/or any other suitable information. In a first variant, the input to the graspability network is a 2D image having 3 channels per pixel (i.e., red-green-blue; RGB). In a second variant, the input to the graspability network can be a 2.5D image having 4 channels per pixel (RGB-depth image). In a first example, the depth can be a sensed depth (e.g., from a lower-accuracy sensor or a higher-accuracy sensor such a Lidar). In a second example, the depth can be a ‘refined’ depth determined by a trained depth enhancement network (e.g., wherein the depth enhancement network can be a precursor neural network or form the initial layers of the graspability network; etc.). In a third variant, the input to the graspability network can include an object detection output as an input feature (e.g., an object parameter, such a characteristic axis of a detected object)" [0069]. Thus, the feature map receives a plurality of input from a plurality of image sensor information including color, object detection, and depth perception images.), extracting a plurality of intermediate features extracted by a plurality of feature extractors from the plurality of pieces of image sensor information, integrating the plurality of intermediate features, and fusing, by convolution calculation, features of the plurality of pieces of image sensor information including the plurality of intermediate features ("The graspability network is preferably trained based on the labelled images. The labelled images can include: the image (e.g., RGB, RGB-D, RGB and point cloud, etc.), grasp point (e.g., the image features depicting a 3D physical point to grasp in the scene), and grasp outcome; and optionally the object parameters (e.g., object pose, surface normal, etc.), effector parameters (e.g., end effector pose, grasp pose, etc.), and/or other information. In particular, the graspability network is trained to predict the outcome of a grasp attempt at the grasp point, given the respective image as the input. However, the network can additionally or alternatively be trained based on object parameters and/or robotic manipulator parameters, such as may be used to: train the graspability network to predict the object parameter values (or bins) and/or robotic manipulator parameter values (or bins), given the respective image as input" [0075]. "The graspability network can be a neural network (e.g., CNN, fully connected, etc.), such as a convolutional neural network (CNN), fully convolutional neural network (FCN), artificial neural network (ANN), a feed forward network, a clustering algorithm, and/or any other suitable neural network or ML model" [0039]. The image sensor information is labelled such that it may be fed into a training model in order to delineate features for each of the images. The features are integrated such that they are provided to a convolutional neural network model in order to fuse the image data and determine graspability outcomes.). Regarding claim 10, Humayun as modified by Maehara (references made to Maehara) teaches the apparatus according to claim 1, wherein the second parameter is constituted by elements {x, y, w, h, θ}, the elements x and y indicating a center position of the handling tool (“Upon performing grasping, the object grasping control apparatus positions the grasping center O of the grasping unit 3 at the center of one of the approximate circles of the graspable member S set to the graspable member W” [0051]. Thus, O which is the center of the handling tool (see Fig. 2) is positioned at the center of graspable member S and thus would have center position p(i, j).), the element w indicating an opening width of the handling tool (The opening width of the handling tool may be described as the diameter of graspable member S, which is 2*r(i, j).), the element h indicating a width of a finger of the handling tool (The width of the finger of the handling tool is represented by its diameter Rf.), and the element θ indicating an angle between the element w and an image horizontal axis (Such an angle may be determined as the difference between 360o (2π) and the angle ψ2 as shown in Fig. 6.)… Maehara does not explicitly teach the second parameter is acquired by a following equation with the first sub- parameter as cx and cy, the second sub-parameter as R, and the third sub-parameter as dRx and dRy PNG media_image1.png 140 224 media_image1.png Greyscale . However, as stated in the rejection of claim 1, the first sub-parameter and second sub-parameter are defined as center coordinates p(i, j) and the sum of r(i, j) and Rf respectively. Examiner previously defined the third sub-parameter as angular attitudes ψ1 and ψ2, which it would be obvious to one of ordinary skill in the art to convert such angular positions to linear positions based on the provided measured distance from the center of the grasping model to the center of finger F. Given that such parameters necessary in performing the desired calculations and producing the desired outcomes are properly defined, Maehara discloses described calculations for the second parameter. However, it is silent as to the specifics of applying mathematical formula for determining such parameters: PNG media_image1.png 140 224 media_image1.png Greyscale . Nevertheless, applying any mathematical formulae, including that of the claimed invention, would have been an obvious design choice for one of ordinary skill in the art because it facilitates known mathematical means for deriving grasp coordinates, as shown by Maehara. Since the invention failed to provide novel or unexpected results from the usage of said claimed formula, use of any mathematical means, including that of the claimed invention, would be an obvious matter of design choice within the skill of the art. Regarding claim 11, Humayun as modified by Maehara (references made to Maehara) teaches the apparatus according to claim 10, wherein the third sub-parameter indicates coordinates of a center of the element h (The center of the grip position would be provided in the middle of the finger F as shown in Fig. 6, which are also the center coordinates of the finger width Rf, i.e., the coordinates of a center of the element h.). Regarding claim 12, Humayun as modified by Maehara (references made to Maehara) teaches the handling method according to claim 6, wherein the second parameter is constituted by elements {x, y, w, h, θ}, the elements x and y indicating a center position of the handling tool (“Upon performing grasping, the object grasping control apparatus positions the grasping center O of the grasping unit 3 at the center of one of the approximate circles of the graspable member S set to the graspable member W” [0051]. Thus, O which is the center of the handling tool (see Fig. 2) is positioned at the center of graspable member S and thus would have center position p(i, j).), the element w indicating an opening width of the handling tool (The opening width of the handling tool may be described as the diameter of graspable member S, which is 2*r(i, j).), the element h indicating a width of a finger of the handling tool (The width of the finger of the handling tool is represented by its diameter Rf.), and the element θ indicating an angle between the element w and an image horizontal axis (Such an angle may be determined as the difference between 360o (2π) and the angle ψ2 as shown in Fig. 6.)… Maehara does not explicitly teach the second parameter is acquired by a following equation with the first sub- parameter as cx and cy, the second sub-parameter as R, and the third sub-parameter as dRx and dRy PNG media_image1.png 140 224 media_image1.png Greyscale . However, as stated in the rejection of claim 1, the first sub-parameter and second sub-parameter are defined as center coordinates p(i, j) and the sum of r(i, j) and Rf respectively. Examiner previously defined the third sub-parameter as angular attitudes ψ1 and ψ2, which it would be obvious to one of ordinary skill in the art to convert such angular positions to linear positions based on the provided measured distance from the center of the grasping model to the center of finger F. Given that such parameters necessary in performing the desired calculations and producing the desired outcomes are properly defined, Maehara discloses described calculations for the second parameter. However, it is silent as to the specifics of applying mathematical formula for determining such parameters: PNG media_image1.png 140 224 media_image1.png Greyscale . Nevertheless, applying any mathematical formulae, including that of the claimed invention, would have been an obvious design choice for one of ordinary skill in the art because it facilitates known mathematical means for deriving grasp coordinates, as shown by Maehara. Since the invention failed to provide novel or unexpected results from the usage of said claimed formula, use of any mathematical means, including that of the claimed invention, would be an obvious matter of design choice within the skill of the art. Regarding claim 13, Humayun as modified by Maehara (references to Maehara) teaches the handling method according to claim 12, wherein the third sub-parameter indicates coordinates of a center of the element h (The center of the grip position would be provided in the middle of the finger F as shown in Fig. 6, which are also the center coordinates of the finger width Rf, i.e., the coordinates of a center of the element h.). Regarding claim 14, Humayun as modified by Maehara (references to Maehara) teaches the computer program product according to claim 7, wherein the program further instructs the computer to execute the handling method the elements x and y indicating a center position of the handling tool (“Upon performing grasping, the object grasping control apparatus positions the grasping center O of the grasping unit 3 at the center of one of the approximate circles of the graspable member S set to the graspable member W” [0051]. Thus, O which is the center of the handling tool (see Fig. 2) is positioned at the center of graspable member S and thus would have center position p(i, j).), the element w indicating an opening width of the handling tool (The opening width of the handling tool may be described as the diameter of graspable member S, which is 2*r(i, j).), the element h indicating a width of a finger of the handling tool (The width of the finger of the handling tool is represented by its diameter Rf.), and the element θ indicating an angle between the element w and an image horizontal axis (Such an angle may be determined as the difference between 360o (2π) and the angle ψ2 as shown in Fig. 6.)… Maehara does not explicitly teach the second parameter is acquired by a following equation with the first sub- parameter as cx and cy, the second sub-parameter as R, and the third sub-parameter as dRx and dRy PNG media_image1.png 140 224 media_image1.png Greyscale . However, as stated in the rejection of claim 1, the first sub-parameter and second sub-parameter are defined as center coordinates p(i, j) and the sum of r(i, j) and Rf respectively. Examiner previously defined the third sub-parameter as angular attitudes ψ1 and ψ2, which it would be obvious to one of ordinary skill in the art to convert such angular positions to linear positions based on the provided measured distance from the center of the grasping model to the center of finger F. Given that such parameters necessary in performing the desired calculations and producing the desired outcomes are properly defined, Maehara discloses described calculations for the second parameter. However, it is silent as to the specifics of applying mathematical formula for determining such parameters: PNG media_image1.png 140 224 media_image1.png Greyscale . Nevertheless, applying any mathematical formulae, including that of the claimed invention, would have been an obvious design choice for one of ordinary skill in the art because it facilitates known mathematical means for deriving grasp coordinates, as shown by Maehara. Since the invention failed to provide novel or unexpected results from the usage of said claimed formula, use of any mathematical means, including that of the claimed invention, would be an obvious matter of design choice within the skill of the art. Regarding claim 15, Humayun as modified by Maehara (references to Maehara) teaches the computer program product according to claim 14, wherein the third sub-parameter indicates coordinates of a center of the element h (The center of the grip position would be provided in the middle of the finger F as shown in Fig. 6, which are also the center coordinates of the finger width Rf, i.e., the coordinates of a center of the element h.). Conclusion The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Hasegawa et al. (US 2024/0100692 A1) and Kitai (US 2020/0406466 A1) additionally teach features similar to the circular anchor which has been claimed by Applicant. Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY L MOLNAR whose telephone number is (571)272-2276. The examiner can normally be reached 8 A.M. to 3 P.M. EST Monday-Friday. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jonathan (Wade) Miles can be reached at (571) 270-7777. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /S.L.M./Examiner, Art Unit 3656 /WADE MILES/Supervisory Patent Examiner, Art Unit 3656
Read full office action

Prosecution Timeline

Feb 28, 2023
Application Filed
Feb 22, 2025
Non-Final Rejection — §103
Jun 26, 2025
Response Filed
Aug 22, 2025
Final Rejection — §103
Nov 28, 2025
Request for Continued Examination
Dec 10, 2025
Response after Non-Final Action
Dec 11, 2025
Non-Final Rejection — §103
Apr 16, 2026
Interview Requested

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12600039
ROBOT, CONVEYING SYSTEM, AND ROBOT-CONTROLLING METHOD
2y 5m to grant Granted Apr 14, 2026
Patent 12533807
ROBOTIC APPARATUS AND CONTROL METHOD THEREOF
2y 5m to grant Granted Jan 27, 2026
Patent 12479098
SURGICAL ROBOTIC SYSTEM WITH ACCESS PORT STORAGE
2y 5m to grant Granted Nov 25, 2025
Patent 12384048
TRANSFER APPARATUS
2y 5m to grant Granted Aug 12, 2025
Patent 12376922
TOOL HEAD POSTURE ADJUSTMENT METHOD, APPARATUS AND READABLE STORAGE MEDIUM
2y 5m to grant Granted Aug 05, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
54%
Grant Probability
99%
With Interview (+85.7%)
2y 4m
Median Time to Grant
High
PTA Risk
Based on 13 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month