Last updated: April 19, 2026
Application No. 18/419,474
MULTI-OBJECT PICKING

Final Rejection §103
Filed
Jan 22, 2024
Examiner
HOQUE, SHAHEDA SHABNAM
Art Unit
3658
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
UNIVERSITY OF SOUTH FLORIDA
OA Round
2 (Final)
This examiner grants 43% of cases after interview

— +37.9% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 58 resolved cases, 2023–2026
Examiner Intelligence

HOQUE, SHAHEDA SHABNAM View full profile →
Grants 43% of resolved cases
Career Allow Rate
25 granted / 58 resolved
-8.9% vs TC avg
Strong +38% interview lift
Without
With
+37.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 1m
Avg Prosecution
38 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
10.5%
-29.5% vs TC avg
§103
61.8%
+21.8% vs TC avg
§102
16.9%
-23.1% vs TC avg
§112
10.2%
-29.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 58 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Drawing and Specification rejection are withdrawn in view of amendments. Claim Rejections under 35 USC § 101 is withdrawn in view of amendments. 
Applicant argues on page 12 of the Applicant’s remarks that “Koga's height map (reproduced 
below) lacks edges and nodes and is not a graph (Koga, para 40 and FIG. 5)”. The Examiner respectfully disagrees. A graph can be considered as a type of visual representation. Koga discloses using depth sensor to acquire data in the environment that is used to control the robotic system (See at least Para [0030]). Under broadest reasonable interpretation, Koga’s height map can be considered as graph because it’s a visual representation of the work environment and since the Specification of the Application described that “a graph based on the sensor data of the work environment may be generated” (see at least Para [0008]).
Applicant argues on page 13 of the Applicant’s remarks that “However, Koga never describes 
the height map as a graph. Applicant submits that one of ordinary skill in the art, applying even the broadest reasonable interpretation independent of the specification, would not understand Koga's height map to be a 'graph' as claimed. Furthermore, even if the term 'graph' alone could be construed as a height map, the claims clarify that the 'graph' comprises nodes and edges:" identifying a plurality of connections on the graph." This further excludes construction of the term 'graph' as Koga's height map”. The Examiner respectfully disagrees. As described in the Specification of the Application that “a graph based on the sensor data of the work environment may be generated” (see at least Para [0008]). The sensors are depth sensors as disclosed in Para [0044] of the Specification. Koga also discloses using depth sensor to acquire data in the environment that is used to control the robotic system (See at least Para [0030]. Koga discloses height map which is generated from depth data from the depth camera which is construed as generating a graph based on the sensor data of the work environment since a graph is considered to be a visual representation of the work environment. 
Applicant argues on page 13 of the Applicant’s remarks that “Koga fails to teach the limitations of identifying graph connections based on relative locations of sensed objects”. The Examiner respectfully disagrees. The Specification discloses “A graph may be generated based on the image of the work environment. A plurality of connections on the graph may be identified based on a plurality of relative locations, positions, and/or poses corresponding to each of the plurality of objects.” (See at least Para [0007]).  Therefore, relative locations can also be the positions or poses of each of the plurality of objects. 
Under the broadest reasonable interpretation (BRI), the limitations “generating a graph based 
on the sensor data of the work environment” and “identifying a plurality of connections on the graph based on a plurality of relative locations corresponding to each of the plurality of objects” do not require the graph connections to explicitly encode or represent the relative location of objects. Instead, these limitations are met so long as the connections (edges) between nodes (objects) in the graph are identified, selected, or established based on the relative positions or proximity of the objects as determined from sensor data. In other words, the claim is satisfied if sensor data is used to determine which objects are near each other, and those relationships are reflected as connections in a graph structure, regardless of whether the graph explicitly contains spatial information.
In light of this BRI, the combination of Agboh and Koga teaches these limitations of claim 1. Agboh discloses processing sensor data to determine the locations of objects and grouping them based on proximity—effectively identifying which objects are “connected” according to their spatial relationships. Although Agboh does not always explicitly refer to a “graph,” this neighborhood-based grouping is functionally equivalent to generating a graph where objects are nodes and edges are established between objects that are within a certain distance of each other. Koga, in turn, provides explicit teachings of constructing and using graphs in robotic planning contexts, reinforcing the reasonableness of representing such relationships as a graph. Thus, when combined, these references disclose generating a graph from sensor data, with connections identified based on the relative locations of objects, thereby teaching the relevant limitations of claim 1 under their broadest reasonable interpretation.
Therefore, the Applicant's arguments filed on 12/05/25 related to claims 1-19 have been fully considered but they are not persuasive. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-7 are rejected under 35 U.S.C. 103 as being unpatentable over Agboh et al. (Learning to Efficiently Plan Robust Frictional Multi-Object Grasps, Wisdom C. Agboh˚1,2, Satvik Sharma˚1, Kishore Srinivas1, Mallika Parulekar1 Gaurav Datta1, Tianshuang Qiu1, Jeffrey Ichnowski3, Eugen Solowjow4, Mehmet Dogar2, Ken Goldberg) (Hereinafter Agboh) in view of Koga et al. (US 20230278213 A1) (Hereinafter Koga).
Regarding Claim 1, Agboh teaches a method for generating a picking plan for a robotic picking arm, the method comprising: 
…
processing sensor data indicative of a work environment containing a plurality of objects (See at 
least Page 6 Col 1 Para 2 “We use an RGBD camera (Intel Realsense Camera D435) to get a top-down image of the cluttered scene and then extract vertices of all objects to get the state x…”); 
extracting a plurality of clusters from the plurality of connections (See at least Page 5 Col 2 – Page 6 Col 1 “VII. PHYSICAL EXPERIMENTS - A. Experimental setup - … We begin by repeatedly creating random object clusters. Each scene contains 17 non-overlapping object clusters that have a random center point. Within each cluster, we randomly sample the number of objects, their types, positions, and orientations.”); 
determining a plurality of ranks corresponding to each of the plurality of clusters using a ranking algorithm (See at least Page 3 Col 2 Para 2 “A. Decluttering - Then, the RankObjGroups(.) subroutine(line4) ranks the list of object groups by their size. ”); 
generating the picking plan, where the picking plan comprises a plurality of grasping poses associated with each of the plurality of ranks (See at least Page 3 Col 1 “The next step is to plan a robust multi-object grasp for this object group.”, Page 7 Col 1 “VIII. LIMITATIONS AND FUTURE WORK - … We use MOG-Net in a novel grasp planner to generate robust multi-object grasps.”, Page 3 Col 1 Algorithm1: Decluttering Algorithm shows grasp planner receives ranked object groups as inputs which is construed as picking plan comprising with a plurality of grasping poses associated with each of the plurality of ranks); 
selecting a grasping pose from the plurality of grasping poses (See at least Figure 2 . “… The chosen robust grasp maximizes the product γk¨Nkg.”, Page 6 Col 1 Para 2 “(1) Moving the open gripper above the desired grasp pose and lowering until just above the table.”); and 
sending a plurality of movement instructions to the robotic picking arm to cause the robotic device to pick a selected plurality of objects in one grasping motion, the grasping motion being determined in accordance with the grasping pose (See at least Figure 2 . “… The chosen robust grasp maximizes the product γk¨Nkg. We execute the robust grasp and continue to the next object group”, Page 6 Col 1 Para 2 “(1) Moving the open gripper above the desired grasp pose and lowering until just above the table.”).
However, Agboh does not explicitly spell out …
	obtaining a user input, the user input comprising a desired picking characteristic …
	generating a graph based on the sensor data of the work environment
identifying a plurality of connections on the graph based on a plurality of relative locations 
corresponding to each of the plurality of objects; …
	Koga teaches …
obtaining a user input, the user input comprising a desired picking characteristic (See at least Para [0014] “FIGS. 4A-4F illustrate how an exemplar grasp proposal can be generated from user specified robot grasps, according to various embodiments.”, Para [0037] “FIGS. 4A-4F illustrate how an exemplar grasp proposal can be generated from user specified robot grasps, according to various embodiments. Each of FIGS. 4A-4E shows a different way that the fingers of a robot 402 can grasp a part 404. In some embodiments, the model trainer 116 provides a GUI (e.g., a point-and-click GUI) that permits a user to input a set of robot grasps for each part, such as the grasps shown in FIGS. 4A-4E. Each robot grasp in the set of robot grasps can be either a single position and orientation of fingers of a robot with respect to the part, or a range between two endpoints. Each grasp input by the user can also be flipped by 180 degrees along a z-axis of the fingers, producing another robot grasp for the set of robot grasps…”); …
generating a graph based on the sensor data of the work environment (See at least Para [0040] “… In some embodiments, height maps (e.g., height map 502), which each indicate the height from a plane behind one or more parts, can be generated from depth data indicating distance from a depth sensor, such as depth images captured by a depth camera that is mounted on a robot or elsewhere…”, discloses height map which is generated from depth data from the depth camera which is construed as generating a graph based on the sensor data of the work environment); 
identifying a plurality of connections on the graph based on a plurality of relative locations 
corresponding to each of the plurality of objects (See at least Para [0053] “FIG. 8 illustrates an exemplar graph 800 for determining how a part can be re-grasped, according to various embodiments. As shown, the graph 800 includes a number of nodes connected by edges…”, Para [0073] “… In some embodiments, the robot control application 146 determines the sequence of movements by performing a search of a graph whose nodes represent one or more robots grasping the part in different poses, as described above in conjunction with FIG. 8.”); …
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the method of Agboh with the teachings of Koga and include the feature of obtaining a user input which comprises a desired picking characteristic, generating a graph based on the sensor data of the work environment and identifying a plurality of connections on the graph based on a plurality of relative locations corresponding to each of the plurality of objects, thereby improve efficiency and reliability by providing option for user engagement related to picking and visually communicating large or complex datasets (See at least Para [0005] “As the foregoing illustrates, what is needed in the art are more effective techniques for controlling robots when performing assembly tasks.”).
Regarding Claim 2, modified Agboh teaches all the elements of claim 1. Agboh further teaches the method of claim 1, further comprising: 
analyzing the plurality of clusters and detecting a collision cluster containing a collision pose (See at least Page 4 Col 1 Para 1 “The algorithm generates multiple grasp candidates (line 1) using GenGraspCands(.). It finds the convex hull of a given group of objects and generates Np points that uniformly cover the convex hull. At each point, it generates Nθ orientation samples. It rejects grasp samples that result in collisions between the gripper jaws and any object.”, discloses rejecting rejects grasp samples that result in collisions between the gripper jaws and any object and analyzing and detecting collision pose is done prior to rejecting them); and 
deleting the collision cluster from the plurality of clusters (See at least Page 4 Col 1 Para 1 “The algorithm generates multiple grasp candidates (line 1) using GenGraspCands(.). It finds the convex hull of a given group of objects and generates Np points that uniformly cover the convex hull. At each point, it generates Nθ orientation samples. It rejects grasp samples that result in collisions between the gripper jaws and any object.”).
Regarding Claim 3, modified Agboh teaches all the elements of claim 1.
However, Agboh does not explicitly spell out the method of claim 1, wherein the plurality of grasping poses comprises collision-free poses.
Koga teaches the method of claim 1, wherein the plurality of grasping poses comprises collision-free poses (See at least Para [0045] “… The grasp associated with the resulting collision-free path is then the goal grasp…”, Para [0036] “… When adding a part to an assembly, the grasping module 304 applies the grasp perception model 150 to determine grasp proposals indicating stable and collision-free regions of parts that fingers of the robotic system can grasp…”, Para [0053] “… Then, the grasp pose can be tested to determine if the grasp pose for the fingers of a robot is achievable. Achievable grasps can further be tested to determine if the robot and fingers are collision-free with the environment during the grasps…”).
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the method of Agboh with the teachings of Koga and include the feature of plurality of grasping poses comprising collision-free poses, thereby providing collision-free poses which will improv accuracy, efficiency, and safety during performing robotic tasks (See at least Para [0005] “As the foregoing illustrates, what is needed in the art are more effective techniques for controlling robots when performing assembly tasks.”).
Regarding Claim 4, modified Agboh teaches all the elements of claim 1. Agboh further teaches the method of claim 1, further comprising generating a confidence estimation using a trained neural network model (See at least Page 5 Col 1 “VI. LEARNING A MULTI-OBJECT GRASP NEURAL NETWORK - We train MOG-Net with self-supervised learning in real to predict the number of objects (Ng) that can be successfully grasped from a target object group. It takes the state of all objects in a target group, and a grasp action u as inputs. In Sec. VI-A we detail our data collection process, and in Sec. VI-B we explain details of the neural network model.”, Page 6 “Table II. Physical decluttering experimental results for 10 scenes, each with 58 objects randomized as described in VII-A. We reset each scene precisely by hand to compare the methods. Errors here are within 95% confidence interval of the mean.”).
Regarding Claim 5, modified Agboh teaches all the elements of claim 1. 
However, Agboh does not explicitly spell out the method of claim 1, wherein the plurality of 
grasping poses fit in an effective gripping area
Koga teaches the method of claim 1, wherein the plurality of grasping poses fit in an effective gripping area (See at least Para [0082] “5. The computer-implemented method of any of clauses 1-4, further comprising training the second trained machine learning model based on training data that includes (i) sensor data associated with a computer-aided design (CAD) model that represents the first part being grasped in a plurality of poses, and a (ii) plurality of pose proposals that each represents the first part being grasped in a different pose included in the plurality of poses”). 
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the method of Agboh with the teachings of Koga and include the feature of plurality of grasping poses being fit in an effective gripping area, thereby improve efficiency and reliability during performing robotic tasks (See at least Para [0005] “As the foregoing illustrates, what is needed in the art are more effective techniques for controlling robots when performing assembly tasks.”).
Regarding Claim 6, modified Agboh teaches all the elements of claim 5. Agboh further teaches the method of claim 5, wherein generating the picking plan comprises performing a set of operations substantially in accordance with Algorithm 1 (See at least Page 1 Col 2 “A grasp planning algorithm, µ-MOG, that generates grasps that are robust to state and control uncertainty, by considering the probability of necessary conditions being satisfied.”).
Regarding Claim 7, modified Agboh teaches all the elements of claim 1. Agboh further teaches the method of claim 1, wherein the ranking algorithm comprises calculating the plurality of ranks substantially in accordance with Algorithm 2 (See at least Page 4 Col 1 Algorithm 1 shows ranking algorithm comprises calculating the plurality of ranks substantially in accordance with another algorithm).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Agboh et al. (Learning to Efficiently Plan Robust Frictional Multi-Object Grasps, Wisdom C. Agboh˚1,2, Satvik Sharma˚1, Kishore Srinivas1, Mallika Parulekar1 Gaurav Datta1, Tianshuang Qiu1, Jeffrey Ichnowski3, Eugen Solowjow4, Mehmet Dogar2, Ken Goldberg) (Hereinafter Agboh) in view of Koga et al. (US 20230278213 A1) (Hereinafter Koga), Fakuda et al. (US 20180236669 A1) (Hereinafter Fakuda), and further in view of Cai et al. (CN113927601A) (Hereinafter Cai).
Regarding Claim 8, modified Agboh teaches all the elements of claim 1. 
However, Agboh does not explicitly spell out the method of claim 1, wherein the picking plan further comprises a first plurality of coordinates corresponding to a location in the work environment; and a second plurality of coordinates corresponding to a location of the robotic picking arm.
Fakuda teaches the method of claim 1, wherein the picking plan further comprises a first plurality of coordinates corresponding to a location in the work environment (See at least Para [0032] “… The work robot 5 operates based on a position command and a posture command that are based on the work space coordinate system XYZ…”); …
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the teachings of Agboh with the teachings of Fakuda and include the feature of the picking plan further comprising first plurality of coordinates corresponding to a location in the work environment, thereby provide precise and accurate movement when performing robotic tasks.
Cai teaches … and a second plurality of coordinates corresponding to a location of the robotic 
picking arm (See at least Page 2 Para 2 “Obtain a preset grasping recognition model, input the three-dimensional model into the grasping identification model, and obtain the first grasping direction and the coordinates of multiple grasping points of the mechanical claw at the end of the robotic arm to grasp the picking target”).
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the method of Agboh with the teachings of Cai and include the feature of the picking plan further comprising first plurality of coordinates corresponding to a location in the work environment, thereby provide precise and accurate movement when performing robotic tasks (See at least Page 15 Para 10 “The embodiment of the present invention utilizes the cooperation of the camera and the robotic arm to accurately select the picking target, realizes visual recognition, saves labor costs, and also avoids the problem that manual picking is prone to errors.”).

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Agboh et al. (Learning to Efficiently Plan Robust Frictional Multi-Object Grasps, Wisdom C. Agboh˚1,2, Satvik Sharma˚1, Kishore Srinivas1, Mallika Parulekar1 Gaurav Datta1, Tianshuang Qiu1, Jeffrey Ichnowski3, Eugen Solowjow4, Mehmet Dogar2, Ken Goldberg) (Hereinafter Agboh) in view of Koga et al. (US 20230278213 A1) (Hereinafter Koga), and further in view of Sorin et al. (US 20210023706 A1) (Hereinafter Sorin).
Regarding Claim 9, modified Agboh teaches all the elements of claim 1. Agboh further teaches the method of claim 1, wherein outputting the picking plan comprises: 
…
storing the pair in a training dataset (See at least Page 2 Col 2 Para 4 “… Popular data-driven single-object robust grasp synthesis approaches are Dex-Net 2.0 and Dex-Net 4.0 [2] [40]. They train a grasp quality convolutional neural network (GQ-CNN) with synthetic data to predict grasp success probability.”)…
training a neural network to generate picking plans that implement the desired picking 
characteristics for objects based on using sensor data as an input (See at least “Abstract - … We train a neural network using real examples to plan robust multi-object grasps…”).
	Agboh does not explicitly spell out …
	matching the picking plan with the sensor data of the work environment to create a pair
Sorin teaches …
matching the picking plan with the sensor data of the work environment to create a pair (See at 
least Para [0140] “During the runtime phase, the sensors 282 send perception data to processor 212 a . The perception data may be a stream of which voxels or boxes that are present in the current environment and are stored in on-chip environment memory 294 . Using Boolean circuitry to compare the perception data retrieved from the environment memory 294 to the information stored in the planning graph edge information memory 284  ...”); and …
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the method of Agboh with the teachings of Sorin and include the feature of matching the picking plan with the sensor data of the work environment, thereby provide precise and accurate movement plan when performing robotic tasks with efficiency (See at least Para [0124] “… It is advantageous for robot 102 to be able to quickly and efficiently determine which movements of robot arm 106 (and any movement of robot 102) would result in a collision with obstacle A 112. Therefore, the present disclosure provides solutions that would enable robot 102 to efficiently represent, communicate and compare the space occupied by robot 102 and obstacle A 112 in the environment 100 to facilitate determining which movements of robot arm 106 would result in a collision with obstacle A 112…”).

Claim(s) 10-12, and 14-19 are rejected under 35 U.S.C. 103 as being unpatentable over Agboh et al. (Learning to Efficiently Plan Robust Frictional Multi-Object Grasps, Wisdom C. Agboh˚1,2, Satvik Sharma˚1, Kishore Srinivas1, Mallika Parulekar1 Gaurav Datta1, Tianshuang Qiu1, Jeffrey Ichnowski3, Eugen Solowjow4, Mehmet Dogar2, Ken Goldberg) (Hereinafter Agboh) in view of Koga et al. (US 20230278213 A1) (Hereinafter Koga), and further in view of Cloetingh et al. (US 20210069917 A1) (Hereinafter Cloetingh).
Regarding Claim 10, Agboh teaches a system for picking objects, the system comprising: 
a sensor (See at least Page 6 Col 1 Para 2 “We use an RGBD camera (Intel Realsense Camera 
D435) to get a top-down image of the cluttered scene and then extract vertices of all objects to get the state x…”); 
a robotic device (See at least Fig 1 shows a robotic device), the robotic device comprising a multi-axis arm (See at least Page 5 Col 2 “A. Experimental setup - The setup is as shown in Fig. 1 where we use a UR5 robot with a Robotiq 2F-85 gripper.”, discloses UR5 robot which is a 6-axis robot) … ; 
a processor electrically coupled to the sensor and the robotic device (See at least Fig 2 shows an image of a work environment containing a plurality of objects is processed, Page 6 Col 1 Para 2 “We use an RGBD camera (Intel Realsense Camera D435) to get a top-down image of the cluttered scene and then extract vertices of all objects to get the state x…”, Page 1 Col 1 Para 3 “Instead of using a physics simulator, we propose to collect data entirely on a physical robot and use it to train a multi-object grasping function, MOG-Net, which is robust to state and control uncertainty and predicts the number of objects that will be grasped out of a target object group.”, performing above mentioned operations requires a processor electrically coupled to the sensor and the robotic device); 
a memory in communication with the processor, wherein the processor is configured to execute instructions embodied in the memory (See at least Page 1 Col 1 Para 3 “Instead of using a physics simulator, we propose to collect data entirely on a physical robot and use it to train a multi-object grasping function, MOG-Net, which is robust to state and control uncertainty and predicts the number of objects that will be grasped out of a target object group.”, performing above mentioned operations requires a memory in communication with the processor, wherein the processor is configured to execute instructions embodied in the memory) to: 
… 
process an image of a work environment containing a plurality of objects (See at least Fig 2 shows an image of a work environment containing a plurality of objects is processed, Page 6 Col 1 Para 2 “We use an RGBD camera (Intel Realsense Camera D435) to get a top-down image of the cluttered scene and then extract vertices of all objects to get the state x…”); 
…
extract a plurality of clusters from the plurality of connections (See at least Page 5 Col 2 – Page 6 Col 1 “VII. PHYSICAL EXPERIMENTS - A. Experimental setup - … We begin by repeatedly creating random object clusters. Each scene contains 17 non-overlapping object clusters that have a random center point. Within each cluster, we randomly sample the number of objects, their types, positions, and orientations.”); 
determine a plurality of ranks corresponding to each of the plurality of clusters using a ranking algorithm (See at least Page 3 Col 2 Para 2 “A. Decluttering - Then, the RankObjGroups(.) subroutine(line4) ranks the list of object groups by their size. ”); 
generate a picking plan, wherein the picking plan comprises a plurality of grasping poses associated with each of the plurality of ranks (See at least Page 3 Col 1 “The next step is to plan a robust multi-object grasp for this object group.”, Page 7 Col 1 “VIII. LIMITATIONS AND FUTURE WORK - … We use MOG-Net in a novel grasp planner to generate robust multi-object grasps.”, Page 3 Col 1 Algorithm1: Decluttering Algorithm shows grasp planner receives ranked object groups as inputs which is construed as picking plan comprising with a plurality of grasping poses associated with each of the plurality of ranks);
select a grasping pose from the plurality of grasping poses (See at least Figure 2 . “… The chosen robust grasp maximizes the product γk¨Nkg.”, Page 6 Col 1 Para 2 “(1) Moving the open gripper above the desired grasp pose and lowering until just above the table.”); and 
send a plurality of movement instructions to the robotic picking arm to cause the robotic device to pick a selected plurality of objects in one grasping motion, the grasping motion being determined in accordance with the grasping pose (See at least Figure 2 . “… The chosen robust grasp maximizes the product γk¨Nkg. We execute the robust grasp and continue to the next object group”, Page 6 Col 1 Para 2 “(1) Moving the open gripper above the desired grasp pose and lowering until just above the table.”).
However, Agboh does not explicitly spell out … and a set of paddles …
	obtain a user input, the user input comprising a desired picking characteristic …
	generate a graph based on the image of the work environment; 
identify a plurality of connections on the graph based on a plurality of relative locations corresponding to each of the plurality of objects; …	
Koga teaches …
obtain a user input, the user input comprising a desired picking characteristic (See at least Para [0014] “FIGS. 4A-4F illustrate how an exemplar grasp proposal can be generated from user specified robot grasps, according to various embodiments.”, Para [0037] “FIGS. 4A-4F illustrate how an exemplar grasp proposal can be generated from user specified robot grasps, according to various embodiments. Each of FIGS. 4A-4E shows a different way that the fingers of a robot 402 can grasp a part 404. In some embodiments, the model trainer 116 provides a GUI (e.g., a point-and-click GUI) that permits a user to input a set of robot grasps for each part, such as the grasps shown in FIGS. 4A-4E. Each robot grasp in the set of robot grasps can be either a single position and orientation of fingers of a robot with respect to the part, or a range between two endpoints. Each grasp input by the user can also be flipped by 180 degrees along a z-axis of the fingers, producing another robot grasp for the set of robot grasps…”); …
generate a graph based on the sensor data of the work environment (See at least Para [0040] “… In some embodiments, height maps (e.g., height map 502), which each indicate the height from a plane behind one or more parts, can be generated from depth data indicating distance from a depth sensor, such as depth images captured by a depth camera that is mounted on a robot or elsewhere…”, discloses height map which is generated from depth data from the depth camera which is construed as generating a graph based on the sensor data of the work environment); 
identify a plurality of connections on the graph based on a plurality of relative locations 
corresponding to each of the plurality of objects (See at least Para [0053] “FIG. 8 illustrates an exemplar graph 800 for determining how a part can be re-grasped, according to various embodiments. As shown, the graph 800 includes a number of nodes connected by edges…”, Para [0073] “… In some embodiments, the robot control application 146 determines the sequence of movements by performing a search of a graph whose nodes represent one or more robots grasping the part in different poses, as described above in conjunction with FIG. 8.”); 
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the system of Agboh with the teachings of Koga and include the feature of obtaining a user input which comprises a desired picking characteristic, generating a graph based on the sensor data of the work environment and identifying a plurality of connections on the graph based on a plurality of relative locations corresponding to each of the plurality of objects, thereby improve efficiency and reliability by providing option for user engagement related to picking and visually communicating large or complex datasets (See at least Para [0005] “As the foregoing illustrates, what is needed in the art are more effective techniques for controlling robots when performing assembly tasks.”).
	However, neither Agboh nor does Koga explicitly spell out … and a set of paddles …
	Cloetingh teaches … and a set of paddles (See at least Para [0015] “… A paddle component 123 may be mounted onto a different finger of the plurality of fingers 121. The paddle component 123 may be slightly curved to accommodate the cup stack. Together, the cone-shaped component 122 and the paddle component 123 act together to grip the cup stack…”)…
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the teachings of Agboh with the teachings of Cloetingh and include the feature of using a set of paddles attached to robotic arm/gripper, thereby provide easy and stable grasping of multiple objects at once by the robotic device.
Regarding Claim 11, modified Agboh teaches all the elements of claim 10. Agboh further teaches the system of claim 10, wherein the sensor is a Red Green Blue Depth (RGBD) vision sensor (See at least Page 6 Col 1 Para 2 “We use an RGBD camera (Intel Real sense Camera D435) to get a top-down image of the cluttered scene and then extract vertices of all objects to get the state x…”).
Regarding Claim 12, modified Agboh teaches all the elements of claim 10. Agboh further teaches the system of claim 10, wherein the sensor is positioned above the work environment (See at least Page 2 Col 2 “III. PROBLEM STATEMENT We consider a decluttering problem where multiple rigid convex polygonal objects rest in randomly placed positions and orientations on a planar surface, visible from an overhead camera, and must be transported to a packing box.”, Page 6 Col 1 Para 2 “We use an RGBD camera (Intel Real sense Camera D435) to get a top-down image of the cluttered scene and then extract vertices of all objects to get the state x. The grasp action u involves four steps. (1) Moving the open gripper above the desired grasp pose and lowering until just above the table.”).
Regarding Claim 14, modified Agboh teaches all the elements of claim 10. Agboh further teaches the system of claim 10, wherein the plurality of objects comprises at least one of: a cube, a cylinder, a cuboid, or a hexagon (See at least Fig 1 shows the plurality of objects includes cubes and cylinders).
Regarding Claim 15, modified Agboh teaches all the elements of claim 10. Agboh further teaches the system of claim 10, wherein the plurality grasping poses comprises at least one of a griping pose to grip multiple objects (See at least Fig 1 and Fig 2 shows a griping pose gripping multiple objects).
Regarding Claim 16, Agboh teaches a system for picking objects, the system comprising: 
a sensor (See at least Page 6 Col 1 Para 2 “We use an RGBD camera (Intel Realsense Camera 
D435) to get a top-down image of the cluttered scene and then extract vertices of all objects to get the state x…”); 
a robotic device (See at least Fig 1 shows a robotic device), the robotic device comprising a multi-axis arm … (See at least Page 5 Col 2 “A. Experimental setup - The setup is as shown in Fig. 1 where we use a UR5 robot with a Robotiq 2F-85 gripper.”, discloses UR5 robot which is a 6-axis robot); 
a processor electrically coupled to the sensor and the robotic device (See at least Fig 2 shows an image of a work environment containing a plurality of objects is processed, Page 6 Col 1 Para 2 “We use an RGBD camera (Intel Realsense Camera D435) to get a top-down image of the cluttered scene and then extract vertices of all objects to get the state x…”, Page 1 Col 1 Para 3 “Instead of using a physics simulator, we propose to collect data entirely on a physical robot and use it to train a multi-object grasping function, MOG-Net, which is robust to state and control uncertainty and predicts the number of objects that will be grasped out of a target object group.”, performing above mentioned operations requires a processor electrically coupled to the sensor and the robotic device); 
a memory in communication with the processor, wherein the processor is configured to execute instructions embodied in the memory (See at least Page 1 Col 1 Para 3 “Instead of using a physics simulator, we propose to collect data entirely on a physical robot and use it to train a multi-object grasping function, MOG-Net, which is robust to state and control uncertainty and predicts the number of objects that will be grasped out of a target object group.”, performing above mentioned operations requires a memory in communication with the processor, wherein the processor is configured to execute instructions embodied in the memory) which cause the processor to: 
determine a desired grasping characteristic (See at least Page 1 Col 1 Para 3 “Instead of using a physics simulator, we propose to collect data entirely on a physical robot and use it to train a multi-object grasping function, MOG-Net, which is robust to state and control uncertainty and predicts the number of objects that will be grasped out of a target object group.”); 
receive an output of the sensor (See at least Fig 2 shows an image of a work 
environment containing a plurality of objects is processed, Page 6 Col 1 Para 2 “We use an RGBD camera (Intel Realsense Camera D435) to get a top-down image of the cluttered scene and then extract vertices of all objects to get the state x…”), and determine a grasping plan to achieve the desired grasping characteristic for a given workspace sensed by the sensor (See at least Page 3 Col 1 “The next step is to plan a robust multi-object grasp for this object group.”, Page 7 Col 1 “VIII. LIMITATIONS AND FUTURE WORK - … We use MOG-Net in a novel grasp planner to generate robust multi-object grasps.”, Page 3 Col 1 Algorithm1: Decluttering Algorithm shows grasp planner receives ranked object groups as inputs which is construed as picking plan comprising with a plurality of grasping poses associated with each of the plurality of ranks); and … 
extract a plurality of clusters from the plurality of connections (See at least Page 5 Col 2 – Page 6 Col 1 “VII. PHYSICAL EXPERIMENTS - A. Experimental setup - … We begin by repeatedly creating random object clusters. Each scene contains 17 non-overlapping object clusters that have a random center point. Within each cluster, we randomly sample the number of objects, their types, positions, and orientations.”);
rank the plurality of clusters (See at least Page 3 Col 2 Para 2 “A. Decluttering - Then, the RankObjGroups(.) subroutine(line4) ranks the list of object groups by their size. ”);
select one of the ranked clusters for the grasping plan (See at least Figure 2 . “… The chosen robust grasp maximizes the product γk¨Nkg.”, Page 3 Col 1 “The next step is to plan a robust multi-object grasp for this object group.”, Page 7 Col 1 “VIII. LIMITATIONS AND FUTURE WORK - … We use MOG-Net in a novel grasp planner to generate robust multi-object grasps.”, Page 3 Col 1 Algorithm1: Decluttering Algorithm shows grasp planner receives ranked object groups as inputs);		
send a plurality of movement instructions to a plurality of motors of the multi-axis arm 
(See at least Fig 1 shows plurality movements of the multi-axis arm), wherein the plurality of movement instructions cause the robotic device to pick a plurality of objects in one grasping motion (See at least Page 3 Col 1 Algorithm1: Decluttering Algorithm shows executing the plan, Fig 1 and Fig 2 shows picking a plurality of objects in one grasping motion) …, the grasping motion being determined in accordance with the grasping plan and the plurality of objects being determined in accordance with the desired grasping characteristic (See at least Page 3 Col 1 Para 1 “We further assume that a group of objects in force closure will be securely grasped during motion, and neither the grasping force nor the speed of the motion will dislodge the objects.”, Page 3 Col 1 Algorithm1: Decluttering Algorithm shows executing the plan, Fig 1 and Fig 2 shows picking a plurality of objects in one grasping motion).
		However, Agboh does not explicitly spell out … and a set of paddles … using the paddles …
	generate a graph based on the sensor output of the given workspace;
	identify a plurality of connections on the graph based on a plurality of relative locations corresponding to each of the plurality of objects; …
	Koga teaches …
	generate a graph based on the sensor output of the given workspace (See at least Para [0040] “… In some embodiments, height maps (e.g., height map 502), which each indicate the height from a plane behind one or more parts, can be generated from depth data indicating distance from a depth sensor, such as depth images captured by a depth camera that is mounted on a robot or elsewhere…”, discloses height map which is generated from depth data from the depth camera which is construed as generating a graph based on the sensor data of the work environment);
	identify a plurality of connections on the graph based on a plurality of relative locations corresponding to each of the plurality of objects (See at least Para [0053] “FIG. 8 illustrates an exemplar graph 800 for determining how a part can be re-grasped, according to various embodiments. As shown, the graph 800 includes a number of nodes connected by edges…”, Para [0073] “… In some embodiments, the robot control application 146 determines the sequence of movements by performing a search of a graph whose nodes represent one or more robots grasping the part in different poses, as described above in conjunction with FIG. 8.”); …
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the system of Agboh with the teachings of Koga and include the feature of obtaining a user input which comprises a desired picking characteristic, generating a graph based on the sensor data of the work environment and identifying a plurality of connections on the graph based on a plurality of relative locations corresponding to each of the plurality of objects, thereby improve efficiency and reliability by providing option for user engagement related to picking and visually communicating large or complex datasets (See at least Para [0005] “As the foregoing illustrates, what is needed in the art are more effective techniques for controlling robots when performing assembly tasks.”).
	Cloetingh teaches … and a set of paddles (See at least Para [0015] “… A paddle component 123 may be mounted onto a different finger of the plurality of fingers 121. The paddle component 123 may be slightly curved to accommodate the cup stack. Together, the cone-shaped component 122 and the paddle component 123 act together to grip the cup stack…”)… using the paddles (See at least Para [0015] “… A paddle component 123 may be mounted onto a different finger of the plurality of fingers 121. The paddle component 123 may be slightly curved to accommodate the cup stack. Together, the cone-shaped component 122 and the paddle component 123 act together to grip the cup stack…”)…
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the system of Agboh with the teachings of Cloetingh and include the feature of using a set of paddles attached to robotic arm/gripper, thereby provide easy and stable grasping of multiple objects at once by the robotic device.
Regarding Claim 17, modified Agboh teaches all the elements of claim 16. Agboh further teaches the system of claim 16, wherein the plurality of objects are selected from a plurality of clusters (See at least Page 5 Col 2 Para 6 “We begin by repeatedly creating random object clusters. Each scene contains 17 non-overlapping object clusters that have a random”).
Regarding Claim 18, modified Agboh teaches all the elements of claim 16. Agboh further teaches the system of claim 16, wherein the desired grasping characteristic comprises at least one of: an orientation of the plurality of objects, a number of objects to be picked from the plurality of objects, one or more types of objects in the plurality of objects, a maximum number of grasping motions, a minimization of grasping motions for a given number of objects, or an identified importance of objects in the plurality of objects (See at least Para [0068] “… In some embodiments, the assembly order is specified by a user or determined based on a user specification (e.g., a user specification of a disassembly order)…”, Para [0034] “In some embodiments, a graphical user interface (GUI) can be provided (e.g., by the model trainer 116 and/or the simulation application 118) that permits a user to specify fingers 220 and 224 of the robot models 212 and 214, respectively, that can grasp parts; sensors (e.g., cameras) 222 and 226 that acquire sensor data; a pickup area 230 where parts can be picked up…”, Para [0037] “FIGS. 4A-4F illustrate how an exemplar grasp proposal can be generated from user specified robot grasps, according to various embodiments. Each of FIGS. 4A-4E shows a different way that the fingers of a robot 402 can grasp a part 404. In some embodiments, the model trainer 116 provides a GUI (e.g., a point-and-click GUI) that permits a user to input a set of robot grasps for each part, such as the grasps shown in FIGS. 4A-4E. Each robot grasp in the set of robot grasps can be either a single position and orientation of fingers of a robot with respect to the part…”)
Regarding Claim 19, modified Agboh teaches all the elements of claim 16. Agboh further teaches the system of claim 16 wherein the instructions further cause the processor to provide the output of the sensor as an input to a trained machine learning model (See at least Fig 2 shows output of the sensor is an input to a trained machine learning model, Figure 2. An overview of the decluttering system proposed in this paper. It finds the maximum group of objects that can fit in the gripper and generates a robust grasp for that group. First, it generates candidate grasps, and for each grasp uk, estimates a probability of satisfying multi-object grasp necessary conditions (γk),under state and control uncertainty. Thereafter, it uses MOG-Net which was trained in real to predict the number of objects (Nk g) that will be grasped using uk. The chosen robust grasp maximizes the product γk¨Nk g.We execute the robust grasp and continue to the next object group until the table is cleared of all objects.), and to obtain the grasping plan as an output of the trained machine learning model (See at least Page 3 Col 1 Algorithm1: Decluttering Algorithm shows outputting the picking plan in variable u, Page 3 Col 1 “The next step is to plan a robust multi-object grasp for this object group.”, Page 7 Col 1 “VIII. LIMITATIONS AND FUTURE WORK - … We use MOG-Net in a novel grasp planner to generate robust multi-object grasps.”, Page 3 Col 1 Algorithm1: Decluttering Algorithm shows grasp planner receives ranked object groups as inputs); wherein the trained machine learning model was trained to generate grasping plans from sensor data using training data generated in accordance with the method of claim 9 (See at least Fig 2 shows machine learning model was trained to generate grasping plans from sensor data using training data generated).

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Agboh et al. (Learning to Efficiently Plan Robust Frictional Multi-Object Grasps, Wisdom C. Agboh˚1,2, Satvik Sharma˚1, Kishore Srinivas1, Mallika Parulekar1 Gaurav Datta1, Tianshuang Qiu1, Jeffrey Ichnowski3, Eugen Solowjow4, Mehmet Dogar2, Ken Goldberg) (Hereinafter Agboh) in view of Koga et al. (US 20230278213 A1) (Hereinafter Koga), Cloetingh et al. (US 20210069917 A1) (Hereinafter Cloetingh), Fakuda et al. (US 20180236669 A1) (Hereinafter Fakuda), and further in view of Cai et al. (CN113927601A) (Hereinafter Cai).
Regarding Claim 13, modified Agboh teaches all the elements of claim 10. 
However, Agboh does not explicitly spell out the system of claim 10, wherein the picking plan further comprises a first plurality of coordinates corresponding to a location in the work environment; and a second plurality of coordinates corresponding to a location of the set of paddles.
Fakuda teaches the system of claim 10, wherein the picking plan further comprises a first plurality of coordinates corresponding to a location in the work environment (See at least Para [0032] “… The work robot 5 operates based on a position command and a posture command that are based on the work space coordinate system XYZ…”); …
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the system of Agboh with the teachings of Fakuda and include the feature of the picking plan further comprising first plurality of coordinates corresponding to a location in the work environment, thereby provide precise and accurate movement when performing robotic tasks.
Cai teaches … and a second plurality of coordinates corresponding to a location of the set of 
paddles (See at least Page 2 Para 2 “Obtain a preset grasping recognition model, input the three-dimensional model into the grasping identification model, and obtain the first grasping direction and the coordinates of multiple grasping points of the mechanical claw at the end of the robotic arm to grasp the picking target”).
Therefore, it would have been obvious to one of the ordinary skill of the art before the effective 
filing date of the claimed invention to combine the system of Agboh with the teachings of Cai and include the feature of the picking plan further comprising first plurality of coordinates corresponding to a location in the work environment, thereby provide precise and accurate movement when performing robotic tasks (See at least Page 15 Para 10 “The embodiment of the present invention utilizes the cooperation of the camera and the robotic arm to accurately select the picking target, realizes visual recognition, saves labor costs, and also avoids the problem that manual picking is prone to errors.”).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure
Mutarelli et al. (US 20230020976 A1) teaches manipulation apparatus configured to transfer and release the multiple items coupled to the end effector onto the surface

THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAHEDA HOQUE whose telephone number is (571)270-5310. The examiner can normally be reached Monday-Friday 8:00 am- 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ramon Mercado can be reached at 571-270-5744. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SHAHEDA HOQUE/Examiner, Art Unit 3658
/Ramon A. Mercado/Supervisory Patent Examiner, Art Unit 3658
Read full office action
Prosecution Timeline

Jan 22, 2024
Application Filed
Aug 30, 2025
Non-Final Rejection — §103
Dec 05, 2025
Response Filed
Mar 02, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/612,719
Patent 12569992
AUTOMATIC DETERMINATION OF ROBOT SETTLING STATES
2y 5m to grant Granted Mar 10, 2026
18/248,067
Patent 12539597
ROBOT SYSTEM, AND CONTROL METHOD FOR SAME
2y 5m to grant Granted Feb 03, 2026
18/231,853
Patent 12514143
AGRICULTURAL MACHINE, AGRICULTURAL WORK ASSISTANCE APPARATUS, AND AGRICULTURAL WORK ASSISTANCE SYSTEM
2y 5m to grant Granted Jan 06, 2026
17/740,627
Patent 12485538
METHOD AND SYSTEM FOR DETERMINING A WORKPIECE LOADING LOCATION IN A CNC MACHINE WITH A ROBOTIC ARM
2y 5m to grant Granted Dec 02, 2025
17/384,452
Patent 12479107
METHOD AND AN ASSEMBLY UNIT FOR PERFORMING ASSEMBLING OPERATIONS
2y 5m to grant Granted Nov 25, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
43%
Grant Probability
81%
With Interview (+37.9%)
3y 1m
Median Time to Grant
Moderate
PTA Risk
Based on 58 resolved cases by this examiner. Grant probability derived from career allow rate.